1、安装boost
yum install boost-devel boost-test boost
2、安装 zlib、bzip2和xz
yum install zlib bzip2 xz
3、安装cmak和make
yum install cmake make
4、安装kenlm
http://kheafield.com/code/kenlm/
https://github.com/kpu/kenlm/
unzip kenlm-master.zip
mv kenlm-master kenlm
cd kenlm
mkdir bulid
cmake ..
make
5、安装kenlm的python安装包
python setup.py install
6、简单使用
6.1 数据
河南大学 真棒
中国 人民 我 爱 你
北京 欢迎 您
bin/lmplz -o 3
bin/lmplz -o 3 --verbose_header --text test --arpa test.arpa
Could not calculate Kneser-Ney discounts for 3-grams with adjusted count 4 because we didn't observe any 3-grams with adjusted count 3; Is this small or artificial data?
Try deduplicating the input. To override this error for e.g. a class-based model, rerun with --discount_fallback
6.3 使用python
import kenlm
model = kenlm.Model('test.arpa')
print(model.score('中国', bos=False, eos=False))
原文链接:https://blog.csdn.net/make_progress/article/details/107517552