[Datawhale学习打卡]tianchi-intel-PaddleOCR
一:安装Paddle-gpu
二:克隆项目
3. git clone https://gitee.com/coggle/tianchi-intel-PaddleOCR
4. cd tianchi-intel-PaddleOCR
5. 不使用run.sh自带的一键训练
6. python3 down_image.py 下载图片
7.
7.1 mkdir inference
7.2 cd inference
7.3 下载三个步骤模型权重
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_infer.tar
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar
wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_infer.tar
7.4 解压在当前文件夹下
tar -xf ch_ppocr_server_v2.0_rec_infer.tar
tar -xf ch_ppocr_server_v2.0_det_infer.tar
tar -xf ch_ppocr_mobile_v2.0_cls_infer.tar
8. 验证推理 (注意当前位置)
python tools/infer/predict_system.py --image_dir="./1.jpg" --det_model_dir="./inference/ch_ppocr_server_v2.0_det_infer/" --rec_model_dir="./inference/ch_ppocr_server_v2.0_rec_infer/" --cls_model_dir=“./inference/ch_ppocr_mobile_v2.0_cls_infer/” --use_angle_cls=True --use_space_char=True
三: 迁移学习
9.
9.1 cd inference
9.2 wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_det_train.tar
tar -xf ch_ppocr_server_v2.0_det_train.tar
9.3 windows修改模型文件加载线程数为0(训练集和验证集)
configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml
9.4 开始训练(配置文件修改。。10epoch收敛)
python tools/train.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml -o Global.pretrain_weights=./inference/ch_ppocr_server_v2.0_det_train/
四: 预测
10. 将模型导出
python tools/export_model.py -c configs/det/ch_ppocr_v2.0/ch_det_res18_db_v2.0.yml -o Global.pretrained_model=output/ch_db_res18/best_accuracy Global.save_inference_dir=output/ch_db_res18/
对测试集进行预测
python tools/infer/predict_system_tianchi.py --image_dir="./doc/imgs/11.jpg" --det_model_dir=“output/ch_db_res18/” --rec_model_dir="./inference/ch_ppocr_server_v2.0_rec_infer/" --cls_model_dir=’./inference/ch_ppocr_mobile_v2.0_cls_infer/’ --use_angle_cls=True --use_space_char=True
将结果文件压缩
zip -r submit.zip Xeon1OCR_round1_test*
AI-Studio项目地址:
https://aistudio.baidu.com/aistudio/projectdetail/2183888?shared=1