从零搭建基于PaddleOCR+Flask+Layui的webapi平台（一、银行卡识别）

见贤思齐 · 发表于 2024-9-10 19:18:09

前言由于业务需要识别银行卡卡号，为了降低成本，网上找了各种开源框架，最后决定使用PaddleOCR+Flask+Layui搭建一个提供webapi接口的OCR平台，本文尽量从小白基础讲解整个搭建过程，如有不足之处尽情见谅。文末附源代码，本地或者直接部署到Linux就可以使用，内含训练好的模型。一、环境准备下载源码下载paddleOCR的2.8分支，下载地址：https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.8运行环境准备使用的Anaconda创建Python环境（本文用的是Python3.8.5），详细请看源码说明：https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.8/doc/doc_ch/environment.md安装requirements在项目根目录，执行安装requirements.txt，下载慢的话换国内镜像源试试。pipinstall-rrequirements.txt1为了精准的识别出银行卡号，大致分为两个步骤，检测文本的位置，然后对检测出的位置进行文字识别，如果图片方向不正的话，还要进行方向检测，本文仅对文本检测和文本识别进行训练、推理，具体可以看官方说明：https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.8/doc/doc_ch/training.md二、文本检测训练数据训练数据来源有两种，一种是自己用PPOCRLabel标注数据，一种用别人训练好的数据。PPOCRLabel安装和使用教程：https://www.jianshu.com/p/4133fbf919813000多张银行卡号已标注文本检测数据集：https://download.csdn.net/download/YY007H/85374437训练模型开始训练前，可以看下官网文档，有详细训练和微调说明：文本检测：https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.8/doc/doc_ch/detection.md模型微调：https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.8/doc/doc_ch/finetune.md本文选择PP-OCRv3模型（配置文件：ch_PP-OCRv3_det_student.yml，预训练模型：ch_PP-OCRv3_det_distill_train.tar）进行微调。ch_PP-OCRv3_det_distill_train.tar下载地址：https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_det_distill_train.tarch_PP-OCRv3_det_student.yml配置文件如下：Global:debug:falseuse_gpu:trueepoch_num:1200#最大训练epoch数log_smooth_window:20#log队列长度，每次打印输出队列里的中间值print_batch_step:2#设置打印log间隔save_model_dir:./output/ch_PP-OCRv3_det_student/#设置输出模型路径save_epoch_step:1200#设置模型保存间隔eval_batch_step:1500#设置模型评估间隔cal_metric_during_train:falsepretrained_model:./pretrain_models/ch_PP-OCRv3_det_distill_train/student.pdparams#预训练模型路径checkpoints:nullsave_inference_dir:nulluse_visualdl:Trueinfer_img:doc/imgs_en/img_10.jpgsave_res_path:./output/ch_PP-OCRv3_det_student/predicts_db.txtdistributed:trueArchitecture:model_type:detalgorithm

BTransform:Backbone:name:MobileNetV3scale:0.5model_name:largedisable_se:TrueNeck:name:RSEFPNout_channels:96shortcut:TrueHead:name

BHeadk:50Loss:name

BLossbalance_loss:truemain_loss_type

iceLossalpha:5beta:10ohem_ratio:3Optimizer:name:Adambeta1:0.9beta2:0.999lr:name:Cosinelearning_rate:0.001warmup_epoch:2regularizer:name

2factor:0PostProcess:name

BPostProcessthresh:0.3box_thresh:0.6max_candidates:1000unclip_ratio:1.5Metric:name

etMetricmain_indicator:hmeanTrain:dataset:name:SimpleDataSetdata_dir:./pretrain_models/train_data/#标注数据集路径label_file_list:-"./pretrain_models/train_data/bank/bank1/real_det_train.txt"-"./pretrain_models/train_data/bank/bank2/real_det_train.txt"-"./pretrain_models/train_data/bank/bank3/real_det_train.txt"ratio_list:[1.0,1.0,1.0]transforms:-DecodeImage:img_mode:BGRchannel_first:false-DetLabelEncode:null-IaaAugment:augmenter_args:-type:Fliplrargs:p:0.5-type:Affineargs:rotate:--10-10-type:Resizeargs:size:-0.5-3-EastRandomCropData:size:-960-960max_tries:50keep_ratio:true-MakeBorderMap:shrink_ratio:0.4thresh_min:0.3thresh_max:0.7-MakeShrinkMap:shrink_ratio:0.4min_text_size:8-NormalizeImage:scale:1./255.mean:-0.485-0.456-0.406std:-0.229-0.224-0.225order:hwc-ToCHWImage:null-KeepKeys:keep_keys:-image-threshold_map-threshold_mask-shrink_map-shrink_maskloader:shuffle:truedrop_last:falsebatch_size_per_card:14#单卡batchsizenum_workers:14Eval:dataset:name:SimpleDataSetdata_dir:./pretrain_models/train_data/label_file_list:-"./pretrain_models/train_data/bank/bank1/real_det_test.txt"-"./pretrain_models/train_data/bank/bank2/real_det_test.txt"-"./pretrain_models/train_data/bank/bank3/real_det_test.txt"ratio_list:[1.0,1.0,1.0]transforms:-DecodeImage:img_mode:BGRchannel_first:false-DetLabelEncode:null-DetResizeForTest:null-NormalizeImage:scale:1./255.mean:-0.485-0.456-0.406std:-0.229-0.224-0.225order:hwc-ToCHWImage:null-KeepKeys:keep_keys:-image-shape-polys-ignore_tagsloader:shuffle:falsedrop_last:falsebatch_size_per_card:1num_workers:8123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166具体配置的参数说明：https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.8/doc/doc_ch/config.md大家根据自己电脑配置去调整Train和Eval下面的batch_size_per_card、num_workers，其中Eval下面的batch_size_per_card必须为1。执行训练：python-mpaddle.distributed.launch--gpus0tools/train.py-cpretrain_models/ch_PP-OCRv3_det_student.yml1本文使用电脑为306012G，训练了大概4天，实际情况看大家电脑配置和训练参数的配置。验证模型执行以下命令：pythontools/infer_det.py-cpretrain_models/ch_PP-OCRv3_det_student.yml-oGlobal.pretrained_model="./output/ch_PP-OCRv3_det_student/best_accuracy"Global.infer_img="./output/ch_PP-OCRv3_det_student/det_input/03.png"1导出模型修改源代码解决导出模型与训练模型不一致问题和检测框太小问题：文件一：tools/infer/predict_det.py修改："DetResizeForTest":{ #"limit_side_len":args.det_limit_side_len, #"limit_type":args.det_limit_type, "resize_long":args.det_resize_long,}12345文件二：tools/infer/utility.py修改：parser.add_argument("--det_resize_long",type=float,default=960)parser.add_argument("--det_db_unclip_ratio",type=float,default=3)12执行以下代码导出模型：pythontools/export_model.py-cpretrain_models/ch_PP-OCRv3_det_student.yml-oGlobal.pretrained_model="./output/ch_PP-OCRv3_det_student/best_accuracy"Global.save_inference_dir="./inference/ch_PP-OCRv3_det_student/"1推理模型找张图片测试检测效果pythontools/infer/predict_det.py--det_algorithm="DB"--det_model_dir="./inference/ch_PP-OCRv3_det_student/"--image_dir="./output/ch_PP-OCRv3_det_student/det_input/03.png"--use_gpu=True1三、文本识别训练数据银行卡卡号切图数据集，用于卡号识别训练https://download.csdn.net/download/YY007H/88571384模型训练开始训练前，可以看下官网文档，有详细训练和微调说明：官网文档：https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.8/doc/doc_ch/recognition.md模型微调：https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.8/doc/doc_ch/finetune.md本文选择PP-OCRv3模型（配置文件：ch_PP-OCRv3_rec_distillation.yml，预训练模型：ch_PP-OCRv3_rec_train.tar）进行微调。ch_PP-OCRv3_rec_train.tar下载地址：https://paddleocr.bj.bcebos.com/PP-OCRv3/chinese/ch_PP-OCRv3_rec_train.tarch_PP-OCRv3_rec_distillation.yml配置文件：Global:debug:falseuse_gpu:trueepoch_num:500log_smooth_window:20print_batch_step:10save_model_dir:./output/ch_PP-OCRv3_rec_train/#设置输出模型路径save_epoch_step:50eval_batch_step:100cal_metric_during_train:truepretrained_model:./pretrain_models/ch_PP-OCRv3_rec_train/best_accuracy.pdparams#预训练模型路径checkpoints:save_inference_dir:use_visualdl:falseinfer_img:doc/imgs_words/ch/word_1.jpgcharacter_dict_path:ppocr/utils/ppocr_keys_bank.txtmax_text_length:&max_text_length25infer_mode:falseuse_space_char:Falsedistributed:truesave_res_path:./output/ch_PP-OCRv3_rec_train/rec/ch_PP-OCRv3_rec_train.txtd2s_train_image_shape:[3,48,-1]Optimizer:name:Adambeta1:0.9beta2:0.999lr:name

iecewisedecay_epochs:[700]values:[0.0005,0.00005]warmup_epoch:5regularizer:name

2factor:3.0e-05Architecture:model_type:&model_type"rec"name

istillationModelalgorithm

istillationModels:Teacher:pretrained:freeze_params:falsereturn_all_feats:truemodel_type:*model_typealgorithm:SVTR_LCNetTransform:Backbone:name:MobileNetV1Enhancescale:0.5last_conv_stride:[1,2]last_pool_type:avglast_pool_kernel_size:[2,2]Head:name:MultiHeadhead_list:-CTCHead:Neck:name:svtrdims:64depth:2hidden_dims:120use_guide:FalseHead:name:CTCHeadfc_decay:0.00001-SARHead:enc_dim:512max_text_length:*max_text_lengthStudent:pretrained:freeze_params:falsereturn_all_feats:truemodel_type:*model_typealgorithm:SVTR_LCNetTransform:Backbone:name:MobileNetV1Enhancescale:0.5last_conv_stride:[1,2]last_pool_type:avglast_pool_kernel_size:[2,2]Head:name:MultiHeadhead_list:-CTCHead:Neck:name:svtrdims:64depth:2hidden_dims:120use_guide:TrueHead:fc_decay:0.00001-SARHead:enc_dim:512max_text_length:*max_text_lengthLoss:name:CombinedLossloss_config_list:-DistillationDMLLoss:weight:1.0act:"softmax"use_log:truemodel_name_pairs:-["Student","Teacher"]key:head_outmulti_head:Truedis_head:ctcname:dml_ctc-DistillationDMLLoss:weight:0.5act:"softmax"use_log:truemodel_name_pairs:-["Student","Teacher"]key:head_outmulti_head:Truedis_head:sarname:dml_sar-DistillationDistanceLoss:weight:1.0mode:"l2"model_name_pairs:-["Student","Teacher"]key:backbone_out-DistillationCTCLoss:weight:1.0model_name_list:["Student","Teacher"]key:head_outmulti_head:True-DistillationSARLoss:weight:1.0model_name_list:["Student","Teacher"]key:head_outmulti_head:TruePostProcess:name

istillationCTCLabelDecodemodel_name:["Student","Teacher"]key:head_outmulti_head:TrueMetric:name

istillationMetricbase_metric_name:RecMetricmain_indicator:acckey:"Student"ignore_space:FalseTrain:dataset:name:SimpleDataSetdata_dir:./pretrain_models/rec_train_data/#标注数据集路径ext_op_transform_idx:1label_file_list:-"./pretrain_models/rec_train_data/bank/rec/bank1/real_rec_train.txt"-"./pretrain_models/rec_train_data/bank/rec/bank2/real_rec_train.txt"-"./pretrain_models/rec_train_data/bank/rec/bank3/real_rec_train.txt"ratio_list:[1.0,1.0,1.0]transforms:-DecodeImage:img_mode:BGRchannel_first:false-RecAug:-MultiLabelEncode:-RecResizeImg:image_shape:[3,48,320]-KeepKeys:keep_keys:-image-label_ctc-label_sar-length-valid_ratioloader:shuffle:truebatch_size_per_card:32drop_last:truenum_workers:8Eval:dataset:name:SimpleDataSetdata_dir:./pretrain_models/rec_train_data/label_file_list:-"./pretrain_models/rec_train_data/bank/rec/bank1/real_rec_test.txt"-"./pretrain_models/rec_train_data/bank/rec/bank2/real_rec_test.txt"-"./pretrain_models/rec_train_data/bank/rec/bank3/real_rec_test.txt"ratio_list:[1.0,1.0,1.0]transforms:-DecodeImage:img_mode:BGRchannel_first:false-MultiLabelEncode:-RecResizeImg:image_shape:[3,48,320]-KeepKeys:keep_keys:-image-label_ctc-label_sar-length-valid_ratioloader:shuffle:falsedrop_last:falsebatch_size_per_card:32num_workers:8123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211具体配置的参数说明：https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.8/doc/doc_ch/config.md大家根据自己电脑配置去调整Train和Eval下面的batch_size_per_card、num_workers。执行训练：python-mpaddle.distributed.launch--gpus0tools/train.py-cpretrain_models/ch_PP-OCRv3_rec_distillation.yml1导出模型：pythontools/export_model.py-cpretrain_models/ch_PP-OCRv3_rec_distillation.yml-oGlobal.pretrained_model="./output/ch_PP-OCRv3_rec_train/best_accuracy"Global.save_inference_dir="./inference/ch_PP-OCRv3_rec_train/"1识别训练的数据大概1W左右，本文使用电脑为306012G，训练了大概2天，实际情况看大家电脑配置和训练参数的配置。四、Flask+Layui串联PaddleOCR检测识别银行卡安装FlaskFlask是一个Python的web框架，比Django框架更简单，上手更容易。pipinstallFlask1在项目根目录新建一个web文件夹，存放Flask相关代码，为了有一个良好的在线体验效果，前端用layui框架。下载Layui静态文件放到web目录下，下载地址：https://layui.dev/static文件夹放css、font、js等静态文件，templates文件夹放html（Flask默认的模板目录），temp文件夹主要用来放ocr识别的图片及结果：前端布局较简单，顶部一个菜单，左边上传文件，右边显示结果：串联整个流程新建一个app.py，作为webapi入口，添加一个方法返回html页面：@app.route('/')defindex_view():returnrender_template('index.html')123添加一个temp静态目录解析，使得前端能访问到里面识别文件：app=Flask(__name__)temp_bp=Blueprint('temp_bp',__name__,static_folder='temp')app.register_blueprint(temp_bp,url_prefix='/')123添加上传文件去识别的接口：@app.route("/upload",methods=["POST"])defupload():file=request.files.get("file")name,extension=os.path.splitext(os.path.basename(file.filename))filename=str(int(time.time()))+extensionfilepath=os.path.join("temp",filename)file.save(filepath)bank_card_num,bank_card_file=ocr(filepath,True)returnjsonify({'success':True,'bank_card_num':bank_card_num,'bank_card_file':bank_card_file})12345678910添加远程下载文件去识别的接口：@app.route("/ocr",methods=["POST"])defocr():url=request.form['url']file_url=url.split('/')[-1]name,extension=os.path.splitext(os.path.basename(file_url))filename=str(int(time.time()))+extensionfilepath=os.path.join("temp",filename)r=requests.get(url)withopen(filepath,'wb')astemp_file:temp_file.write(r.content)bank_card_num,bank_card_file=ocr(filepath,False)returnjsonify({'success':True,'bank_card_num':bank_card_num,'bank_card_file':bank_card_file})1234567891011121314添加识别方法：defocr(image_file,is_visualize):cfg=merge_configs()text_sys=TextSystem(cfg)predicted_data=read_image(image_file)dt_boxes,rec_res,time_dict=text_sys(predicted_data)result_file=Noneifis_visualize:result_file=draw_result(dt_boxes,rec_res,image_file)returnrec_res[0][0],result_file123456789101112添加加载配置方法：defmerge_configs():backup_argv=copy.deepcopy(sys.argv)sys.argv=sys.argv[:1]cfg=parse_args()update_cfg_map=vars(read_params())forkeyinupdate_cfg_map:cfg.__setattr__(key,update_cfg_map[key])sys.argv=copy.deepcopy(backup_argv)returncfg123456789101112添加加载图片方法：defread_image(img_path):assertos.path.isfile(img_path),"The{}isn'tavalidfile.".format(img_path)img=cv2.imread(img_path)ifimgisNone:returnNonereturnimg1234567添加识别图片输出方法：defdraw_result(dt_boxes,rec_res,image_file):img=read_image(image_file)image=Image.fromarray(cv2.cvtColor(img,cv2.COLOR_BGR2RGB))boxes=dt_boxestxts=[rec_res[i][0]foriinrange(len(rec_res))]scores=[rec_res[i][1]foriinrange(len(rec_res))]draw_img=draw_ocr_box_txt(image,boxes,txts,scores,0.5,font_path="../doc/fonts/simfang.ttf")name,extension=os.path.splitext(os.path.basename(image_file))directory=os.path.dirname(image_file)result_file=os.path.join(directory,name+"_result"+extension)cv2.imwrite(result_file,draw_img[:,:,::-1],)returnresult_file1234567891011121314151617181920212223242526添加启动方法：if__name__=='__main__':app.run(host="0.0.0.0",port=8321)12新建一个params.py文件存放配置：defread_params():cfg=Config()#paramsfortextdetectorcfg.det_algorithm="DB"cfg.det_model_dir="/data/paddle_ocr/models/ch_PP-OCRv3_det_student/"#paramsfortextrecognizercfg.rec_model_dir="/data/paddle_ocr/models/ch_PP-OCRv3_rec_train/Student"cfg.rec_char_dict_path="/data/paddle_ocr/models/ch_PP-OCRv3_rec_train/ppocr_keys_bank.txt"cfg.use_gpu=Falsecfg.ir_optim=Truereturncfg123456789101112131415训练好的检测模型、识别模型、识别字典都放在根目录的models目录下，部署时根据服务器配置来决定是否开启gpu部署Linux环境在Linux安装好Python环境，并安装好requirements.txt里面的依赖，在web目录下新建一个run.sh文件，内容如下：nohup/usr/bin/python3/data/paddle_ocr/web/app.py>app.log2>&1&1执行命令启动：cd/data/paddle_ocr/webchmod+xrun.shshrun.sh123除了直接用python命令启动，还可以使用WSGI服务器如‌Gunicorn或‌uWSGI来部署Flask应用，本文不在做阐述。五、查看成果在线上传体验：调用远程下载：源代码：基于PaddleOCR+Flask+Layui的webapi平台（一、银行卡识别）

		自动登录	找回密码
密码			会员注册