Skip to content

Different results using python package and C++ inference #14630

@timminator

Description

@timminator

🔎 Search before asking

  • I have searched the PaddleOCR Docs and found no similar bug report.
  • I have searched the PaddleOCR Issues and found no similar bug report.
  • I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

I've tried out the C++ version of PaddleOCR in the last few days and noticed that there are some slight differences on the results when using both of them on the same image with the same parameters. I have two examples.

The first image:
Image

Here is the result when using the C++ version:

.\paddleocr.exe -det_model_dir="C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\det\en\en_PP-OCRv3_det_infer" 
-rec_model_dir="C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\rec\latin\latin_PP-OCRv3_rec_infer" 
-image_dir="C:\Users\GPUVM\Downloads\402374996-663c3ab7-10a0-475b-820a-df862a4fc049.png" -use_angle_cls="true" 
-cls_model_dir="C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\cls\ch_ppocr_mobile_v2.0_cls_infer" 
-rec_char_dict_path="C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\ppocr\utils\dict\latin_dict.txt"
...
predict img: C:\Users\GPUVM\Downloads\402374996-663c3ab7-10a0-475b-820a-df862a4fc049.png
0       det boxes: [[114,21],[564,20],[564,44],[114,45]] rec text: et cesser de orétendre chercher rec score: 0.942833
1       det boxes: [[140,59],[539,59],[539,85],[140,85]] rec text: qui se fait passer pour nous. rec score: 0.988313
The detection visualized image saved in ./output//402374996-663c3ab7-10a0-475b-820a-df862a4fc049.png

And here using Python:

paddleocr --image_dir "C:\Users\GPUVM\Downloads\402374996-663c3ab7-10a0-475b-820a-df862a4fc049.png" 
--det_model_dir "C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\det\en\en_PP-OCRv3_det_infer" 
--rec_model_dir "C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\rec\latin\latin_PP-OCRv3_rec_infer" 
--cls_model_dir "C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\cls\ch_ppocr_mobile_v2.0_cls_infer" 
--rec_char_dict_path "C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\ppocr\utils\dict\latin_dict.txt" --use_angle_cls true
...
[2025/02/07 01:21:04] ppocr INFO: [[[114.0, 22.0], [565.0, 21.0], [565.0, 45.0], [114.0, 46.0]], ('et cesser de prétenclre chercher', 0.9665742516517639)]
[2025/02/07 01:21:04] ppocr INFO: [[[141.0, 60.0], [539.0, 60.0], [539.0, 86.0], [141.0, 86.0]], ('qui se fait passer pour nous.', 0.9676637053489685)]

Both are having trouble with the word "prétendre". But they are getting different results! The recognition confidence differs aswell.

Here is another example:
Image

Here is the result when using the C++ version:

.\paddleocr.exe -det_model_dir="C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\det\en\en_PP-OCRv3_det_infer"
 -rec_model_dir="C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\rec\en\en_PP-OCRv4_rec_infer" 
-image_dir="C:\Users\GPUVM\Downloads\402374329-cb1397c6-888f-45e6-b205-4f18f830430f.png" -use_angle_cls="true" 
-cls_model_dir="C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\cls\ch_ppocr_mobile_v2.0_cls_infer" 
-rec_char_dict_path="C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\ppocr\utils\en_dict.txt"
...
predict img: C:\Users\GPUVM\Downloads\402374329-cb1397c6-888f-45e6-b205-4f18f830430f.png
0       det boxes: [[91,55],[734,59],[734,117],[91,113]] rec text: My mommy always said rec score: 0.965158
1       det boxes: [[89,149],[731,153],[731,198],[89,194]] rec text: there were no monsters. rec score: 0.984143
The detection visualized image saved in ./output//402374329-cb1397c6-888f-45e6-b205-4f18f830430f.png

And here using Python:

paddleocr --image_dir "C:\Users\GPUVM\Downloads\402374329-cb1397c6-888f-45e6-b205-4f18f830430f.png" 
--det_model_dir "C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\det\en\en_PP-OCRv3_det_infer" 
--rec_model_dir "C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\rec\en\en_PP-OCRv4_rec_infer" 
--cls_model_dir "C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\build\Release\PP-OCRv4\cls\ch_ppocr_mobile_v2.0_cls_infer" 
--rec_char_dict_path "C:\Users\GPUVM\Desktop\projects\cpp\PaddleOCR\PaddleOCR\deploy\cpp_infer\ppocr\utils\en_dict.txt" --use_angle_cls true
...
[2025/02/07 01:29:31] ppocr INFO: [[[92.0, 56.0], [735.0, 60.0], [734.0, 118.0], [91.0, 113.0]], ('My mommy always said', 0.9898611307144165)]
[2025/02/07 01:29:31] ppocr INFO: [[[90.0, 149.0], [732.0, 153.0], [731.0, 198.0], [89.0, 194.0]], ('there were no monsters.', 0.9909024238586426)]

The text is recognized correctly on both, but the confidence numbers differ again.

The question is: Is this behaviour expected and is this the correct behaviour?
If thats the case, what causes this?

🏃‍♂️ Environment (运行环境)

Windows 11 23H2
PaddleOCR 2.9
paddlepaddle-gpu 3.0.0-rc0 for C++ and Python

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

No further code required.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions