-
Couldn't load subscription status.
- Fork 10.3k
Closed
Labels
OSDOrientation and Script DetectionOrientation and Script Detection
Description
Environment
- Tesseract Version: 4.00 alpha
- Commit Number: 1b0379c
- Platform: Presumably all - found on Ubuntu 17.04, confirmed on macOS Sierra
Current Behavior:
When using eng.traineddata from tessdata_fast in -psm 0 mode Tesseract crashes for all input files. Example:
b095cb28b6c868b99d19e1c64b48a626bc4cb944 osd.traineddata
31abd495e0f719db4f524c447e9d855124a0b0d6 eng.traineddata
$ tesseract -psm 0 testing/phototest.tif stdout
Segmentation fault
Behaviour is the same using tessdata_best.
After replacing with tessdata/eng.traineddata, OSD works fine:
b095cb28b6c868b99d19e1c64b48a626bc4cb944 osd.traineddata
cdcfae0c5c272b5b2f0406cc91ac5d022f7df7f4 eng.traineddata
$ tesseract -psm 0 testing/phototest.tif stdout
Page 1
Page number: 0
Orientation in degrees: 0
Rotate: 0
Orientation confidence: 15.98
Script: Latin
Script confidence: 460.00
I discovered this in the Ubuntu 17.04 PPA (ppa:alex-p/tesseract-ocr) and replicated it on macOS tesseract built from source.
Stack trace
Stack trace (Linux version)
#0 0x00007f5e14a5494d in tesseract::Classify::CharNormClassifier(TBLOB*, tesseract::TrainingSample const&, ADAPT_RESULTS*) () from /usr/lib/libtesseract.so.4
#1 0x00007f5e14a55885 in tesseract::Classify::DoAdaptiveMatch(TBLOB*, ADAPT_RESULTS*) () from /usr/lib/libtesseract.so.4
#2 0x00007f5e14a55f89 in tesseract::Classify::AdaptiveClassifier(TBLOB*, BLOB_CHOICE_LIST*) () from /usr/lib/libtesseract.so.4
#3 0x00007f5e14973c82 in os_detect_blob(BLOBNBOX*, OrientationDetector*, ScriptDetector*, OSResults*, tesseract::Tesseract*) ()
from /usr/lib/libtesseract.so.4
#4 0x00007f5e1497413b in os_detect_blobs(GenericVector<int> const*, BLOBNBOX_CLIST*, OSResults*, tesseract::Tesseract*) () from /usr/lib/libtesseract.so.4
#5 0x00007f5e1497453d in os_detect(TO_BLOCK_LIST*, OSResults*, tesseract::Tesseract*) () from /usr/lib/libtesseract.so.4
#6 0x00007f5e14974782 in orientation_and_script_detection(STRING&, OSResults*, tesseract::Tesseract*) () from /usr/lib/libtesseract.so.4
#7 0x00007f5e14943f8f in tesseract::TessBaseAPI::DetectOS(OSResults*) ()
from /usr/lib/libtesseract.so.4
#8 0x00007f5e149440b9 in tesseract::TessBaseAPI::DetectOrientationScript(int*, float*, char const**, float*) () from /usr/lib/libtesseract.so.4
#9 0x00007f5e149441b1 in tesseract::TessBaseAPI::GetOsdText(int) ()
from /usr/lib/libtesseract.so.4
#10 0x00007f5e1494bcd4 in tesseract::TessOsdRenderer::AddImageHandler(tesseract::TessBaseAPI*) () from /usr/lib/libtesseract.so.4
brlin-tw
Metadata
Metadata
Assignees
Labels
OSDOrientation and Script DetectionOrientation and Script Detection