Skip to content

Segmentation fault when using integer models for LSTM training #1573

@DihuiLai

Description

@DihuiLai

I am running the tutorial on training lstm by fine tuning it following the link https://github.com/tesseract-ocr/tesseract/wiki/TrainingTesseract-4.00#fine-tuning-for-impact

The training works OK when I follow the tutorial instruction and fine tune from .lstm extracted from tessdata/best/eng.traineddata. However the training failed when I try to extract .lstm from tessdata/eng.traineddata

Environment

  • Tesseract Version: tesseract 4.0.0-beta.1-232-g45a6

  • Platform: <ubuntu 16.04>

The code I am trying to execute:
training/lstmtraining --model_output ~/tesstutorial/impact_from_full/impact --continue_from ~/tesstutorial/impact_from_full/eng.lstm --traineddata tessdata/eng.traineddata --train_listfile ~/tesstutorial/engeval/eng.training_files.txt --max_iterations 400

The eng.lstm is extracted by "training/combine_tessdata -e tessdata/eng.traineddata ~/tesstutorial/impact_from_full/eng.lstm"

The code will work if I use the tessdata/best/eng.traineddata

The error that I got:
Loaded file /home/dlai/tesstutorial/impact_from_full/eng.lstm, unpacking...
Warning: LSTMTrainer deserialized an LSTMRecognizer!
Continuing from /home/dlai/tesstutorial/impact_from_full/eng.lstm
Loaded 72/72 pages (1-72) of document /home/dlai/tesstutorial/engeval/eng.FreeSans.exp0.lstmf
!int_mode_:Error:Assert failed:in file weightmatrix.cpp, line 244
Segmentation fault (core dumped)

Thanks very much

Dihui

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions