-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Closed
Description
Usage instructions are given in https://github.com/tesseract-ocr/tesseract/blob/master/training/combine_lang_model.cpp#L43-58
// Check validity of input flags.
if (FLAGS_input_unicharset.empty() || FLAGS_script_dir.empty() ||
FLAGS_output_dir.empty() || FLAGS_lang.empty()) {
tprintf("Usage: %s --input_unicharset filename --script_dir dirname\n",
argv[0]);
tprintf(" --output_dir rootdir --lang lang [--lang_is_rtl]\n");
tprintf(" [--words file --puncs file --numbers file]\n");
tprintf("Sets properties on the input unicharset file, and writes:\n");
tprintf("rootdir/lang/lang.charset_size=ddd.txt\n");
tprintf("rootdir/lang/lang.traineddata\n");
tprintf("rootdir/lang/lang.unicharset\n");
tprintf("If the 3 word lists are provided, the dawgs are also added to");
tprintf(" the traineddata file.\n");
tprintf("The output unicharset and charset_size files are just for human");
tprintf(" readability.\n");
However, the actual info displayed is
USAGE: combine_lang_model
--lang_is_rtl True if lang being processed is written right-to-left (type:bool default:false)
--pass_through_recoder If true, the recoder is a simple pass-through of the unicharset. Otherwise, potentially a compre
ssion of it (type:bool default:false)
--input_unicharset Unicharset to complete and use in encoding (type:string default:)
--script_dir Directory name for input script unicharsets (type:string default:)
--words File listing words to use for the system dictionary (type:string default:)
--puncs File listing punctuation patterns (type:string default:)
--numbers File listing number patterns (type:string default:)
--output_dir Root directory for output files (type:string default:)
--version_str Version string to add to traineddata file (type:string default:)
--lang Name of language being processed (type:string default:)
So, it looks like that the program is calling a common training argument parser and exiting.
int main(int argc, char** argv) {
tesseract::ParseCommandLineFlags(argv[0], &argc, &argv, true);
Related: #1297
Metadata
Metadata
Assignees
Labels
No labels