tesseract  v4.0.0-17-g361f3264
Open Source OCR Engine
tesseract::LanguageModelDawgInfo Struct Reference

#include <lm_state.h>

Collaboration diagram for tesseract::LanguageModelDawgInfo:

Public Member Functions

 LanguageModelDawgInfo (const DawgPositionVector *a, PermuterType pt)
 

Public Attributes

DawgPositionVector active_dawgs
 
PermuterType permuter
 

Detailed Description

The following structs are used for storing the state of the language model in the segmentation search graph. In this graph the nodes are BLOB_CHOICEs and the links are the relationships between the underlying blobs (see segsearch.h for a more detailed description).

Each of the BLOB_CHOICEs contains LanguageModelState struct, which has a list of N best paths (list of ViterbiStateEntry) explored by the Viterbi search leading up to and including this BLOB_CHOICE.

Each ViterbiStateEntry contains information from various components of the language model: dawgs in which the path is found, character ngram model probability of the path, script/chartype/font consistency info, state for language-specific heuristics (e.g. hyphenated and compound words, lower/upper case preferences, etc).

Each ViterbiStateEntry also contains the parent pointer, so that the path that it represents (WERD_CHOICE) can be constructed by following these parent pointers. Struct for storing additional information used by Dawg language model component. It stores the set of active dawgs in which the sequence of letters on a path can be found.

Constructor & Destructor Documentation

◆ LanguageModelDawgInfo()

tesseract::LanguageModelDawgInfo::LanguageModelDawgInfo ( const DawgPositionVector a,
PermuterType  pt 
)
inline

Member Data Documentation

◆ active_dawgs

DawgPositionVector tesseract::LanguageModelDawgInfo::active_dawgs

◆ permuter

PermuterType tesseract::LanguageModelDawgInfo::permuter

The documentation for this struct was generated from the following file: