?
Метод коррекции ошибок классификации распознанных символов
Optical recognition of text documents is inevitably error-prone process. To identify and correct that errors systems use post-processing techniques that are usually based on dictionary search. Using dictionaries can bring an acceptable quality of recognition for Latin, Cyrillic and other phonetic alphabets, but of little use for the languages in which the selection of individual words is untypical or optional (Chinese, Japanese , Korean, Vietnamese and other languages ). This paper discusses known methods to address this problem, and proposes a new approach to correcting certain types of errors, based on the application of neural networks ensembles (containing distinct neural network for each possible character), which allows to reduce the number of hieroglyphic recognition errors and to reduce dependence on the quality of dictionaries while recognizing texts in phonetic alphabets.