• 2020 (Vol.34)

Artificial training data generation for the task of character recognition of fields of russian passport

© 2018 A.V. Gayer, Y.S. Chernyshova, A.V. Sheshkus

National University of Science and Technology “MISIS”, 119049 Moscow, Leninsky prosp., 4, Russia
Smart Engines,117312 Moscow, Prospekt 60-Letiya Oktyabrya, 9, Russia

Received 31 Aug 2017

This paper describes an algorithm for data synthesis for learning convolutional neural networks, which is an actual problem because obtaining enough natural training data is rather difficult and expensive, and in some cases – impossible. In this article existing methods for data generation were considered and data generator based on set of fonts and backgrounds was developed. Experiments were conducted to train neural networks on generated datasets and results were compared to accuracy of the network, which was trained on natural data – images with symbols from fields of russian passport, as well as on intermediate versions of obtained datasets. The proposed approach has shown its effectiveness – the accuracy of learning on a synthesized dataset is comparable to that of natural data.

Key words: OCR systems, artificial datasets, ANN, machine learning, synthetic training dataset

DOI: 10.1134/S023500921803006X

Cite: Gayer A. V., Chernyshova Y. S., Sheshkus A. V. Generatsiya iskusstvennoi obuchayushchei vyborki dlya zadachi raspoznavaniya simvolov polei pasporta rf [Artificial training data generation for the task of character recognition of fields of russian passport]. Sensornye sistemy [Sensory systems]. 2018. V. 32(3). P. 230-235 (in Russian). doi: 10.1134/S023500921803006X

References:

  • Arlazarov V.V., Postnikov V.V., Sholomov D.L. Cognitive Forms – sistema massovogo vvoda strukturirovannyh dokumentov [Cognitive forms – the system of mass input of structured documents]. Trudy Instituta Sistemnogo Analiza Rossiiskoi Akademii Nauk [ISA RAS Proceedings]. 2002. P. 35–46 [in Russian].
  • Arlazarov V.V., Zhukovskij A.E., Krivcov V.E., Nikolaev D.P., Polevoj D.V. Analiz osobennostej ispol’zovanija stacionarnyh i mobil’nyh malorazmernyh cifrovyh vi-deo kamer dlja raspoznavanija dokumentov [Analysis of features of the use of fixed and mobile small-sized digital video camera for OCR]. Informacionnye tehnologii i vychislitel’nye sistemy [Information Technologies and Computer Systems]. 2014. V. 3. P. 71–81 [in Russian].
  • Ivanova A., Kuznecova E., Nikolaev D. Prikladnye osobennosti obucheniya nejrosetevyh klassifikatorov v industrial’nyh zadachah raspoznavaniya obrazov [Applied features of learning neural network classifiers in industrial problems of pattern recognition]. Informacionnye tehnologii i sistemy (ITiS'15): sbornik trudov konferencii [Proceedings of Information Technology and Systems (ITaS'15)]. 2015. P. 1169–1184 [in Russian].
  • Moiseev B., Chigorin A. Klassifikaciya avtodorozhnyh znakov na osnove svyortochnoj nejroseti, obuchennoj na sinteticheskih dannyh [Classification of road signs on the basis of a convolutional neural network, trained on synthetic data]. The 22nd International Conference on Computer Graphics and Vision. 2012. P. 284–287 [in Russian].
  • Nikolaev D.P., Polevoj D.V., Tarasova N.A. Sintez obuchajushhej vyborki v zadache raspoznavanija teksta v trehmernom prostranstve [Training data synthesis in text recognition problem solved in three-dimensional space]. Informacionnye tehnologii i vychislitel’nye sistemy [Information Technologies and Computer Systems]. 2014. V. 3. P. 82–88 [in Russian].
  • Polevoj D.V. Aktual’nye zadachi sozdaniya sistem massovogo vvoda s ispol’zovaniem opticheskogo raspoznavaniya dlya preobrazovaniya slozhno strukturirovannyh bumazhnyh dokumentov v gibridnyh informacionnyh sistemah [Topical tasks of creating mass input systems using optical recognition for the transformation of complex structured paper documents in hybrid information systems]. Sistemnyj analiz i informacionnye tehnologii (SAIT) [Conference on Systems analysis and information technologies SAIT]. 2011. V. 2. P. 192–195 [in Russian].
  • Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y. Generative adversarial nets. In Proc. NIPS. 2014.
  • Gupta A., Vedaldi A., Zisserman A. Synthetic Data for Text Localisation in Natural Images. Computer Vision and Pattern Recognition (CVPR). 2016. P. 2315–2324.
  • Ilin D., Krivtsov V. Creating training datasets for OCR in mobile device video stream. Proc. 29th European Conference on Modelling and Simulation. 2015. P. 516–520.
  • Jaderberg M., Simonyan K., Vedaldi A., Zisserman A. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition. Computer Vision and PatternRecognition (CVPR). 2014.
  • Postnikov V.V., Sholomov D.L., Marchenko A.E. Flexi-Docs: The Template Driven Document Recognition Technology. Proceedings of the 6th German-Russian Workshop on Pattern Recognition and Image Understanding (OGRW-6). 2003.
  • Shrivastava A., Pfister T., Tuzel O., Susskind J., Wang W., Webb R. Learning from Simulated and Unsupervised Images through Adversarial Training. Computer Vision and Pattern Recognition (CVPR). 2017. P. 2242–2251.
  • Wood E., Baltrusaitis T., Morency L., Robinson P., Bulling A. Learning an Appearance-Based Gaze Estimator from One Million Synthesised Images. ETRA, Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications. 2016. P. 131–138.