In this paper we consider the applicability of similarity metric classifiers for recognition of words. The approaches
based on recognition of single characters are well studied, but they demonstrate poor performance on some kinds of text,
especially the ones hard for segmentation, such as handwritten and Arabic texts, or the ones with ligatures. Moreover,
if images are strongly noised and/or corrupted because of camera imperfection, touched symbols can appear in the text.
All these problems usually occur in recognition systems for text with a predefined pattern, where the set of words is
limited. Given this, it is reasonable to recognize whole words, although the dictionary can be huge and unknown while
training. In this study we suggest using the similarity metric-based neural networks for word images recognition. We
provide the comparison between the similarity metric-based neural network with classifying one on the words collected
from “gender” field of Russian National passport. To maintain the experimental integrity, the parameters of all the
layers except the last one were the same for both types of networks. The results show the relevance of the similarity
metric-based neural networks to word recognition problem solving. The main advantages of the suggested method are the
possibility of network alphabet extension after learning and no need for symbol segmentation.
Key words:
text recognition, convolutional neural networks, deep learning, siamese neural networks, metrics learning
DOI: 10.1134/S0235009219010049
Cite:
Chirvonaya A. N., Lynchenko A. E., Chernyshova Y. S., Sheshkus A. V.
Sravnenie klassifitsiruyushchei i metricheskoi svertochnykh setei na primere raspoznavaniya polya “pol” pasporta grazhdanina rf
[Comparison of the classifying and similarity metric-based neural networks through the recognition of the filed “gender” in russian federation passport].
Sensornye sistemy [Sensory systems].
2019.
V. 33(1).
P. 65-69 (in Russian). doi: 10.1134/S0235009219010049
References:
- Arlazarov V.V., Zhukovskiy A.E., Krivtsov V.E., Nikolaev D.P., Polevoy D.V. Analiz osobennostey ispolzovaniya statsionarnykh i mobilnykh malorazmernykh tsifrovykh video kamer dlya raspoznavaniya dokumentov [The analysis of the features of using stationary and mobile small-size digital video cameras for documents recognition]. Informatsionnye tekhnologii i vychislitelnye sistemy [Information technologies and computation systems]. 2014. № 3. P. 71–81. (In Russian).
- Lyozin I.A., Solovyov A.V. Performing an image compression by using the multilayer perceptron. Izvestija Samarskogo nauchnogo centra RAN [Proceedings of the Samara scientific center of Russian Academy of Sciences]. 2016. V. 18. № 4. P. 770–773. (In Russian).
- Prohorov V.G. Ispol’zovanie svjortochnyh nejronnyh setej dlja raspoznavanija rukopisnih simvolov [The using of convolutional neural networks for handwritten symbols recognition]. Problemi programuvannja [Programming problems]. 2008. № 2–3. P. 669–674. (In Russian).
- Chernov T.S., Il’in D.A., Bezmaternyh P.V., Faradzhev I.A., Karpenko S.M. Research of Segmentation Methods for Images of Document Textual Blocks Based on the Structural Analysis and Machine Learning. RBRF Information Bulletin. 2016. № 4 (92). P. 55–71. DOI: 10.22204/2410-4639-2016-092-04-55-71. (In Russian).
- Bulatov K., Arlazarov V.V., Chernov T., Slavin O., Nikolaev D.P. Smart IDReader: Document Recognition in Video Stream. The 14th IAPR International Conference on Document Analysis and Recognition. 2018. P. 39–44. doi 10.1109/ICDAR.2017.34710.1109/ICDAR.2017.347
- Chernyshova Y., Gayer A., Sheshkus A. Generation method of synthetic training data for mobile OCR system. Proc. SPIE 10696, Tenth International Conference on Machine Vision. 2018. P. 1–7. doi 10.1117/12.2310119.10.1117/12.2310119
- Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York. Springer-Verlag, 2009. 745 p.
- Jaderberg M., Simonyan K., Vedaldi A., Zisserman A. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition. NIPS Deep Learning Workshop. 2014. P. 1–10.
- Koch G., Zemel R., Salakhutdinov R. Siamese Neural Networks for One-shot Image Recognition. Proceedings of the 32 International Conference on Machine Learning. 2015. V. 2. 8 p.
- Lecun Y., Bottou L., Bengio Y., Haffner P. Gradient-based learning applied to document recognition. Intelligent Signal Processing. New York. IEEE Press, 2001. P. 306–351.
- Liu Y., Wang Z., Jin H., Wassel I. Synthetically supervised feature learinig for scene text recognition. The European Conference on Computer Vision. 2018. P. 435–451.
- Venkata Rao N., Sastry A.S.C.S., Chakravarthy A.S.N., Kalyanchakravarthi P. Optical character recognition technique algorithms. Journal of Theoretical and Applied Information Technology. 2016. V. 83. P. 275–282.