• 2021 (Vol.35)

Maximal directions discrepancy as accuracy criterion of images projective normalization for optical text recognition

© 2020 I. A. Konovalenko, D. V. Polevoy, D. P. Nikolaev

Institute for Information Transmission Problems (Kharkevich Institute) RAS 127051 Moscow, B. Karetny per. 19, Russia
Smart Engines Service LLC 117312 Moscow, pr. 60-letiya Oktyabrya, 9, Russia
Institute for System Analysis of Federal Research Center “Computer Science and Control” RAS 117312 Moscow, pr. 60-letiya Oktyabrya, 9, Russia
National University of Science and Technology “MISIS” 119991 Moscow, Leninsky prospect, 4, Russia
Moscow Institute of Physics and Technology (State University) 141701 Dolgoprudny, Institutsky pereulok, 9, Moscow Region, Russia

Received 08 Oct 2019

The application of projective normalization (a special case of orthocorrection) to photographs of documents for their further optical recognition is generally accepted. To date, a number of criteria are known for the accuracy of projective normalization. Almost all of them characterize only the coordinates discrepancy. However, the text fields of documents usually have an elongated shape, so that even with small coordinates discrepancy, large directions discrepancy are possible, which significantly affect the quality of segmentation of the field and the recognition of individual characters in it. The problem of accurate correction of directions discrepancy also arises in tomography problems if a spiral scanning scheme is used for measurement or projections are recorded in tomosynthesis schemes. To describe images projective normalization accuracy at a point, a pointwise maximum directions discrepancy is proposed. As a criterion for projective normalization accuracy of the entire image, a maximum directions discrepancy equal to the maximum pointwise maximum directions discrepancy in the region of interest is proposed. An analytical solution to the problem of calculating the pointwise maximum directions discrepancy is obtained. A hypothesis was put forward and numerically confirmed that the pointwise maximum directions discrepancy is a quasiconvex function. The theorem is proved that the supremum of a quasiconvex function on a bounded closed set is equal to the supremum on the extreme points of its convex hull. Based on the hypothesis and theorem, an analytical solution to the problem of calculating the maximum directions discrepancy on the polyhedral region of interest is proposed.

Key words: orthocorrection, perspective correction, images projective normalization, accuracy criteria, directions discrepancy, optical character recognition, mathematical programming

DOI: 10.31857/S0235009220020079

Cite: Konovalenko I. A., Polevoy D. V., Nikolaev D. P. Maksimalnaya nevyazka napravlenii kak kriterii tochnosti proektivnoi normalizatsii izobrazheniya pri opticheskom raspoznavanii teksta [Maximal directions discrepancy as accuracy criterion of images projective normalization for optical text recognition]. Sensornye sistemy [Sensory systems]. 2020. V. 34(2). P. 131–146 (in Russian). doi: 10.31857/S0235009220020079

References:

  • Balickij A.M., Savchik A.V., Gafarov R.F., Konovalenko I.A. O proektivno invariantnyh tochkah ovala s vydelennoj vneshnej prjamoj [About design-invariant points of an oval with a distinguished external line]. Problemy peredachi informacii [Information Transfer Issues]. 2017. V. 53. № 3. P. 84–89 (in Russian).
  • Berezskij O.N., Berezskaja K.M. Kolichestvennaja ocenka kachestva segmentacii izobrazhenij na osnove metric [Quantitative assessment of image segmentation quality based on metrics]. Upravljajushhie sistemy i mashiny [Control systems and machines]. 2015. № 6. P. 59–65 (in Russian).
  • Bolotova J.A., Spicyn V.G., Osina P.M. Obzor algoritmov detektirovanija tekstovyh oblastej na izobrazhenijah i videozapisjah. [An overview of the algorithms for detecting text areas in images and videos] Komp’juternaja optika [Computer optics]. 2017. V. 41. № 3. P. 441–452 (in Russian).
  • Buzmakov A.V., Asadchikov V.E., Zolotov D.A., Chukalina M.V., Ingacheva A.S., Krivonosov Y.S. Laboratornye rentgenovskie mikrotomografy: metody predobrabotki jeksperimental’nyh dannyh [Laboratory X-ray microtomographs: experimental data preprocessing methods]. Izvestija RAN. Serija fizicheskaja [Proceedings of the RAS. Physical series]. 2019. V. 83 (2). P. 194–197. https://doi.org/10.1134/S0367676519020066 (in Russian).
  • Efimov A.I., Novikov A.I. Algoritm pojetapnogo utochnenija proektivnogo preobrazovanija dlja sovmeshhenija izobrazhenij [Algorithm for step-by-step refinement of projective transformation for image matching]. Komp’juternaja optika [Computer optics]. 2016. V. 40. № 2. P. 258–265. https://doi.org/10.18287/2412-6179-2016-40-2-258-265 (in Russian).
  • Zeynalov R., Velizhev A., Konushin A. Vosstanovlenie formy stranicy teksta dlja korrekcii geometricheskih iskazhenij [Recovering the shape of a page of text for correcting geometric distortions]. 19 International Conference GraphiCon-2009, Moscow. 2009. P. 125–128 (in Russian).
  • Katamanov S.N. Avtomaticheskaja privjazka izobrazhenij geostacionarnogo sputnika MTSAT-1R [MTSAT-1R automatic geostationary satellite image linking]. Sovremennye problemy distancionnogo zondirovanija Zemli iz kosmosa [Modern problems of remote sensing of the Earth from space]. 2007. V. 1. № 4. P. 63–68 (in Russian).
  • Nikolayev P.P. Projectively invariant description of nonplanar smooth figures. 1. Preliminary analysis of the problem. Sensornye sistemy [Sensory system]. 2016. V. 30. № 4. P. 290–311 (in Russian).
  • Pritula N.E., Nikolaev P.P., Sheshkus A.V. Sravnenie dvuh algoritmov proektivno-invariantnogo raspoznavanija ploskih zamknutyh konturov s edinstvennoj vognutost’ju [Comparison of two algorithms for projectively invariant recognition of flat closed loops with a single concavity]. Sbornik trudov ITIS-14 [Proceedings ITIS-14]. 2014. P. 367–373 (in Russian).
  • Putjatin E.P., Prokopenko D.O., Pechenaja E.M. Voprosy normalizacii izobrazhenij pri proektivnyh preobrazovanijah [Image normalization issues in projective transformations]. Radiojelektronika i informatika [Electronics and Informatics]. 1998. № 2(3). P. 82–86 (in Russian).
  • Rokafellar R. Vypuklyj analiz [Convex analysis]. M. Mir [Peace]. 1973. V. 472 (in Russian).
  • Savchik A.V., Nikolaev P.P. Metod proektivnogo sopostavlenija dlja ovalov s dvumja otmechennymi tochkami [Projective matching method for ovals with two marked points]. Informacionnye tehnologii i vychislitel’nye sistemy [Information Technology and Computing Systems]. 2018. № 1. C. 60–67 (in Russian).
  • Holopov I.S. Algoritm korrekcii proektivnyh iskazhenij pri malovysotnoj sjomke [Projection distortion correction algorithm for low-altitude shooting]. Komp’juternaja optika [Computer optics]. 2017. V. 41. № 2. S 284–290 (in Russian).
  • Shapiro L., Stokman D., Boguslavskij A.A., Sokolov S.M. Komp’juternoe zrenie [Computer vision]. M. BINOM. Laboratorija znanij [BINOMIAL. Knowledge laboratory]. 2013. 763 p. (in Russian).
  • Shemiakina J.A., Zhukovsky A.E., Faradjev I.A. The research of the algorithms of a projective transformation calculation in the problem of planar object targeting by feature points. Iskusstvennyj intellekt i prinjatie reshenij [Artificial Intelligence and Decision Making]. 2017. № 1. Р. 43–49 (in Russian).
  • Judin D.B. Matematicheskie metody upravlenija v uslovijah nepolnoj informacii [Mathematical control methods in conditions of incomplete information]. Izdatel’skaya gruppa URSS [URSS Publishing Group]. Moscow, Russia. 2010. 400 p. (in Russian).
  • Arlazarov V.V., Slavin O.A.E., Uskov A.V.E., Janiszewski I.M. Modelling the flow of character recognition results in video stream. Bulletin of the South Ural State University. Ser. Mathematical Modelling, Programming and Computer Software. 2018. V. 11. № 2. P. 14–28. https://doi.org/10.14529/mmp180202
  • Arvind C.S., Mishra R., Vishal K., Gundimeda V. Vision based speed breaker detection for autonomous vehicle. Tenth International Conference on Machine Vision (ICMV): International Society for Optics and Photonics. 2018. V. 106960E. P. 1–9. https://doi.org/10.1117/12.2311315.
  • Awal A.M., Ghanmi N., Sicre R., Furon T. Complex document classification and localization application on identity document images. 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). IEEE. 2017. V. 1. P. 426–431. https://doi.org/10.1109/ICDAR.2017.77.
  • Bezmaternykh P.V., Nikolaev D.P., Arlazarov V.L. Textual blocks rectification method based on fast Hough transform analysis in identity documents recognition. Tenth International Conference on Machine Vision (ICMV): International Society for Optics and Photonics. 2018. V. 1069606. P. 1–6. https://doi.org/10.1117/12.23 10162.
  • Calore E., Pedersini F., Frosio I. Accelerometer based horizon and keystone perspective correction. 2012 IEEE International Instrumentation and Measurement Technology Conference Proceedings. 2012. P. 205–209. https://doi.org/10.1109/I2MTC.2012.6229434.
  • Chen H., Sukthankar R., Wallace G., Li K. Scalable alignment of large-format multi-projector displays using camera homography trees. Proceedings of the conference on Visualization'02. IEEE Computer Society. 2002. P. 339–346.
  • Dubuisson M.P., Jain A.K. A modified Hausdorff distance for object matching. Proceedings of 12th international conference on pattern recognition. IEEE. 1994. V. 1. P. 566–568. https://doi.org/10.1109/ICPR.1994.576361.
  • Fréchet M.M. Sur quelques points du calcul fonctionnel. Rendiconti del Circolo Matematico di Palermo (1884–1940). 1906. V. 22. № 1. P. 1–72.
  • Hsu S.C., Sawhney H.S. Influence of global constraints and lens distortion on pose and appearance recovery from a purely rotating camera. Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No. 98EX201). 1998. P. 154–159. https://doi.org/10.1109/ACV.1998.732873.
  • Huttenlocher D.P., Klanderman G.A., Rucklidge W.J. Comparing images using the Hausdorff distance. IEEE Transactions on pattern analysis and machine intelligence. 1993. V. 15. № 9. P. 850–863. https://doi.org/10.1109/34.232073
  • Jaccard P. Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull Soc Vaudoise Sci Nat. 1901. V. 37. P. 547–579.
  • Jesorsky O., Kirchberg K.J., Frischholz R.W. Robust face detection using the hausdorff distance. International conference on audio-and video-based biometric person authentication. 2001. P. 90–95.
  • Karpenko S., Konovalenko I., Miller A., Miller B., Nikolaev D. UAV control on the basis of 3D landmark bearing-only observations. Sensors. 2015. V. 15. № 12. P. 29802–29820. https://doi.org/10.3390/s151229768
  • Konovalenko I.A., Shemiakina J.A. Error values analysis for inaccurate projective transformation of a quadrangle. Journal of Physics: Conference Series. – IOP Publishing. 2018. V. 1096. № 1. P. 1–9. https://doi.org/10.1088/1742-6596/1096/1/012038
  • Kunina I.A., Gladilin S.A., Nikolaev D.P. Blind radial distortion compensation in a single image using fast Hough transform. Computer optics. 2016. V. 40. P. 395–403. https://doi.org/10.18287/2412-6179-2016-40-3-395-403
  • Kunina I.A., Terekhin A.P., Gladilin S.A., Nikolaev D.P. Blind radial distortion compensation from video using fast Hough transform. International Conference on Robotics and Machine Vision. 2017. V. 10253. № 1025308. P. 1–7. https://doi.org/10.1117/12.2254867.
  • Legge G.E., Pelli D.G., Rubin G.S., Schleske M.M. Psychophysics of reading–I. Normal vision. Vision research. 1985. V. 25. № 2. P. 239–252. https://doi.org/10.1016/0042-6989(85)90117-8
  • Povolotskiy M.A., Kuznetsova E.G., Khanipov T.M. Russian license plate segmentation based on dynamic time warping. European Conference on Modelling and Simulation. 2017. P. 285–291.
  • Rodríguez-Piñeiro J., Comesaña-Alfaro P., PérezGonzález F., Malvido-García A. A new method for perspective correction of document images. Document Recognition and Retrieval XVIII. International Society for Optics and Photonics. 2011. V. 787410. P. 1–12.
  • Sim D.G., Kwon O.K., Park R.H. Object matching algorithms using robust Hausdorff distance measures. IEEE Transactions on image processing. 1999. V. 8. № 3. P. 425–429. https://doi.org/10.1109/83.748897
  • Orrite C., Herrero J.E. Shape matching of partially occluded curves invariant under projective transformation. Computer Vision and Image Understanding. 2004. V. 93. № 1. P. 34–64. https://doi.org/10.1016/j.cviu.2003.09.005
  • Skoryukina N., Chernov T., Bulatov K., Nikolaev D.P., Arlazarov V. Snapscreen: TV-stream frame search with projectively distorted and noisy query. Ninth International Conference on Machine Vision (ICMV): International Society for Optics and Photonics. 2017. V. 103410Y. P. 1–5. https://doi.org/10.1117/12.2268735.
  • Skoryukina N., Shemiakina J., Arlazarov V.L., Faradjev I. Document localization algorithms based on feature points and straight lines. International Society for Optics and Photonics. 2018. V. 106961H. P. 1–5. https://doi.org/10.1117/12.2311478
  • Takezawa Y., Hasegawa M., Tabbone S. Camera-captured document image perspective distortion correction using vanishing point detection based on Radon transform. 23rd International Conference on Pattern Recognition (ICPR). IEEE. 2016. P. 3968–3974. https://doi.org/10.1109/ICPR.2016.7900254.
  • Wei H., Wang Y., Forman G., Zhu Y. Map matching by Fréchet distance and global weight optimization. Technical Paper, Departement of Computer Science and Engineering. 2013. P. 19–30.
  • Xie Y., Tang G., Hoff W. Geometry-based populated chessboard recognition. Tenth International Conference on Machine Vision (ICMV): International Society for Optics and Photonics. 2018. V. 1069603. P. 1–5.
  • Zhang Z., He L.W. Whiteboard scanning and image enhancement. Digital Signal Processing. 2007. V. 17. № 2. P. 414–432. g/10.1016/j.dsp.2006.05.006
  • Zhukovsky A., Nikolaev D., Arlazarov V., Postnikov V., Polevoy D., Skoryukina N., Chernov T., Shemiakina J., Mukovozov A., Konovalenko I. Segments graph-based approach for document capture in a smartphone video stream. IAPR International Conference on Document Analysis and Recognition (ICDAR). 2017. №1. P. 337–342. https://doi.org/10.1109/ICDAR.2017.63.