• 2024 (Vol.38)
  • 1990 (Vol.4)
  • 1989 (Vol.3)
  • 1988 (Vol.2)
  • 1987 (Vol.1)

Methods of training data augmentation in the task of image classification

© 2018 S. O. Emelyanov, A. A. Ivanova, E. A. Shvets, D. P. Nikolaev

Institute for Information Transmission Problems RAS, 127051 Moscow, Bolshoi Karetny lane, 19, Russian

Received 26 Feb 2018

Machine learning (and artificial neural network training in particular) is one of the most prominent approaches to the task of object recognition in images. Typically such artificial neural networks are trained on large datasets containing tens or hundreds of thousands of elements. However gathering a dataset of significant size is a very complex task in practice. In this paper we consider the existing methods to train an efficient classifier when such a large training dataset is unavailable. We focus on one particular approach – data augmentation – and various methods for implementing it, and present a systematic approach for choosing which augmentation transformations and their parameters to use in each particular task.

Key words: neural networks, data augmentation, image classification, small datasets

DOI: 10.1134/S0235009218030058

Cite: Emelyanov S. O., Ivanova A. A., Shvets E. A., Nikolaev D. P. Metody augmentatsii obuchayushchikh vyborok v zadachakh klassifikatsii izobrazhenii [Methods of training data augmentation in the task of image classification]. Sensornye sistemy [Sensory systems]. 2018. V. 32(3). P. 236-245 (in Russian). doi: 10.1134/S0235009218030058

References:

  • Bocharov D., Koptelov I., Kuznetsova E. Detektory proezdov na osnove tekhnicheskogo zreniya v avtomaticheskom klassifikatore transportnykh sredstv [Computer vision-based detectors of vehicle drive-throughs in automatic vehicle classifier]. Sbornik trudov 39-i mezhdistsiplinarnoi shkoly-konferentsii IPPI RAN “Informatsionnye tekhnologii i sistemy 2015” [Proceedings Information Technology and Systems (ITaS 2015)]. 2015. P. 485–497 (in Russian).
  • Ivanova A., Kuznetsova E., Nikolaev D. Prikladnye osobennosti obucheniya neirosetevykh klassifikatorov v industrial’nykh zadachakh raspoznavaniya obrazov [Applied practices of neural network-based classifiers training for application to industrial object recognition tasks]. Sbornik trudov 39-i mezhdistsiplinarnoi shkolykonferentsii IPPI RAN “Informatsionnye tekhnologii i sistemy 2015” [Proceedings Information Technology and Systems (ITaS 2015)]. 2015. P. 1169–1184 (in Russian).
  • Button K.S., Ioannidis J.P., Mokrysz C., Nosek B.A., Flint J., Robinson E. S., Munafò M.R. Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience. 2013. V. 14. № 5. P. 365–376. 10.1038/nrn3475.
  • Campos T. E., Bodla R. B., Varma M. The Chars74K dataset. 2009. URL: www.ee.surrey.ac.uk/CVSSP/demos/chars74k.
  • Cireşan D., Meier U., Schmidhuber J. Multi-column deep neural networks for image classification. Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. 2012а. P. 3642–3649.
  • Cireşan D.C., Meier U., Schmidhuber J. Transfer learning for Latin and Chinese characters with deep neural networks. Neural Networks (IJCNN), The 2012 International Joint Conference on. 2012б. P. 1–6.
  • Fawzi A., Samulowitz H., Turaga D., Frossard P. Adaptive data augmentation for image classification. Image Processing (ICIP), 2016 IEEE International Conference on. 2016. P. 3688–3692.
  • Fei-Fei L., Fergus R., Perona P. One-shot learning of object categories. IEEE transactions on pattern analysis and machine intelligence. 2006. V. 28. № 4. P. 594–611.
  • Griffin G., Holub A., Perona P. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology. 2007. URL: http://authors.library.caltech.edu/7694
  • Ilin D.A., Krivtsov V. E. Creating Training Datasets For OCR In Mobile Device Video Stream. Proceedings 29th European Conference on Modelling and Simulation. 2015. P. 516–520.
  • Kennard D.J., Barrett W.A., Sederberg T.W. Word warping for offline handwriting recognition. Document Analysis and Recognition (ICDAR), 2011 International Conference on. 2011. P. 1349–1353. 10.1109/ICDAR.2011.271.
  • Khanipov T., Koptelov I., Grigoryev A., Kuznetsova E., Nikolaev D. Vision-based industrial automatic vehicle classifier. Seventh International Conference on Machine Vision. 2015. V. 9445. P. 1–5. 10.1117/12.2181557.
  • Krizhevsky A., Sutskever I., Hinton G.E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012. P. 1097–1105. 10.1145/3065386.
  • Krizhevsky A. Learning multiple layers of features from tiny images. Master’s thesis. Department of Computer Science. University of Toronto. 2009. 60 p.
  • LeCun Y., Bottou L., Bengio Y., Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE. 1998. V. 86. № 11. P. 2278–2324.
  • Nijhuis J.A.G., Brugge M.H., Helmholt K.A., Pluim J.P.W., Spaanenburg L., Venema R.S., Westenberg M.A. Car license plate recognition with neural networks and fuzzy logic. Neural Networks, 1995. Proceedings., IEEE International Conference on. 1995. V. 5. P. 1–5.
  • Oquab M., Bottou L., Laptev I., Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. P. 1717–1724. 10.1109/CVPR.2014.222.
  • Paulin M., Revaud J., Harchaoui Z., Perronnin F., Schmid C. Transformation pursuit for image classification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014. P. 3646–3653. 10.1109/CVPR.2014.466.
  • Povolotskiy M.A., Kuznetsova E.G., Khanipov T.M. Russian License Plate Segmentation Based on Dynamic time Wrapping. Proceedings 31th European Conference on Modelling and Simulation. (in print)
  • Ronneberger O., Fischer P., Brox T. U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham. 2015. P. 234–241. 10.1007/978-3-319-24574-4.
  • Rothe R., Timofte R., Van Gool L. Dex: Deep expectation of apparent age from a single image. Proceedings of the IEEE International Conference on Computer Vision Workshops. 2015. P. 10–15. 10.1109/ICCVW.2015.41.
  • Rowley H.A., Baluja S., Kanade T. Neural network-based face detection. IEEE Transactions on pattern analysis and machine intelligence. 1998. V. 20. № 1. P. 23–38.
  • Sáiz-Abajo M.J., Mevik B.H., Segtnan V.H., Næs T. Ensemble methods and data augmentation by noise addition applied to the analysis of spectroscopic data. Analytica Chimica Acta. 2005. V. 533. № 2. P. 147–159. 10.1016/j.aca.2004.10.086.
  • Seni G., Elder J.F. Ensemble methods in data mining: improving accuracy through combining predictions. Synthesis Lectures on Data Mining and Knowledge Discovery. 2010. V. 2. № 1. P.
  • Sladojevic S., Arsenovic M., Anderla A., Culibrk D., Stefanovic D. Deep neural networks based recognition of plant diseases by leaf image classification. Computational intelligence and neuroscience. 2016. V. 2016. P. 1–12. 10.1109/CVPR.2014.466.
  • Srivastava N., Hinton G.E., Krizhevsky A., Sutskever I., Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. Journal of machine learning research. 2014. V. 15. № 1. P. 1929–1958.
  • Vinyals O., Blundell C., Lillicrap T., Wierstra D. Matching networks for one shot learning. Advances in Neural Information Processing Systems. 2016. P. 3630–3638.