In this article we consider stereo matching problem using classic dynamic time warping algorithm. This method utilizes
similarity metrics of small ( pixels) image fragments. We consider algorithm’s quality in respect to the similarity
metric. We compare -norm of point-wise differences between the pixel neighborhoods and neural network-based metrics,
based on relatively small (about 1000) number of neurons and an output vector of dimension 64. Similarity metric in this
case is an -norm of a vector of neural network output differences. We show that this modification of DTW achieves better
results than the unmodified -distance-based method. All neural networks in this article were trained on open datasets
Middlebury Stereo Datasets and KITTI.
Key words:
computer vision, stereoscopy, stereo matching, dynamic time warping, machine learning, Siamese neural networks
DOI: 10.1134/S0235009218030101
Cite:
Okhlopkov D. O., Gladilin S. A., Fedorenko F. A.
Postroenie metricheskogo priznakovogo prostranstva pri pomoshchi siamskikh neironnykh setei dlya vychisleniya karty disparatnosti
[Learning a metric feature space with siamese networks for disparity map computation].
Sensornye sistemy [Sensory systems].
2018.
V. 32(3).
P. 253-259 (in Russian). doi: 10.1134/S0235009218030101
References:
- Forsait D., Pons Zh. Komp’yuternoe zrenie. Sovremennyi podkhod. [Computer vision. Modern approach] Moscow. Vil’yams, 2004. 928 p. (in Russian).
- Birchfield S., Tomasi C. A pixel dissimilarity measure that is insensitive to image sampling. TPAMI. 1998. P. 401–406.
- Chopra S., Hadsell R., LeCun Y. Learning a similarity metric discriminatively, with application to face verification. CVPR 2005. V. 1. P. 539–546.
- Einecke N., Eggert J. A two-stage correlation method for stereoscopic depth estimation. International Conference on Digital Image Computing: Techniques and Applications. 2010. P. 227–234.
- Fedorenko F.A., Ivanova A.A., Limonova E.E., Konovalenko I.A., Trainable Siamese keypoint descriptors for real-time applications. 2016 International Conference on Robotics and Machine Vision. 2017. P. 1025306–1025306.
- Geiger A., Lenz P., Urtasun R. Are we ready for autonomous driving? The KITTI vision benchmark suite. CVPR, 2012. P. 3354–3361.
- Hirschmuller H., Scharstein D. Evaluation of stereo matching costs on images with radiometric differences. TPAMI. 2009. P. 1582–1599.
- Itakura F. Minimum prediction residual principle applied to speech recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1975. V. 23. № 1. P. 67–72.
- Ma Z., He K., Wei Y., Sun J., Wu E. Constant time weighted median filtering for stereo matching and beyond. Proceedings of the IEEE International Conference on Computer Vision. 2013. P. 49–56.
- Min D., Lu J., Do M. A revisit to cost aggregation in stereo matching: How far can we reduce its computational redundancy? ICCV. 2011. P. 1567–1574.
- Muehlmann K., Maier D., Hesser J., Maenner R. Calculating dense disparity maps from color stereo images, an efficient implementation. IJCV. 2002. P. 79–88.
- Rhemann C., Hosni A., Bleyer M., Rother C., Gelautz M. Fast cost-volume filtering for visual correspondence and beyond. Transactions on Pattern Analysis and Machine Intelligence. 2013. V. 35. № 2.
- Scharstein D., Szeliski R. Middlebury stereo vision page URL: http://www.middlebury.edu/stereo. (last access date 15.08.2017).
- Scharstein D., Szeliski R. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV. 2002 P. 7–42.
- Sun D., Roth S., Black M.J. Secrets of optical flow estimation and their principles. CVPR. 2010. P. 2432–2439.
- Takaya K. Dense Stereo Disparity Maps – Real-time Video Implementation by the Sparse Feature Sampling. MVA. 2011. P. 164–167.
- Yoon K., Kweon I. Adaptive support-weight approach for correspondence search. TPAMI. 2006. P. 650–656.
- Zagoruyko S., Komodakis N. Learning to compare image patches via convolutional neural networks. CVPR. 2015. P. 4353–4361.
- Zbontar J., LeCun Y. Stereo matching by training a convolutional neural network to compare image patches. Journal of Machine Learning Research. 2016. V. 17. №1–32. P. 2.
- Zhang K., Lu J., Lafruit G. Cross-based local stereo matching using orthogonal integral images. TCSVT. 2009. P. 1073–1079.