Mgr. Dmytro Mishkin, Ph.D.

StereoGlue: Robust Estimation with Single-Point Solvers

Autoři: Barath, D., Mgr. Dmytro Mishkin, Ph.D., Cavalli, L., Sarlin, P.E., Hruby, P., Pollefeys, M.
Publikace: Computer Vision – ECCV 2024, Part LVII. Springer, Cham, 2025. p. 421-441. LNCS. vol. 15115. ISSN 0302-9743. ISBN 978-3-031-72997-3.
Rok: 2025

DOI: 10.1007/978-3-031-72998-0_24
Odkaz: https://doi.org/10.1007/978-3-031-72998-0_24
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We propose StereoGlue, a method designed for joint feature matching and robust estimation that effectively reduces the combinatorial complexity of these tasks using single-point minimal solvers. StereoGlue is applicable to a range of problems, including but not limited to relative pose and homography estimation, determining absolute pose with 2D-3D correspondences, and estimating 3D rigid transformations between point clouds. StereoGlue starts with a set of one-to-many tentative correspondences, iteratively forms tentative matches, and estimates the minimal sample model. This model then facilitates guided matching, leading to consistent one-to-one matches, whose number serves as the model score. StereoGlue is superior to the state-of-the-art robust estimators on real-world datasets on multiple problems, improving upon a number of recent feature detectors and matchers. Additionally, it shows improvements in point cloud matching and absolute camera pose estimation. The code is at: https://github.com/danini/stereoglue.

A Large-Scale Homography Benchmark

Autoři: Barath, D., Mgr. Dmytro Mishkin, Ph.D., Polic, M., Forstner, W., prof. Ing. Jiří Matas, Ph.D.,
Publikace: Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE Computer Society, 2023. p. 21360-21370. ISSN 2575-7075. ISBN 979-8-3503-0129-8.
Rok: 2023

DOI: 10.1109/CVPR52729.2023.02046
Odkaz: https://doi.org/10.1109/CVPR52729.2023.02046
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We present a large-scale dataset of Planes in 3D, Pi3D, of roughly 1000 planes observed in 10 000 images from the 1DSfM dataset, and HEB, a large-scale homography estimation benchmark leveraging Pi3D. The applications of the Pi3D dataset are diverse, e.g. training or evaluating monocular depth, surface normal estimation and image matching algorithms. The HEB dataset consists of 226 260 homographies and includes roughly 4M correspondences. The homographies link images that often undergo significant viewpoint and illumination changes. As applications of HEB, we perform a rigorous evaluation of a wide range of robust estimators and deep learning-based correspondence filtering methods, establishing the current state-of- the-art in robust homography estimation. We also evalu- ate the uncertainty of the SIFT orientations and scales w.r.t. the ground truth coming from the underlying homographies and provide codes for comparing uncertainty of custom de- tectors. The dataset is available at https://github.com/danini/homography-benchmark.

DoG Accuracy Via Equivariance: Get The Interpolation Right

Autoři: Ing. et Ing. Václav Vávra, Mgr. Dmytro Mishkin, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
Publikace: 2023 IEEE International Conference on Image Processing (ICIP). New York: Institute of Electrical and Electronics Engineers, 2023. p. 136-140. ISBN 978-1-7281-9835-4.
Rok: 2023

DOI: 10.1109/ICIP49359.2023.10222153
Odkaz: https://doi.org/10.1109/ICIP49359.2023.10222153
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We study the influence of image interpolation algorithms on local feature detectors operating on a scale pyramid, focusing on the Difference-of-Gaussian, as used in SIFT. We show that commonly used implementations, such as in OpenCV and Kornia, are neither rotational nor scale equivariant. We present a simple solution and demonstrate its positive influence on the downstream image matching tasks. The implementation of the method has been accepted in standard libraries OpenCV and Kornia.

HarrisZ+: Harris corner selection for next-gen image matching pipelines

Autoři: Bellavia, F., Mgr. Dmytro Mishkin, Ph.D.,
Publikace: Pattern Recognition Letters. 2022, 2022(158) 141-147. ISSN 0167-8655.
Rok: 2022

DOI: 10.1016/j.patrec.2022.04.022
Odkaz: https://doi.org/10.1016/j.patrec.2022.04.022
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
Due to its role in many computer vision tasks, image matching has been subjected to an active investigation by researchers, which has lead to better and more discriminant feature descriptors and to more robust matching strategies, also thanks to the advent of the deep learning and the increased computational power of the modern hardware. Despite of these achievements, the keypoint extraction process at the base of the image matching pipeline has not seen equivalent progresses. This paper presents HarrisZ+, an upgrade to the HarrisZ corner detector, optimized to synergically take advance of the recent improvements of the other steps of the image matching pipeline. HarrisZ+ does not only consists of a tuning of the setup parameters, but introduces further refinements to the selection criteria delineated by HarrisZ, so providing more, yet discriminative, keypoints, which are better distributed on the image and with higher localization accuracy. The image matching pipeline including HarrisZ+, together with the other modern components, obtained in different recent matching benchmarks state-of-the-art results among the classic image matching pipelines. These results are quite close to those obtained by the more recent fully deep end-to-end trainable approaches and show that there is still a proper margin of improvement that can be granted by the research in classic image matching methods.

Efficient Initial Pose-Graph Generation for Global SfM

Autoři: Baráth, D., Mgr. Dmytro Mishkin, Ph.D., Eichhardt, I., Shipachev, I., prof. Ing. Jiří Matas, Ph.D.,
Publikace: Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE Computer Society, 2021. p. 14541-14550. ISSN 2575-7075. ISBN 978-1-6654-4509-2.
Rok: 2021

DOI: 10.1109/CVPR46437.2021.01431
Odkaz: https://doi.org/10.1109/CVPR46437.2021.01431
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We propose ways to speed up the initial pose-graph generation for global Structure-from-Motion algorithms. To avoid forming tentative point correspondences by FLANN and geometric verification by RANSAC, which are the most time-consuming steps of the pose-graph creation, we propose two new methods -- built on the fact that image pairs usually are matched consecutively. Thus, candidate relative poses can be recovered from paths in the partly-built pose-graph. We propose a heuristic for the A* traversal, considering global similarity of images and the quality of the pose-graph edges. Given a relative pose from a path, descriptor-based feature matching is made "light-weight" by exploiting the known epipolar geometry. To speed up PROSAC-based sampling when RANSAC is applied, we propose a third method to order the correspondences by their inlier probabilities from previous estimations. The algorithms are tested on 402130 image pairs from the 1DSfM dataset and they speed up the feature matching 17 times and pose estimation 5 times. The source code will be made public.

Image Matching across Wide Baselines: From Paper to Practice

Autoři: Jin, Y., Mgr. Dmytro Mishkin, Ph.D., Mishchuk, A., prof. Ing. Jiří Matas, Ph.D., Fua, P., Yi, K.M., Trulls, E.
Publikace: International Journal of Computer Vision. 2021, 129 517-547. ISSN 0920-5691.
Rok: 2021

DOI: 10.1007/s11263-020-01385-0
Odkaz: https://doi.org/10.1007/s11263-020-01385-0
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We introduce a comprehensive benchmark for local features and robust estimation algorithms, focusing on the downstream task -- the accuracy of the reconstructed camera pose -- as our primary metric. Our pipeline's modular structure allows easy integration, configuration, and combination of different methods and heuristics. This is demonstrated by embedding dozens of popular algorithms and evaluating them, from seminal works to the cutting edge of machine learning research. We show that with proper settings, classical solutions may still outperform the perceived state of the art. Besides establishing the actual state of the art, the conducted experiments reveal unexpected properties of Structure from Motion (SfM) pipelines that can help improve their performance, for both algorithmic and learned methods. Data and code are online https://github.com/team-yi-ubc/image-matching-benchmark providing an easy-to-use and flexible framework for the benchmarking of local features and robust estimation methods, both alongside and against top-performing methods. This work provides a basis for the Image Matching Challenge https://vision.uvic.ca/image-matching-challenge/.

Kornia: an Open Source Differentiable Computer Vision Library for PyTorch

Autoři: Riba, E., Mgr. Dmytro Mishkin, Ph.D., Ponsa, D., Rublee, E., Bradski, G.
Publikace: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). New Jersey: IEEE, 2020. p. 3663-3672. ISSN 2642-9381. ISBN 978-1-7281-6553-0.
Rok: 2020

DOI: 10.1109/WACV45572.2020.9093363
Odkaz: https://doi.org/10.1109/WACV45572.2020.9093363
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
This work presents Kornia -- an open source computer vision library which consists of a set of differentiable routines and modules to solve generic computer vision problems. At its core, the package uses PyTorch as its main backend both for efficiency and to take advantage of the reverse-mode auto-differentiation to define and compute the gradient of complex functions. Inspired by OpenCV, Kornia is composed of a set of modules containing operators that can be inserted inside neural networks to train models to perform image transformations, camera calibration, epipolar geometry, and low level image processing techniques such as filtering and edge detection that operate directly on high dimensional tensor representations. Examples of classical vision problems implemented using our framework are also provided including a benchmark comparing to existing vision libraries.

Saddle: Fast and repeatable features with good coverage

Autoři: Aldana Iuit, J., Mgr. Dmytro Mishkin, Ph.D., prof. Mgr. Ondřej Chum, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
Publikace: Image and Vision Computing. 2020, 97 ISSN 0262-8856.
Rok: 2020

DOI: 10.1016/j.imavis.2019.08.011
Odkaz: https://doi.org/10.1016/j.imavis.2019.08.011
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
A novel similarity-covariant feature detector that extracts points whose neighborhoods, when treated as a 3D intensity surface, have a saddle-like intensity profile is presented. The saddle condition is verified efficiently by intensity comparisons on two concentric rings that must have exactly two dark-to-bright and two bright-to-dark transitions satisfying certain geometric constraints. Saddle is a fast approximation of Hessian detector as ORB, that implements the FAST detector, is for Harris detector. We propose to use the matching strategy called the first geometric inconsistent with binary descriptors that is suitable for our feature detector, including experiments with fix point descriptors hand-crafted and learned. Experiments show that the Saddle features are general, evenly spread and appearing in high density in a range of images. The Saddle detector is among the fastest proposed. In comparison with detector with similar speed, the Saddle features show superior matching performance on number of challenging datasets. Compared to recently proposed deep-learning based interest point detectors and popular hand-crafted keypoint detectors, evaluated for repeatability in the ApolloScape dataset [1], the Saddle detectors shows the best performance in most of the street-level view sequences a.k.a. traversals.

Leveraging Outdoor Webcams for Local Descriptor Learning

Autoři: Pultar, M., Mgr. Dmytro Mishkin, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
Publikace: Proceedings of the 24th Computer Vision Winter Workshop. Graz: Verlag der TU Graz, 2019. p. 51-60. ISBN 978-3-85125-652-9.
Rok: 2019

DOI: 10.3217/978-3-85125-652-9-06
Odkaz: https://doi.org/10.3217/978-3-85125-652-9-06
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We present AMOS Patches, a large set of image cut-outs, intended primarily for the robustification of trainable local feature descriptors to illumination and appearance changes. Images contributing to AMOS Patches originate from the AMOS dataset of recordings from a large set of outdoor webcams. The semiautomatic method used to generate AMOS Patches is described. It includes camera selection, viewpoint clustering and patch selection. For training, we provide both the registered full source images as well as the patches. A new descriptor, trained on the AMOS Patches and 6Brown datasets, is introduced. It achieves state-of-the-art in matching under illumination changes onstandard benchmarks.

DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks

Autoři: Kupyn, O., Budzan, V., Mykhailych, M., Mgr. Dmytro Mishkin, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
Publikace: CVPR 2018: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018. p. 8183-8192. ISSN 2575-7075. ISBN 978-1-5386-6420-9.
Rok: 2018

Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We present DeblurGAN, an end-to-end learned method for motion deblurring. The learning is based on a conditional GAN and the content loss. DeblurGAN achieves state-of-the art performance both in the structural similarity measure and visual appearance. The quality of the deblurring model is also evaluated in a novel way on a real-world problem - object detection on (de-)blurred images. The method is 5 times faster than the closest competitor - DeepDeblur. We also introduce a novel method for generating synthetic motion blurred images from sharp ones, allowing realistic dataset augmentation. The model, code and the dataset are available https://github.com/KupynOrest/DeblurGAN

Repeatability Is Not Enough: Learning Affine Regions via Discriminability

Autoři: Mgr. Dmytro Mishkin, Ph.D., Radenović, F., prof. Ing. Jiří Matas, Ph.D.,
Publikace: ECCV2018: Proceedings of the European Conference on Computer Vision, Part IX. Springer, Cham, 2018. p. 287-304. Lecture Notes in Computer Vision. vol. 11213. ISSN 0302-9743. ISBN 978-3-030-01239-7.
Rok: 2018

DOI: 10.1007/978-3-030-01240-3_18
Odkaz: https://doi.org/10.1007/978-3-030-01240-3_18
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
A method for learning local affine-covariant regions is presented. We show that maximizing geometric repeatability does not lead to local regions, a.k.a features, that are reliably matched and this necessitates descriptor-based learning. We explore factors that influence such learning and registration: the loss function, descriptor type, geometric parametrization and the trade-off between matchability and geometric accuracy and propose a novel hard negative-constant loss function for learning of affine regions. The affine shape estimator – AffNet – trained with the hard negative-constant loss outperforms the state-of-the-art in bag-of-words image retrieval and wide baseline stereo. The proposed training process does not require precisely geometrically aligned patches. The source codes and trained weights are available at https://github.com/ducha-aiki/affnet

In the Saddle: Chasing fast and repeatable features

Autoři: Aldana Iuit, J., Mgr. Dmytro Mishkin, Ph.D., prof. Mgr. Ondřej Chum, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
Publikace: 2016 23rd International Conference on Pattern Recognition (ICPR). Institute of Electrical and Electronics Engineers, 2017. p. 675-680. ISSN 1051-4651. ISBN 978-1-5090-4847-2.
Rok: 2017

DOI: 10.1109/ICPR.2016.7899712
Odkaz: https://doi.org/10.1109/ICPR.2016.7899712
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
A novel similarity-covariant feature detector that extracts points whose neighborhoods, when treated as a 3D intensity surface, have a saddle-like intensity profile. The saddle condition is verified efficiently by intensity comparisons on two concentric rings that must have exactly two dark-to-bright and two bright-to-dark transitions satisfying certain geometric constraints. Experiments show that the Saddle features are general, evenly spread and appearing in high density in a range of images. The Saddle detector is among the fastest proposed. In comparison with detector with similar speed, the Saddle features show superior matching performance on number of challenging datasets.

Systematic Evaluation of Convolution Neural Network Advances on the ImageNet

Autoři: Mgr. Dmytro Mishkin, Ph.D., Sergievskiy, N., prof. Ing. Jiří Matas, Ph.D.,
Publikace: Computer Vision and Image Understanding. 2017, 161 11-19. ISSN 1077-3142.
Rok: 2017

DOI: 10.1016/j.cviu.2017.05.007
Odkaz: https://doi.org/10.1016/j.cviu.2017.05.007
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
The paper systematically studies the impact of a range of recent advances in convolution neural network (CNN) architectures and learning methods on the object categorization (ILSVRC) problem. The evaluation tests the influence of the following choices of the architecture: non-linearity (ReLU, ELU, maxout, compatability with batch normalization), pooling variants (stochastic, max, average, mixed), network width, classifier design (convolutional, fully-connected, SPP), image pre-processing, and of learning parameters: learning rate, batch size, cleanliness of the data, etc. The performance gains of the proposed modifications are first tested individually and then in combination. The sum of individual gains is greater than the observed improvement when all modifications are introduced, but the “deficit” is small suggesting independence of their benefits. We show that the use of 128 × 128 pixel images is sufficient to make qualitative conclusions about optimal network structure that hold for the full size Caffe and VGG nets. The results are obtained an order of magnitude faster than with the standard 224 pixel images.

Working hard to know your neighbor's margins: Local descriptor learning loss

Autoři: Mishchuk, A., Mgr. Dmytro Mishkin, Ph.D., Radenović, F., prof. Ing. Jiří Matas, Ph.D.,
Publikace: Advances in Neural Information Processing Systems 30. Neural Information Processing Systems (NIPS) Foundation, 2017. p. 4827-4838. vol. 30. ISSN 1049-5258.
Rok: 2017

Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We introduce a loss for metric learning, which is inspired by the Lowe's matching criterion for SIFT. We show that the proposed loss, that maximizes the distance between the closest positive and closest negative example in the batch, is better than complex regularization methods; it works well for both shallow and deep convolution network architectures. Applying the novel loss to the L2Net CNN architecture results in a compact descriptor named HardNet. It has the same dimensionality as SIFT (128) and shows state-of-art performance in wide baseline stereo, patch verification and instance retrieval benchmarks.

All you need is a good init

Autoři: Mgr. Dmytro Mishkin, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
Publikace: International Conference on Learning Representations 2016. Computational and Biological Learning Society, 2016.
Rok: 2016

Pracoviště: Katedra kybernetiky, Skupina vizuálního rozpoznávání
Anotace:
Layer-sequential unit-variance (LSUV) initialization - a simple method for weight initialization for deep net learning - is proposed. The method consists of the two steps. First, pre-initialize weights of each convolution or inner-product layer with orthonormal matrices. Second, proceed from the first to the final layer, normalizing the variance of the output of each layer to be equal to one. Experiment with different activation functions (maxout, ReLU-family, tanh) show that the proposed initialization leads to learning of very deep nets that (i) produces networks with test accuracy better or equal to standard methods and (ii) is at least as fast as the complex schemes proposed specifically for very deep nets such as FitNets (Romero et al. 2015)) and Highway (Srivastava et al. (2015)). Performance is evaluated on GoogLeNet, CaffeNet, FitNets and Residual nets and the state-of-the-art, or very close to it, is achieved on the MNIST, CIFAR-10/100 and ImageNet datasets.

Very Deep Residual Networks with MaxOut for Plant Identification in the Wild

Autoři: Šulc, M., Mgr. Dmytro Mishkin, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
Publikace: Working Notes of CLEF 2016 - Conference and Labs of the Evaluation forum. Aachen: CEUR Workshop Proceedings, 2016. pp. 579-586. CEUR Workshop Proceedings. vol. 1609. ISSN 1613-0073.
Rok: 2016

Pracoviště: Katedra kybernetiky, Skupina vizuálního rozpoznávání
Anotace:
The paper presents our deep learning approach to automatic recognition of plant species from photos. We utilized a very deep 152-layer residual network model pre-trained on ImageNet, replaced the original fully connected layer with two randomly initialized fully connected layers connected with maxout, and fine-tuned the network on the PlantCLEF 2016 training data. Bagging of 3 networks was used to further improve accuracy. With the proposed approach we scored among the top 3 teams in the PlantCLEF 2016 plant identification challenge.

MODS: Fast and robust method for two-view matching

Autoři: Mgr. Dmytro Mishkin, Ph.D., prof. Ing. Jiří Matas, Ph.D., Perďoch, M.
Publikace: Computer Vision and Image Understanding. 2015, 141 81-93. ISSN 1077-3142.
Rok: 2015

DOI: 10.1016/j.cviu.2015.08.005
Odkaz: https://doi.org/10.1016/j.cviu.2015.08.005
Pracoviště: Katedra kybernetiky
Anotace:
Abstract A novel algorithm for wide-baseline matching called MODS - matching on demand with view synthesis - is presented. The MODS algorithm is experimentally shown to solve a broader range of wide-baseline problems than the state of the art while being nearly as fast as standard matchers on simple problems. The apparent robustness vs. speed trade-off is finessed by the use of progressively more time-consuming feature detectors and by on-demand generation of synthesized images that is performed until a reliable estimate of geometry is obtained. We introduce an improved method for tentative correspondence selection, applicable both with and without view synthesis. A modification of the standard first to second nearest distance rule increases the number of correct matches by 5-20% at no additional computational cost. Performance of the MODS algorithm is evaluated on several standard publicly available datasets, and on a new set of geometrically challenging wide baseline problems that is made public together with the ground truth. Experiments show that the MODS outperforms the state-of-the-art in robustness and speed. Moreover, MODS performs well on other classes of difficult two-view problems like matching of images from different modalities, with wide temporal baseline or with significant lighting changes.

Place Recognition with WxBS Retrieval

Autoři: Mgr. Dmytro Mishkin, Ph.D., Perďoch, M., prof. Ing. Jiří Matas, Ph.D.,
Publikace: CVPR 2015 Workshop on Visual Place Recognition in Changing Environments. 2015.
Rok: 2015

Pracoviště: Katedra kybernetiky
Anotace:
We present a novel visual place recognition method designed for operation in challenging conditions such as encountered in day to night or winter to summer matching. The proposed WxBS Retrieval method is novel in enriching a bag of words approach with the use of multiple detectors, descriptors with suitable visual vocabularies, view synthesis, and adaptive thresholding to compensate for large variations in contrast and richness of features in different conditions. The performance of the method evaluated on the public Visual Place Recognition in Changing Environments (VPRiCE) dataset was achieved with precision 0.689 and recall 0.798 and F1-score 0.740. The precision and F1 score are best results so far reported for VPRiCE dataset. Experiments show that the combination of retrieval and matching algorithms with detectors and descriptors insensitive to gradient reversal and contrast lead to both high accuracy and scalability.

WxBS: Wide Baseline Stereo Generalizations

Autoři: Mgr. Dmytro Mishkin, Ph.D., prof. Ing. Jiří Matas, Ph.D., Perďoch, M., Lenc, K.
Publikace: Proceedings of the British Machine Vision Conference (BMVC). London: British Machine Vision Association, 2015. ISBN 978-1-901725-53-7.
Rok: 2015

Pracoviště: Katedra kybernetiky
Anotace:
We have presented a new problem - the wide multiple baseline stereo (WxBS) -- which considers matching of images that simultaneously differ in more than one image acquisition factor such as viewpoint, illumination, sensor type or where object appearance changes significantly, e.g. over time. A new dataset with the ground truth for evaluation of matching algorithms has been introduced and will be made public. We have extensively tested a large set of popular and recent detectors and descriptors and show than the combination of RootSIFT and HalfRootSIFT as descriptors with MSER and Hessian-Affine detectors works best for many different nuisance factors. We show that simple adaptive thresholding improves Hessian-Affine, DoG, MSER (and possibly other) detectors and allows to use them on infrared and low contrast images. A novel matching algorithm for addressing the WxBS problem has been introduced. We have shown experimentally that the WxBS-M matcher dominantes the state-of-the-art methods both on both the new and existing datasets.

A Few Things One Should Know About Feature Extraction, Description and Matching

Autoři: Lenc, K., prof. Ing. Jiří Matas, Ph.D., Mgr. Dmytro Mishkin, Ph.D.,
Publikace: CVWW2014: Proceedings of the 19th Computer Vision Winter Workshop. Praha: Czech Society for Cybernetics and Informatics, 2014. p. 67-74. ISBN 978-80-260-5641-6.
Rok: 2014

Pracoviště: Katedra kybernetiky
Anotace:
We explore the computational bottlenecks of the affine feature extraction process and sho w how this process can be speeded up by 2-3 times with no or very modest loss of performance. With o ur improvements the speed of the Hessian-Affine and MSER detector is comparable with similarity-inva riant SURF and DoG-SIFT detectors. The improvements presented include a faster anisotropic patch ext raction algorithm which does not depend on the feature scale, a speed up of a feature dominant orien tation estimation and SIFT descriptor computation using a look-up table. In the second part of the paper we explore performance of the recently proposed first geometrically inconsistent nearest neighbour criterion and domination orientation generation process.

Matching of Images of Non-planar Objects with View Synthesis

Autoři: Mgr. Dmytro Mishkin, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
Publikace: SOFSEM 2014: Theory and Practice of Computer Science. Cham: Springer International Publishing AG, 2014. pp. 30-39. Lecture notes in computer science. ISSN 0302-9743. ISBN 978-3-319-04297-8.
Rok: 2014

DOI: 10.1007/978-3-319-04298-5_4
Odkaz: https://doi.org/10.1007/978-3-319-04298-5_4
Pracoviště: Katedra kybernetiky
Anotace:
We explore the performance of the recently proposed two-view image matchin g algorithms using affine view synthesis ASIFT (Morel and Yu, 2009) [14] and MODS (Mishkin, Perdoch and Matas, 2013) [10] on images of objects that do not have significant local texture and that are l ocally not well approximated by planes. Experiments show that view synthesis improves matching resul ts on images of such objects, but the number of useful synthetic views is lower than for planar objects matching. The best detector for matching images of 3D objects is the Hessian-Affine in the Sparse configuration. The iterative MODS matcher performs comparably confirming it is a robust, generic method for two view matching that performs well for different types of scenes and a wide range of viewing conditions.

Two-view Matching with View Synthesis Revisited

Autoři: Mgr. Dmytro Mishkin, Ph.D., Perďoch, M., prof. Ing. Jiří Matas, Ph.D.,
Publikace: 2013 28th International Conference of Image and Vision Computing New Zealand (IVCNZ 2013). Piscataway: IEEE, 2013. pp. 436-441. ISSN 2151-2191. ISBN 978-1-4799-0882-0.
Rok: 2013

DOI: 10.1109/IVCNZ.2013.6727054
Odkaz: https://doi.org/10.1109/IVCNZ.2013.6727054
Pracoviště: Katedra kybernetiky
Anotace:
Wide-baseline matching focussing on problems with extreme viewpoint change is considered. We in troduce the use of view synthesis with affine-covariant detectors to solve such problems and show that matching with the Hessian-Affine or MSER detectors outperforms the state-of-the-art ASIFT [19]. To minimise the loss of speed caused by view synthesis, we propose the Matching On Demand with view Synthesis algorithm (MODS) that uses progressively more synthesized images and more (time-consuming) detectors until reliable estimation of geometry is possible. We show experimentally that the MODS algorithm solves problems beyond the state-of-the-art and yet is comparable in speed to standard wide-baseline matchers on simpler problems. Minor contributions include an improved method for tentative correspondence selection, applicable both with and without view synthesis and a view synthesis setup greatly improving MSER robustness to blur and scale change that increase its running time by 10% only.

Mgr. Dmytro Mishkin, Ph.D.

Všechny publikace

StereoGlue: Robust Estimation with Single-Point Solvers

A Large-Scale Homography Benchmark

DoG Accuracy Via Equivariance: Get The Interpolation Right

HarrisZ+: Harris corner selection for next-gen image matching pipelines

Efficient Initial Pose-Graph Generation for Global SfM

Image Matching across Wide Baselines: From Paper to Practice

Kornia: an Open Source Differentiable Computer Vision Library for PyTorch

Saddle: Fast and repeatable features with good coverage

Leveraging Outdoor Webcams for Local Descriptor Learning

DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks

Repeatability Is Not Enough: Learning Affine Regions via Discriminability

In the Saddle: Chasing fast and repeatable features

Systematic Evaluation of Convolution Neural Network Advances on the ImageNet

Working hard to know your neighbor's margins: Local descriptor learning loss

All you need is a good init

Very Deep Residual Networks with MaxOut for Plant Identification in the Wild

MODS: Fast and robust method for two-view matching

Place Recognition with WxBS Retrieval

WxBS: Wide Baseline Stereo Generalizations

A Few Things One Should Know About Feature Extraction, Description and Matching

Matching of Images of Non-planar Objects with View Synthesis

Two-view Matching with View Synthesis Revisited

Mějte přehled