Mgr. Oleksandr Shekhovtsov, Ph.D.

Cold Analysis of Rao-Blackwellized Straight-Through Gumbel-Softmax Gradient Estimator

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D.,
Publikace: International Conference on Machine Learning (ICML). Proceedings of Machine Learning Research, 2023. p. 30931-30955. ISSN 2640-3498.
Rok: 2023

Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
Many problems in machine learning require an estimate of the gradient of an expectation in discrete random variables with respect to the sampling distribution. This work is motivated by the development of the Gumbel-Softmax family of estimators, which use a temperature-controlled relaxation of discrete variables. The state-of-the art in this family, the Gumbel-Rao estimator uses an extra internal sampling to reduce the variance, which may be costly. We analyze this estimator and show that it possesses a zero temperature limit with a surprisingly simple closed form. The limit estimator, called ZGR, has favorable bias and variance properties, it is easy to implement and computationally inexpensive. It decomposes as the average of the straight through (ST) estimator and DARN estimator — two basic but not very well performing on their own estimators. We demonstrate that the simple ST–ZGR family of estimators practically dominates in the bias-variance tradeoffs the whole GR family while also outperforming SOTA unbiased estimators.

Generalized Differentiable RANSAC

Autoři: Tong Wei, MSc., Patel, Y., Mgr. Oleksandr Shekhovtsov, Ph.D., prof. Ing. Jiří Matas, Ph.D., Baráth, D.
Publikace: ICCV2023: Proceedings of the International Conference on Computer Vision. Piscataway: IEEE, 2023. p. 17603-17614. ISSN 1550-5499. ISBN 979-8-3503-0719-1.
Rok: 2023

DOI: 10.1109/ICCV51070.2023.01618
Odkaz: https://doi.org/10.1109/ICCV51070.2023.01618
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We propose ∇-RANSAC, a generalized differentiable RANSAC that allows learning the entire randomized robust estimation pipeline. The proposed approach enables the use of relaxation techniques for estimating the gradients in the sampling distribution, which are then propagated through a differentiable solver. The trainable quality function marginalizes over the scores from all the models estimated within ∇-RANSAC to guide the network learning accurate and useful inlier probabilities or to train feature detection and matching networks. Our method directly maximizes the probability of drawing a good hypothesis, allowing us to learn better sampling distributions. We test ∇-RANSAC on various real-world scenarios on fundamental and essential matrix estimation, and 3D point cloud registration, outdoors and indoors, with handcrafted and learning-based features. It is superior to the state-of-the-art in terms of accuracy while running at a similar speed to its less accurate alternatives. The code and trained models are available at https://github.com/weitong8591/differentiable_ransac.

VAE Approximation Error: ELBO and Exponential Families

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Schlesinger, D., Flach, B.
Publikace: The Tenth International Conference on Learning Representations. Massachusetts: OpenReview.net / University of Massachusetts, 2022.
Rok: 2022

Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
The importance of Variational Autoencoders reaches far beyond standalone generative models -- the approach is also used for learning latent representations and can be generalized to semi-supervised learning. This requires a thorough analysis of their commonly known shortcomings: posterior collapse and approximation errors. This paper analyzes VAE approximation errors caused by the combination of the ELBO objective and encoder models from conditional exponential families, including, but not limited to, commonly used conditionally independent discrete and continuous models. We characterize subclasses of generative models consistent with these encoder families. We show that the ELBO optimizer is pulled away from the likelihood optimizer towards the consistent subset and study this effect experimentally. Importantly, this subset can not be enlarged, and the respective error cannot be decreased, by considering deeper encoder/decoder networks.

Bias-Variance Tradeoffs in Single-Sample Binary Gradient Estimators

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D.,
Publikace: Proceedings of the 43rd DAGM German Conference (DAGM GCPR 2021). Cham: Springer, 2021. p. 127-141. LNCS. vol. 13024. ISSN 0302-9743. ISBN 978-3-030-92658-8.
Rok: 2021

DOI: 10.1007/978-3-030-92659-5_8
Odkaz: https://doi.org/10.1007/978-3-030-92659-5_8
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
Discrete and especially binary random variables occur in many machine learning models, notably in variational autoencoders with binary latent states and in stochastic binary networks. When learning such models, a key tool is an estimator of the gradient of the expected loss with respect to the probabilities of binary variables. The straight-through (ST) estimator gained popularity due to its simplicity and efficiency, in particular in deep networks where unbiased estimators are impractical. Several techniques were proposed to improve over ST while keeping the same low computational complexity: Gumbel-Softmax, ST-Gumbel-Softmax, BayesBiNN, FouST. We conduct a theoretical analysis of bias and variance of these methods in order to understand tradeoffs and verify the originally claimed properties. The presented theoretical results allow for better understanding of these methods and in some cases reveal serious issues.

Initialization and Transfer Learning of Stochastic Binary Networks from Real-Valued Ones

Autoři: Livochka, A., Mgr. Oleksandr Shekhovtsov, Ph.D.,
Publikace: 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGITION WORKSHOPS (CVPRW 2021). New York: IEEE Computer Society Press, 2021. p. 4655-4663. ISSN 2160-7516. ISBN 978-1-6654-4899-4.
Rok: 2021

DOI: 10.1109/CVPRW53098.2021.00524
Odkaz: https://doi.org/10.1109/CVPRW53098.2021.00524
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We consider the training of binary neural networks (BNNs) using the stochastic relaxation approach, which leads to stochastic binary networks (SBNs). We identify that a severe obstacle to training deep SBNs without skip connections is already the initialization phase. While smaller models can be trained from a random (possibly data-driven) initialization, for deeper models and large datasets, it becomes increasingly difficult to obtain non-vanishing and low variance gradients when initializing randomly.

Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Yanush, V.
Publikace: Proceedings of the 43rd DAGM German Conference (DAGM GCPR 2021). Cham: Springer, 2021. p. 111-126. LNCS. vol. 13024. ISSN 0302-9743. ISBN 978-3-030-92658-8.
Rok: 2021

DOI: 10.1007/978-3-030-92659-5_7
Odkaz: https://doi.org/10.1007/978-3-030-92659-5_7
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
Training neural networks with binary weights and activa-tions is a challenging problem due to the lack of gradients and difficulty ofoptimization over discrete weights. Many successful experimental resultshave been achieved with empirical straight-through (ST) approaches,proposing a variety of ad-hoc rules for propagating gradients throughnon-differentiable activations and updating discrete weights. At the sametime, ST methods can be truly derived as estimators in the stochasticbinary network (SBN) model with Bernoulli weights. We advance thesederivations to a more complete and systematic study. We analyze proper-ties, estimation accuracy, obtain different forms of correct ST estimatorsfor activations and weights, explain existing empirical approaches andtheir shortcomings, explain how latent weights arise from the mirrordescent method when optimizing over probabilities. This allows to rein-troduce ST methods, long known empirically, as sound approximations,apply them with clarity and develop further improvements.

Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems

Autoři: Knobelreiter, P., Sormann, C., Mgr. Oleksandr Shekhovtsov, Ph.D., Fraundorfer, F., Pock, T.
Publikace: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE Computer Society, 2020. p. 7897-7906. ISSN 1063-6919. ISBN 978-1-7281-7169-2.
Rok: 2020

DOI: 10.1109/CVPR42600.2020.00792
Odkaz: https://doi.org/10.1109/CVPR42600.2020.00792
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
It has been proposed by many researchers that combining deep neural networks with graphical models can create more efficient and better regularized composite models. The main difficulties in implementing this in practice are associated with a discrepancy in suitable learning objectives as well as with the necessity of approximations for the inference. In this work we take one of the simplest inference methods, a truncated max-product Belief Propagation, and add what is necessary to make it a proper component of a deep learning model: connect it to learning formulations with losses on marginals and compute the backprop operation. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs), allowing us to design a hierarchical model composing BP inference and CNNs at different scale levels. The model is applicable to a range of dense prediction problems, is well-Trainable and provides parameter-efficient and robust solutions in stereo, flow and semantic segmentation.

Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Yanush, V., Flach, B.
Publikace: Advances in Neural Information Processing Systems 33 (NeurIPS 2020). Montreal: Neural Information Processing Society, 2020. ISSN 1049-5258.
Rok: 2020

Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
In neural networks with binary activations and or binary weights the training by gradient descent is complicated as the model has piecewise constant response. We consider stochastic binary networks, obtained by adding noises in front of activations. The expected model response becomes a smooth function of parameters, its gradient is well defined but it is challenging to estimate it accurately. We propose a new method for this estimation problem combining sampling and analytic approximation steps. The method has a significantly reduced variance at the price of a small bias which gives a very practical tradeoff in comparison with existing unbiased and biased estimators. We further show that one extra linearization step leads to a deep straight-through estimator previously known only as an ad-hoc heuristic. We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models with both proposed methods.

Taxonomy of Dual Block-Coordinate Ascent Methods for Discrete Energy Minimization

Autoři: Tourani, S., Mgr. Oleksandr Shekhovtsov, Ph.D., Rither, C., Savchynskyy, B.
Publikace: Proceedings of Machine Learning Research. Proceedings of Machine Learning Research, 2020. vol. Volume 108. ISSN 2640-3498.
Rok: 2020

Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We consider the maximum-a-posteriori inference problem in discrete graphical models and study solvers based on the dual block-coordinate ascent rule. We map all existing solvers in a single framework, allowing for a better understanding of their design principles. We theoretically show that some block-optimizing updates are sub-optimal and how to strictly improve them. On a wide range of problem instances of varying graph connectivity, we study the performance of existing solvers as well as new variants that can be obtained within the framework. As a result of this exploration we build a new state-of-the art solver, performing uniformly better on the whole range of test instances.

Feed-forward Propagation in Probabilistic Neural Networks with Categorical and Max Layers

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Flach, B.
Publikace: International Conference on Learning Representations. Massachusetts: OpenReview.net / University of Massachusetts, 2019.
Rok: 2019

Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
Probabilistic Neural Networks deal with various sources of stochasticity: input noise, dropout, stochastic neurons, parameter uncertainties modeled as random variables, etc. In this paper we revisit a feed-forward propagation approach that allows one to estimate for each neuron its mean and variance \wrt all mentioned sources of stochasticity. In contrast, standard NNs propagate only point estimates, discarding the uncertainty. Methods propagating also the variance have been proposed by several authors in different context. The view presented here attempts to clarify the assumptions and derivation behind such methods, relate them to classical NNs and broaden their scope of applicability. The main technical contributions are new approximations for the distributions of argmax and max-related transforms, which allow for fully analytic uncertainty propagation in networks with softmax and max-pooling layers as well as leaky ReLU activations. We evaluate the accuracy of the approximation and suggest a simple calibration. Applying the method to networks with dropout allows for faster training and gives improved test likelihoods without the need of sampling.

Stochastic Normalizations as Bayesian Learning

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Flach, B.
Publikace: ACCV 2018: Proceedings of the 14th Asian Conference on Computer Vision, Part II. Springer, 2019. p. 463-479. LNCS. vol. 11362. ISSN 0302-9743. ISBN 978-3-030-20889-9.
Rok: 2019

DOI: 10.1007/978-3-030-20890-5_30
Odkaz: https://doi.org/10.1007/978-3-030-20890-5_30
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
In this work we investigate the reasons why Batch Normalization (BN) improves the generalization performance of deep networks. We argue that one major reason, distinguishing it from data-independent normalization methods, is randomness of batch statistics. This randomness appears in the parameters rather than in activations and admits an interpretation as a practical Bayesian learning. We apply this idea to other (deterministic) normalization techniques that are oblivious to the batch size. We show that their generalization performance can be improved significantly by Bayesian learning of the same form. We obtain test performance comparable to BN and, at the same time, better validation losses suitable for subsequent output uncertainty estimation through approximate Bayesian posterior.

MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical Models

Autoři: Tourani, S., Mgr. Oleksandr Shekhovtsov, Ph.D., Rother, C., Savchynskyy, B.
Publikace: ECCV2018: Proceedings of the European Conference on Computer Vision, Part IV. Berlin: Springer-Verlag, 2018. p. 264-281. Lecture Notes in Computer Science. vol. 11208. ISSN 0302-9743. ISBN 978-3-030-01224-3.
Rok: 2018

DOI: 10.1007/978-3-030-01225-0_16
Odkaz: https://doi.org/10.1007/978-3-030-01225-0_16
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
Dense, discrete Graphical Models with pairwise potentials are a powerful class of models which are employed in state-of-the-art computer vision and bio-imaging applications. This work introduces a new MAP-solver, based on the popular Dual Block-Coordinate Ascent principle. Surprisingly, by making a small change to a low-performing solver, the Max Product Linear Programming (MPLP) algorithm [7], we derive the new solver MPLP++ that significantly outperforms all existing solvers by a large margin, including the state-of-the-art solver Tree-Reweighted Sequential (TRW-S) message-passing algorithm [17]. Additionally, our solver is highly parallel, in contrast to TRW-S, which gives a further boost in performance with the proposed GPU and multi-thread CPU implementations. We verify the superiority of our algorithm on dense problems from publicly available benchmarks as well as a new benchmark for 6D Object Pose estimation. We also provide an ablation study with respect to graph density.

Normalization of Neural Networks using Analytic Variance Propagation

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Flach, B.
Publikace: Proceedings of the 23rd Computer Vision Winter Workshop. Praha: Czech Society for Cybernetics and Informatics, 2018. p. 45-53. ISBN 978-80-270-3395-9.
Rok: 2018

Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We address the problem of estimating statistics of hidden units in a neural network using a method of analytic moment propagation. These statistics are useful for approximate whitening of the inputs in front of saturating non-linearities such as a sigmoid function. This is important for initialization of training and for reducing the accumulated scale and bias dependencies (compensating covariate shift), which presumably eases the learning. In batch normalization, which is currently a very widely applied technique, sample estimates of statistics of hidden units over a batch are used. The proposed estimation uses an analytic propagation of mean and variance of the training set through the network. The result depends on the network structure and its current weights but not on the specific batch input. The estimates are suitable for initialization and normalization, efficient to compute and independent of the batch size. The experimental verification well supports these claims. However, the method does not share the generalization properties of BN, to which our experiments give some additional insight.

Scalable full flow with learned binary descriptors

Autoři: Munda, G., Mgr. Oleksandr Shekhovtsov, Ph.D., Knöbelreiter, P., Pock, T.
Publikace: 39th German Conference on Pattern Recognition. Springer, Cham, 2017. p. 321-332. Lecture Notes in Computer Science. vol. 10496. ISSN 0302-9743. ISBN 978-3-319-66708-9.
Rok: 2017

DOI: 10.1007/978-3-319-66709-6_26
Odkaz: https://doi.org/10.1007/978-3-319-66709-6_26
Pracoviště: Skupina vizuálního rozpoznávání
Anotace:
We propose a method for large displacement optical flow in which local matching costs are learned by a convolutional neural network (CNN) and a smoothness prior is imposed by a conditional random field (CRF). We tackle the computation- and memory-intensive operations on the 4D cost volume by a min-projection which reduces memory complexity from quadratic to linear and binary descriptors for efficient matching. This enables evaluation of the cost on the fly and allows to perform learning and CRF inference on high resolution images without ever storing the 4D cost volume. To address the problem of learning binary descriptors we propose a new hybrid learning scheme. In contrast to current state of the art approaches for learning binary CNNs we can compute the exact non-zero gradient within our model. We compare several methods for training binary descriptors and show results on public available benchmarks.

A Distributed Mincut/Maxflow Algorithm Combining Path Augmentation and Push-Relabel

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Hlaváč, V.
Publikace: International Journal of Computer Vision. 2013, 104(3), 315-342. ISSN 0920-5691.
Rok: 2013

DOI: 10.1007/s11263-012-0571-2
Odkaz: https://doi.org/10.1007/s11263-012-0571-2
Pracoviště: Katedra kybernetiky
Anotace:
We propose a novel MRF-based model for image matching. Given two images, the task is to estimate a mapping from one image to another, in order to maximize the matching quality. We consider mappings defined by discrete deformation field constrained to preserve 2-dimensional continuity. We approach the corresponding optimization problem by the TRW-S (sequential Tree-reweighted message passing) algorithm [Wainwright-03, Kolmogorov-05]. Our model design allows for a considerably wider class of natural transformation and yields a compact representation of the optimization task. For this model TRW-S algorithm demonstrated nice practical performance on our experiments. We also propose a concise derivation of the TRW-S algorithm as a sequential maximization of the lower bound on the energy function.

Curvature Prior for MRF-based Segmentation and Shape Inpainting

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Kohli, P., Rother, C.
Publikace: DAGM/OAGM 2012: Pattern Recognition - Joint 34th DAGM and 36th OAGM Symposium. Heidelberg: Springer, 2012. p. 41-51. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-642-32716-2.
Rok: 2012

Pracoviště: Katedra kybernetiky
Anotace:
Most image labeling problems such as segmentation and image reconstruction are fundamentally ill-posed and suffer from ambiguities and noise. Higher order image priors encode high level structural dependencies between pixels and are key to overcoming these problems. However, these priors in general lead to computationally intractable models. This paper addresses the problem of discovering compact representations of higher order priors which allow efficient inference. We propose a framework for solving this problem which uses a recently proposed representation of higher order functions where they are encoded as lower envelopes of linear functions. Maximum a Posterior inference on our learned models reduces to minimizing a pairwise function of discrete variables, which can be done approximately using standard methods. We show that our framework can learn a compact representation that approximates a prior that encourages low curvature shapes. We evaluate the approximation accuracy, discuss properties of the trained model, and demonstrate it on shape inpainting and image segmentation problems.

A Distributed Mincut/Maxflow Algorithm Combining Path Augmentation and Push-Relabel

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Hlaváč, V.
Publikace: Proceedings of the 8th International Conference on Energy Minimization Methods in Computer Vision and Pattern Recognition (EMMCVPR). Berlin: Springer, 2011, pp. 1-16. Lecture Notes in Computer Science. ISBN 978-3-642-23093-6.
Rok: 2011

Pracoviště: Katedra kybernetiky
Anotace:
We develop a novel distributed algorithm for the minimum cut problem. We primarily aim at solving large sparse problems. Assuming vertices of the graph are partitioned into several regions, the algorithm performs path augmentations inside the regions and updates of the push-relabel style between the regions. The interaction between regions is considered expensive (regions are loaded into the memory one-by-one or located on separate machines in a network). The algorithm works in sweeps - passes over all regions. Let B be the set of vertices incident to inter-region edges of the graph. We present a sequential and parallel versions of the algorithm which terminate in at most 2|B|^2+1 sweeps. The competing algorithm by Delong and Boykov uses push-relabel updates inside regions. In the case of a fixed partition we prove that this algorithm has a tight O(n^2) bound on the number of sweeps, where n is the number of vertices.

On Partial Opimality by Auxiliary Submodular Problems

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Hlaváč, V.
Publikace: Control Systems and Computers. 2011, 232(2), 71-78. ISSN 0130-5395.
Rok: 2011

Pracoviště: Katedra kybernetiky
Anotace:
We prove several relations between 3 energy minimization techniques. Methods for determining a provably optimal partial assignment of variables by Ivan Kovtun (IK), the linear programming relaxation approach (LP) and the popular expansion move algorithm by Yuri Boykov. We propose a novel suffcient condition of optimal partial assignment, which is based on LP relaxation and called LP-autarky. We show that methods of Kovtun, which build auxiliary submodular problems, fulfill this suffcient condition. The following link is thus established: LP relaxation cannot be tightened by IK. For non-submodular problems this is a non-trivial result. In the case of two labels, LP relaxation provides optimal partial assignment, known as persistency, which, as we show, dominates IK. Relating IK with expansion move, we show that the set of fixed points of expansion move with any "truncation" rule for the initial problem and the problem restricted by one-vs-all method of IK would coincide.

Joint Image GMM and Shading MAP Estimation

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Hlaváč, V.
Publikace: ICPR'2010: Proceedings of the 20th International Conference on Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2010, pp. 1360-1363. ISSN 1051-4651. ISBN 978-0-7695-4109-9.
Rok: 2010

Pracoviště: Katedra kybernetiky
Anotace:
We consider a simple statistical model of the image, in which the image is represented as a sum of two parts: one part is explained by an i.i.d. color Gaussian mixture and the other part by a (piecewise-) smooth grayscale shading function. The smoothness is ensured by a quadratic (Tikhonov) or total variation regularization. We derive an EM algorithm to estimate simultaneously the parameters of the mixture model and the shading. Our algorithms for both kinds of the regularization solve for shading and mean parameters of the mixture model jointly.

Scalable Multi-View Stereo

Autoři: Jančošek, M., Mgr. Oleksandr Shekhovtsov, Ph.D., Pajdla, T.
Publikace: 3DIM '09: The 2009 IEEE International Workshop on 3-D Digital Imaging and Modeling. Los Alamitos: IEEE Computer Society Press, 2009. p. 1-8. ISBN 978-1-4244-4441-0.
Rok: 2009

Pracoviště: Katedra kybernetiky
Anotace:
This paper presents a scalable multi-view stereo reconstruction method which can deal with a large number of large unorganized images in affordable time and effort. The computational effort of our technique is a linear function of the surface area of the observed scene which is conveniently discretized to represent sufficient but not excessive detail. Our technique works as a filter on a limited number of images at a time and can thus process arbitrarily large data sets using limited memory. By building reconstructions gradually, we avoid unnecessary processing of data which bring little improvement. In experiments with Middlebury and Strecha's databases, we demonstrate that we achieve results comparable to the state of the art with considerably smaller effort than used by previous methods. We present a large scale experiments in which we processed 294 unorganized images of an outdoor scene and reconstruct its 3D model and 1000 images from the Google Street View Pittsburgh.

A Discrete Search Method for Multi-modal Non-Rigid Image Registration

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Garcia Arteaga, J., doc. Ing. Tomáš Werner, Ph.D.,
Publikace: NORDIA 2008: Proceedings of the 2008 IEEE CVPR Workshop on Non-Rigid Shape Analysis and Deformable Image Alignment. Los Alamitos: IEEE Computer Society Press, 2008. pp. 915-920. ISSN 1063-6919. ISBN 978-1-4244-2339-2.
Rok: 2008

Pracoviště: Katedra kybernetiky
Anotace:
We consider the problem of image matching under the unknown statistical dependence of the signals, i.e. a signal in one image may correspond to one or more signals in the other image with different probabilities. This problem is widely known as multimodal image registration and is commonly solved by the maximization of the empirical mutual information between the images. The deformation is typically represented in a parametric form and optimization w.r.t. it is performed using gradient-based methods. In contrast, we represent the deformation as a field of discretized displacements and optimize w.r.t. it using pairwise Gibbs energy minimization technique. This has potential advantage of finding good solutions even for problems having many local minima.

Efficient MRF Deformation Model for Non-Rigid Image Matching

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Kovtun, I., Hlaváč, V.
Publikace: Computer Vision and Image Understanding. 2008, 112(1), 91-99. ISSN 1077-3142.
Rok: 2008

DOI: 10.1016/j.cviu.2008.06.006
Odkaz: https://doi.org/10.1016/j.cviu.2008.06.006
Pracoviště: Katedra kybernetiky
Anotace:
We propose a novel MRF-based model for deformable image matching (also known as registration). The deformation is described by a field of discrete variables, representing displacements of (blocks of) pixels. Discontinuities in the deformation are prohibited by imposing hard pairwise constraints in the model. Exact maximum a posteriori inference is intractable and we apply a linear programming relaxation technique. We show that, when reformulated in the form of two coupled fields of x- and y- displacements, the problem leads to a simpler relaxation to which we apply the TRW-S (Sequential Tree-Reweighted Message passing) algorithm [Wainwright-03, Kolmogorov-05]. This enables image registration with large displacements at a single scale. We employ fast message updates for a special type of interaction as was proposed [Felzenszwalb and Huttenlocher-04] for the max-product Belief Propagation (BP) and introduce a few independent speedups. In contrast to BP, the TRW-S allows us to compute per

On Partial Optimality in Multi-label MRFs

Autoři: Kohli, P., Mgr. Oleksandr Shekhovtsov, Ph.D., Rother, C., Kolmogorov, V., Torr, P.
Publikace: ICML 2008: Proceedings of the 25th International Conference on Machine Learning. New York: ACM, 2008. p. 1-8. ISBN 978-1-60558-205-4.
Rok: 2008

Pracoviště: Katedra kybernetiky
Anotace:
We consider the problem of optimizing multi-label MRFs, which is in general NP-hard and ubiquitous in low-level computer vision. One approach for its solution is to formulate it as an integer linear programming and relax the integrality constraints. The approach we consider in this paper is to first convert the multi-label MRF into an equivalent binary-label MRF and then to relax it. The resulting relaxation can be efficiently solved using a maximum flow algorithm. Its solution provides us with a partially optimal labelling of the binary variables. This partial labelling is then easily transferred to the multi-label problem. We study the theoretical properties of the new relaxation and compare it with the standard one. Specifically, we compare tightness, and characterize a subclass of problems where the two relaxations coincide. We propose several combined algorithms based on the technique and demonstrate their performance on challenging computer vision problems.

Efficient MRF Deformation Model for Non-Rigid Image Matching

Autoři: Mgr. Oleksandr Shekhovtsov, Ph.D., Kovtun, I., Hlaváč, V.
Publikace: CVPR 2007: Proceedings of the Computer Vision and Pattern Recognition conference. Los Alamitos: IEEE Computer Society, 2007. ISSN 1053-587X. ISBN 1-4244-1180-7.
Rok: 2007

Pracoviště: Katedra kybernetiky
Anotace:
We propose a novel MRF-based model for deformable image matching. Given two images, the task is to estimate a mapping from one image to the other maximizing the quality of the match. We consider mappings defined by a discrete deformation field constrained to preserve 2D continuity. We pose the task as finding MAP configurations of a pairwise MRF.

Unified Framework for Semiring-Based Arc Consistency and Relaxation Labeling

Autoři: doc. Ing. Tomáš Werner, Ph.D., Mgr. Oleksandr Shekhovtsov, Ph.D.,
Publikace: CVWW 2007: Proceedings of the 12th Computer Vision Winter Workshop. Graz: Verlag der Technischen Universität Graz, 2007. p. 27-34. ISBN 978-3-902465-60-3.
Rok: 2007

Pracoviště: Katedra kybernetiky
Anotace:
Constraint Satisfaction Problem (CSP), including its soft modifications, is ubiquitous in artificial intelligence and related fields. In computer vision and pattern recognition, the crisp CSP is more known as the consistent labeling problem and certain soft CSPs as certain inference problems in Markov Random Fields. Many soft CSPs can be seen as special cases of the semiring-based CSP (SCSP), using two abstract operations that form a semiring. A fundamental concept to tackle the CSP, as well as the SCSPs with idempotent semiring multiplication, are arc consistency algorithms, also known as relaxation labeling. Attempts have been made to generalize arc consistency for soft CSPs with non-idempotent semiring multiplication. We achieve such generalization by generalizing max-sum diffusion of Kovalevsky and Koval, used to decrease Schlesinger's upper bound on the max-sum CSP. We formulate the proposed generalized arc consistency in the semiring framework.

Mgr. Oleksandr Shekhovtsov, Ph.D.

Všechny publikace

Cold Analysis of Rao-Blackwellized Straight-Through Gumbel-Softmax Gradient Estimator

Generalized Differentiable RANSAC

VAE Approximation Error: ELBO and Exponential Families

Bias-Variance Tradeoffs in Single-Sample Binary Gradient Estimators

Initialization and Transfer Learning of Stochastic Binary Networks from Real-Valued Ones

Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks

Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems

Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks

Taxonomy of Dual Block-Coordinate Ascent Methods for Discrete Energy Minimization

Feed-forward Propagation in Probabilistic Neural Networks with Categorical and Max Layers

Stochastic Normalizations as Bayesian Learning

MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical Models

Normalization of Neural Networks using Analytic Variance Propagation

Scalable full flow with learned binary descriptors

A Distributed Mincut/Maxflow Algorithm Combining Path Augmentation and Push-Relabel

Curvature Prior for MRF-based Segmentation and Shape Inpainting

A Distributed Mincut/Maxflow Algorithm Combining Path Augmentation and Push-Relabel

On Partial Opimality by Auxiliary Submodular Problems

Joint Image GMM and Shading MAP Estimation

Scalable Multi-View Stereo

A Discrete Search Method for Multi-modal Non-Rigid Image Registration

Efficient MRF Deformation Model for Non-Rigid Image Matching

On Partial Optimality in Multi-label MRFs

Efficient MRF Deformation Model for Non-Rigid Image Matching

Unified Framework for Semiring-Based Arc Consistency and Relaxation Labeling

Mějte přehled