Lidé

prof. Mgr. Ondřej Chum, Ph.D.

Všechny publikace

Edge Augmentation for Large-Scale Sketch Recognition without Sketches

  • DOI: 10.1109/ICPR56361.2022.9956233
  • Odkaz: https://doi.org/10.1109/ICPR56361.2022.9956233
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    This work addresses scaling up the sketch classification task into a large number of categories. Collecting sketches for training is a slow and tedious process that has so far precluded any attempts to large-scale sketch recognition. We overcome the lack of training sketch data by exploiting labeled collections of natural images that are easier to obtain. To bridge the domain gap we present a novel augmentation technique that is tailored to the task of learning sketch recognition from a training set of natural images. Randomization is introduced in the parameters of edge detection and edge selection. Natural images are translated to a pseudo-novel domain called "randomized Binary Thin Edges" (rBTE), which is used as a training domain instead of natural images. The ability to scale up is demonstrated by training CNN-based sketch recognition of more than 2.5 times larger number of categories than used previously. For this purpose, a dataset of natural images from 874 categories is constructed by combining a number of popular computer vision datasets. The categories are selected to be suitable for sketch recognition. To estimate the performance, a subset of 393 categories with sketches is also collected.

Results and findings of the 2021 Image Similarity Challenge

  • Autoři: Papakipos, Z., doc. Georgios Tolias, Ph.D., Ing. Tomáš Jeníček, Pizzi, E., Yokoo, S., Wang, W., Sun, Y., Zhang, W., Yang, Y., Addicam, S., Papadakis, S.M., Ferrer, C.C., prof. Mgr. Ondřej Chum, Ph.D., Douze, M.
  • Publikace: Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track. Proceedings of Machine Learning Research, 2022. p. 1-12. vol. 176. ISSN 1938-7228.
  • Rok: 2022
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    The 2021 Image Similarity Challenge introduced a dataset to serve as a benchmark to evaluate image copy detection methods. There were 200 participants to the competition. This paper presents a quantitative and qualitative analysis of the top submissions. It appears that the most difficult image transformations involve either severe image crops or overlaying onto unrelated images, combined with local pixel perturbations. The key algorithmic elements in the winning submissions are: training on strong augmentations, self-supervised learning, score normalization, explicit overlay detection, and global descriptor matching followed by pairwise image comparison.

Minimal Solvers for Rectifying from Radially-Distorted Conjugate Translations

  • DOI: 10.1109/TPAMI.2020.2992261
  • Odkaz: https://doi.org/10.1109/TPAMI.2020.2992261
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    This paper introduces minimal solvers that jointly solve for radial lens undistortion and affine-rectification using local features extracted from the image of coplanar translated and reflected scene texture, which is common in man-made environments. The proposed solvers accommodate different types of local features and sampling strategies, and three of the proposed variants require just one feature correspondence. State-of-the-art techniques from algebraic geometry are used to simplify the formulation of the solvers. The generated solvers are stable, small and fast. Synthetic and real-image experiments show that the proposed solvers have superior robustness to noise compared to the state of the art. The solvers are integrated with an automated system for rectifying imaged scene planes from coplanar repeated texture. Accurate rectifications on challenging imagery taken with narrow to wide field-of-view lenses demonstrate the applicability of the proposed solvers.

Graph convolutional networks for learning with few clean and many noisy labels

  • DOI: 10.1007/978-3-030-58607-2_17
  • Odkaz: https://doi.org/10.1007/978-3-030-58607-2_17
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    In this work we consider the problem of learning a classifier from noisy labels when a few clean labeled examples are given. The structure of clean and noisy data is modeled by a graph per class and Graph Convolutional Networks (GCN) are used to predict class relevance of noisy examples. For each class, the GCN is treated as a binary classifier, which learns to discriminate clean from noisy examples using a weighted binary cross-entropy loss function. The GCN-inferred “clean” probability is then exploited as a relevance measure. Each noisy example is weighted by its relevance when learning a classifier for the end task. We evaluate our method on an extended version of a few-shot learning problem, where the few clean examples of novel classes are supplemented with additional noisy data. Experimental results show that our GCNbased cleaning process significantly improves the classification accuracy over not cleaning the noisy data, as well as standard few-shot classification where only few clean examples are used.

Learning and aggregating deep local descriptors for instance-level recognition

  • DOI: 10.1007/978-3-030-58452-8_27
  • Odkaz: https://doi.org/10.1007/978-3-030-58452-8_27
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    We propose an efficient method to learn deep local descriptors for instance-level recognition. The training only requires examples of positive and negative image pairs and is performed as metric learning of sum-pooled global image descriptors. At inference, the local descriptors are provided by the activations of internal components of the network. We demonstrate why such an approach learns local descriptors that work well for image similarity estimation with classical efficient match kernel methods. The experimental validation studies the trade-off between performance and memory requirements of the state-of-the-art image search approach based on match kernels. Compared to existing local descriptors, the proposed ones perform better in two instance-level recognition tasks and keep memory requirements lower. We experimentally show that global descriptors are not effective enough at large scale and that local descriptors are essential. We achieve state-of-the-art performance, in some cases even with a backbone network as small as ResNet18.

Minimal Solvers for Rectifying from Radially-Distorted Scales and Change of Scales

  • DOI: 10.1007/s11263-019-01216-x
  • Odkaz: https://doi.org/10.1007/s11263-019-01216-x
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    This paper introduces the first minimal solvers that jointly estimate lens distortion and affine rectification from the image of rigidly-transformed coplanar features. The solvers work on scenes without straight lines and, in general, relax strong assumptions about scene content made by the state of the art. The proposed solvers use the affine invariant that coplanar repeats have the same scale in rectified space. The solvers are separated into two groups that differ by how the equal scale invariant of rectified space is used to place constraints on the lens undistortion and rectification parameters. We demonstrate a principled approach for generating stable minimal solvers by the Gröbner basis method, which is accomplished by sampling feasible monomial bases to maximize numerical stability. Synthetic and real-image experiments confirm that the proposed solvers demonstrate superior robustness to noise compared to the state of the art. Accurate rectifications on imagery taken with narrow to fisheye field-of-view lenses demonstrate the wide applicability of the proposed method. The method s fully automatic.

Saddle: Fast and repeatable features with good coverage

  • DOI: 10.1016/j.imavis.2019.08.011
  • Odkaz: https://doi.org/10.1016/j.imavis.2019.08.011
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    A novel similarity-covariant feature detector that extracts points whose neighborhoods, when treated as a 3D intensity surface, have a saddle-like intensity profile is presented. The saddle condition is verified efficiently by intensity comparisons on two concentric rings that must have exactly two dark-to-bright and two bright-to-dark transitions satisfying certain geometric constraints. Saddle is a fast approximation of Hessian detector as ORB, that implements the FAST detector, is for Harris detector. We propose to use the matching strategy called the first geometric inconsistent with binary descriptors that is suitable for our feature detector, including experiments with fix point descriptors hand-crafted and learned. Experiments show that the Saddle features are general, evenly spread and appearing in high density in a range of images. The Saddle detector is among the fastest proposed. In comparison with detector with similar speed, the Saddle features show superior matching performance on number of challenging datasets. Compared to recently proposed deep-learning based interest point detectors and popular hand-crafted keypoint detectors, evaluated for repeatability in the ApolloScape dataset [1], the Saddle detectors shows the best performance in most of the street-level view sequences a.k.a. traversals.

Explicit Spatial Encoding for Deep Local Descriptors

  • DOI: 10.1109/CVPR.2019.00962
  • Odkaz: https://doi.org/10.1109/CVPR.2019.00962
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    We propose a kernelized deep local-patch descriptor based on efficient match kernels of neural network activations. Response of each receptive field is encoded together with its spatial location using explicit feature maps. Two location parametrizations, Cartesian and polar, are used to provide robustness to a different types of canonical patch misalignment. Additionally, we analyze how the conventional architecture, i.e. a fully connected layer attached after the convolutional part, encodes responses in a spatially variant way. In contrary, explicit spatial encoding is used in our descriptor, whose potential applications are not limited to local-patches. We evaluate the descriptor on standard benchmarks. Both versions, encoding 32x32 or 64x64 patches, consistently outperform all other methods on all benchmarks. The number of parameters of the model is independent of the input patch resolution.

Fine-tuning CNN Image Retrieval with No Human Annotation

  • DOI: 10.1109/TPAMI.2018.2846566
  • Odkaz: https://doi.org/10.1109/TPAMI.2018.2846566
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Image descriptors based on activations of Convolutional Neural Networks (CNNs) have become dominant in image retrieval due to their discriminative power, compactness of representation, and search efficiency. Training of CNNs, either from scratch or fine-tuning, requires a large amount of annotated data, where a high quality of annotation is often crucial. In this work, we propose to fine-tune CNNs for image retrieval on a large collection of unordered images in a fully automated manner. Reconstructed 3D models obtained by the state-of-the-art retrieval and structure-from-motion methods guide the selection of the training data. We show that both hard-positive and hard-negative examples, selected by exploiting the geometry and the camera positions available from the 3D models, enhance the performance of particular-object retrieval. CNN descriptor whitening discriminatively learned from the same training data outperforms commonly used PCA whitening. We propose a novel trainable Generalized-Mean (GeM) pooling layer that generalizes max and average pooling and show that it boosts retrieval performance. Applying the proposed method to the VGG network achieves state-of-the-art performance on the standard benchmarks: Oxford Buildings, Paris, and Holidays datasets.

Graph-based particular object discovery

  • DOI: 10.1007/s00138-019-01005-z
  • Odkaz: https://doi.org/10.1007/s00138-019-01005-z
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Severe background clutter is challenging in many computer vision tasks, including large-scale image retrieval. Global descriptors, which are popular due to their memory and search efficiency, are especially prone to corruption by such a clutter. Eliminating the impact of the clutter on the image descriptor increases the chance of retrieving relevant images and prevents topic drift due to actually retrieving the clutter in the case of query expansion. In this work, we propose a novel salient region detection method. It captures, in an unsupervised manner, patterns that are both discriminative and common in the dataset. Saliency is based on a centrality measure of a nearest neighbor graph constructed from regional CNN representations of dataset images. The proposed method exploits recent CNN architectures trained for object retrieval to construct the image representation from the salient regions. We improve particular object retrieval on challenging datasets containing small objects.

Hybrid Diffusion: Spectral-Temporal Graph Filtering for Manifold Ranking

  • Autoři: Iscen, A., Avrithis, Y., doc. Georgios Tolias, Ph.D., Furon, T., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: ACCV 2018: Proceedings of the 14th Asian Conference on Computer Vision, Part II. Springer, 2019. p. 301-316. LNCS. vol. 11362. ISSN 0302-9743. ISBN 978-3-030-20889-9.
  • Rok: 2019
  • DOI: 10.1007/978-3-030-20890-5_20
  • Odkaz: https://doi.org/10.1007/978-3-030-20890-5_20
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    State of the art image retrieval performance is achieved with CNN features and manifold ranking using a k-NN similarity graph that is pre-computed off-line. The two most successful existing approaches are temporal filtering, where manifold ranking amounts to solving a sparse linear system online, and spectral filtering, where eigen-decomposition of the adjacency matrix is performed off-line and then manifold ranking amounts to dot-product search online. The former suffers from expensive queries and the latter from significant space overhead. Here we introduce a novel, theoretically well-founded hybrid filtering approach allowing full control of the space-time trade-off between these two extremes. Experimentally, we verify that our hybrid method delivers results on par with the state of the art, with lower memory demands compared to spectral filtering approaches and faster compared to temporal filtering.

Label Propagation for Deep Semi-supervised Learning

  • DOI: 10.1109/CVPR.2019.00521
  • Odkaz: https://doi.org/10.1109/CVPR.2019.00521
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Semi-supervised learning is becoming increasingly important because it can combine data carefully labeled by humans with abundant unlabeled data to train deep neural networks. Classic methods on semi-supervised learning that have focused on transductive learning have not been fully exploited in the inductive framework followed by modern deep learning. The same holds for the manifold assumption---that similar examples should get the same prediction. In this work, we employ a transductive label propagation method that is based on the manifold assumption to make predictions on the entire dataset and use these predictions to generate pseudo-labels for the unlabeled data and train a deep neural network. At the core of the transductive method lies a nearest neighbor graph of the dataset that we create based on the embeddings of the same network. Therefore our learning process iterates between these two steps. We improve performance on several datasets especially in the few labels regime and show that our work is complementary to current state of the art.

Linking Art through Human Poses

  • Autoři: Ing. Tomáš Jeníček, prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: ICDAR2019: Proceedings of the 15th IAPR International Conference on Document Analysis and Recognition. Piscataway, NJ: IEEE, 2019. p. 1338-1345. ISSN 1520-5363. ISBN 978-1-7281-3015-6.
  • Rok: 2019
  • DOI: 10.1109/ICDAR.2019.00216
  • Odkaz: https://doi.org/10.1109/ICDAR.2019.00216
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    We address the discovery of composition transfer in artworks based on their visual content. Automated analysis of large art collections, which are growing as a result of art digitization among museums and galleries, is an important tool for art history and assists cultural heritage preservation. Modern image retrieval systems offer good performance on visually similar artworks, but fail in the cases of more abstract composition transfer. The proposed approach links artworks through a pose similarity of human figures depicted in images. Human figures are the subject of a large fraction of visual art from middle ages to modernity and their distinctive poses were often a source of inspiration among artists. The method consists of two steps – fast pose matching and robust spatial verification. We experimentally show that explicit human pose matching is superior to standard content-based image retrieval methods on a manually annotated art composition transfer dataset.

Local Features and Visual Words Emerge in Activations

  • Autoři: Simeoni, O., Avrithis, Y., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: CVPR 2019: Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2019. p. 11643-11652. ISSN 2575-7075. ISBN 978-1-7281-3293-8.
  • Rok: 2019
  • DOI: 10.1109/CVPR.2019.01192
  • Odkaz: https://doi.org/10.1109/CVPR.2019.01192
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    We propose a novel method of deep spatial matching (DSM) for image retrieval. Initial ranking is based on image descriptors extracted from convolutional neural network activations by global pooling, as in recent state-of-the-art work. However, the same sparse 3D activation tensor is also approximated by a collection of local features. These local features are then robustly matched to approximate the optimal alignment of the tensors. This happens without any network modification, additional layers or training. No local feature detection happens on the original image. No local feature descriptors and no visual vocabulary are needed throughout the whole process. We experimentally show that the proposed method achieves the state-of-the-art performance on standard benchmarks across different network architectures and different global pooling methods. The highest gain in performance is achieved when diffusion on the nearest-neighbor graph of global descriptors is initiated from spatially verified images.

No Fear of the Dark: Image Retrieval Under Varying Illumination Conditions

  • DOI: 10.1109/ICCV.2019.00979
  • Odkaz: https://doi.org/10.1109/ICCV.2019.00979
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Image retrieval under varying illumination conditions, such as day and night images, is addressed by image preprocessing, both hand-crafted and learned. Prior to extracting image descriptors by a convolutional neural network, images are photometrically normalised in order to reduce the descriptor sensitivity to illumination changes. We propose a learnable normalisation based on the U-Net architecture, which is trained on a combination of single-camera multi-exposure images and a newly constructed collection of similar views of landmarks during day and night. We experimentally show that both hand-crafted normalisation based on local histogram equalisation and the learnable normalisation outperform standard approaches in varying illumination conditions, while staying on par with the state-of-the-art methods on daylight illumination benchmarks, such as Oxford or Paris datasets.

Rectification from Radially-Distorted Scales

  • DOI: 10.1007/978-3-030-20873-8_3
  • Odkaz: https://doi.org/10.1007/978-3-030-20873-8_3
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    This paper introduces the first minimal solvers that jointly estimate lens distortion and affine rectification from repetitions of rigidly-transformed coplanar local features. The proposed solvers incorporate lens distortion into the camera model and extend accurate rectification to wide-angle images that contain nearly any type of coplanar repeated content. We demonstrate a principled approach to generating stable minimal solvers by the Gröbner basis method, which is accomplished by sampling feasible monomial bases to maximize numerical stability. Synthetic and real-image experiments confirm that the solvers give accurate rectifications from noisy measurements if used in a RANSAC-based estimator. The proposed solvers demonstrate superior robustness to noise compared to the state of the art. The solvers work on scenes without straight lines and, in general, relax strong assumptions about scene content made by the state of the art. Accurate rectifications on imagery taken with narrow focal length to fisheye lenses demonstrate the wide applicability of the proposed method. The method is automatic, and the code is published at https://github.com/prittjam/repeats.

Targeted Mismatch Adversarial Attack: Query With a Flower to Retrieve the Tower

  • DOI: 10.1109/ICCV.2019.00514
  • Odkaz: https://doi.org/10.1109/ICCV.2019.00514
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Access to online visual search engines implies sharing of private user content -- the query images. We introduce the concept of targeted mismatch attack for deep learning based retrieval systems to generate an adversarial image to conceal the query image. The generated image looks nothing like the user intended query, but leads to identical or very similar retrieval results. Transferring attacks to fully unseen networks is challenging. We show successful attacks to partially unknown systems, by designing various loss functions for the adversarial image construction. These include loss functions, for example, for unknown global pooling operation or unknown input resolution by the retrieval system. We evaluate the attacks on standard retrieval benchmarks and compare the results retrieved with the original and adversarial image.

Understanding and Improving Kernel Local Descriptors

  • DOI: 10.1007/s11263-018-1137-8
  • Odkaz: https://doi.org/10.1007/s11263-018-1137-8
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    We propose a multiple-kernel local-patch descriptor based on efficient match kernels from pixel gradients. It combines two parametrizations of gradient position and direction, each parametrization provides robustness to a different type of patch mis-registration: polar parametrization for noise in the patch dominant orientation detection, Cartesian for imprecise location of the feature point. Combined with whitening of the descriptor space, that is learned with or without supervision, the performance is significantly improved. We analyze the effect of the whitening on patch similarity and demonstrate its semantic meaning. Our unsupervised variant is the best performing descriptor constructed without the need of labeled data. Despite the simplicity of the proposed descriptor, it competes well with deep learning approaches on a number of different tasks.

Deep Shape Matching

  • Autoři: Radenović, F., doc. Georgios Tolias, Ph.D., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: ECCV2018: Proceedings of the European Conference on Computer Vision, Part V. Springer, Cham, 2018. p. 774-791. Lecture Notes in Computer Science. vol. 11209. ISSN 0302-9743. ISBN 978-3-030-01227-4.
  • Rok: 2018
  • DOI: 10.1007/978-3-030-01228-1_46
  • Odkaz: https://doi.org/10.1007/978-3-030-01228-1_46
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    We cast shape matching as metric learning with convolutional networks. We break the end-to-end process of image representation into two parts. Firstly, well established efficient methods are chosen to turn the images into edge maps. Secondly, the network is trained with edge maps of landmark images, which are automatically obtained by a structure-from-motion pipeline. The learned representation is evaluated on a range of different tasks, providing improvements on challenging cases of domain generalization, generic sketch-based image retrieval or its fine-grained counterpart. In contrast to other methods that learn a different model per task, object category, or domain, we use the same network throughout all our experiments, achieving state-of-the-art results in multiple benchmarks.

Efficient Contour Match Kernel

  • DOI: 10.1016/j.imavis.2018.04.006
  • Odkaz: https://doi.org/10.1016/j.imavis.2018.04.006
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    We propose a novel concept of asymmetric feature maps (AFM), which allows to evaluate multiple kernels between a query and database entries without increasing the memory requirements. To demonstrate the advantages of the AFM method, we derive an efficient contour match kernel – short vector image representation that, due to asymmetric feature maps, supports efficient scale and translation invariant sketch-based image retrieval. Unlike most of the short-code based retrieval systems, the proposed method provides the query localization in the retrieved image. The efficiency of the search is boosted by approximating a 2D translation search via trigonometric polynomial of scores by 1D projections. The projections are a special case of AFM. An order of magnitude speed-up is achieved compared to traditional trigonometric polynomials. The results are boosted by an image-based average query expansion approach and, without any learning, significantly outperform the state-of-the-art hand-crafted descriptors on standard benchmarks. Our method competes well with recent CNN-based approaches that require large amounts of labeled sketches, images and sketch-image pairs.

Fast Spectral Ranking for Similarity Search

  • Autoři: Iscen, A., Avrithis, Y., doc. Georgios Tolias, Ph.D., Furon, T., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: CVPR 2018: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018. p. 7632-7641. ISSN 2575-7075. ISBN 978-1-5386-6420-9.
  • Rok: 2018
  • DOI: 10.1109/CVPR.2018.00796
  • Odkaz: https://doi.org/10.1109/CVPR.2018.00796
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Despite the success of deep learning on representing images for particular object retrieval, recent studies show that the learned representations still lie on manifolds in a high dimensional space. This makes the Euclidean nearest neighbor search biased for this task. Exploring the manifolds online remains expensive even if a nearest neighbor graph has been computed offline. This work introduces an explicit embedding reducing manifold search to Euclidean search followed by dot product similarity search. This is equivalent to linear graph filtering of a sparse signal in the frequency domain. To speed up online search, we compute an approximate Fourier basis of the graph offline. We improve the state of art on particular object retrieval datasets including the challenging Instre dataset containing small objects. At a scale of 105 images, the offl

Local Orthogonal-Group Testing

  • Autoři: Iscen, A., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: ECCV2018: Proceedings of the European Conference on Computer Vision, Part II. Cham: Springer International Publishing, 2018. p. 460-476. Image Processing, Computer Vision, Pattern Recognition, and Graphics. vol. 11206. ISSN 0302-9743. ISBN 978-3-030-01215-1.
  • Rok: 2018
  • DOI: 10.1007/978-3-030-01216-8_28
  • Odkaz: https://doi.org/10.1007/978-3-030-01216-8_28
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    This work addresses approximate nearest neighbor search applied in the domain of large-scale image retrieval. Within the group testing framework we propose an efficient off-line construction of the search structures. The linear-time complexity orthogonal grouping increases the probability that at most one element from each group is matching to a given query. Non-maxima suppression with each group efficiently reduces the number of false positive results at no extra cost. Unlike in other well-performing approaches, all processing is local, fast, and suitable to process data in batches and in parallel. We experimentally show that the proposed method achieves search accuracy of the exhaustive search with significant reduction in the search complexity. The method can be naturally combined with existing embedding methods.

Mining on Manifolds: Metric Learning without Labels

  • Autoři: Iscen, A., doc. Georgios Tolias, Ph.D., Avrithis, Y., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: CVPR 2018: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018. p. 7642-7651. ISSN 2575-7075. ISBN 978-1-5386-6420-9.
  • Rok: 2018
  • DOI: 10.1109/CVPR.2018.00797
  • Odkaz: https://doi.org/10.1109/CVPR.2018.00797
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    In this work we present a novel unsupervised framework for hard training example mining. The only input to the method is a collection of images relevant to the target application and a meaningful initial representation, provided e.g. by pre-trained CNN. Positive examples are distant points on a single manifold, while negative examples are nearby points on different manifolds. Both types of examples are revealed by disagreements between Euclidean and manifold similarities. The discovered examples can be used in training with any discriminative loss. The method is applied to unsupervised fine-tuning of pre-trained networks for fine-grained classification and particular object retrieval. Our models are on par or are outperforming prior models that are fully or partially supervised.

Radially-Distorted Conjugate Translations

  • DOI: 10.1109/CVPR.2018.00213
  • Odkaz: https://doi.org/10.1109/CVPR.2018.00213
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    This paper introduces the first minimal solvers that jointly solve for affine-rectification and radial lens distortion from coplanar repeated patterns. Even with imagery from moderately distorted lenses, plane rectification using the pinhole camera model is inaccurate or invalid. The proposed solvers incorporate lens distortion into the camera model and extend accurate rectification to wide-angle imagery, which is now common from consumer cameras. The solvers are derived from constraints induced by the conjugate translations of an imaged scene plane, which are integrated with the division model for radial lens distortion. The hidden-variable trick with ideal saturation is used to reformulate the constraints so that the solvers generated by the Gröbner-basis method are stable, small and fast. Rectification and lens distortion are recovered from either one conjugately translated affine-covariant feature or two independently translated similarity-covariant features. The proposed solvers are used in a RANSAC-based estimator, which gives accurate rectifications after few iterations. The proposed solvers are evaluated against the state-of-the-art and demonstrate significantly better rectifcations on noisy measurements. Qualitative results on diverse imagery demonstrate high-accuracy undistortion and rectification.

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

  • Autoři: Radenović, F., Iscen, A., doc. Georgios Tolias, Ph.D., Avrithis, Y., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: CVPR 2018: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE, 2018. p. 5706-5715. ISSN 2575-7075. ISBN 978-1-5386-6420-9.
  • Rok: 2018
  • DOI: 10.1109/CVPR.2018.00598
  • Odkaz: https://doi.org/10.1109/CVPR.2018.00598
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    In this paper we address issues with image retrieval benchmarking on standard and popular Oxford 5k and Paris 6k datasets. In particular, annotation errors, the size of the dataset, and the level of challenge are addressed: new annotation for both datasets is created with an extra attention to the reliability of the ground truth. Three new protocols of varying difficulty are introduced. The protocols allow fair comparison between different methods, including those using a dataset pre-processing stage. For each dataset, 15 new challenging queries are introduced. Finally, a new set of 1M hard, semi automatically cleaned distractors is selected. An extensive comparison of the state-of-the-art methods is performed on the new benchmark. Different types of methods are evaluated, ranging from local-feature-based to modern CNN based methods. The best results are achieved by taking the best of the two worlds. Most importantly, image retrieval appears far from being solved.

Unsupervised object discovery for instance recognition

  • Autoři: Simeoni, O., Iscen, A., doc. Georgios Tolias, Ph.D., Avrithis, Y., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018. Institute of Electrical and Electronics Engineers Inc, 2018. p. 1745-1754. ISSN 2472-6737. ISBN 978-1-5386-4886-5.
  • Rok: 2018
  • DOI: 10.1109/WACV.2018.00194
  • Odkaz: https://doi.org/10.1109/WACV.2018.00194
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Severe background clutter is challenging in many computer vision tasks, including large-scale image retrieval. Global descriptors, that are popular due to their memory and search efficiency, are especially prone to corruption by such a clutter. Eliminating the impact of the clutter on the image descriptor increases the chance of retrieving relevant images and prevents topic drift due to actually retrieving the clutter in the case of query expansion. In this work, we propose a novel salient region detection method. It captures, in an unsupervised manner, patterns that are both discriminative and common in the dataset. Saliency is based on a centrality measure of a nearest neighbor graph constructed from regional CNN representations of dataset images. The descriptors derived from the salient regions improve particular object retrieval, most noticeably in a large collections containing small objects.

Asymmetric Feature Maps with Application to Sketch Based Retrieval

  • DOI: 10.1109/CVPR.2017.655
  • Odkaz: https://doi.org/10.1109/CVPR.2017.655
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    We propose a novel concept of asymmetric feature maps (AFM), which allows to evaluate multiple kernels between a query and database entries without increasing the memory requirements. To demonstrate the advantages of the AFM method, we derive a short vector image representation that, due to asymmetric feature maps, supports efficient scale and translation invariant sketch-based image retrieval. Unlike most of the short-code based retrieval systems, the proposed method provides the query localization in the retrieved image. The efficiency of the search is boosted by approximating a 2D translation search via trigonometric polynomial of scores by 1D projections. The projections are a special case of AFM. An order of magnitude speed-up is achieved compared to traditional trigonometric polynomials. The results are boosted by an image-based average query expansion, exceeding significantly the state of the art on standard benchmarks.

Efficient Diffusion on Region Manifolds: Recovering Small Objects with Compact CNN Representations

  • Autoři: Iscen, A., doc. Georgios Tolias, Ph.D., Avrithis, Y., Furon, T., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: CVPR 2017: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society Press, 2017. p. 926-935. ISSN 1063-6919. ISBN 978-1-5386-0457-1.
  • Rok: 2017
  • DOI: 10.1109/CVPR.2017.105
  • Odkaz: https://doi.org/10.1109/CVPR.2017.105
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Query expansion is a popular method to improve the quality of image retrieval with both conventional and CNN representations. It has been so far limited to global image similarity. This work focuses on diffusion, a mechanism that captures the image manifold in the feature space. The diffusion is carried out on descriptors of overlapping image regions rather than on a global image descriptor like in previous approaches. An efficient off-line stage allows optional reduction in the number of stored regions. In the on-line stage, the proposed handling of unseen queries in the indexing stage removes additional computation to adjust the precomputed data. We perform diffusion through a sparse linear system solver, yielding practical query times well below one second. Experimentally, we observe a significant boost in performance of image retrieval with compact CNN descriptors on standard benchmarks, especially when the query object covers only a small part of the image. Small objects have been a common failure case of CNN-based retrieval.

In the Saddle: Chasing fast and repeatable features

  • DOI: 10.1109/ICPR.2016.7899712
  • Odkaz: https://doi.org/10.1109/ICPR.2016.7899712
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    A novel similarity-covariant feature detector that extracts points whose neighborhoods, when treated as a 3D intensity surface, have a saddle-like intensity profile. The saddle condition is verified efficiently by intensity comparisons on two concentric rings that must have exactly two dark-to-bright and two bright-to-dark transitions satisfying certain geometric constraints. Experiments show that the Saddle features are general, evenly spread and appearing in high density in a range of images. The Saddle detector is among the fastest proposed. In comparison with detector with similar speed, the Saddle features show superior matching performance on number of challenging datasets.

Multiple-Kernel Local-Patch Descriptor

  • DOI: 10.5244/C.31.184
  • Odkaz: https://doi.org/10.5244/C.31.184
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    We propose a multiple-kernel local-patch descriptor based on efficient match kernels of patch gradients. It combines two parametrizations of gradient position and direction, each parametrization provides robustness to a different type of patch miss-registration: polar parametrization for noise in the patch dominant orientation detection, Cartesian for imprecise location of the feature point. Even though handcrafted, the proposed method consistently outperforms the state-of-the-art methods on two local patch benchmarks.

Optimizing explicit feature maps on intervals

  • DOI: 10.1016/j.imavis.2017.07.001
  • Odkaz: https://doi.org/10.1016/j.imavis.2017.07.001
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Approximating non-linear kernels by finite-dimensional feature maps is a popular approach for accelerating training and evaluation of support vector machines or to encode information into efficient match kernels. We propose a novel method of data independent construction of low-dimensional feature maps. The problem is formulated as a linear program that jointly considers two competing objectives: the quality of the approximation and the dimensionality of the feature map.

Panorama to panorama matching for location recognition

  • DOI: 10.1145/3078971.3079033
  • Odkaz: https://doi.org/10.1145/3078971.3079033
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Location recognition is commonly treated as visual instance retrieval on “street view” imagery. i.e. dataset items and queries are panoramic views, i.e. groups of images taken at a single location. This work introduces a novel panorama-to-panorama matching process, either by aggregating features of individual images in a group or by explicitly constructing a larger panorama. In either case, multiple views are used as queries. We reach near perfect location recognition on a standard benchmark with only four query views.

Robust data whitening as an iteratively re-weighted least squares problem

  • DOI: 10.1007/978-3-319-59126-1_20
  • Odkaz: https://doi.org/10.1007/978-3-319-59126-1_20
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    The entries of high-dimensional measurements, such as image or feature descriptors, are often correlated, which leads to a bias in similarity estimation. To remove the correlation, a linear transformation, called whitening, is commonly used. In this work, we analyze robust estimation of the whitening transformation in the presence of outliers. Inspired by the Iteratively Re-weighted Least Squares approach, we iterate between centering and applying a transformation matrix, a process which is shown to converge to a solution that minimizes the sum of ℓ2 norms. The approach is developed for unsupervised scenarios, but further extend to supervised cases. We demonstrate the robustness of our method to outliers on synthetic 2D data and also show improvements compared to conventional whitening on real data for image retrieval with CNN-based representation. Finally, our robust estimation is not limited to data whitening, but can be used for robust patch rectification, e.g. with MSER features.

CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples

  • Autoři: Radenovič, F., doc. Georgios Tolias, Ph.D., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: Computer Vision – ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I. Springer, 2016. p. 3-20. Lecture Notes in Computer Science. vol. 9905. ISSN 0302-9743. ISBN 978-3-319-46447-3.
  • Rok: 2016
  • DOI: 10.1007/978-3-319-46448-0_1
  • Odkaz: https://doi.org/10.1007/978-3-319-46448-0_1
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Convolutional Neural Networks (CNNs) achieve state-of-the-art performance in many computer vision tasks. However, this achievement is preceded by extreme manual annotation in order to perform either training from scratch or fine-tuning for the target task. In this work, we propose to fine-tune CNN for image retrieval from a large collection of unordered images in a fully automated manner. We employ state-of-the-art retrieval and Structure-from-Motion (SfM) methods to obtain 3D models, which are used to guide the selection of the training data for CNN fine-tuning. We show that both hard positive and hard negative examples enhance the final performance in particular object retrieval with compact codes.

Coplanar Repeats by Energy Minimization

  • DOI: 10.5244/C.30.107
  • Odkaz: https://doi.org/10.5244/C.30.107
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    This paper proposes an automated method to detect, group and rectify arbitrarily arranged coplanar repeated elements via energy minimization. The proposed energy functional combines several features that model how planes with coplanar repeats are projected into images and captures global interactions between different coplanar repeat groups and scene planes. An inference framework based on a recent variant of α-expansion is described and fast convergence is demonstrated. We compare the proposed method to two widely-used geometric multi-model fitting methods using a new dataset of annotated images containing multiple scene planes with coplanar repeats in varied arrangements. The evaluation shows a significant improvement in the accuracy of rectifications computed from coplanar repeats detected with the proposed method versus those detected with the baseline methods.

From Dusk till Dawn: Modeling in the Dark

  • Autoři: Radenovič, F., Schönberger, J. L., Ji, D., Frahm, J., prof. Mgr. Ondřej Chum, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
  • Publikace: CVPR 2016: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2016. p. 5488-5496. ISSN 1063-6919. ISBN 978-1-4673-8851-1.
  • Rok: 2016
  • DOI: 10.1109/CVPR.2016.592
  • Odkaz: https://doi.org/10.1109/CVPR.2016.592
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Internet photo collections naturally contain a large variety of illumination conditions, with the largest difference between day and night images. Current modeling techniques do not embrace the broad illumination range often leading to reconstruction failure or severe artifacts. We present an algorithm that leverages the appearance variety to obtain more complete and accurate scene geometry along with consistent multi-illumination appearance information. The proposed method relies on automatic scene appearance grouping, which is used to obtain separate dense 3D models. Subsequent model fusion combines the separate models into a complete and accurate reconstruction of the scene. In addition, we propose a method to derive the appearance information for the model under the different illumination conditions, even for scene parts that are not observed under one illumination condition. To achieve this, we develop a cross-illumination color transfer technique. We evaluate our method on a large variety of landmarks from across Europe reconstructed from a database of 7.4M images.

Camera Elevation Estimation from a Single Mountain Landscape Photograph

  • Autoři: Čadík, M., Vašíček, J., Hradiš, M., Radenovič, F., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: Proceedings of the British Machine Vision Conference (BMVC). London: British Machine Vision Association, 2015. ISBN 978-1-901725-53-7.
  • Rok: 2015
  • DOI: 10.5244/C.29.30
  • Odkaz: https://doi.org/10.5244/C.29.30
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This work addresses the problem of camera elevation estimation from a single photograph in an outdoor environment. We introduce a new benchmark dataset of one-hundred thousand images with annotated camera elevation called Alps100K. We propose and experimentally evaluate two automatic data-driven approaches to camera elevation estimation: one based on convolutional neural networks, the other on local features. To compare the proposed methods to human performance, an experiment with 100 subjects is conducted. The experimental results show that both proposed approaches outperform humans and that the best result is achieved by their combination.

Efficient Image Detail Mining

  • Autoři: Mikulík, A., Radenovič, F., prof. Mgr. Ondřej Chum, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
  • Publikace: ACCV 2014: Proceedings of the 12th Asian Conference on Computer Vision, Part II. Cham: Springer, 2015. p. 118-132. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-319-16807-4.
  • Rok: 2015
  • DOI: 10.1007/978-3-319-16808-1_9
  • Odkaz: https://doi.org/10.1007/978-3-319-16808-1_9
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Two novel problems straddling the boundary between image retrieval and data min ing are formulated: for every pixel in the query image, (i) find the database image with the maximum resolution depicting the pixel and (ii) find the frequency with which it is photograp hed in detail. An efficient and reliable solution for both problems is proposed based on two novel techniques, the hierarchical query expansion that exploits the document at a time (DAAT ) inverted file and a geometric consistency verification sufficiently robust to prevent topic drift within a zooming search. Experiments show that the proposed method finds surprisingly fine details on landmarks, even those that are hardly noticeable for humans.

From Single Image Query to Detailed 3D Reconstruction

  • Autoři: Schonberger, J., Radenovič, F., prof. Mgr. Ondřej Chum, Ph.D., Frahm, J.
  • Publikace: CVPR 2015: Proceedings of the 2015 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York: IEEE Computer Society Press, 2015. p. 5126-5134. ISSN 1063-6919. ISBN 978-1-4673-6964-0.
  • Rok: 2015
  • DOI: 10.1109/CVPR.2015.7299148
  • Odkaz: https://doi.org/10.1109/CVPR.2015.7299148
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Structure-from-Motion for unordered image collections has significantly advance d in scale over the last decade. This impressive progress can be in part attributed to the in troduction of efficient retrieval methods for those systems. While this boosts scalability, i t also limits the amount of detail that the large-scale reconstruction systems are able to pr oduce. In this paper, we propose a joint reconstruction and retrieval system that maintains t he scalability of large-scale Structure-from-Motion systems while also recovering the often l ost ability of reconstructing fine details of the scene. We demonstrate our proposed method o n a large-scale dataset of 7.4 million images downloaded from the Internet.

Low dimensional explicit feature maps

  • Autoři: prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: 2015 IEEE International Conference on Computer Vision (ICCV 2015). Piscataway: IEEE, 2015. p. 4077-4085. ISSN 1550-5499. ISBN 978-1-4673-8391-2.
  • Rok: 2015
  • DOI: 10.1109/ICCV.2015.464
  • Odkaz: https://doi.org/10.1109/ICCV.2015.464
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Approximating non-linear kernels by finite-dimensional feature maps is a popular approach for speeding up training and evaluation of support vector machines or to encode information into efficient match kernels. We propose a novel method of data independent construction of low dimensional feature maps. The problem is cast as a linear program which jointly considers competing objectives: the quality of the approximation and the dimensionality of the feature map. For both shift-invariant and homogeneous kernels the proposed method achieves a better approximations at the same dimensionality or comparable approximations at lower dimensionality of the feature map compared with state-of-the-art methods.

Multiple Measurements and Joint Dimensionality Reduction for Large Scale Image Search with Short Vectors

  • Autoři: Radenovič, F., Jégou, H., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: ICMR 2015: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. New York: ACM, 2015. p. 587-590. ISBN 978-1-4503-3274-3.
  • Rok: 2015
  • DOI: 10.1145/2671188.2749366
  • Odkaz: https://doi.org/10.1145/2671188.2749366
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper addresses the construction of a short-vector (128D) image representa tion for large-scale image and particular object retrieval. In particular, the method of join t dimensionality reduction of multiple vocabularies is considered. We study a variety of voca bulary generation techniques: different k-means initializations, different descriptor transfo rmations, different measurement regions for descriptor extraction. Our extensive evaluation s hows that different combinations of vocabularies, each partitioning the descriptor space in a different yet complementary manner, results in a significant performance improvement, which exceeds the state-of-the-art.

Towards Visual Words to Words Text Detection with a General Bag of Words Representation

  • DOI: 10.1109/ICDAR.2015.7333840
  • Odkaz: https://doi.org/10.1109/ICDAR.2015.7333840
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We address the problem of text localization and retrieval in real world images. We are first to study the retrieval of text images, i.e. the selection of images containing text in large collections at high speed. We propose a novel representation, textual visual words, which describe text by generic visual words that geometrically consistently predict bottom and top lines of text. The visual words are discretized SIFT descriptors of Hessian features. The features may correspond to various structures present in the text - character fragments, individual characters or their arrangements. The textual words representation is invariant to affine transformation of the image and local linear change of intensity. Experiments demonstrate that the proposed method outperforms the state-of-the-art on the MS dataset. The proposed method detects blurry, small font, low contrast, noisy text from real world images.

Rectification, and Segmentation of Coplanar Repeated Patterns

  • DOI: 10.1109/CVPR.2014.380
  • Odkaz: https://doi.org/10.1109/CVPR.2014.380
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper presents a novel and general method for the detection, rectification and segmentation of imaged coplanar repeated patterns. The only assumption made of the scene geometry is that repeated scene elements are mapped to each other by planar Euclidean transformations. The class of patterns covered is broad and includes nearly all commonly seen, planar, man-made repeated patterns. In addition, novel linear constraints are used to reduce geometric ambiguity between the rectified imaged pattern and the scene pattern. Rectification to within a similarity of the scene plane is achieved from one rotated repeat, or to within a similarity with a scale ambiguity along the axis of symmetry from one reflected repeat. A stratum of constraints is derived that gives the necessary configuration of repeats for each successive level of rectification. A generative model for the imaged pattern is inferred and used to segment the pattern with pixel accuracy. Qualitative results are shown on a broad range of image types on which state-of-the-art methods fail.

Relevance Assessment for Visual Video Re-ranking

  • Autoři: Aldana Iuit, J., prof. Mgr. Ondřej Chum, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
  • Publikace: Image Analysis and Recognition: 11th International Conference (ICIAR 2014). Berlin: Springer-Verlag, 2014, pp. 421-430. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-319-11757-7. Available from: http://dx.doi.org/10.1007/978-3-319-11758-4_46
  • Rok: 2014
  • DOI: 10.1007/978-3-319-11758-4_46
  • Odkaz: https://doi.org/10.1007/978-3-319-11758-4_46
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    The following problem is considered: Given a name or phrase specifying an object, collect images and videos from the internet possibly depicting the object using a textual query on their name or annotation. A visual model from the images is built and used to rank the videos by relevance to the object of interest. Shot relevance is defined as the duration of the visibility of the object of interest. The model is based on local image features. The relevant shot detection builds on wide baseline stereo matching. The method is tested on 10 text phrases corresponding to 10 landmarks. The pool of 100 videos collected querying You-Tube with includes seven relevant videos for each landmark. The implementation runs faster than real-time at 208 frames per second. Averaged over the set of landmarks, at recall 0.95 the method has mean precision of 0.65, and the mean Average Precision (mAP) of 0.92.

Approximate Models for Fast and Accurate Epipolar Geometry Estimation

  • DOI: 10.1109/IVCNZ.2013.6727000
  • Odkaz: https://doi.org/10.1109/IVCNZ.2013.6727000
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper investigates the plausibility of using approximate models for hypothesis generation in a RANSAC framework to accurately and reliably estimate the fundamental matrix. Two novel fundamental matrix estimators are introduced that sample two correspondences to generate affine-fundamental matrices for RANSAC hypotheses. A new RANSAC framework is presented that uses local optimization to estimate the fundamental matrix from the consensus correspondence sets of verified hy- potheses, which are approximate models. The proposed estimators are shown to perform better than other approximate models that have previously been used in the literature for fundamental matrix estimation in a rigorous evaluation. In addition the proposed estimators are over 30 times faster, in terms of models verified, than the 7-point method, and offer comparable accuracy and repeatability on a large subset of the test set.

Image Retrieval for Online Browsing in Large Image Collections

  • DOI: 10.1007/978-3-642-41062-8_2
  • Odkaz: https://doi.org/10.1007/978-3-642-41062-8_2
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Two new methods for large scale image retrieval are proposed, showing that the classical ranking of images based on similarity addresses only one of possible user requirements. The novel retrieval methods add zoom-in and zoom-out capabilities and answer the 'What is this?' and 'Where is this?' questions. The functionality is obtained by modifying the scoring and ranking functions of a standard bag-of-words image retrieval pipeline. We show the importance of the DAAT scoring and query expansion for recall of zoomed images. The proposed methods were tested on a standard large annotated image dataset together with images of Sagrada Familia and 100000 image confusers downloaded from Flickr. For completeness, we present in detail components of image retrieval pipelines in state-of-the-art systems. Finally, open problems related to zoom-in and zoom-out queries are discussed.

Learning Vocabularies over a Fine Quantization

  • DOI: 10.1007/s11263-012-0600-1
  • Odkaz: https://doi.org/10.1007/s11263-012-0600-1
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A novel similarity measure for bag-of-words type large scale image retrieval is presented. The similarity function is learned in an unsupervised manner, requires no extra space over the standard bag-of-words method and is more discriminative than both L2-based soft assignment and Hamming embedding. The novel similarity function achieves mean average precision that is superior to any result published in the literature on the standard Oxford 5k, Oxford 105k and Paris datasets/protocols. We study the effect of a fine quantization and very large vocabularies (up to 64 million words) and show that the performance of specific object retrieval increases with the size of the vocabulary. This observation is in contradiction with previously published methods. We further demonstrate that the large vocabularies increase the speed of the tf-idf scoring step.

USAC: A Universal Framework for Random Sample Consensus

  • DOI: 10.1109/TPAMI.2012.257
  • Odkaz: https://doi.org/10.1109/TPAMI.2012.257
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A computational problem that arises frequently in computer vision is that of estimating the parameters of a model from data that have been contaminated by noise and outliers. More generally, any practical system that seeks to estimate quantities from noisy data measurements must have at its core some means of dealing with data contamination. The random sample consensus (RANSAC) algorithm is one of the most popular tools for robust estimation. Recent years have seen an explosion of activity in this area, leading to the development of a number of techniques that improve upon the efficiency and robustness of the basic RANSAC algorithm. In this paper, we present a comprehensive overview of recent research in RANSAC-based robust estimation by analyzing and comparing various approaches that have been explored over the years. We provide a common context for this analysis by introducing a new framework for robust estimation, which we call Universal RANSAC (USAC). USAC extends the simple hypothesize-and-verify structure of standard RANSAC to incorporate a number of important practical and computational considerations. In addition, we provide a general-purpose C++ software library that implements the USAC framework by leveraging state-of-the-art algorithms for the various modules. This implementation thus addresses many of the limitations of standard RANSAC within a single unified package. We benchmark the performance of the algorithm on a large collection of estimation problems. The implementation we provide can be used by researchers either as a stand-alone tool for robust estimation or as a benchmark for evaluating new techniques.

Fast Computation of min-Hash Signatures for Image Collections

  • Autoři: prof. Mgr. Ondřej Chum, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
  • Publikace: CVPR 2012: Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York: IEEE Computer Society Press, 2012, pp. 3077-3084. ISSN 1063-6919. ISBN 978-1-4673-1228-8.
  • Rok: 2012
  • DOI: 10.1109/CVPR.2012.6248039
  • Odkaz: https://doi.org/10.1109/CVPR.2012.6248039
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A new method for highly efficient min-Hash generation for document collections is proposed. It exploits the inverted file structure which is available in many applications based on a bag or a set of words. Fast min-Hash generation is important in applications such as image clustering where good recall and precision requires a large number of min-Hash signatures.

Fixing the Locally Optimized RANSAC

  • DOI: 10.5244/C.26.95
  • Odkaz: https://doi.org/10.5244/C.26.95
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    The paper revisits the problem of local optimization for RANSAC. Improvements of the LO-RANSAC procedure are proposed: a use of truncated quadratic cost function, an introduction of a limit on the number of inliers used for the least squares computation and several implementation issues are addressed. The implementation is made publicly available.

Homography Estimation from Correspondences of Local Elliptical Features

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We propose a novel unified approach for homography estimation from two or more correspondences of local elliptical features. The method finds a homography defined by first-order Taylor expansions at two (or more) points. The approximations are affine transformations that are constrained by the ellipse-to-ellipse correspondences. Unlike methods based on projective invariants of conics, the proposed method generates only a single homography model per pair of ellipse correspondences. We show experimentally, that the proposed method generates models of precision comparable or better than the state-of-the-art at lower computational costs.

Negative Evidences and Co-occurences in Image Retrieval: The Benefit of PCA and Whitening

  • Autoři: Jégou, H., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: Computer Vision - ECCV 2012. Heidelberg: Springer, 2012. p. 774-787. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-642-33708-6.
  • Rok: 2012
  • DOI: 10.1007/978-3-642-33709-3_55
  • Odkaz: https://doi.org/10.1007/978-3-642-33709-3_55
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    The paper addresses large scale image retrieval with short vector representations. We study dimensionality reduction by Principal Component Analysis (PCA) and propose improvements to its different phases.We show and explicitly exploit relations between i) mean subtraction and the negative evidence, i.e., a visual word that is mutually missing in two descriptions being compared, and ii) the axis de-correlation and the co-occurrences phenomenon. Finally, we propose an effective way to alleviate the quantization artifacts through a joint dimensionality reduction of multiple vocabularies. The proposed techniques are simple, yet significantly and consistently improve over the state of the art on compact image representations. Complementary experiments in image classification show that the methods are generally applicable.

Planar Affine Rectification from Change of Scale

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A method for affine rectification of a plane exploiting knowledge of relative scale changes is presented. The rectifying transformation is fully specified by the relative scale change at three non-collinear points or by two pairs of points where the relative scale change is known; the relative scale change between the pairs is not required. The method also allows homography estimation between two views of a planar scene from three point-with-scale correspondences. The proposed method is simple to implement and without parameters; linear and thus supporting (algebraic) least squares solutions; and general, without restrictions on either the shape of the corresponding features or their mutual position. The wide applicability of the method is demonstrated on text rectification, detection of repetitive patterns, texture normalization and estimation of homography from three point-with-scale correspondences.

Total Recall II: Query Expansion Revisited

  • Autoři: prof. Mgr. Ondřej Chum, Ph.D., Mikulík, A., Perďoch, M., prof. Ing. Jiří Matas, Ph.D.,
  • Publikace: CVPR 2011: Proceedings of the 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2011. pp. 889-896. IEEE Conference on Computer Vision and Pattern Recognition. ISSN 1063-6919. ISBN 978-1-4577-0393-5.
  • Rok: 2011
  • DOI: 10.1109/CVPR.2011.5995601
  • Odkaz: https://doi.org/10.1109/CVPR.2011.5995601
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Most effective particular object and image retrieval approaches are based on the bag-of-words (BoW) model. All state-of-the-art retrieval results have been achieved by methods that include a query expansion that brings a significant boost in performance. We introduce three modifications to automatic query expansion: (i) a method capable of preventing query expansion failure caused by the presence of confusers, (ii) an improved spatial verification and re-ranking step that incrementally builds a statistical model of the query object and (iii) we learn relevant spatial context to boost retrieval performance. The three improvements of query expansion were evaluated on established Paris and Oxford datasets according to a standard protocol, and state-of-the-art results were achieved.

Construction of Precise Local Affine Frames

  • Autoři: Mikulík, A., prof. Ing. Jiří Matas, Ph.D., Perďoch, M., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: ICPR'2010: Proceedings of the 20th International Conference on Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2010, pp. 3565-3569. ISSN 1051-4651. ISBN 978-0-7695-4109-9.
  • Rok: 2010
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We propose a novel method for the refinement of Maximally Stable Extremal Region (MSER) boundaries to sub-pixel precision by taking into account the intensity function in the 2x2 neighborhood of the contour points. The proposed method improves the repeatability and precision of Local Affine Frames (LAFs) constructed on extremal regions. Additionally, we propose a novel method for detection of local curvature extrema on the refined contour. Experimental evaluation on publicly available datasets shows that matching with the modified LAFs leads to a higher number of correspondences and a higher inlier ratio in more than 80% of the test image pairs. Since the processing time of the contour refinement is negligible, there is no reason not to include the algorithms as a standard part of the MSER detector and LAF constructions.

Image Matching and Retrieval by Repetitive Patterns

  • Autoři: Doubek, P., prof. Ing. Jiří Matas, Ph.D., Perďoch, M., prof. Mgr. Ondřej Chum, Ph.D.,
  • Publikace: ICPR'2010: Proceedings of the 20th International Conference on Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2010, pp. 3195-3198. ISSN 1051-4651. ISBN 978-0-7695-4109-9.
  • Rok: 2010
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Detection of repetitive patterns in images has been studied for a long time in computer vision. This paper discusses a method for representing a lattice or line pattern by shift-invariant descriptor of the repeating element. The descriptor overcomes shift ambiguity and can be matched between different a views. The pattern matching is then demonstrated in retrieval experiment, where different images of the same buildings are retrieved solely by repetitive patterns.

Large Scale Discovery of Spatilly Related Images

  • DOI: 10.1109/TPAMI.2009.166
  • Odkaz: https://doi.org/10.1109/TPAMI.2009.166
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We propose a randomized data mining method that finds clusters of spatially overlapping images. The core of the method relies on the min-Hash algorithm for fast detection of pairs of images with spatial overlap, the so-called cluster seeds. The seeds are then used as visual queries to obtain clusters which are formed as transitive closures of sets of partially overlapping images that include the seed. We show that the probability of finding a seed for an image cluster rapidly increases with the size of the cluster.

Learning a Fine Vocabulary

  • Autoři: Mikulík, A., Perďoch, M., prof. Mgr. Ondřej Chum, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
  • Publikace: Computer Vision - ECCV 2010, 11th European Conference on Computer Vision, Proceedings, Part III. Heidelberg: Springer, 2010. pp. 1-14. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-642-15557-4.
  • Rok: 2010
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We present a novel similarity measure for bag-of-words type large scale image retrieval. The similarity function is learned in an unsupervised manner, requires no extra space over the standard bag-of-words method and is more discriminative than both L2-based soft assignment and Hamming embedding. Experimentally we show that the novel similarity function achieves mean average precision that is superior to any result published in the literature on the standard Oxford 105k dataset/protocol. At the same time, retrieval with the proposed similarity function is faster than the reference method.

Unsupervised Discovery of Co-occurrence in Sparse High Dimensional Data

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    An efficient min-Hash based algorithm for discovery of dependencies in sparse high-dimensional data is presented. The dependencies are represented by sets of features cooccurring with high probability and are called co-ocsets. Sparse high dimensional descriptors, such as bag of words, have been proven very effective in the domain of image retrieval. To maintain high efficiency even for very large data collection, features are assumed independent. We show experimentally that co-ocsets are not rare, i.e. the independence assumption is often violated, and that they may ruin retrieval performance if present in the query image. Two methods for managing co-ocsets in such cases are proposed. Both methods significantly outperform the state-of-the-art in image retrieval, one is also significantly faster.

Efficient Representation of Local Geometry for Large Scale Object Retrieval

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    State of the art methods for image and object retrieval exploit both appearance (via visual words) and local geometry (spatial extent, relative pose). In large scale problems, memory becomes a limiting factor - local geometry is stored for each feature detected in each image and requires storage larger than the inverted file and term frequency and inverted document frequency weights together. We propose a novel method for learning discretized local geometry representation based on minimization of average reprojection error in the space of ellipses. The representation requires only 24 bits per feature without drop in performance. Additionally, we show that if the gravity vector assumption is used consistently from the feature description to spatial verification, it improves retrieval performance and decreases the memory footprint. The proposed method outperforms state of the art retrieval algorithms in a standard image retrieval benchmark.

Geometric min-Hashing: Finding a (thick) needle in a haystack

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We propose a novel hashing scheme for image retrieval, clustering and automatic object discovery. Unlike commonly used bag-of-words approaches, the spatial extent of image features is exploited in our method. The geometric information is used both to construct repeatable hash keys and to increase the discriminability of the description. Each hash key combines visual appearance (visual words) with semi-local geometric information. Compared with the state-of-the-art min-hash, the proposed method has both higher recall (probability of collision for hashes on the same object) and lower false positive rates (random collisions). The advantages of geometric min-hashing approach are most pronounced in the presence of viewpoint and scale change, significant occlusion or small physical overlap of the viewing fields. We demonstrate the power of the proposed method on small object discovery in a large unordered collection of images and on a large scale image clustering problem.

Lost in Quantization: Improving Particular Object Retrieval in Large Scale Image Databases

  • Autoři: Philbin, J., prof. Mgr. Ondřej Chum, Ph.D., Isard, M., Šivic, J., Zisserman, A.
  • Publikace: CVPR 2008: Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Medison: Omnipress, 2008. p. 8. ISSN 1063-6919. ISBN 978-1-4244-2242-5.
  • Rok: 2008
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    The state of the art in visual object retrieval from large databases is achieved by systems that are inspired by text retrieval. A key component of these approaches is that local regions of images are characterized using high-dimensional descriptors which are then mapped to .visual words. selected from a discrete vocabulary. This paper explores techniques to map each visual region to a weighted set of words, allowing the inclusion of features which were lost in the quantization stage of previous systems. The set of visual words is obtained by selecting words based on proximity in descriptor space. We describe how this representation may be incorporated into a standard tf-idf architecture, and how spatial verification is modified in the case of this soft-assignment.

Near Duplicate Image Detection: min-Hash and tf-idf Weighting

  • Autoři: prof. Mgr. Ondřej Chum, Ph.D., Philbin, J., Zisserman, A.
  • Publikace: BMVC 2008: Proceedings of the 19th British Machine Vision Conference. London: British Machine Vision Association, 2008. p. 493-502. ISBN 978-1-901725-36-0.
  • Rok: 2008
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper proposes two novel image similarity measures for fast indexing via locality sensitive hashing. The similarity measures are applied and evaluated in the context of near duplicate image detection. The proposed method uses a visual vocabulary of vector quantized local feature descriptors (SIFT) and for retrieval exploits enhanced min-Hash techniques. Standard min-Hash uses an approximate set intersection between document descriptors was used as a similarity measure. We propose an efficient way of exploiting more sophisticated similarity measures that have proven to be essential in image / particular object retrieval. The proposed similarity measures do not require extra computational effort compared to the original measure. We focus primarily on scalability to very large image and video databases, where fast query processing is necessary. The method requires only a small amount of data need be stored for each image. We demonstrate our method on the TrecVid 2006 data set which c

Optimal Randomized RANSAC

  • DOI: 10.1109/TPAMI.2007.70787
  • Odkaz: https://doi.org/10.1109/TPAMI.2007.70787
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A randomized model verification strategy for RANSAC is presented. The proposed method finds, like RANSAC, a solution that is optimal with user-specified probability. The solution is found in time that is close to the shortest possible and superior to any deterministic verification strategy. A provably fastest model verification strategy is designed for the (theoretical) situation when the contamination of data by outliers is known. In this case, the algorithm is the fastest possible (on the average) of all randomized RANSAC algorithms guaranteeing a confidence in the solution. The derivation of the optimality property is based on Wald's theory of sequential decision making, in particular, a modified sequential probability ratio test (SPRT). Next, the R-RANSAC with SPRT algorithm is introduced. The algorithm removes the requirement for a priori knowledge of the fraction of outliers and estimates the quantity online. We show experimentally that on standard test data, the method has perf

3D Geometry from Uncalibrated Images

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We present an automatic pipeline for recovering the geometry of a 3D scene from a set of unordered, uncalibrated images. The contributions in the paper are the presentation of the system as a whole, from images to geometry, the estimation of the local scale for various scene components in the orientation-topology module, the procedure for orienting the cloud components, and the method for dealing with points of contact. The methods are aimed to process complex scenes and nonuniformly sampled, noisy data sets.

Epipolar Geometry from Two Correspondences

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Paper that stress Epipolar geometry from three correspondences to an extreme.

Geometric Hashing with Local Affine Frames

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We propose a novel representation of local image structure and a matching scheme that are insensitive to a wide range of appearance changes. The representation is a collection of local affine frames that are constructed on outer boundaries of maximally stable extremal regions (MSERs) in an affine-covariant way. Each local affine frame is de- scribed by a relative location of other local affine frames in its neighborhood. The image is thus represented by quan- tities that depend only on the location of the boundaries of MSERs. Inter-image correspondences between local affine frames are formed in constant time by geometric hashing. Direct detection of local affine frames removes the require- ment of point-based hashing to establish reference frames in a combinatorial way, which has in the case of affine trans- form complexity that is cubic in the number of points. Local affine frames, which are also the quantities represented in the hash table, occupy a 6D space and hence data collisions

Effective Use o Pattern recognition Method for Composition of Structure Microphotographs

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Detailed knowledge of material structure is necessary in advanced applications. Commercial systems use systematic scanning of small patches of structure cuts. Since an exact geometric and photometric relation of the scanned images is not known, composition of the overall image is non-trivial. Two independent methods exploiting image overlaps for precise registration are proposed and evaluated. The first method is based on robust matching of maximally stable extremal regions, the second one compares image columns. Both methods show comparable performance.

Matching with PROSAC - Progressive Sample Consensus

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A new robust matching method is proposed. The Progressive Sample Consensus (PROSAC) algorithm exploits the linear ordering defined on the set of correspondences by a similarity function used in establishing tentative correspondences. Unlike RANSAC, which treats all correspondences equally and draws random samples uniformly from the full set, PROSAC samples are drawn from progressively larger sets of top-ranked correspondences. Under the mild assumption that the similarity measure predicts correctness of a match better than random guessing, we show that PROSAC achieves large computational savings. Experiments demonstrate it is often significantly faster (up to more than hundred times) than RANSAC. For the derived size of the sampled set of correspondences as a function of the number of samples already drawn, PROSAC converges towards RANSAC in the worst case. The power of the method is demonstrated on widebaseline matching problems.

Optimal Randomised RANSAC

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A randomized model verification strategy for RANSAC is presented. The proposed method finds, like RANSAC, a solution that is optimal with user-controllable probability. A provably optimal model verification strategy is designed for the situation when the contamination of data by outliers is known, ie the algorithm is the fastest possible (on average) of all randomized RANSAC algorithms guaranteeing given confidence in the solution. The derivation of the optimality property is based on Wald's theory of sequential decision making. The RRANSAC with SPRT, which does not require the a priori knowledge of the fraction of outliers and has results close to the optimal strategy, is introduced. We show experimentally that on standard test data the method is 2 to 10 times faster than the standard RANSAC and up to 4 times faster than previously published methods.

Randomized RANSAC with Sequential Probability Ratio Test

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A randomized model verification strategy for RANSAC is presented. The proposed method finds, like RANSAC, a solution that is optimal with user-controllable probability. A provably optimal model verification strategy is designed for the situation when the contamination of data by outliers is known, i.e. the algorithm is the fastest possible (on average) of all randomized RANSAC algorithms guaranteeing confidence in the solution. The derivation of the optimality property is based on Wald.s theory of sequential decision making. The R-RANSAC with SPRT, which does not require the a priori knowledge of the fraction of outliers and has results close to the optimal strategy, is introduced. We show experimentally that on standard test data the method is 2 to 10 times faster than the standard RANSAC and up to 4 times faster than previously published methods.

The Geometric Error for Homographies

  • Autoři: prof. Mgr. Ondřej Chum, Ph.D., Pajdla, T., Sturm, P.
  • Publikace: Computer Vision and Image Understanding. 2005, 97(1), 86-102. ISSN 1077-3142.
  • Rok: 2005
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We address the problem of finding optimal point correspondences between images related by a homography: given a homography and a pair of matching points, determine a pair of points that are exactly consistent with the homography and that minimize the geometric distance to the given points. This problem is tightly linked to the triangulation problem, i.e., the optimal 3D reconstruction of points from image pairs. Our problem is non-linear and iterative optimization methods may fall into local minima. In this paper, we show how the problem can be reduced to the solution of a polynomial of degree eight in a single variable, which can be computed numerically. Local minima are thus explicitly modeled and can be avoided. An application where this method significantly improves reconstruction accuracy is discussed. Besides the general case of homographies, we also examine the case of affine transformations, and closely study the relationships between the geometric error and the commonly used S

Two-view Geometry Estimation Unaffected by a Dominant Plane

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A RANSAC-based algorithm for robust estimation of epipolar geometry from point correspondences in the possible presence of a dominant scene plane is presented. The algorithm handles scenes with (i) all points in a single plane, (ii) majority of points in a single plane and the rest off the plane, (iii) no dominant plane. It is not required to know a priori which of the cases (i) - (iii) occurs. The algorithm exploits a theorem we proved, that if five or more of seven correspondences are related by a homography then there is an epipolar geometry consistent with the seven-tuple as well as with all correspondences related by the homography. This means that a seven point sample consisting of two outliers and five inliers lying in a dominant plane produces an epipolar geometry which is completely wrong and yet consistent with a high number of correspondences. The theorem explains why RANSAC often fails to estimate epipolar geometry in the presence of a dominant plane. Rather surprisingly, t

Enhancing RANSAC by Generalized Model Optimization

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    An extension of the RANSAC procedure is proposed. By adding a generalized model optimization step (the LO step) applied only to models with a score (quality) better than all previous ones, an algorithm with the following desirable properties is obtained: a near perfect agreement with theoretical (i.e. optimal) performance and lower sensitivity to noise and poor conditioning. The chosen scheduling strategy is shown to guarantee that the optimization step is applied so rarely that it has minimal impact on the execution time.

Epipolar Geometry Estimation via RANSAC Benefits from the Oriented Epipolar Constraint

Randomized RANSAC with T_d,d test

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Many computer vision algorithms include a robust estimation step where model parameters are computed from a data set containing a significant proportion of outliers. The RANSAC algorithm is possibly the most widely used robust estimator in the field of computer vision. In the paper we show that under a broad range of conditions, RANSAC efficiency is significantly improved if its hypothesis evaluation step is randomized.

Robust wide-baseline stereo from maximally stable extremal regions

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    The wide-baseline stereo problem, i.e. the problem of establishing correspondences between a pair of images taken from different viewpoints is studied.

Towards Complete Free-Form Reconstruction of Complex 3D cenes from an Unordered Set of Uncalibrated Images

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper describes a method for accurate dense reconstruction of a complex scene from a small set of high-resolution unorganized still images taken by a hand-held digital camera. A fully automatic data processing pipeline is proposed. Highly discriminative features are first detected in all images. Correspondences are then found in all image pairs by wide-baseline stereo matching and used in a scene structure and camera reconstruction step that can cope with occlusion and outliers. Image pairs suitable for dense matching are automatically selected, rectified and used in dense binocular matching. The dense point cloud obtained as the union of all pairwise reconstructions is fused by local roximation using oriented geometric primitives. For texturing, every primitive is mapped on the image with the best resolution.

Epipolar Geometry from Three Correspondences

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    In this paper, LO-RANSAC 3-LAF, a new algorithm for the correspondence problem is described. Exploiting processes proposed for computation of affineinvariant local frames, three point-to-point correspondences are found for each region-to-region correspondence. Consequently, it is sufficient to select only triplets of region correspondences in the hypothesis stage of epipolar geometry estimation by RANSAC.

Joint Orientation of Epipoles

Locally optimized RANSAC

On the Interaction between Object Recognition and Colour Constancy

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    In this paper we investigate some aspects of the interaction between colour constancy and object recognition. We demonstrate that even under severe changes of illumination, many objects are reliably recognised if relying only on geometry and on invariant representation of local colour appearance. We feel that colour constancy as a prePROCESSING step of an object recognition algorithm is important only in cases when colour is major (or the only available) clue for object discrimination. We also show that successful object recognition allows for "colour constancy by recognition" - an approach where the global photometric transformation is estimated from locally corresponding image patches.

Evaluating error of homography

  • Autoři: prof. Mgr. Ondřej Chum, Ph.D., Pajdla, T.
  • Publikace: Proceedings of the CVWW'02. Wien: Pattern Recognition & Image Processing Group, Vienna University of Technology, 2002, pp. 315-324.
  • Rok: 2002

Local Affine Frames for Wide-Baseline Stereo

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A novel procedure for establishing wide-baseline correspondence is introduced. Tentative correspondences are established by matching photometrically normalised colour measurements represented in a local affine frame. The affine frames are obtained by a number of affine invariant constructions on robustly detected maximally stable extremal regions of data-dependent shape. Several processes for local affine frame construction are proposed and proved affine covariant. The potential of the proposed approach is demonstrated on demanding wide-baseline matching problems. Correspondence between two views taken from different viewpoints and camera orientations as well as at very different scales is reliably established. For the scale change present (a factor more than 3), the zoomed-in image covers less than 10% of the wider view.

Randomized RANSAC

Randomized RANSAC with Td,d test

Robust Wide baseline Stereo from Maximally Stable Extremal Regions

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Clanek ocenen jako "Best Scientific Paper" konference BMVC 02

Rotational Invariants for Wide-baseline Stereo

Za stránku zodpovídá: Ing. Mgr. Radovan Suk