Persons

Ing. Jonáš Šerých

All publications

Dense Matchers for Dense Tracking

  • Department: Visual Recognition Group
  • Annotation:
    Optical flow is a useful input for various applications, including 3D reconstruction, pose estimation, tracking, and structure-from-motion. Despite its utility, the field of dense long-term tracking, especially over wide baselines, has not been extensively explored. This paper extends the concept of combining multiple optical flows over logarithmically spaced intervals as proposed by MFT. We demonstrate the compatibility of MFT with different optical flow networks, yielding results that surpass their individual performance. Moreover, we present a simple yet effective combination of these networks within the MFT framework. This approach proves to be competitive with more sophisticated, non-causal methods in terms of position prediction accuracy, highlighting the potential of MFT in enhancing long-term tracking applications.

MFT: Long-Term Tracking of Every Pixel

  • DOI: 10.1109/WACV57701.2024.00669
  • Link: https://doi.org/10.1109/WACV57701.2024.00669
  • Department: Department of Cybernetics, Visual Recognition Group
  • Annotation:
    We propose MFT -- Multi-Flow dense Tracker -- a novel method for dense, pixel-level, long-term tracking. The approach exploits optical flows estimated not only between consecutive frames, but also for pairs of frames at logarithmically spaced intervals. It selects the most reliable sequence of flows on the basis of estimates of its geometric accuracy and the probability of occlusion, both provided by a pre-trained CNN. We show that MFT achieves competitive performance on the TAP-Vid benchmark, outperforming baselines by a significant margin, and tracking densely orders of magnitude faster than the state-of-the-art point-tracking methods. The method is insensitive to medium-length occlusions and it is robustified by estimating flow with respect to the reference frame, which reduces drift.

Planar Object Tracking via Weighted Optical Flow

  • Authors: Ing. Jonáš Šerých, prof. Ing. Jiří Matas, Ph.D.,
  • Publication: Proc. of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Piscataway: IEEE, 2023. p. 1593-1602. ISSN 2642-9381. ISBN 978-1-6654-9346-8.
  • Year: 2023
  • DOI: 10.1109/WACV56688.2023.00164
  • Link: https://doi.org/10.1109/WACV56688.2023.00164
  • Department: Visual Recognition Group
  • Annotation:
    We propose WOFT - a novel method for planar object tracking that estimates a full 8 degrees-of-freedom pose, i.e. the homography w.r.t. a reference view. The method uses a novel module that leverages dense optical flow and assigns a weight to each optical flow correspondence, estimating a homography by weighted least squares in a fully differentiable manner. The trained module assigns zero weights to incorrect correspondences (outliers) in most cases, making the method robust and eliminating the need of the typically used non-differentiable robust estimators like RANSAC. The proposed weighted optical flow tracker (WOFT) achieves state-of-the-art performance on two benchmarks, POT-210 [23] and POIC [7], tracking consistently well across a wide range of scenarios.

Visual Coin-Tracking: Tracking of Planar Double-Sided Objects

  • DOI: 10.1007/978-3-030-33676-9_22
  • Link: https://doi.org/10.1007/978-3-030-33676-9_22
  • Department: Visual Recognition Group
  • Annotation:
    We introduce a new video analysis problem – tracking of rigid planar objects in sequences where both their sides are visible. Such coin-like objects often rotate fast with respect to an arbitrary axis producing unique challenges, such as fast incident light and aspect ratio change and rotational motion blur. Despite being common, neither tracking sequences containing coin-like objects nor suitable algorithm have been published. As a second contribution, we present a novel coin-tracking benchmark containing 17 video sequences annotated with object segmentation masks. Experiments show that the sequences differ significantly from the ones encountered in standard tracking datasets. We propose a baseline coin-tracking method based on convolutional neural network segmentation and explicit pose modeling. Its performance confirms that coin-tracking is an open and challenging problem.

Fast L1-Based RANSAC for Homography Estimation

  • Department: Department of Cybernetics, Visual Recognition Group
  • Annotation:
    We revisit the problem of local optimization (LO) in RANSAC for homography estimation. The standard state-of-the-art LO-RANSAC improves the plain version's accuracy and stability, but it may be computationally demanding, it is complex to implement and requires setting multiple parameters. We show that employing L1 minimization instead of the standard LO step of LO-RANSAC leads to results with similar precision. At the same time, the proposed L1 minimization is significantly faster than the standard LO step of [8], it is easy to implement and it has only a few of parameters which all have intuitive interpretation. On the negative side, the L1 minimization does not achieve the robustness of the standard LO step, its probability of failure is higher.

Responsible person Ing. Mgr. Radovan Suk