Lidé

Mgr. Jan Šochman, Ph.D.

Všechny publikace

Monocular Arbitrary Moving Object Discovery and Segmentation

  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    We propose a method for discovery and segmentation of objects that are, or their parts are, independently moving in the scene. Given three monocular video frames, the method outputs semantically meaningful regions, i.e. regions corresponding to the whole object, even when only a part of it moves. The architecture of the CNN-based end-to-end method, called Raptor, combines semantic and motion backbones, which pass their outputs to a final region segmentation network. The semantic backbone is trained in a class-agnostic manner in order to generalise to object classes beyond the training data. The core of the motion branch is a geometrical cost volume computed from optical flow, optical expansion, mono-depth and the estimated camera motion. Evaluation of the proposed architecture on the instance motion segmentation and binary moving-static segmentation problems on KITTI, DAVIS-Moving and YTVOSMoving datasets shows that the proposed method achieves state-of-the-art results on all the datasets and is able to generalise well to various environments. For the KITTI dataset, we provide an upgraded instance motion segmentation annotation which covers all moving objects. Dataset, code and models are available on the github project page github.com/michalneoral/Raptor.

A new semi-supervised method improving optical flow on distant domains

  • Autoři: Novák, T., Mgr. Jan Šochman, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
  • Publikace: Proceedings of the 25th Computer Vision Winter Workshop Conference February 3-5, 2020, Rogaška Slatina, Slovenia. Ljubljana: Slovenian Pattern Recognition Society, 2020. p. 37-45. ISBN 978-961-90901-9-0.
  • Rok: 2020
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    We propose a semi-supervised approach to learning by formulating the optimization as constrained gradient descent on a loss function that includes unsupervised terms. The method is demonstrated on semi-supervised optical flow training that promotes photo-consistency and smoothness of the flow. We show that the unsupervised objective significantly improves the estimation on a distant domain while maintaining the performance on the original domain. As a result, we achieve state-of-the-art results on the Creative Flow+ dataset among CNN based methods that did not train on any samples from the dataset.

Continual Occlusion and Optical Flow Estimation

  • DOI: 10.1007/978-3-030-20870-7_10
  • Odkaz: https://doi.org/10.1007/978-3-030-20870-7_10
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Two optical flow estimation problems are addressed: (i) occlusion estimation and handling, and (ii) estimation from image sequences longer than two frames. The proposed ContinualFlow method estimates occlusions before flow, avoiding the use of flow corrupted by occlusions for their estimation. We show that providing occlusion masks as an additional input to flow estimation improves the standard performance metric by more than 25% on both KITTI and Sintel. As a second contribution, a novel method for incorporating information from past frames into flow estimation is introduced. The previous frame flow serves as an input to occlusion estimation and as a prior in occluded regions, i.e. those without visual correspondences. By continually using the previous frame flow, ContinualFlow performance improves further by 18% on KITTI and 7% on Sintel, achieving top performance on KITTI and Sintel. © 2019, Springer Nature Switzerland AG.

Object Scene Flow with Temporal Consistency

  • Autoři: Ing. Michal Neoral, Mgr. Jan Šochman, Ph.D.,
  • Publikace: Proceedings of the 22nd Computer Vision Winter Workshop. Wien: Pattern Recognition & Image Processing Group, Vienna University of Technology, 2017. ISBN 978-3-200-04969-7.
  • Rok: 2017
  • Pracoviště: Katedra kybernetiky, Skupina vizuálního rozpoznávání
  • Anotace:
    In this paper, we propose several improvements of the Object Scene Flow (OSF) algorithm [14]. The OSF does not use the scene flow estimated in previous frame nor the object labels and their corresponding object motion information. The goal of this paper is to use this information in order to produce temporarily consistent output throughout the whole video sequence. We evaluate the progress on the KITTI’15 multiframe dataset. We show that propagating the labels and the corresponding motion information using the estimated flow reduces the false negative rate (missed cars). Together with two further proposed improvements the overall reduction of false negative is 42%. The proposed improvements also reduce EPE on the KITTI’15 scene flow from 10.63% to 9.65%.

Robust abandoned object detection integrating wide area visual surveillance and social context

  • Autoři: Ferryman, J., Hogg, D., Mgr. Jan Šochman, Ph.D., Behera, A., Rodriguez-Serrano,, J.A., Worgan, S., Li, L.Z., Leung, V., Evans, M., Cornic, P., Herbin, S., Schlenger, S., Dose, M.
  • Publikace: Pattern Recognition Letters. 2013, 34(7), 789-798. ISSN 0167-8655.
  • Rok: 2013
  • DOI: 10.1016/j.patrec.2013.01.018
  • Odkaz: https://doi.org/10.1016/j.patrec.2013.01.018
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper presents a video surveillance framework that robustly and efficiently detects abandoned objects in surveillance scenes. The framework is based on a novel threat assessment algorithm which combines the concept of ownership with automatic understanding of social relations in order to infer abandonment of objects. Implementation is achieved through development of a logic-based inference engine based on Prolog. Threat detection performance is conducted by testing against a range of datasets describing realistic situations and demonstrates a reduction in the number of false alarms generated. The proposed system represents the approach employed in the EU SUBITO project (Surveillance of Unattended Baggage and the Identification and Tracking of the Owner

A System for Real-time Detection and Tracking of Vehicles from a Single Car-mounted Camera

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A novel system for detection and tracking of vehicles from a single car-mounted camera is presented. The core of the system are high-performance vision algorithms: the WaldBoost detector and the TLD tracker that are scheduled so that a real-time performance is achieved. The vehicle monitoring system is evaluated on a new dataset collected on Italian motorways which is provided with approxi- mate ground truth (GT'') obtained from laser scans. For a wide range of distances, the recall and precision of detection for cars are excellent. Statistics for trucks are also reported. The dataset with the ground truth is made public.

Who Knows Who - Inverting the Social Force Model for Finding Groups

  • Autoři: Mgr. Jan Šochman, Ph.D., Hogg, D.
  • Publikace: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops). Los Alamitos: IEEE Computer Society Press, 2011. p. 830-837. ISBN 978-1-4673-0063-6.
  • Rok: 2011
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Social groups based on friendship or family relations are very common phenomena in human crowds and a valuable cue for a crowd activity recognition system. In this paper we present an algorithm for automatic on-line inference of social groups from observed trajectories of individual people. The method is based on the Social Force Model (SFM) - widely used in crowd simulation applications -- which specifies several attractive and repulsive forces influencing each individual relative to the other pedestrians and their environment. The main contribution of the paper is an algorithm for inference of the social groups (parameters of the SFM) based on analysis of the observed trajectories through attractive or repulsive forces which could lead to such behaviour. The proposed SFM-based method shows its clear advantage especially in more crowded scenarios where other state-of-the-art methods fail. The applicability of the algorithm is illustrated on an abandoned bag scenario.

Interpreting Structures in Man-made Scenes - Combining Low-Level and High-Level Structure Sources

  • Autoři: Terzic, K., Hotz, L., Mgr. Jan Šochman, Ph.D.,
  • Publikace: ICAART 2010 - Proceedings of the International Conference on Agents and Artificial Intelligence. Setúbal: INSTICC Press, 2010. pp. 357-364. ISBN 978-989-674-021-4.
  • Rok: 2010
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Recognizing structure is an important aspect of interpreting many computer vision domains. Structure can manifest itself both visually, in terms of repeated low-level phenomena, and conceptually, in terms of a highlevel compositional hierarchy. In this paper, we demonstrate an approach for combining a low-level repetitive structure detector with a logical high-level interpretation system. We evaluate the performance on a set of images from the building facade domain.

Learning Fast Emulators of Binary Decision Processes

  • DOI: 10.1007/s11263-009-0229-x
  • Odkaz: https://doi.org/10.1007/s11263-009-0229-x
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We shows how existing binary decision algorithms can be approximated by a fast trained WaldBoost classifier. WaldBoost learning minimises the decision time of the classifier while guaranteeing predefined precision. The WaldBoost algorithm together with bootstrapping is able to efficiently handle an effectively unlimited number of training examples provided by the implementation of the approximated algorithm. Two interest point detectors, the Hessian-Laplace and the Kadir-Brady saliency detectors, are emulated to demonstrate the approach. Experiments show that while the repeatability and matching scores are similar for the original and emulated algorithms, a 9-fold speed-up for the Hessian-Laplace detector and a 142-fold speed-up for the Kadir-Brady detector is achieved.

Mobile Mapping of Vertical Traffic Infrastructure

  • Autoři: Doubek, P., Perďoch, M., prof. Ing. Jiří Matas, Ph.D., Mgr. Jan Šochman, Ph.D.,
  • Publikace: CVWW 2008: Proceedings of the 13th Computer Vision Winter Workshop. Ljubljana: Slovenian Pattern Recognition Society, 2008, pp. 115-122. ISBN 978-961-90901-4-5.
  • Rok: 2008
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    In this paper, we present a method for detection and localization of vertical traffic infrastructure using video sequences recorded by a survey vehicle. Search for pole-like structures in the images creates initial 2D hypotheses. They are fused on the groundplane to form 3D hypotheses which are finally verified and classified by search for the distinguished part of the infrastructure. Each step is followed by pruning the set of hypotheses using SVM classifier. The method was tested in a streetlight detection application with video sequences containing over one thousand streetlights.

Training Sequential On-line Boosting Classifier for Visual Tracking

  • Autoři: Grabner, H., Mgr. Jan Šochman, Ph.D., Bischof, H., prof. Ing. Jiří Matas, Ph.D.,
  • Publikace: ICPR 2008: Proceedings of the 19th International Conference on Pattern Recognition. Madison: Omnipress, 2008. pp. 1360-1363. ISSN 1051-4651. ISBN 978-1-4244-2174-9.
  • Rok: 2008
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    On-line boosting allows to adapt a trained classifier to changing environmental conditions or to use sequentially available training data. Yet, two important problems in the on-line boosting training remain unsolved: (i) classifier evaluation speed optimization and, (ii) automatic classifier complexity estimation. In this paper we show how the on-line boosting can be combined with Wald's sequential decision theory to solve both of the problems.The properties of the proposed on-lineWaldBoost algorithm are demonstrated on a visual tracking problem. The complexity of the classifier is changing dynamically depending on the difficulty of the problem. On average, a speedup of a factor of 5-10 is achieved compared to the non-sequential on-line boosting.

Wald's Sequential Analysis for Time-constrained Vision Problems

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    In detection and matching problems in computer vision, both classification errors and time to decision characterize the quality of an algorithmic solution. It is shown how to formalize such problems in the framework of sequential decision-making and derive quasi-optimal time-constrained solutions for three vision problems. The methodology is applied to face and interest point detection and to the RANSAC robust estimator. Error rates of the face detector proposed algorithm are comparable to the state-of-the-art methods. In the interest point application, the output of the Hessian-Laplace detector [Mikolajczyk-IJCV04] is approximated by a sequential WaldBoost classifier which is about five times faster than the original with comparable repeatability. A sequential strategy based on Wald's SPRT for evaluation of model quality in RANSAC leads to significant speed-up in geometric matching problems.

Learning A Fast Emulator of a Binary Decision Process

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Computation time is an important performance characteristic of computer vision algorithms. This paper shows how existing (slow) binary-valued decision algorithms can be approximated by a trained WaldBoost classifier, which minimises the decision time while guaranteeing predefined approximation precision. The core idea is to take an existing algorithm as a black box performing some useful binary decision task and to train the WaldBoost classifier as its emulator. Two interest point detectors, Hessian-Laplace and Kadir-Brady saliency detector, are emulated to demonstrate the approach. The experiments show similar repeatability and matching score of the original and emulated algorithms while achieving a 70-fold speed-up for Kadir-Brady detector.

Wald's Sequential Analysis for Time-constrained Vision Problems

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    n detection and matching problems in computer vision, both classification errors and time to decision characterize the quality of an algorithmic solution. We show how to formalize such problems in the framework of sequential decision-making and derive quasi-optimal time-constrained solutions for three vision problems. The methodology is applied to face and interest point detection and to the RANSAC robust estimator. Error rates of the face detector proposed algorithm are comparable to the state-of-the-art methods. In the interest point application, the output of the Hessian-Laplace detector [Mikolajczyk-IJCV04] is approximated by a sequential WaldBoost classifier which is about five times faster than the original with comparable repeatability. A sequential strategy based on Wald's SPRT for evaluation of model quality in RANSAC leads to significant speed-up in geometric matching problems.

WaldBoost - Learning for Time Constrained Sequential Detection

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    In many computer vision classification problems, both the error and time characterizes the quality of a decision. We show that such problems can be formalized in the framework of sequential decision-making. If the false positive and false negative error rates are given, the optimal strategy in terms of the shortest average time to decision (number of measurements used) is the Wald's sequential probability ratio test (SPRT). We built on the optimal SPRT test and enlarge its capabilities to problems with dependent measurements. We show, how the limitations of SPRT to a priori ordered measurements and known joint probability density functions can be overcome. We propose an algorithm with near optimal time - error rate trade-off, called WaldBoost, which integrates the AdaBoost algorithm for measurement selection and ordering and the joint probability density estimation with the optimal SPRT decision strategy. The WaldBoost algorithm is tested on the face detection problem. The results are

AdaBoost with Totally Corrective Updates for Fast Face Detection

  • Autoři: Mgr. Jan Šochman, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
  • Publikace: FGR '04: Proceeding of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition. Los Alamitos: IEEE Computer Society Press, 2004. pp. 445-450. ISBN 0-7695-2122-3.
  • Rok: 2004

Inter-stage Feature Propagation in Cascade Building with AdaBoost

Za stránku zodpovídá: Ing. Mgr. Radovan Suk