Lidé

doc. Ing. Karel Zimmermann, Ph.D.

Všechny publikace

Self-Supervised Depth Correction of Lidar Measurements From Map Consistency Loss

  • DOI: 10.1109/LRA.2023.3287791
  • Odkaz: https://doi.org/10.1109/LRA.2023.3287791
  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    Depth perception is considered an invaluable source of information in the context of 3D mapping and various robotics applications. However, point cloud maps acquired using consumer-level light detection and ranging sensors (lidars) still suffer from bias related to local surface properties such as measuring beam-to-surface incidence angle. This fact has recently motivated researchers to exploit traditional filters, as well as the deep learning paradigm, in order to suppress the aforementioned depth sensors error while preserving geometric and map consistency details. Despite the effort, depth correction of lidar measurements is still an open challenge mainly due to the lack of clean 3D data that could be used as ground truth. In this letter, we introduce two novel point cloud map consistency losses, which facilitate self-supervised learning on real data of lidar depth correction models. Specifically, the models exploit multiple point cloud measurements of the same scene from different view-points in order to learn to reduce the bias based on the constructed map consistency signal. Complementary to the removal of the bias from the measurements, we demonstrate that the depth correction models help to reduce localization drift.

T-UDA: Temporal Unsupervised Domain Adaptation in Sequential Point Clouds

  • DOI: 10.1109/IROS55552.2023.10341446
  • Odkaz: https://doi.org/10.1109/IROS55552.2023.10341446
  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    Deep perception models have to reliably cope with an open-world setting of domain shifts induced by different geographic regions, sensor properties, mounting positions, and several other reasons. Since covering all domains with annotated data is technically intractable due to the endless possible variations, researchers focus on unsupervised domain adaptation (UDA) methods that adapt models trained on one (source) domain with annotations available to another (target) domain for which only unannotated data are available. Current predominant methods either leverage semi-supervised approaches, e.g., teacher-student setup, or exploit privileged data, such as other sensor modalities or temporal data consistency. We introduce a novel domain adaptation method that leverages the best of both approaches. Our approach combines input data's temporal and cross-sensor geometric consistency with the mean teacher method. Dubbed T-UDA for “temporal UDA”, such a combination yields massive performance gains for the task of 3D semantic segmentation of driving scenes. Experiments are conducted on Waymo Open Dataset, nuScenes, and SemanticKITTI, for two popular 3D point cloud architectures, Cylinder3D and MinkowskiNet. Our codes are publicly available on https://github.com/ctu-vras/T-UDA.

Teachers in Concordance for Pseudo-Labeling of 3D Sequential Data

  • DOI: 10.1109/LRA.2022.3226029
  • Odkaz: https://doi.org/10.1109/LRA.2022.3226029
  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    Automatic pseudo-labeling is a powerful tool to tap into large amounts of sequential unlabeled data. It is especially appealing in safety-critical applications of autonomous driving, where performance requirements are extreme, datasets are large, and manual labeling is very challenging. We propose to leverage sequences of point clouds to boost the pseudo-labeling technique in a teacher-student setup via training multiple teachers, each with access to different temporal information. This set of teachers, dubbed Concordance , provides higher quality pseudo-labels for student training than standard methods. The output of multiple teachers is combined via a novel pseudo-label confidence-guided criterion. Our experimental evaluation focuses on the 3D point cloud domain and urban driving scenarios. We show the performance of our method applied to 3D semantic segmentation and 3D object detection on three benchmark datasets. Our approach, which uses only 20% manual labels, outperforms some fully supervised methods. A notable performance boost is achieved for classes rarely appearing in training data. Our codes will be made publicly available.

Autonomous state-based flipper control for articulated tracked robots in urban environments

  • DOI: 10.1109/LRA.2022.3185762
  • Odkaz: https://doi.org/10.1109/LRA.2022.3185762
  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    We demonstrate a hybrid approach to autonomous flipper control, focusing on a fusion of hard-coded and learned knowledge. The result is a sample-efficient and modifiable control structure that can be used in conjunction with a mapping/navigation stack. The backbone of the control policy is formulated as a state machine whose states define various flipper action templates and local control behaviors. It is also used as an interface that facilitates the gathering of demonstrations to train the transitions of the state machine. We propose a soft-differentiable state machine neural network that mitigates the shortcomings of its naively implemented counterpart and improves over a multi-layer perceptron baseline in the task of state-transition classification. We show that by training on several minutes of user-gathered demonstrations in simulation, our approach is capable of a zero-shot domain transfer to a wide range of obstacles on a similar real robotic platform. Our results show a considerable increase in performance over a previous competing approach in several essential criteria. A subset of this work was successfully used in the Defense Advanced Research Projects Agency (DARPA) Subterranean Challenge to alleviate the operator of manual flipper control. We autonomously traversed stairs and other obstacles, improving map coverage.

Learning to Predict Lidar Intensities

  • DOI: 10.1109/TITS.2020.3037980
  • Odkaz: https://doi.org/10.1109/TITS.2020.3037980
  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    We propose a data-driven method for simulating lidar sensors. The method reads computer-generated data, and (i) extracts geometrically simulated lidar point clouds and (ii) predicts the strength of the lidar response – lidar intensities. Qualitative valuation of the proposed pipeline demonstrates the ability to predict systematic ailures such as no/low responses on polished parts of car bodyworks and windows, for strong responses on reflective surfaces such as traffic signs and license/registration plates. We also experimentally show that enhancing the training set by such simulated data improves the segmentation accuracy on the real dataset with limited access to real data. Implementation of the resulting lidar simulator for the GTA V game, as well as the accompanying large dataset, is made publicly available.

Trajectory Optimization using Learned Robot-Terrain Interaction Model in Exploration of Large Subterranean Environments

  • DOI: 10.1109/LRA.2022.3147332
  • Odkaz: https://doi.org/10.1109/LRA.2022.3147332
  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    We consider the task of active exploration of large subterranean environments with a ground mobile robot. Our goal is to autonomously explore a large unknown area and to obtain an accurate coverage and localization of objects of interest (artifacts). The exploration is constrained by the restricted operation time in rescue scenarios, as well as a hard rough terrain. To this end, we introduce a novel optimization strategy that respects these constraints by maximizing the environment coverage by onboard sensors while producing feasible trajectories with the help of a learned robot-terrain interaction model. The approach is evaluated in diverse subterranean simulated environments showing the viability of active exploration in challenging scenarios. In addition, we demonstrate that the local trajectory optimization improves global coverage of an environment as well as the overall object detection results.

Pose consistency KKT-loss for weakly supervised learning of robot-terrain interaction model

  • DOI: 10.1109/LRA.2021.3076957
  • Odkaz: https://doi.org/10.1109/LRA.2021.3076957
  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    We address the problem of self-supervised learning for predicting the shape of supporting terrain (i.e. the terrain which will provide rigid support for the robot during its traversal) from sparse input measurements. The learning method exploits two types of ground-truth labels: dense 2.5D maps and robot poses, both estimated by a usual SLAM procedure from offline recorded measurements. We show that robot poses are required because straightforward supervised learning from the 3D maps only suffers from: (i) exaggerated height of the supporting terrain caused by terrain flexibility (vegetation, shallow water, snow or sand) and (ii) missing or noisy measurements caused by high spectral absorbance or non-Lambertian reflectance of the measured surface. We address the learning from robot poses by introducing a novel KKT-loss, which emerges as the distance from necessary Karush-Kuhn-Tucker conditions for constrained local optima of a simplified first-principle model of the robot-terrain interaction. We experimentally verify that the proposed weakly supervised learning from ground-truth robot poses boosts the accuracy of predicted support heightmaps and increases the accuracy of estimated robot poses. All experiments are conducted on a dataset captured by a real platform. Both the dataset and codes which replicates experiments in the paper are made publicly available as a part of the submission.

Blind Hexapod Locomotion in Complex Terrain with Gait Adaptation Using Deep Reinforcement Learning and Classification

  • DOI: 10.1007/s10846-020-01162-8
  • Odkaz: https://doi.org/10.1007/s10846-020-01162-8
  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    We present a scalable two-level architecture for Hexapod locomotion through complex terrain without the use of exteroceptive sensors. Our approach assumes that the target complex terrain can be modeled by $N$ discrete terrain distributions which capture individual difficulties of the target terrain. Expert policies (physical locomotion controllers) modeled by Artificial Neural Networks are trained independently in these individual scenarios using Deep Reinforcement Learning. These policies are then autonomously multiplexed during inference using a Recurrent Neural Network terrain classifier conditioned on the state history, giving an adaptive gait appropriate for the current terrain. We perform several tests to assess policy robustness by changing various parameters, such as contact, friction and actuator properties. We also show experiments of goal-based positional control of such a system and a way of selecting several gait criteria during deployment, giving us a complete solution for blind Hexapod locomotion in a practical setting. The Hexapod platform and all our experiments are modeled in the MuJoCo \cite{Todorov2012MuJoCoAP} physics simulator. Demonstrations are available in the supplementary video.

DARPA Subterranean Challenge: Multi-robotic exploration of underground environments

  • DOI: 10.1007/978-3-030-43890-6_22
  • Odkaz: https://doi.org/10.1007/978-3-030-43890-6_22
  • Pracoviště: Centrum umělé inteligence, Vidění pro roboty a autonomní systémy, Multirobotické systémy
  • Anotace:
    The Subterranean Challenge (SubT) is a contest organised by the Defense Advanced Research Projects Agency (DARPA). The contest reflects the requirement of increasing safety and efficiency of underground search-and-rescue missions. In the SubT challenge, teams of mobile robots have to detect, localise and report positions of specific objects in an underground environment. This paper provides a description of the multi-robot heterogeneous exploration system of our CTU-CRAS team, which scored third place in the Tunnel Circuit round, surpassing the performance of all other non-DARPA-funded competitors. In addition to the description of the platforms, algorithms and strategies used, we also discuss the lessons-learned by participating at such contest.

Simultaneous exploration and segmentation for search and rescue

  • DOI: 10.1002/rob.21847
  • Odkaz: https://doi.org/10.1002/rob.21847
  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    We consider the problem of active victim segmentation during a search‐and‐rescue (SAR) exploration mission. The robot is equipped with a multimodal sensor suite consisting of a camera, lidar, and pan‐tilt thermal sensor. The robot enters an unknown scene, builds a 3D model incrementally, and the proposed method simultaneously (a) segments the victims from incomplete multimodal measurements and (b) controls the motion of the thermal camera. Both of these tasks are difficult due to the lack of natural training data and the limited number of real‐world trials. In particular, we overcome the absence of training data for the segmentation task by employing a manually designed generative model, which provides a semisynthetic training data set. The limited number of real‐world trials is tackled by self‐supervised initialization and optimization‐based guiding of the motion control learning. In addition to that, we provide a quantitative evaluation of the proposed method on several real testing scenarios using the real SAR robot. Finally, we also provide a data set which will allow for further development of algorithms on the real data.

Data-driven Policy Transfer with Imprecise Perception Simulation

  • DOI: 10.1109/LRA.2018.2857927
  • Odkaz: https://doi.org/10.1109/LRA.2018.2857927
  • Pracoviště: Katedra kybernetiky, Vidění pro roboty a autonomní systémy
  • Anotace:
    This paper presents a complete pipeline for learning continuous motion control policies for a mobile robot when only a non-differentiable physics simulator of robot-terrain interactions is available. The multi-modal state estimation of the robot is also complex and difficult to simulate, so we simultaneously learn a generative model which refines simulator outputs. We propose a coarse-to-fine learning paradigm, where the coarse motion planning is alternated with guided learning and policy transfer to the real robot. The policy is jointly optimized with the generative model. We evaluate the method on a real-world platform.

Controlling Robot Morphology From Incomplete Measurements

  • DOI: 10.1109/TIE.2016.2580125
  • Odkaz: https://doi.org/10.1109/TIE.2016.2580125
  • Pracoviště: Katedra kybernetiky, Vidění pro roboty a autonomní systémy
  • Anotace:
    Mobile robots with complex morphology are essential for traversing rough terrains in Urban Search & Rescue missions. Since teleoperation of the complex morphology causes high cognitive load of the operator, the morphology is controlled autonomously. The autonomous control measures the robot state and surrounding terrain which is usually only partially observable, and thus the data are often incomplete. We marginalize the control over the missing measurements and evaluate an explicit safety condition. If the safety condition is violated, tactile terrain exploration by the body-mounted robotic arm gathers the missing data.

Fast Simulation of Vehicles with Non-deformable Tracks

  • DOI: 10.1109/IROS.2017.8206546
  • Odkaz: https://doi.org/10.1109/IROS.2017.8206546
  • Pracoviště: Katedra kybernetiky, Vidění pro roboty a autonomní systémy
  • Anotace:
    This paper presents a novel technique that allows for both computationally fast and sufficiently plausible simulation of vehicles with non-deformable tracks. The method is based on an effect we have called Contact Surface Motion. A comparison with several other methods for simulation of tracked vehicle dynamics is presented with the aim to evaluate methods that are available off-the-shelf or with minimum effort in general-purpose robotics simulators. The proposed method is implemented as a plugin for the open-source physics-based simulator Gazebo using the Open Dynamics Engine.

Learning for Active 3D Mapping

  • DOI: 10.1109/ICCV.2017.171
  • Odkaz: https://doi.org/10.1109/ICCV.2017.171
  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    We propose an active 3D mapping method for depth sen- sors, which allow individual control of depth-measuring rays, such as the newly emerging solid-state lidars. The method simultaneously (i) learns to reconstruct a dense 3D occupancy map from sparse depth measurements, and (ii) optimizes the reactive control of depth-measuring rays. To make the first step towards the online control optimization, we propose a fast prioritized greedy algorithm, which needs to update its cost function in only a small fraction of possible rays. The approximation ratio of the greedy algorithm is derived. An experimental evaluation on the subset of the KITTI dataset demonstrates significant improvement in the 3D map accuracy when learning-to-reconstruct from sparse measurements is coupled with the optimization of depth measuring rays.

Autonomous Flipper Control with Safety Constraints

  • DOI: 10.1109/IROS.2016.7759447
  • Odkaz: https://doi.org/10.1109/IROS.2016.7759447
  • Pracoviště: Katedra kybernetiky, Vidění pro roboty a autonomní systémy
  • Anotace:
    Policy Gradient methods require many real-world trials. Some of the trials may endanger the robot system and cause its rapid wear. Therefore, a safe or at least gentle-to-wear exploration is a desired property. We incorporate bounds on the probability of unwanted trials into the recent Contextual Relative Entropy Policy Search method. The proposed algorithm is evaluated on the task of autonomous flipper control for a real Search and Rescue rover platform.

Touching without vision: terrain perception in sensory deprived environments

  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    In this paper we demonstrate a combined hardware and software solution that enhances sensor suite and perception capabilities of a mobile robot intended for real Urban Search & Rescue missions. A common fail-case, when exploring unknown environment of a disaster site, is the outage o r deterioration of exteroceptive sensory measurements that the robot heavily relies on especially for localization and navigation purposes. Deprivation of visual and laser modalities caused by dense smoke motivated us to develop a novel solution comprised of force sensor arrays embedded into tracks of our platform. Furthermore, we also exploit a robotic arm for active perception in cases when the prediction based on force sensors is too uncertain. Beside the integration of hardware, we also propose a framework exploiting Gaussian proces ses followed by Gibb's sampling to process raw sensor measurements and provide probabilistic interpretation of the underlying terrain pro file. In the final, the profile is perceived by proprioceptive means only and successfully substitutes for the lack of exteroceptive measurements in the close vicinity of the robot, when traversing unknown and unseen obstacles. We evaluated our solution on real world terrains.

Adaptive traversability of partially occluded obstacles

  • Autoři: doc. Ing. Karel Zimmermann, Ph.D., Zuzánek, P., Reinštein, M., Petříček, T., Hlaváč, V.
  • Publikace: 2015 IEEE International Conference on Robotics and Automation (ICRA 2015). Piscataway: IEEE, 2015. p. 3959-3964. ISSN 1050-4729. ISBN 978-1-4799-6923-4.
  • Rok: 2015
  • DOI: 10.1109/ICRA.2015.7139752
  • Odkaz: https://doi.org/10.1109/ICRA.2015.7139752
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Controlling mobile robots with complex articulated parts and hence many degrees of freedom generates high cognitive load on the operator, especially under demanding conditions such as in Urban Search & Rescue missions. We propose a solution based on reinforcement learning in order to accommodate the robot morphology automatically to the terrain and the obstacles it traverses. In this paper, we concentrate on the crucial issue of predicting rewards from incomplete or missing data. For this purpose we exploit the Gaussian processes as a predictor combined with decision trees. We demonstrate our achievements in a series of experiments on real data.

Safe Exploration for Reinforcement Learning in Real Unstructured Environments

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    In USAR (Urban Search and Rescue) missions, robots are often required to operate in an unknown environment and with imprecise data coming from their sensors. However, it is highly desired that the robots only act in a safe manner and do not perform actions that could probably make damage to them. To train some tasks with the robot, we utilize reinforcement learning (RL). This machine learning method however requires the robot to perform actions leading to unknown states, which may be dangerous. We develop a framework for training a safety function which constrains possible actions to a subset of really safe actions. Our approach utilizes two basic concepts. First, a "core" of the safety function is given by a cautious simulator and possibly also by manually given examples. Second, a classifier training phase is performed (using Neyman-Pearson SVMs), which extends the safety function to the states where the simulator fails to recognize safe states.

Accepted Autonomy for Search and Rescue Robotics

  • Autoři: Zuzánek, P., doc. Ing. Karel Zimmermann, Ph.D., Hlaváč, V.
  • Publikace: Modelling and Simulation for Autonomous Systems. Cham: Springer, 2014. pp. 231-240. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-319-13822-0.
  • Rok: 2014
  • DOI: 10.1007/978-3-319-13823-7_21
  • Odkaz: https://doi.org/10.1007/978-3-319-13823-7_21
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Since exploration of unknown disaster areas during Search and Rescue missions is often dangerous, teleoperated robotic platforms are usually used as a suitable replacement for a human rescuer. Advanced robotic platforms have usually many degrees of freedom to be controlled, e.g. speed, azimuth, camera view or articulated sub-tracks angles. Manual control of all available degrees of freedom often leads to unwanted cognitive overload of the operator whose attention should be mainly focused on reaching the mission goals. On the other hand, there are fully autonomous systems requiring minimal attention but allowing almost no interaction which is usually not acceptable for the operator. Operator-accepted level of autonomy is usually a trade-off between fully teleoperated and completely autonomous robots. The main contribution of our paper is extensive survey on accepted autonomy solutions for Search and Rescue robots with special focus on traversing unstructured terrain, however brief summary of our system is also provided. Since, integral part of any Search and Rescue robot is the ability to traverse a complex terrain, we describe a system for teleoperated skid-steer robot with articulated sub-tracks (flippers), in which the operator controls robot speed and azimuth, while flipper posture and stiffness are controlled autonomously. The system for autonomous flipper control is trained from semi-autonomously collected training samples to maximize the platform stability and motion smoothness on challenging obstacles.

Adaptive Traversability of Unknown Complex Terrain with Obstacles for Mobile Robots

  • Autoři: doc. Ing. Karel Zimmermann, Ph.D., Zuzánek, P., Reinštein, M., Hlaváč, V.
  • Publikace: ICRA2014: Proceedings of 2014 IEEE International Conference on Robotics and Automation. Piscataway: IEEE, 2014. p. 5177-5182. ISSN 1050-4729. ISBN 978-1-4799-3684-7.
  • Rok: 2014
  • DOI: 10.1109/ICRA.2014.6907619
  • Odkaz: https://doi.org/10.1109/ICRA.2014.6907619
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    In this paper we introduce the concept of Adaptive Traversability (AT), which we define as means of autonomous motion control adapting the robot morphology config uration of articulated parts and their compliances to traverse unknown complex terrain wit h obstacles in an optimal way. We verify this concept by proposing a reinforcement learnin g based AT algorithm for mobile robots operating in such conditions. We demonstrate the fu nctionality by training the AT algorithm under lab conditions on simple EUR-pallet obstacl es and then testing it successfully on natural obstacles in a forest. For quantitative eva luation we define a metrics based on comparison with expert operator. Exploiting the propo sed AT algorithm significantly decreases the cognitive load of the operator.

Designing, developing, and deploying systems to support human-robot teams in disaster response

  • Autoři: Kruijff, G.J.M., Kruijff-Korbayova, I., Keshavdas, S., Larochelle, B., Janíček, M., Colas, F., Liu, M., Pomerleau, F., Siegwart, R., Neerincx, M.A., Looije, R., Smets, N.J.J.M, Mioch, T., van Diggelen, J., Pirri, F., Gianni, M., Ferri, F., Menna, M., Worst, R., Linder, T., Tretyakov, V., Surmann, H., prof. Ing. Tomáš Svoboda, Ph.D., Reinštein, M., doc. Ing. Karel Zimmermann, Ph.D., Petříček, T., Hlaváč, V.
  • Publikace: Advanced Robotics. 2014, 28(23), 1547-1570. ISSN 0169-1864.
  • Rok: 2014
  • DOI: 10.1080/01691864.2014.985335
  • Odkaz: https://doi.org/10.1080/01691864.2014.985335
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper describes our experience in designing, developing and deploying systems for supporting human-robot teams during disaster response. It is based on R&D performed in the EU-funded project NIFTi. NIFTi aimed at building intelligent, collaborative robots that could work together with humans in exploring a disaster site, to make a situational assessment. To achieve this aim, NIFTi addressed key scientific design aspects in building up situation awareness in a human-robot team, developing systems using a user-centric methodology involving end users throughout the entire R&D cycle, and regularly deploying implemented systems under real-life circumstances for experimentation and testing. This has yielded substantial scientific advances in the state-of-the-art in robot mapping, robot autonomy for operating in harsh terrain, collaborative planning, and human-robot interaction. NIFTi deployed its system in actual disaster response activities in Northern Italy, in July 2012, aiding in structure damage assessment.

Experience in System Design for Human-Robot Teaming in Urban Search & Rescue

  • Autoři: Kruijff, G.J.M., Janíček, M., Keshavdas, S., Larochelle, B., Zender, H., Smets, N.J.J.M., Mioch, T., Neerincx, M.A., Diggelen, J.V., Colas, F., Liu, M., Pomerleau, F., Siegwart, R., Hlaváč, V., prof. Ing. Tomáš Svoboda, Ph.D., Petříček, T., Reinštein, M., doc. Ing. Karel Zimmermann, Ph.D., Pirri, F., Gianni, M., Papadakis, P., Sinha, A., Balmer, P., Tomatis, N., Worst, R., Linder, T., Surmann, H., Tretyakov, V., Corrao, S., Pratzler-Wanczura, S., Sulk, M.
  • Publikace: Field and Service Robotics. Heidelberg: Springer, 2014. p. 111-125. Springer Tracts in Advanced Robotics. ISSN 1610-7438. ISBN 978-3-642-40685-0.
  • Rok: 2014
  • DOI: 10.1007/978-3-642-40686-7_8
  • Odkaz: https://doi.org/10.1007/978-3-642-40686-7_8
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    The paper describes experience with applying a user-centric design methodology in developing systems for human-robot teaming in Urban Search & Rescue. A human-robot team consists of several semi-autonomous robots (rovers/UGVs, microcopter/ UAVs), several humans at an off-site command post (mission commander, UGV operators) and one on-site human (UAV operator). This system has been developed in close cooperation with several rescue organizations, and has been deployed in a real-life tunnel accident use case. The human-robot team jointly explores an accident site, communicating using a multi-modal team interface, and spoken dialogue. The paper describes the development of this complex socio-technical system per se, as well as recent experience in evaluating the performance of this system.

Multi-view traffic sign detection, recognition, and 3D localisation

  • DOI: 10.1007/s00138-011-0391-3
  • Odkaz: https://doi.org/10.1007/s00138-011-0391-3
  • Pracoviště: Vidění pro roboty a autonomní systémy
  • Anotace:
    Several applications require information about street furniture. Part of the task is to survey all traffic signs. This has to be done for millions of km of road, and the exercise needs to be repeated every so often. We used a van with eight roof-mounted cameras to drive through the streets and took images every meter. The paper proposes a pipeline for the efficient detection and recognition of traffic signs from such images. The task is challenging, as illumination conditions change regularly, occlusions are frequent, sign positions and orientations vary substantially, and the actual signs are far less similar among equal types than one might expect. We combine 2D and 3D techniques to improve results beyond the state-of-the-art, which is still very much preoccupied with single view analysis. For the initial detection in single frames, we use a set of colour- and shape-based criteria. They yield a set of candidate sign patterns. The selection of such candidates allows for a significant speed up over a sliding window approach while keeping similar performance. A speedup is also achieved through a proposed efficient bounded evaluation of AdaBoost detectors. The 2D detections in multiple views are subsequently combined to generate 3D hypotheses. A Minimum Description Length formulation yields the set of 3D traffic signs that best explains the 2D detections. The paper comes with a publicly available database, with more than 13,000 traffic signs annotations.

Non-Rigid Object Detection with Local Interleaved Sequential Alignment (LISA)

  • DOI: 10.1109/TPAMI.2013.171
  • Odkaz: https://doi.org/10.1109/TPAMI.2013.171
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper shows that the successively evaluated features used in a sliding window detection process to decide about object presence/absence also contain knowledge about object deformation. We exploit these detection features to estimate the object deformation. Estimated deformation is then immediately applied to not yet evaluated features to align them with the observed image data. In our approach, the alignment estimators are jointly learned with the detector. The joint process allows for the learning of each detection stage from less deformed training samples than in the previous stage. For the alignment estimation we propose regressors that approximate non-linear regression functions and compute the alignment parameters extremely fast.

Detection of Curvilinear Objects in Aerial Images

  • Autoři: Zuzánek, P., doc. Ing. Karel Zimmermann, Ph.D., Hlaváč, V.
  • Publikace: CVWW 2013: Proceedings of the 18th Computer Vision Winter Workshop. Vienna: Vienna University of Technology, 2013, pp. 9-15. ISBN 978-3-200-02943-9.
  • Rok: 2013
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper introduces a general framework for autonomous detection of curvilinear objects in aerial images. Our contribution is two-fold. First, we designed simple yet efficient method, which sequentially prunes the space of possible curvilinear objects and thus reduces both the false negative rate detection and computational resources with respect to the exhaustive search methods. Second, our method can handle many types of curvilinear objects (e.g. roads, pipelines). We tested the method on our own dataset consisting of highway images. The produced data set is publicly available. We reached the 93.07 perc. overall accuracy.

Domain Adaptation for Sequential Detection

  • Autoři: Fojtů, Š., doc. Ing. Karel Zimmermann, Ph.D., Pajdla, T., Hlaváč, V.
  • Publikace: SCIA 2013: Proceedings of the 18th Scandinavian Conference on Image Analysis. Heidelberg: Springer, 2013, pp. 215-224. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-642-38885-9. Available from: http://link.springer.com/chapter/10.1007/978-3-642-38886-6_21
  • Rok: 2013
  • DOI: 10.1007/978-3-642-38886-6_21
  • Odkaz: https://doi.org/10.1007/978-3-642-38886-6_21
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We propose a domain adaptation method for sequential decision-making process. While most of the state-of-the-art approaches focus on SVM detectors, we propose the domain adaptation method for the sequential detector similar to WaldBoost, which is suitable for real-time processing. The work is motivated by applications in surveillance, where detectors must be adapted to new observation conditions. We address the situation, where the new observation is related to the previous observation by a parametric transformation. We propose a learning procedure, which reveals the hidden transformation between the old and new data. The transformation essentially allows to transfer the knowledge from the old data to the new one. We show that our method can achieve a 60% speedup in the training w.r.t.~the baseline WaldBoost algorithm while outperforming it in precision.

Exploiting Features - Locally Interleaved Sequential Alignment for Object Detection

  • Autoři: doc. Ing. Karel Zimmermann, Ph.D., Hurych, D., prof. Ing. Tomáš Svoboda, Ph.D.,
  • Publikace: Computer Vision - ACCV 2012, 11th Asian Conference on Computer Vision, Part 1. Heidelberg: Springer, 2013, pp. 446-459. Lecture Notes in Computer Science. ISSN 0302-9743. ISBN 978-3-642-37330-5. Available from: ftp://cmp.felk.cvut.cz/pub/cmp/articles/hurycd1/hurych-accv2012.pdf
  • Rok: 2013
  • DOI: 10.1007/978-3-642-37331-2_34
  • Odkaz: https://doi.org/10.1007/978-3-642-37331-2_34
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We exploit image features multiple times in order to make sequential decision process faster and better performing. In the decision process features providing knowledge about the object presence or absence in a given detection window are successively evaluated. We show that these features also provide information about object position within the evaluated window. The classification process is sequentially interleaved with estimating the correct position. The position estimate is used for steering the features yet to be evaluated. This locally interleaved sequential alignment (LISA) allows to run an object detector on sparser grid which speeds up the process. The position alignment is jointly learned with the detector. We achieve a better detection rate since the method allows for training the detector on perfectly aligned image samples. For estimation of the alignment we propose a learnable regressor that approximates a non-linear regression function and runs in ne2076-1465gligible time.

Mutual On-Line Learning for Detection and Tracking in High-Resolution Images

  • Autoři: Hurych, D., doc. Ing. Karel Zimmermann, Ph.D., prof. Ing. Tomáš Svoboda, Ph.D.,
  • Publikace: Proceedings of the International Joint Conference on Computer Vision, Imaging and Computer Graphics. Theory and Applications (VISIGRAPP2011). Heidelberg: Springer, 2013, pp. 240-256. Communications in Computer and Information Science. ISSN 1865-0929. ISBN 978-3-642-32349-2.
  • Rok: 2013
  • DOI: 10.1007/978-3-642-32350-8_15
  • Odkaz: https://doi.org/10.1007/978-3-642-32350-8_15
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper addresses object detection and tracking in high-resolution omnidirectional images. The foreseen application is a visual subsystem of a rescue robot equipped with an omnidirectional camera, which demands real-time efficiency and robustness against changing viewpoint. Object detectors typically do not guarantee specific frame rate. The detection time may vastly depend on a scene complexity and image resolution. The adapted tracker can often help to overcome the situation, where the appearance of the object is far from the training set. On the other hand, once a tracker is lost, it almost never finds the object again. We propose a combined solution where a very efficient tracker (based on sequential linear predictors) incrementally accommodates varying appearance and speeds up the whole process. Next we propose to incrementally update the detector with examples collected by the tracker. We experimentally show that the performance of the combined algorithm, measured by a ratio between false positives and false negatives, outperforms both individual algorithms. The tracker allows to run the expensive detector only sparsely enabling the combined solution to run in real-time on 12 MPx images from a high resolution omnidirectional camera (Ladybug3).

Terrain Adaptive Odometry for Mobile Skid-steer Robots

  • Autoři: Reinštein, M., Kubelka, V., doc. Ing. Karel Zimmermann, Ph.D.,
  • Publikace: ICRA2013: Proceedings of 2013 IEEE International Conference on Robotics and Automation. Piscataway: IEEE, 2013. p. 4706-4711. ISSN 1050-4729. ISBN 978-1-4673-5641-1.
  • Rok: 2013
  • DOI: 10.1109/ICRA.2013.6631247
  • Odkaz: https://doi.org/10.1109/ICRA.2013.6631247
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper proposes a novel approach to improving precision and reliability of odometry of skid-steer mobile robots by means inspired by robotic terrain classification (RTC). In contrary to standard RTC approaches we do not provide human labeled discrete terrain categories but we classify the terrain directly by the values of coefficients correcting the robot's odometry. Hence these coefficients make the odometry model adaptable to the terrain type due to inherent slip compensation. Estimation of these correction coefficients is based on feature extraction from the vibration data measured by an inertial measurement unit and regression function trained offline. Statistical features from the time domain, frequency domain, and wavelet features were explored and the best were automatically selected. To provide ground truth trajectory for the purpose of offline training a portable overhead camera tracking system was developed. Experimental evaluation on rough outdoor terrain proved 67.9 a 7.5% improvement in RMSE in position with respect to a state of the art odometry model. Moreover, our proposed approach is straightforward, easy for online implementation, and low on computational demands.

A Unified Framework for Planning and Execution-Monitoring of Mobile Robots

  • Autoři: Gianni, M., Papadakis, P., Pirri, F., Liu, M., Pomerleau, F., Colas, F., doc. Ing. Karel Zimmermann, Ph.D., prof. Ing. Tomáš Svoboda, Ph.D., Petříček, T., Kruijff, G., Khambhaita, H., Zender, H.
  • Publikace: Automated Action Planning for Autonomous Mobile Robots: Papers from the AAAI Workshop (WS-11-09). Menlo Park: AAAI Press, 2011, pp. 39-44. ISBN 978-1-57735-525-0.
  • Rok: 2011
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We present an original integration of high level planning and execution with incoming perceptual information from vision, SLAM, topological map segmentation and dialogue. The task of the robot system, implementing the integrated model, is to explore unknown areas and report detected objects to an operator, by speaking loudly. The knowledge base of the planner maintains a graph-based representation of the metric map that is dynamically constructed via an unsupervised topological segmentation method, and augmented with information about the type and position of detected objects, within the map, such as cars or containers. According to this knowledge the cognitive robot can infer strategies in so generating parametric plans that are instantiated from the perceptual processes. Finally, a model-based approach for the execution and control of the robot system is proposed.

Detection of unseen patches trackable by linear predictors

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Linear predictors (LPs) are being used for tracking because of their computational efficiency which is better than steepest descent methods (e.g. Lucas-Kanade). The only disadvantage of LPs is the necessary learning phase which hinders the predictors applicability as a general patch tracker. We address this limitation and propose to learn a bank of LPs off-line and develop an on-line detector which selects image regions that could be tracked by some predictor from the bank. The proposed detector differs significantly from the usual solutions that attempt to find the closest match between a candidate patch and a database of exemplars. We construct the detector directly from the learned linear predictor. The detector positively detects the learned patches, but also many other image patches, which were not used in LP learning phase. This means, that the LP is able to track also previously unseen image patches, the appearances of which are often significantly diverse from the patches used.

Fast Learnable Object Tracking and Detection in High-resolution Omnidirectional Images

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    This paper addresses object detection and tracking in high-resolution omnidirectional images. The foreseen application is a visual subsystem of a rescue robot equipped with an omnidirectional camera, which demands real time efficiency and robustness against changing viewpoint. Object detectors typically do not guarantee specific frame rate. The detection time may vastly depend on a scene complexity and image resolution. The adapted tracker can often help to overcome the situation, where the appearance of the object is far from the training set. On the other hand, once a tracker is lost, it almost never finds the object again. We propose a combined solution where a very efficient tracker (based on sequential linear predictors) incrementally accommodates varying appearance and speeds up the whole process. We experimentally show that the performance of the combined algorithm, measured by a ratio between false positives and false negatives, outperforms both individual algorithms.

Improving Cascade of Classifiers by Sliding Window Alignment in Between

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We improve an object detector based on cascade of classifiers by a local alignment of the sliding window. The detector needs to operate on a relatively sparse grid in order to achieve a real time performance on high-resolution images. The proposed local alignment in the middle of the cascade improves its recognition performance whilst retaining the necessary speed. We show that the moment of the alignment matters and discuss the performance in terms of false negatives and false positives. The proposed method is tested on a car detection problem.

Multi-view Traffic Sign Detection, Recognition, and 3D Localisation

  • DOI: 10.1007/s00138-011-0391-3
  • Odkaz: https://doi.org/10.1007/s00138-011-0391-3
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Several applications require information about street furniture. Part of the task is to survey all traffic signs. This has to be done for millions of km of road, and the exercise needs to be repeated every so often. We used a van with eight roof-mounted cameras to drive through the streets and took images every meter. The paper proposes a pipeline for the efficient detection and recognition of traffic signs from such images. The task is challenging, as illumination conditions change regularly, occlusions are frequent, sign positions and orientations vary substantially, and the actual signs are far less similar among equal types than one might expect. We combine 2D and 3D techniques to improve results beyond the state-of-the-art, which is still very much preoccupied with single view analysis. For the initial detection in single frames, we use a set of colour- and shape-based criteria. They yield a set of candidate sign patterns. The selection of such candidates allows for a significant speed up over a sliding window approach while keeping similar performance. A speedup is also achieved through a proposed efficient bounded evaluation of AdaBoost detectors. The 2D detections in multiple views are subsequently combined to generate 3D hypotheses. A Minimum Description Length formulation yields the set of 3D traffic signs that best explains the 2D detections. The paper comes with a publicly available database, with more than 13,000 traffic signs annotations.

Inerciálně stabilizovaná kamerová základna pro bezpilotní letoun s automatickým sledováním pozemních cílů

  • Pracoviště: Katedra kybernetiky, Katedra řídicí techniky
  • Anotace:
    Článek představuje vývojový projekt inerciálně a a obrazově stabilizované kamerové základny, který je právě řešen týmem na ČVUT ve spolupráci s Vojenským technickým ústavem letectva a PVO a pražskou firmou ESSA. Konkrétní údaje uvedené v článku se týkají již ukončené fáze projektu, ve které byl vyvinut první funkční vzor založený na použití dvou direct drive motorů a MEMS gyroskopů. V současné době probíhá vývoj druhé vylepšené verze, která využívá dvoustupňovou stabilizaci optické osy a vyhýbá se použití atraktivní avšak drahé technologie direct drive motorů.

Anytime learning for the NoSLLiP tracker

  • DOI: 10.1016/j.physletb.2003.10.07
  • Odkaz: https://doi.org/10.1016/j.physletb.2003.10.07
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Anytime learning for the Sequence of Learned Linear Predictors (SLLiP) tracker is proposed. Learning might be time consuming for large problems, we present an anytime learning algorithm which, after a very short initialization period, provides a solution with defined precision. As SLLiP tracking requires only a fraction of the processing power of an ordinary PC, the learning can continue in a parallel background thread continuously delivering improved SLLiPs, ie. faster, with lower computational complexity, with the same pre-defined precision. The proposed approach is verified on publicly-available sequences with approximately 12000 ground truthed frames. The learning time is shown to be twenty times smaller than learning based on linear programming proposed in the paper that introduced the SLLiP tracker [TR]. Its robustness and accuracy is similar. Superiority in frame-rate and robustness with respect to the SIFT detector, Lucas-Kanade tracker and Jurie's tracker is also demonstrated.

Tracking by an Optimal Sequence of Linear Predictors

  • DOI: 10.1109/TPAMI.2008.119
  • Odkaz: https://doi.org/10.1109/TPAMI.2008.119
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    We propose a learning approach to tracking explicitly minimizing the computational complexity of the tracking process subject to user-defined probability of failure (loss-of-lock) and precision. The tracker is formed by a Number of Sequences of Learned Linear Predictors (NoSLLiP). Robustness of NoSLLiP is achieved by modeling the object as a collection of local motion predictors --- object motion is estimated by the outlier-tolerant Ransac algorithm from local predictions. Efficiency of the NoSLLiP tracker stems from (i) the simplicity of the local predictors and (ii) from the fact that all design decisions - the number of local predictors used by the tracker, their computational complexity (ie the number of observations the prediction is based on), locations as well as the number of Ransac iterations are all subject to the optimization (learning) process. All time-consuming operations are performed during the learning stage - t.

Simultaneous learning of motion and appearance

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A new learning method for motion estimation of objects with significantly varying appearance is proposed. Varying object appearance is represented by a low dimensional space of appearance parameters. The appearance mapping and motion estimation method are optimized simultaneously. Appearance parameters are estimated by unsupervised learning. The method is experimentally verified by a tracking application on sequences which exhibit strong variable illumination, non-rigid deformations and self-occlusions.

Adaptive parameter optimization for real-time tracking

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Adaptation of a tracking procedure combined in a common way with a Kalman filter is formulated as an constrained optimization problem, where a trade-off between precision and loss-of-lock probability is explicitly taken into account. While the tracker is learned in order to minimize computational complexity during a learning stage, in a tracking stage the precision is maximized online under a constraint imposed by the loss-of-lock probability resulting in an optimal setting of the tracking procedure. We experimentally show that the proposed method converges to a steady solution in all variables. In contrast to a common Kalman filter based tracking, we achieve a significantly lower state covariance matrix. We also show, that if the covariance matrix is continuously updated, the method is able to adapt to a different situations. If a dynamic model is precise enough the tracker is allowed to spend a longer time with a fine motion estimation, however, if the motion gets saccadic, i.e. unpr

Learning Efficient Linear Predictors for Motion Estimation

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A novel object representation for tracking is proposed. The tracked object is represented as a constellation of spatially localised linear predictors which are learned on a single training image. In the learning stage, sets of pixels whose intensities allow for optimal least square predictions of the transformations are selected as a support of the linear predictor. The approach comprises three contributions: learning object specific linear predictors, explicitly dealing with the predictor precision - computational complexity trade-off and selecting a view-specific set of predictors suitable for global object motion estimate. Robustness to occlusion is achieved by RANSAC procedure. The learned tracker is very efficient, achieving frame rate generally higher than 30 frames per second despite the Matlab implementation.

Multiview 3D Tracking with an Incrementally Constructed 3D Model

  • DOI: 10.1109/3DPVT.2006.101
  • Odkaz: https://doi.org/10.1109/3DPVT.2006.101
  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A novel object representation for tracking is proposed. The tracked object is represented as a constellation of spatially localised linear predictors which are learned on a single training image. In the learning stage, sets of pixels whose intensities allow for optimal least square predictions of the transformations are selected as a support of the linear predictor. The approach comprises three contributions: learning object specific linear predictors, explicitly dealing with the predictor precision - computational complexity trade-off and selecting a view-specific set of predictors suitable for global object motion estimate. Robustness to occlusion is achieved by RANSAC procedure. The learned tracker is very efficient, achieving frame rate generally higher than 30 frames per second despite the Matlab implementation.

A New Class of Learnable Detectors for Categorisation

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    A new class of image-level detectors that can be adapted by machine learning techniques to detect parts of objects from a given category is proposed. A classifier (e.g. neural network or adaboost) within the detector selects a relevant subset of extremal regions, i.e. regions that are connected components of a thresholded image. Properties of extremal regions render the detector very robust to illumination change.

Probabilistic Estimation of Articulated Body Model from Multiview Data

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    An optimization algorithm and statistical description of articulated body model estimation is proposed. The optimization algorithm fits the model into segmented multiview images. The input of our algorithm is a sequence of segmented images captured by several cameras and a structure of the articulated model. The output of the optimization procedure is shape and motion of the articulated model. The optimization runs over all cameras and all images in the sequence. We focus on description and optimization of probability distribution of the model parameters given segmented multiview sequence. We demonstrate the performance of the algorithm on real sequences of walking human.

Unconstrained Licence Plate Detection

  • Pracoviště: Katedra kybernetiky
  • Anotace:
    Licence plates and traffic signs detection and recognition have a number of different applications relevant for transportation systems, such as traffic monitoring, detection of stolen vehicles, driver navigation support or any statistical research. A number of methods have been proposed, but only for particular cases and working under constraints (e.g. known text direction or high resolution). Therefore a new class of locally threshold separable detectors based on extremal regions, which can be adapted by machine learning techniques to arbitrary shapes, is proposed. In the test set of licence plate images taken from different viewpoints <-45dg.,45dg.>, scales (from seven to hundreds of pixels height) even in bad illumination conditions and partial occlusions, the high detection accuracy is achieved (95%). Finally we present the detector generic abilities by traffic signs detection. The standard classifier (neural network) within the detector selects a relevant subset of extremal region

Za stránku zodpovídá: Ing. Mgr. Radovan Suk