Persons

Ing. Tomáš Rouček

All publications

Contrastive Learning for Image Registration in Visual Teach and Repeat Navigation

  • DOI: 10.3390/s22082975
  • Link: https://doi.org/10.3390/s22082975
  • Department: Department of Computer Science, Artificial Intelligence Center
  • Annotation:
    Visual teach and repeat navigation (VT&R) is popular in robotics thanks to its simplicity and versatility. It enables mobile robots equipped with a camera to traverse learned paths without the need to create globally consistent metric maps. Although teach and repeat frameworks have been reported to be relatively robust to changing environments, they still struggle with day-to-night and seasonal changes. This paper aims to find the horizontal displacement between prerecorded and currently perceived images required to steer a robot towards the previously traversed path. We employ a fully convolutional neural network to obtain dense representations of the images that are robust to changes in the environment and variations in illumination. The proposed model achieves state-of-the-art performance on multiple datasets with seasonal and day/night variations. In addition, our experiments show that it is possible to use the model to generate additional training examples that can be used to further improve the original model's robustness. We also conducted a real-world experiment on a mobile robot to demonstrate the suitability of our method for VT&R.

Embedding Weather Simulation in Auto-Labelling Pipelines Improves Vehicle Detection in Adverse Conditions

  • DOI: 10.3390/s22228855
  • Link: https://doi.org/10.3390/s22228855
  • Department: Department of Computer Science, Artificial Intelligence Center
  • Annotation:
    The performance of deep learning-based detection methods has made them an attractive option for robotic perception. However, their training typically requires large volumes of data containing all the various situations the robots may potentially encounter during their routine operation. Thus, the workforce required for data collection and annotation is a significant bottleneck when deploying robots in the real world. This applies especially to outdoor deployments, where robots have to face various adverse weather conditions. We present a method that allows an independent car tansporter to train its neural networks for vehicle detection without human supervision or annotation. We provide the robot with a hand-coded algorithm for detecting cars in LiDAR scans in favourable weather conditions and complement this algorithm with a tracking method and a weather simulator. As the robot traverses its environment, it can collect data samples, which can be subsequently processed into training samples for the neural networks. As the tracking method is applied offline, it can exploit the detections made both before the currently processed scan and any subsequent future detections of the current scene, meaning the quality of annotations is in excess of those of the raw detections. Along with the acquisition of the labels, the weather simulator is able to alter the raw sensory data, which are then fed into the neural network together with the labels. We show how this pipeline, being run in an offline fashion, can exploit off-the-shelf weather simulation for the auto-labelling training scheme in a simulator-in-the-loop manner. We show how such a framework produces an effective detector and how the weather simulator-in-the-loop is beneficial for the robustness of the detector. Thus, our automatic data annotation pipeline significantly reduces not only the data annotation but also the data collection effort. This allows the integration of deep learning algorithms into existing robotic systems without the need for tedious data annotation and collection in all possible situations. Moreover, the method provides annotated datasets that can be used to develop other methods. To promote the reproducibility of our research, we provide our datasets, codes and models online.

Self-Supervised Robust Feature Matching Pipeline for Teach and Repeat Navigation

  • DOI: 10.3390/s22082836
  • Link: https://doi.org/10.3390/s22082836
  • Department: Department of Computer Science, Artificial Intelligence Center
  • Annotation:
    The performance of deep neural networks and the low costs of computational hardware has made computer vision a popular choice in many robotic systems. An attractive feature of deep-learned methods is their ability to cope with appearance changes caused by day-night cycles and seasonal variations. However, deep learning of neural networks typically relies on large numbers of hand-annotated images, which requires significant effort for data collection and annotation. We present a method that allows autonomous, self-supervised training of a neural network in visual teach-and-repeat (VT&R) tasks, where a mobile robot has to traverse a previously taught path repeatedly. Our method is based on a fusion of two image registration schemes: one based on a Siamese neural network and another on point-feature matching. As the robot traverses the taught paths, it uses the results of feature-based matching to train the neural network, which, in turn, provides coarse registration estimates to the feature matcher. We show that as the neural network gets trained, the accuracy and robustness of the navigation increases, making the robot capable of dealing with significant changes in the environment. This method can significantly reduce the data annotation efforts when designing new robotic systems or introducing robots into new environments. Moreover, the method provides annotated datasets that can be deployed in other navigation systems. To promote the reproducibility of the research presented herein, we provide our datasets, codes and trained models online.

Semi-supervised Learning for Image Alignment in Teach and Repeat Navigation

  • DOI: 10.1145/3477314.3507045
  • Link: https://doi.org/10.1145/3477314.3507045
  • Department: Department of Computer Science, Artificial Intelligence Center
  • Annotation:
    Visual teach and repeat navigation (VT&R) is a framework that enables mobile robots to traverse previously learned paths. In principle, it relies on computer vision techniques that can compare the camera's current view to a model based on the images captured during the teaching phase. However, these techniques are usually not robust enough when significant changes occur in the environment between the teach and repeat phases. In this paper, we show that contrastive learning methods can learn how the environment changes and improve the robustness of a VT&R framework. We apply a fully convolutional Siamese network to register the images of the teaching and repeat phases. Their horizontal displacement between the images is then used in a visual servoing manner to keep the robot on the intended trajectory. The experiments performed on several datasets containing seasonal variations indicate that our method outperforms state-of-the-art algorithms tailored to the purpose of registering images captured in different seasons.

Toward Benchmarking of Long-Term Spatio-Temporal Maps of Pedestrian Flows for Human-Aware Navigation

  • DOI: 10.3389/frobt.2022.890013
  • Link: https://doi.org/10.3389/frobt.2022.890013
  • Department: Department of Computer Science, Artificial Intelligence Center
  • Annotation:
    Despite the advances in mobile robotics, the introduction of autonomous robots in human-populated environments is rather slow. One of the fundamental reasons is the acceptance of robots by people directly affected by a robot's presence. Understanding human behavior and dynamics is essential for planning when and how robots should traverse busy environments without disrupting people's natural motion and causing irritation. Research has exploited various techniques to build spatio-temporal representations of people's presence and flows and compared their applicability to plan optimal paths in the future. Many comparisons of how dynamic map-building techniques show how one method compares on a dataset versus another, but without consistent datasets and high-quality comparison metrics, it is difficult to assess how these various methods compare as a whole and in specific tasks. This article proposes a methodology for creating high-quality criteria with interpretable results for comparing long-term spatio-temporal representations for human-aware path planning and human-aware navigation scheduling. Two criteria derived from the methodology are then applied to compare the representations built by the techniques found in the literature. The approaches are compared on a real-world, long-term dataset, and the conception is validated in a field experiment on a robotic platform deployed in a human-populated environment. Our results indicate that continuous spatio-temporal methods independently modeling spatial and temporal phenomena outperformed other modeling approaches. Our results provide a baseline for future work to compare a wide range of methods employed for long-term navigation and provide researchers with an understanding of how these various methods compare in various scenarios.

Learning to see through the haze: Multi-sensor learning-fusion System for Vulnerable Traffic Participant Detection in Fog

  • DOI: 10.1016/j.robot.2020.103687
  • Link: https://doi.org/10.1016/j.robot.2020.103687
  • Department: Department of Computer Science, Artificial Intelligence Center
  • Annotation:
    We present an experimental investigation of a multi-sensor fusion-learning system for detecting pedestrians in foggy weather conditions. The method combines two pipelines for people detection running on two different sensors commonly found on moving vehicles: lidar and radar. The two pipelines are not only combined by sensor fusion, but information from one pipeline is used to train the other. We build upon our previous work, where we showed that a lidar pipeline can be used to train a Support Vector Machine (SVM)-based pipeline to interpret radar data, which is useful when conditions then become unfavourable to the original lidar pipeline. In this paper, we test the method on a wider range of conditions, such as from a moving vehicle, and with multiple people present. Additionally, we also compare how the traditional SVM performs interpreting the radar data versus a modern deep neural network on these experiments. Our experiments indicate that either of the approaches results in progressive improvement in the performance during normal operation. Further, our experiments indicate that in the event of the loss of information from a sensor, pedestrian detection and position estimation is still effective.

Robust Image Alignment for Outdoor Teach-and-Repeat Navigation

  • DOI: 10.1109/ECMR50962.2021.9568832
  • Link: https://doi.org/10.1109/ECMR50962.2021.9568832
  • Department: Department of Computer Science, Artificial Intelligence Center
  • Annotation:
    Visual Teach-and-Repeat robot navigation suffers from environmental changes over time, and it struggles in real-world long-term deployments. We propose a robust robot bearing correction method based on traditional principles aided by exploiting the abstraction from higher layers of widely available pre-trained Convolutional Neural Networks (CNNs). Our method applies a two-dimensional Discrete Fast Fourier Transform based approach over several different convolution filters from higher levels of a CNN to robustly estimate the alignment between two corresponding images. The method also estimates its uncertainty, which is essential for the navigation system to decide how much it can trust the bearing correction. We show that our "learning-free" method is comparable with the state-of-the-art methods when the environmental conditions are changed only slightly, but it out-performs them at night.

CHRONOROBOTICS: Representing the Structure of Time for Service Robots

  • DOI: 10.1145/3440084.3441195
  • Link: https://doi.org/10.1145/3440084.3441195
  • Department: Department of Computer Science, Artificial Intelligence Center
  • Annotation:
    Chronorobotics is the investigation of scientific methods allowing robots to adapt to and learn from the perpetual changes occurring in natural and human-populated environments. We present methods that can introduce the notion of dynamics into spatial environment models, resulting in representations which provide service robots with the ability to predict future states of changing environments. Several long-term experiments indicate that the aforementioned methods gradually improve the efficiency of robots' autonomous operations over time. More importantly, the experiments indicate that chronorobotic concepts improve robots' ability to seamlessly merge into human-populated environments, which is important for their integration and acceptance in human societies

DARPA Subterranean Challenge: Multi-robotic exploration of underground environments

  • DOI: 10.1007/978-3-030-43890-6_22
  • Link: https://doi.org/10.1007/978-3-030-43890-6_22
  • Department: Artificial Intelligence Center, Vision for Robotics and Autonomous Systems, Multi-robot Systems
  • Annotation:
    The Subterranean Challenge (SubT) is a contest organised by the Defense Advanced Research Projects Agency (DARPA). The contest reflects the requirement of increasing safety and efficiency of underground search-and-rescue missions. In the SubT challenge, teams of mobile robots have to detect, localise and report positions of specific objects in an underground environment. This paper provides a description of the multi-robot heterogeneous exploration system of our CTU-CRAS team, which scored third place in the Tunnel Circuit round, surpassing the performance of all other non-DARPA-funded competitors. In addition to the description of the platforms, algorithms and strategies used, we also discuss the lessons-learned by participating at such contest.

Responsible person Ing. Mgr. Radovan Suk