Lidé
Mgr. Vojtěch Čermák, Ph.D.
Všechny publikace
FungiTastic: A multi-modal dataset and benchmark for image categorization
- Autoři: Picek, L., Ing. Klára Janoušková, Mgr. Vojtěch Čermák, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
- Publikace: 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Los Alamitos: IEEE Computer Society, 2025. p. 2037-2047. ISSN 2160-7516. ISBN 979-8-3315-9994-2.
- Rok: 2025
- DOI: 10.1109/CVPRW67362.2025.00192
- Odkaz: https://doi.org/10.1109/CVPRW67362.2025.00192
- Pracoviště: Skupina vizuálního rozpoznávání
-
Anotace:
We introduce a new, challenging benchmark and a dataset, FungiTastic, based on fungal records continuously collected over a twenty-year span. The dataset is labelled and curated by experts and consists of about 350k multimodal observations of 6k fine-grained categories (species). The fungi observations include photographs and additional data, e.g., meteorological and climatic data, satellite images, and body part segmentation masks. FungiTastic is one of the few benchmarks that include a test set with DNA-sequenced ground truth of unprecedented label reliability. The benchmark is designed to support (i) standard closed-set classification, (ii) open-set classification, (iii) multi-modal classification, (iv) few-shot learning, (v) domain shift, and many more. We provide tailored baselines for many use cases, a multitude of ready-to-use pre-trained models on HuggingFace, and a framework for model training. The documentation and the baselines are available at GitHub and Kaggle.
LifeCLEF 2025 Teaser: Challenges on Species Presence Prediction and Identification, and Individual Animal Identification
- Autoři: Joly, A., Picek, L., Kahl, S., Goëau, H., Adam, L., prof. Ing. Jiří Matas, Ph.D., Ing. Klára Janoušková, Mgr. Vojtěch Čermák, Ph.D.,
- Publikace: Advances in Information Retrieval. Cham: Springer International Publishing, 2025. p. 373-381. LNCS. vol. 15576 LNCS. ISSN 1611-3349. ISBN 978-3-031-88720-8.
- Rok: 2025
- DOI: 10.1007/978-3-031-88720-8_57
- Odkaz: https://doi.org/10.1007/978-3-031-88720-8_57
- Pracoviště: Skupina vizuálního rozpoznávání
-
Anotace:
Accurate identification, monitoring, and understanding of species distribution is important for biodiversity conservation, invasive species control, understanding climate change, and ecosystem management. Current methodologies for species identification, animal re-identification, and large-scale population monitoring are both resource-intensive and technically complex, posing significant challenges for widespread implementation. This highlights a need for automated, scalable solutions to enhance efficiency and accuracy. Since 2011, the LifeCLEF lab has driven progress in this field by organizing annual challenges to promote innovation in biodiversity informatics. The 2025 edition introduces five – one new, and four continued – data-driven tasks aimed at addressing current challenges in species recognition: (i) AnimalCLEF: multi-species individual animal identification, (ii) BirdCLEF: bird species identification in soundscape recordings, (iii) FungiCLEF: few shot classification with rare fungi species, (iv) GeoLifeCLEF: multi-modal species prediction using remote sensing and large-scale biodiversity data, and (v) PlantCLEF: multi-species plant identification in vegetation plot images.
Overview of LifeCLEF 2025: Challenges on Species Presence Prediction and Identification, and Individual Animal Identification
- Autoři: Picek, L., Kahl, S., Goeau, H., Adam, L., Ing. Klára Janoušková, prof. Ing. Jiří Matas, Ph.D., Mgr. Vojtěch Čermák, Ph.D.,
- Publikace: Proceedings of the CLEF 2025: Experimental IR Meets Multilinguality, Multimodality, and Interaction. Cham: Springer International Publishing, 2025. p. 338-362. LNCS. vol. 16089. ISSN 1611-3349. ISBN 978-3-032-04354-2.
- Rok: 2025
- DOI: 10.1007/978-3-032-04354-2_19
- Odkaz: https://doi.org/10.1007/978-3-032-04354-2_19
- Pracoviště: Skupina vizuálního rozpoznávání
-
Anotace:
Biodiversity monitoring using AI-powered tools has become vital for tracking species distributions and assessing ecosystem health on a large scale. Automated image- and sound-based species recognition, in particular, continues to accelerate conservation efforts by enabling rapid, low-cost surveys of vulnerable populations. However, the ever-growing variety of algorithms and data sources underscores the need for standardized benchmarks to assess real-world performance. Since 2011, the LifeCLEF lab has filled this role by organizing annual evaluations that promote collaboration among AI experts, citizen science, and ecologists. In this overview, we report on the LifeCLEF 2025 edition, which featured five distinct, data-driven tasks: (i) AnimalCLEF, focusing on open-set individual animal re-identification; (ii) BirdCLEF+, about species recognition in complex acoustic soundscape recordings; (iii) FungiCLEF, addressing few-shot classification of rare fungi species; (iv) GeoLifeCLEF, combining environmental and high-resolution remote sensing with occurrence records to predict plant species presence; and (v) PlantCLEF, aiming to identify multiple co-occurring plant species in vegetation-plot imagery. This paper provides an overview of the motivation, methodology, and main outcomes of the five challenges.
SeaTurtleID2022: A Long-span Dataset for Reliable Sea Turtle Re-identification
- Autoři: Adam, L., Mgr. Vojtěch Čermák, Ph.D., Papafitsoros, K., Picek, L.
- Publikace: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Piscataway: IEEE, 2024. p. 7131-7141. ISSN 2642-9381. ISBN 979-8-3503-1892-0.
- Rok: 2024
- DOI: 10.1109/WACV57701.2024.00699
- Odkaz: https://doi.org/10.1109/WACV57701.2024.00699
- Pracoviště: Skupina vizuálního rozpoznávání
-
Anotace:
This paper introduces the first public large-scale, long-span dataset with sea turtle photographs captured in the wild-SeaTurtleID2022. The dataset contains 8729 photographs of 438 unique individuals collected within 13 years, making it the longest-spanned dataset for animal re-identification. Each photograph includes various annotations, e.g., identity, encounter timestamp, and body parts segmentation masks. Instead of a standard ''random"split, the dataset allows for two realistic and ecologically motivated splits: (i) time-aware: a closed-set with training, validation, and test data from different days/years, and (ii) open-set: with new unknown individuals in test and validation sets. We show that time-aware splits are essential for benchmarking methods for re-identification, as random splits lead to performance overestimation. Furthermore, a baseline instance segmentation and re-identification performance over various body parts is provided. At last, an end-to-end system for sea turtle re-identification is proposed and evaluated. The proposed system based on Hybrid Task Cascade for head instance segmentation and ArcFace-trained feature-extractor achieved an accuracy of 86.8%.
WildlifeDatasets: An Open-source Toolkit for Animal Re-identification
- Autoři: Mgr. Vojtěch Čermák, Ph.D., Picek, L., Adam, L., Papafitsoros, K.
- Publikace: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Piscataway: IEEE, 2024. p. 5941-5951. ISSN 2642-9381. ISBN 979-8-3503-1892-0.
- Rok: 2024
- DOI: 10.1109/WACV57701.2024.00585
- Odkaz: https://doi.org/10.1109/WACV57701.2024.00585
- Pracoviště: Skupina vizuálního rozpoznávání
-
Anotace:
In this paper, we present WildlifeDatasets - an open-source toolkit intended primarily for ecologists and computer-vision / machine-learning researchers. The WildlifeDatasets is written in Python, allows straightforward access to publicly available wildlife datasets, and provides a wide variety of methods for dataset pre-processing, performance analysis, and model fine-tuning. We show-case the toolkit in various scenarios and baseline experiments, including, to the best of our knowledge, the most comprehensive experimental comparison of datasets and methods for wildlife re-identification, including both local descriptors and deep learning approaches. Furthermore, we provide the first-ever foundation model for individual re-identification within a wide range of species - MegaDescriptor - that provides state-of-the-art performance on animal re-identification datasets and outperforms other pre-trained models such as CLIP and DINOv2 by a significant margin. To make the model available to the general public and to allow easy integration with any existing wildlife monitoring applications, we provide multiple MegaDescriptor flavors (i.e., Small, Medium, and Large) through the HuggingFace hub.
Adversarial Examples by Perturbing High-level Features in Intermediate Decoder Layers
- Autoři: Mgr. Vojtěch Čermák, Ph.D., Adam, L.
- Publikace: ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2. Porto: SciTePress - Science and Technology Publications, 2022. p. 496-507. ISSN 2184-433X. ISBN 978-989-758-547-0.
- Rok: 2022
- DOI: 10.5220/0010892800003116
- Odkaz: https://doi.org/10.5220/0010892800003116
- Pracoviště: Katedra počítačů, Centrum umělé inteligence
-
Anotace:
We propose a novel method for creating adversarial examples. Instead of perturbing pixels, we use an encoder-decoder representation of the input image and perturb intermediate layers in the decoder. This changes the high-level features provided by the generative model. Therefore, our perturbation possesses semantic meaning, such as a longer beak or green tints. We formulate this task as an optimization problem by minimizing the Wasserstein distance between the adversarial and initial images under a misclassification constraint. We employ the projected gradient method with a simple inexact projection. Due to the projection, all iterations are feasible, and our method always generates adversarial images. We perform numerical experiments by fooling MNIST and ImageNet classifiers in both targeted and untargeted settings. We demonstrate that our adversarial images are much less vulnerable to steganographic defence techniques than pixel-based attacks. Moreover, we show that our method modifies key features such as edges and that defence techniques based on adversarial training are vulnerable to our attacks.