Lidé
Ing. Bc. Radim Špetlík
Všechny publikace
EEPPR: Event-based Estimation of Periodic Phenomena Rate using Correlation in 3D
- Autoři: Kolář, J., Ing. Bc. Radim Špetlík, prof. Ing. Jiří Matas, Ph.D.,
- Publikace: Proceedings of SPIE - The International Society for Optical Engineering. Bellingham (stát Washington): SPIE, 2025. p. 1-7. 1. vol. 13517. ISSN 1996-756X. ISBN 9781510688285.
- Rok: 2025
- DOI: 10.1117/12.3055033
- Odkaz: https://doi.org/10.1117/12.3055033
- Pracoviště: Skupina vizuálního rozpoznávání
-
Anotace:
We present a novel method for measuring the rate of periodic phenomena (e.g., rotation, flicker, and vibration), by an event camera, a device asynchronously reporting brightness changes at independently operating pixels with high temporal resolution. The approach assumes that for a periodic phenomenon, a highly similar set of events is generated within a spatio-temporal window at a time difference corresponding to its period. The sets of similar events are detected by a correlation in the spatio-temporal event stream space. The proposed method, EEPPR, is evaluated on a dataset of 12 sequences of periodic phenomena, i.e. flashing light and vibration, and periodic motion, e.g., rotation, ranging from 3.2 Hz to 2 kHz (equivalent to 192-120 000 RPM). EEPPR significantly outperforms published methods on this dataset, achieving a mean relative error of 0.1% setting new state of the art. The dataset and codes are publicly available on GitHub.
Efficient Real-Time Quadcopter Propeller Detection and Attribute Estimation with High-Resolution Event Camera
- Autoři: Ing. Bc. Radim Špetlík, Uhrová, T., prof. Ing. Jiří Matas, Ph.D.,
- Publikace: Image Analysis. Springer, Cham, 2025. p. 217-230. 1. vol. 15725 LNCS. ISSN 1611-3349. ISBN 978-3-031-95911-0.
- Rok: 2025
- DOI: 10.1007/978-3-031-95911-0_16
- Odkaz: https://doi.org/10.1007/978-3-031-95911-0_16
- Pracoviště: Skupina vizuálního rozpoznávání
-
Anotace:
In this paper, we present a computationally efficient method for real-time detection and state estimation of quadcopter propellers in high-resolution event-camera streams. We model local event arrivals as Poisson processes and exploit the memoryless nature of inter-arrival times to robustly detect periodic bursts from rotating blades, even at high rotational speeds. Unlike approaches that process data in chunks, our method updates the detection metrics for each incoming event. Once a propeller is detected, we first calculate its angular speed and then fit an ellipse to the aggregated propeller events to estimate pitch and roll. We introduce a new dataset (speeds 1100–8200 RPM; tilt angles 0∘, 10∘, and 90∘) and achieve near-perfect detection accuracy at an average real-time factor of 0.94 on a single CPU core, demonstrating the suitability of the approach for onboard deployment.
Sex Classification from Human Scent Using Image Interpretation of 2D Gas Chromatography-Mass Spectrometry Data
- Autoři: Hlavsa, J., Ing. Bc. Radim Špetlík, Čechová, J., Pojmanová, P., prof. Ing. Jiří Matas, Ph.D.,
- Publikace: Image Analysis. Springer, Cham, 2025. p. 457-470. 1. vol. 15725 LNCS. ISSN 1611-3349. ISBN 978-3-031-95911-0.
- Rok: 2025
- DOI: 10.1007/978-3-031-95911-0_32
- Odkaz: https://doi.org/10.1007/978-3-031-95911-0_32
- Pracoviště: Skupina vizuálního rozpoznávání
-
Anotace:
Two-dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC×GC ToF-MS) provides detailed chemical profiles of complex mixtures, making it useful in areas such as environmental monitoring and medical diagnostics. A promising application is sex classification from human scent, where subtle chemical differences indicate biological sex. In this paper, we propose a pattern recognition approach to sex classification that interprets raw GC×GC ToF-MS data as images, moving beyond traditional compound-based analysis. Our approach employs convolutional neural networks (CNNs) to analyze these images, and we compare its performance against established techniques – linear SVM, Ridge regression, and QDA – demonstrating robust and competitive results. Furthermore, we introduce and release a new dataset of GC×GC ToF-MS measurements to support reproducibility in future studies. Using an identity-aware cross-validation strategy, where test subjects are completely unseen during training, our method achieves approximately 88% accuracy on 504 measurements from 40 individuals.
Single-Image Localised Reflection Removal with k-Order Differences Term
- Autoři: Ing. Bc. Radim Špetlík, prof. Ing. Jiří Matas, Ph.D.,
- Publikace: Image Analysis. Springer, Cham, 2025. p. 106-119. 1. vol. 15726 LNCS. ISSN 1611-3349. ISBN 978-3-031-95918-9.
- Rok: 2025
- DOI: 10.1007/978-3-031-95918-9_8
- Odkaz: https://doi.org/10.1007/978-3-031-95918-9_8
- Pracoviště: Skupina vizuálního rozpoznávání
-
Anotace:
We introduce the problem of localized glass-reflection removal (LGRR), which targets the removal of highlights caused by light reflections on glass surfaces. Our approach uses an end-to-end convolutional neural network trained on MS-COCO image crops with synthetic reflections generated from real light source images. We propose a cost function that includes image difference terms and show that it improves reflection removal and inpainting compared to standard L1 and L2 losses. Experimental results demonstrate that our method effectively reduces localized reflections.
StructuReiser: A Structure-preserving Video Stylization Method
- Autoři: Ing. Bc. Radim Špetlík, Futschik, D., prof. Ing. Daniel Sýkora, Ph.D.,
- Publikace: COMPUTER GRAPHICS FORUM. 2025, 44(4), ISSN 0167-7055.
- Rok: 2025
- DOI: 10.1111/cgf.70161
- Odkaz: https://doi.org/10.1111/cgf.70161
- Pracoviště: Katedra počítačové grafiky a interakce, Skupina vizuálního rozpoznávání
-
Anotace:
We introduce StructuReiser, a novel video-to-video translation method that transforms input videos into stylized sequences using a set of user-provided keyframes. Unlike most existing methods, StructuReiser strictly adheres to the structural elements of the target video, preserving the original identity while seamlessly applying the desired stylistic transformations. This provides a level of control and consistency that is challenging to achieve with previous text-driven or keyframe-based approaches, including large video models. Furthermore, StructuReiser supports real-time inference on standard graphics hardware as well as custom keyframe editing, enabling interactive applications and expanding possibilities for creative expression and video manipulation.
Single-Image Deblurring, Trajectory and Shape Recovery of Fast Moving Objects with Denoising Diffusion Probabilistic Models
- Autoři: Ing. Bc. Radim Špetlík, Ing. Denis Rozumný, Dr.sc.ETH, prof. Ing. Jiří Matas, Ph.D.,
- Publikace: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Piscataway: IEEE, 2024. p. 6843-6852. ISSN 2642-9381. ISBN 979-8-3503-1892-0.
- Rok: 2024
- DOI: 10.1109/WACV57701.2024.00671
- Odkaz: https://doi.org/10.1109/WACV57701.2024.00671
- Pracoviště: Skupina vizuálního rozpoznávání
-
Anotace:
Blurry appearance of fast moving objects in video frames was successfully used to reconstruct the object appearance and motion in both 2D and 3D domains. The proposed method addresses the novel, severely ill-posed, task of single-image fast moving object deblurring, shape, and trajectory recovery--previous approaches require at least three consecutive video frames. Given a single image, the method outputs the object 2D appearance and position in a series of sub-frames as if captured by a high-speed camera (ie temporal super-resolution). The proposed SI-DDPM-FMO method is trained end-to-end on a synthetic dataset with various moving objects, yet it generalizes well to real-world data from several publicly available datasets. SI-DDPM-FMO performs similarly to or better than recent multi-frame methods and a carefully designed baseline method.
Iris Verification with Convolutional Neural Network and Unit-Circle Layer
- Autoři: Ing. Bc. Radim Špetlík, Razumenić, I.
- Publikace: DAGM GCPR 2019. Cham: Springer, 2019. p. 274-287. LNCS. vol. 11824. ISSN 0302-9743. ISBN 978-3-030-33675-2.
- Rok: 2019
- DOI: 10.1007/978-3-030-33676-9_19
- Odkaz: https://doi.org/10.1007/978-3-030-33676-9_19
- Pracoviště: Katedra kybernetiky, Skupina vizuálního rozpoznávání
-
Anotace:
We propose a novel convolutional neural network to verify a match between two normalized images of the human iris. The network is trained end-to-end and validated on three publicly available datasets yielding state-of-the-art results against four baseline methods. The network performs better by a 10% margin to the state-of-the-art method on the CASIA.v4 dataset. In the network, we use a novel “Unit-Circle” layer which replaces the Gabor-filtering step in a common iris-verification pipeline. We show that the layer improves the performance of the model up to 15% on previously-unseen data.
Non-contact reflectance photoplethysmography: Progress, limitations, and myths
- Autoři: Ing. Bc. Radim Špetlík, Ing. Jan Čech, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
- Publikace: FG 2018: Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition. Piscataway: IEEE, 2018. p. 702-709. ISSN 2326-5396. ISBN 978-1-5386-2335-0.
- Rok: 2018
- DOI: 10.1109/FG.2018.00111
- Odkaz: https://doi.org/10.1109/FG.2018.00111
- Pracoviště: Skupina vizuálního rozpoznávání
-
Anotace:
Photoplethysmography (PPG) is a non-invasive method of measuring changes of blood volume in human tissue. The literature on non-contact reflectance PPG related to cardiovascular activity is extensively reviewed. We identify key factors limiting the performance of the PPG methods and reproducibility of the research as: a lack of publicly available datasets and incomplete description of data used in published experiments (missing details on video compression, lighting setup and subject’s skin type), use of unreliable pulse oximeter devices for ground-truth reference and missing standard experimental protocols. Two experiments with 5 participants are presented showing that the quality of the reconstructed signal (1) is adversely affected by a reduction of spatial resolution that also amplifies the effects of H.264 video compression and (2) is improved by precise pixel-to-pixel stabilization.
Visual Heart Rate Estimation with Convolutional Neural Network
- Autoři: Ing. Bc. Radim Špetlík, Ing. Vojtěch Franc, Ph.D., Ing. Jan Čech, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
- Publikace: BMVC2018: Proceedings of the British Machine Vision Conference. London: British Machine Vision Association, 2018.
- Rok: 2018
- Pracoviště: Skupina vizuálního rozpoznávání, Strojové učení
-
Anotace:
We propose a novel two-step convolutional neural network to estimate a heart rate from a sequence of facial images. The network is trained end-to-end by alternating op- timization and validated on three publicly available datasets yielding state-of-the-art results against three baseline methods. The network performs better by a 40% margin to the state-of-the-art method on a newly collected dataset. A challenging dataset of 204 fitness-themed videos is introduced. The dataset is designed to test the robustness of heart rate estimation methods to illumination changes and subject’s motion. 17 subjects perform 4 activities (talking, rowing, exercising on a stationary bike and an elliptical trainer) in 3 lighting setups. Each activity is captured by two RGB web-cameras, one is placed on a tripod, the other is attached to the fitness machine which vibrates significantly. Subject’s age ranges from 20 to 53 years, the mean heart rate is ≈ 110, the standard deviation ≈ 25.
Visual Language Identification from Facial Landmarks
- Autoři: Ing. Bc. Radim Špetlík, Ing. Jan Čech, Ph.D., Ing. Vojtěch Franc, Ph.D., prof. Ing. Jiří Matas, Ph.D.,
- Publikace: Image Analysis, Part II. Springer, Cham, 2017. p. 389-400. Lecture Notes in Computer Science. vol. 10270. ISSN 0302-9743. ISBN 978-3-319-59128-5.
- Rok: 2017
- DOI: 10.1007/978-3-319-59129-2_33
- Odkaz: https://doi.org/10.1007/978-3-319-59129-2_33
- Pracoviště: Katedra kybernetiky, Skupina vizuálního rozpoznávání, Strojové učení
-
Anotace:
The automatic Visual Language IDentification (VLID), i.e. a problem of using visual information to identify the language being spoken, using no audio information, is studied. The proposed method employs facial landmarks automatically detected in a video. A convex optimisation problem to find jointly both the discriminative representation (a softhistogram over a set of lip shapes) and the classifier is formulated. A 10-fold cross-validation is performed on dataset consisting of 644 videos collected from youtube.com resulting in accuracy 73% in a pairwise iscrimination between English and French (50% for a chance).Astudy, inwhich 10 videos were used, suggests that the proposed method performs better than average human in discriminating between the languages.