Lidé

Ing. Bc. Radim Špetlík

Všechny publikace

Single-Image Deblurring, Trajectory and Shape Recovery of Fast Moving Objects with Denoising Diffusion Probabilistic Models

Iris Verification with Convolutional Neural Network and Unit-Circle Layer

  • Autoři: Ing. Bc. Radim Špetlík, Razumenić, I.
  • Publikace: DAGM GCPR 2019. Cham: Springer, 2019. p. 274-287. LNCS. vol. 11824. ISSN 0302-9743. ISBN 978-3-030-33675-2.
  • Rok: 2019
  • DOI: 10.1007/978-3-030-33676-9_19
  • Odkaz: https://doi.org/10.1007/978-3-030-33676-9_19
  • Pracoviště: Katedra kybernetiky, Skupina vizuálního rozpoznávání
  • Anotace:
    We propose a novel convolutional neural network to verify a match between two normalized images of the human iris. The network is trained end-to-end and validated on three publicly available datasets yielding state-of-the-art results against four baseline methods. The network performs better by a 10% margin to the state-of-the-art method on the CASIA.v4 dataset. In the network, we use a novel “Unit-Circle” layer which replaces the Gabor-filtering step in a common iris-verification pipeline. We show that the layer improves the performance of the model up to 15% on previously-unseen data.

Non-contact reflectance photoplethysmography: Progress, limitations, and myths

  • DOI: 10.1109/FG.2018.00111
  • Odkaz: https://doi.org/10.1109/FG.2018.00111
  • Pracoviště: Skupina vizuálního rozpoznávání
  • Anotace:
    Photoplethysmography (PPG) is a non-invasive method of measuring changes of blood volume in human tissue. The literature on non-contact reflectance PPG related to cardiovascular activity is extensively reviewed. We identify key factors limiting the performance of the PPG methods and reproducibility of the research as: a lack of publicly available datasets and incomplete description of data used in published experiments (missing details on video compression, lighting setup and subject’s skin type), use of unreliable pulse oximeter devices for ground-truth reference and missing standard experimental protocols. Two experiments with 5 participants are presented showing that the quality of the reconstructed signal (1) is adversely affected by a reduction of spatial resolution that also amplifies the effects of H.264 video compression and (2) is improved by precise pixel-to-pixel stabilization.

Visual Heart Rate Estimation with Convolutional Neural Network

  • Pracoviště: Skupina vizuálního rozpoznávání, Strojové učení
  • Anotace:
    We propose a novel two-step convolutional neural network to estimate a heart rate from a sequence of facial images. The network is trained end-to-end by alternating op- timization and validated on three publicly available datasets yielding state-of-the-art results against three baseline methods. The network performs better by a 40% margin to the state-of-the-art method on a newly collected dataset. A challenging dataset of 204 fitness-themed videos is introduced. The dataset is designed to test the robustness of heart rate estimation methods to illumination changes and subject’s motion. 17 subjects perform 4 activities (talking, rowing, exercising on a stationary bike and an elliptical trainer) in 3 lighting setups. Each activity is captured by two RGB web-cameras, one is placed on a tripod, the other is attached to the fitness machine which vibrates significantly. Subject’s age ranges from 20 to 53 years, the mean heart rate is ≈ 110, the standard deviation ≈ 25.

Visual Language Identification from Facial Landmarks

  • DOI: 10.1007/978-3-319-59129-2_33
  • Odkaz: https://doi.org/10.1007/978-3-319-59129-2_33
  • Pracoviště: Katedra kybernetiky, Skupina vizuálního rozpoznávání, Strojové učení
  • Anotace:
    The automatic Visual Language IDentification (VLID), i.e. a problem of using visual information to identify the language being spoken, using no audio information, is studied. The proposed method employs facial landmarks automatically detected in a video. A convex optimisation problem to find jointly both the discriminative representation (a softhistogram over a set of lip shapes) and the classifier is formulated. A 10-fold cross-validation is performed on dataset consisting of 644 videos collected from youtube.com resulting in accuracy 73% in a pairwise iscrimination between English and French (50% for a chance).Astudy, inwhich 10 videos were used, suggests that the proposed method performs better than average human in discriminating between the languages.

Za stránku zodpovídá: Ing. Mgr. Radovan Suk