About group
In general, we deal with the analysis and processing of speech signals with a focus on systems of speech recognition and enhancement of speech for communication purposes. Our current activities are directed mainly to the following sub-tasks:
- the recognition of continuous speech with a large vocabulary with a narrower focus on the processing of disturbed speech from a noisy environment or spontaneous talk
- extraction with a focus on robustness, the inclusion of information on the production of speech to the lifting of feature vector (artikulční symptoms)
- optimization of the acoustic modeling in systems based on HMM (discriminative and adaptive techniques, the combination of the ANN/HMM)
- language modeling for spontaneous speech (dictionaries with reduced pronunciations, LM-based classes)
- automatic phonetic segmentation
- collection and preparation of speech and text data
- detection of speech activity
Our Purpose
The recognition tasks above are used in systems for the on-line transcription of speech into text, typical examples of which include dictation apps on PCs, subtitling for online videos, on-line or off-line transcription of audio recordings with the possible index for archiving, voice-controlled telephone information systems ( either by simply replacing the unavailable tonal options or by using natural dialogue for communication), voice control systems for various devices (especially in the car), systems for phonetic segmentation in support of basic phonetic research and special techniques for speech analysis (e.g. pathological speech).
Speech highlighting can be applied to any communication in a noisy environment, where removing noise from the transmission significantly increases the intelligibility of the utterance , and also for the removal of the detectable qualities of noise disturbed speech (such as speech in a moving car, in public spaces, in rooms with echos, from speech sensing remote microphones, etc.). Speech detectione is an integral part of the series of recognition systems or speech highlighting (such as detection of the start and end of the utterance in recognition commands, the estimation of the characteristics of the background while highlighting, etc.). The collection and subsequent processing of speech and text data is necessary for the training of cognitive systems, which are based on statistical models or on the principles of artificial intelligence.
Our Funding
Our research has in recent years supported by grants:
- THE CZECH SCIENCE FOUNDATION (1996-2011), AND CR (2004-2007),
- COST (1994-2005),
- aim of this research project (2005-2011),
- FRVŠ (2010, 2011),
- internal grant of the Czech technical university (2012-2013).
We participated in European projects SpeechDat-E (1999-2000), SPEECON (2002-2003), LC-StarII (2006-2007). We have worked with the following companies as part of bilateral projects: Siemens AG, Muenchen, Germany (1999, 2006), Škoda Mladá Boleslav (2002-2003), TEMIC-Harman/Becker, Ulm, Germany (2000-2004), respectively. Radboud University of Nijmegen, Netherlands (2008-2009).
Currently our research is supported by the internal grant CTU SGS14/191/OHK3/3T/13 (2014-2016), another project is currently in the grant competition of the Czech science foundation.
We are also currently partnering with Zoom International to help solve some of their problems.