National Repository of Grey Literature 6 records found  Search took 0.01 seconds. 
Convolutional Networks for Lip Reading
Kadleček, Josef ; Kišš, Martin (referee) ; Hradiš, Michal (advisor)
This thesis deals with current methods for automatic speech recognition and lip reading via neural networks. Furthermore it deals with similarities in the architectures of neural networks for audio and visual data and available datasets in the field of audiovisual automatic speech recognition. The main contribution of this thesis is set of experiments comparing different changes in neural network architecture and its impact on results. The thesis includes an implementation of a system for automatic speech recognition from audio (CER: 12.6 %) and visual (CER: 57,7 %) data. The architectures of both systems are based on features extraction via convolutional networks followed by recurrent layers LSTM, another layer of convolutions and loss function CTC. 
Topic Detection from Spoken Speech
Škeřík, Zdeněk ; Szőke, Igor (referee) ; Schwarz, Petr (advisor)
This thesis is about topic detection from spoken speech. The first part of the thesis deals with speech transcription to text. The thesis describes two different solutions of the topic detection - a machine learning based solution and an expert solution that composes a very precise query describing the document topic. Both methods are tested on a set of recordings and compared.
Dictation System for the Android Platform
Horák, Miroslav ; Pešán, Jan (referee) ; Schwarz, Petr (advisor)
The aim of this bachelor´s thesis is to create a distributed dictate system. Dictate will be done in real time. Client part is intended for Android platform. Server part is intended for Windows OS. Existing transcription core will be used for the speech transcription.
Convolutional Networks for Lip Reading
Kadleček, Josef ; Kišš, Martin (referee) ; Hradiš, Michal (advisor)
This thesis deals with current methods for automatic speech recognition and lip reading via neural networks. Furthermore it deals with similarities in the architectures of neural networks for audio and visual data and available datasets in the field of audiovisual automatic speech recognition. The main contribution of this thesis is set of experiments comparing different changes in neural network architecture and its impact on results. The thesis includes an implementation of a system for automatic speech recognition from audio (CER: 12.6 %) and visual (CER: 57,7 %) data. The architectures of both systems are based on features extraction via convolutional networks followed by recurrent layers LSTM, another layer of convolutions and loss function CTC. 
Dictation System for the Android Platform
Horák, Miroslav ; Pešán, Jan (referee) ; Schwarz, Petr (advisor)
The aim of this bachelor´s thesis is to create a distributed dictate system. Dictate will be done in real time. Client part is intended for Android platform. Server part is intended for Windows OS. Existing transcription core will be used for the speech transcription.
Topic Detection from Spoken Speech
Škeřík, Zdeněk ; Szőke, Igor (referee) ; Schwarz, Petr (advisor)
This thesis is about topic detection from spoken speech. The first part of the thesis deals with speech transcription to text. The thesis describes two different solutions of the topic detection - a machine learning based solution and an expert solution that composes a very precise query describing the document topic. Both methods are tested on a set of recordings and compared.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.