National Repository of Grey Literature 18 records found  1 - 10next  jump to record: Search took 0.01 seconds. 
End-to-End Speech Recognition for Low-Resource Languages
Sokolovskii, Vladislav ; Schwarz, Petr (referee) ; Karafiát, Martin (advisor)
Oblast automatického rozpoznávání řeči začala přijímat end-to-end řešení neuronové sítě pro vytváření rozpoznávačů řeči. Povaha datového hladu těchto typů systémů však umožňuje vytvářet rozpoznávače pouze pro jazyky s velkými zdroji, jako je angličtina, čínština nebo španělština. Ve scénářích s nízkými zdroji je třeba vyvinout některá řešení, která zmírní problém nedostatku dat. Jednou z nejúčinnějších technik je doladění předtrénovaného modelu. Problém se stávajícími přístupy ladění spočívá v tom, že sada tokenů cílového a zdrojového jazyka se obvykle liší. To je důvod, proč předchozí přístupy k učení vícejazyčného přenosu vyžadovaly změnu výstupní vrstvy nebo smíchání tokenů z různých jazyků ve výstupní vrstvě, případně použití univerzální sady tokenů anebo samostatné výstupní vrstvy pro každý jazyk. To je nežádoucí, jelikož sdílení napříč jazyky je v tomto případě latentní a neovladatelné ve výstupním prostoru, když jsou grafémy specifické pro daný jazyk disjunktní. Proto tato práce navrhuje mapování tokenů do společné sady před začátkem předtréninku. Stávající řešení spočívá v transliteraci zdrojového jazyka do cílového, novým přístupem je romanizace, kde je sada tokenů cílového jazyka romanizována tak, aby odpovídala anglické abecedě. Následně lze diakritiku z romanizovaných hypotéz obnovit pomocí dalšího modelu obnovy. To má výhodu ve zvýšení sdílení v prostoru výstupního grafému.
Domain Specific Data Crawling for Language Model Adaptation
Gregušová, Sabína ; Švec, Ján (referee) ; Karafiát, Martin (advisor)
The goal of this thesis is to implement a system for automatic language model adaptation for Phonexia ASR system. System expects input in the form of source that, which is analysed and appropriate terms for web search are chosen. Every web search results in a set of documents that undergo cleaning and filtering procedures. The resulting web corpora is mixed with Phonexia model and evaluated. In order to estimate the most optimal parameters, I conducted 3 sets of experiments for Hindi, Czech and Mandarin. The results of the experiments were very favourable and the implemented system managed to decrease perplexity and Word Error Rate in most cases.
Impact of Environment Acoustics on Speech Recognition Accuracy
Paliesek, Jakub ; Karafiát, Martin (referee) ; Szőke, Igor (advisor)
This diploma thesis deals with impact of room acoustics on automatic speech recognition (ASR) accuracy. Experiments were evaluated on speech corpus LibriSpeech and database of impulse responses and noise called ReverbDB. Used ASRs were based on Mini LibriSpeech recipe for Kaldi. First it was examined how well can ASR learn to transcribe in selected environments by using the same acoustic conditions during training and testing. Next, experiments were carried out with modifications of ASR architecture in order to achieve better robustness against new conditions by using methods for adapation to room acoustics - r-vectors and i-vectors. It was shown that recently proposed method of r-vectors is beneficial even when using real impulse responses for data augmentation.
Low-Dimensional Matrix Factorization in End-To-End Speech Recognition Systems
Gajdár, Matúš ; Grézl, František (referee) ; Karafiát, Martin (advisor)
The project covers automatic speech recognition with neural network training using low-dimensional matrix factorization. We are describing time delay neural networks with factorization (TDNN-F) and without it (TDNN) in Pytorch language. We are comparing the implementation between Pytorch and Kaldi toolkit, where we achieve similar results during experiments with various network architectures. The last chapter describes the impact of a low-dimensional matrix factorization on End-to-End speech recognition systems and also a modification of the system with TDNN(-F) networks. Using specific network settings, we were able to achieve better results with systems using factorization. Additionally, we reduced the complexity of training by decreasing network parameters with the use of TDNN(-F) networks.
Grammar Based Automatic Speech Recognizer
Škorvaga, Vojtěch ; Karafiát, Martin (referee) ; Schwarz, Petr (advisor)
This work describes a development of system for network compilation for speech recognition based on Speech Recognition Grammar Specification (SRGS) grammar defined by W3C consortium. Together with the new module, the recognizer was integrated to the FreeSwitch software phone switch using a combination of MRCPv2/SIP/RTP networks protokols and tested.
New Techniques in Neural Networks Training - Connectionist Temporal Classification
Gajdár, Matúš ; Švec, Ján (referee) ; Karafiát, Martin (advisor)
This bachelor’s thesis deals with neural network and their use in speech recognition. Firstly,there is some theory about speech recognition, afterwards we show theory around neural networks in connection with connectionist temporal classification method. In next chapter we introduce toolkits, which were used for training of neural networks and also experiments done by them to find out impact of connectionist temporal classification method on precisionin phoneme decoding. The last chapter include summarization of work and overall evaluation of experiments.
Fast and Accurate Keyword Spotting System
Lenčéš, Marián ; Karafiát, Martin (referee) ; Schwarz, Petr (advisor)
This bachelor's thesis deals with fast and accurate detection of keywords from audio records. The aim of was to study possibilities of word detection and to create several types of language models. These were then to be compared to each other. We focus here on the detection of keywords from English spoken audio records.
Activity of Neural Network in Hidden Layers - Visualisation and Analysis
Fábry, Marko ; Grézl, František (referee) ; Karafiát, Martin (advisor)
Goal of this work was to create system capable of visualisation of activation function values, which were produced by neurons placed in hidden layers of neural networks used for speech recognition. In this work are also described experiments comparing methods for visualisation, visualisations of neural networks with different architectures and neural networks trained with different types of input data. Visualisation system implemented in this work is based on previous work of Mr. Khe Chai Sim and extended with new methods of data normalization. Kaldi toolkit was used for neural network training data preparation. CNTK framework was used for neural network training. Core of this work - the visualisation system was implemented in scripting language Python.
Recurrent Neural Networks for Speech Recognition
Nováčik, Tomáš ; Karafiát, Martin (referee) ; Veselý, Karel (advisor)
This master thesis deals with the implementation of various types of recurrent neural networks via programming language lua using torch library. It focuses on finding optimal strategy for training recurrent neural networks and also tries to minimize the duration of the training. Furthermore various types of regularization techniques are investigated and implemented into the recurrent neural network architecture. Implemented recurrent neural networks are compared on the speech recognition task using AMI dataset, where they model the acustic information. Their performance is also compared to standard feedforward neural network. Best results are achieved using BLSTM architecture. The recurrent neural network are also trained via CTC objective function on the TIMIT dataset. Best result is again achieved using BLSTM architecture.
Set of JavaApplets Demonstrations for Speech Processing
Kudr, Michal ; Karafiát, Martin (referee) ; Černocký, Jan (advisor)
The goal of the thesis is being familiar with methods a techniques used in speech processing. Using the obtained knowledge I propose three JavaApplets demonstrating selected methods. In this thesis we can find the theoretical analysis of selected problems.

National Repository of Grey Literature : 18 records found   1 - 10next  jump to record:
See also: similar author names
2 Karafiát, Michal
Interested in being notified about new results for this query?
Subscribe to the RSS feed.