National Repository of Grey Literature 39 records found  previous11 - 20nextend  jump to record: Search took 0.00 seconds. 
Neural Language Model Acceleration
Labaš, Dominik ; Černocký, Jan (referee) ; Beneš, Karel (advisor)
This work adresses the topic of neural language model acceleration. The aim of this work is to optimize model of a feed-forward neural network. In accelerating of the neural network we used a change of activation function, pre-calculation of matrices for calculationg the hidden layer, implementation of the model's history cache and unnormalized model. The best-performing model was accelerated by 75.3\%.
OCR Trained with Unanotated Data
Buchal, Petr ; Dobeš, Petr (referee) ; Hradiš, Michal (advisor)
The creation of a high-quality optical character recognition system (OCR) requires a large amount of labeled data. Obtaining, or in other words creating, such a quantity of labeled data is a costly process. This thesis focuses on several methods which efficiently use unlabeled data for the training of an OCR neural network. The proposed methods fall into the category of self-training algorithms. The general approach of all proposed methods can be summarized as follows. Firstly, the seed model is trained on a limited amount of labeled data. Then, the seed model in combination with the language model is used for producing pseudo-labels for unlabeled data. Machine-labeled data are then combined with the training data used for the creation of the seed model and they are used again for the creation of the target model. The successfulness of individual methods is measured on the handwritten ICFHR 2014 Bentham dataset. Experiments were conducted on two datasets which represented different degrees of labeled data availability. The best model trained on the smaller dataset achieved 3.70 CER [%], which is a relative improvement of 42 % in comparison with the seed model, and the best model trained on the bigger dataset achieved 1.90 CER [%], which is a relative improvement of 26 % in comparison with the seed model. This thesis shows that the proposed methods can be efficiently used to improve the OCR error rate by means of unlabeled data.
Dynamic Decoder for Speech Recognition
Veselý, Michal ; Glembek, Ondřej (referee) ; Schwarz, Petr (advisor)
The result of this work is a fully working and significantly optimized implementation of a dynamic decoder. This decoder is based on dynamic recognition network generation and decoding by a modified version of the Token Passing algorithm. The implemented solution provides very similar results to the original static decoder from BSCORE (API of Phonexia company). Compared to BSCORE this implementation offers significant reduction of memory usage. This makes use of more complex language models possible. It also facilitates integration the speech recognition to some mobile devices or dynamic adding of new words to the system.
Neural Network for Autocomplete in the Browser
Kubík, Ján Jakub ; Zemčík, Pavel (referee) ; Kolář, Martin (advisor)
The goal of this thesis is to create and train a neural network and use it in a web browser for English text sequence prediction during writing of text by the user. The intention is to simplify the writing of frequent phrases. The problem is solved by employing a recurrent neural network that is able to predict output text based on the text input. Trained neural network is then used in a Google Chrome extension. By normalized ouput of the neural network, text choosing by sampling decoding algorithm and connecting, the extension is able to generate English word sequences, which are shown to the user as suggested text. The neural network is optimized by selecting the right loss function, and a suitable number of recurrent layers, neurons in the layers, and training epochs. The thesis contributes to enhancing the everyday user experience of writing on the Internet by using a neural network for English word sequence autocomplete in the browser.
Modelling Prosodic Dynamics for Speaker Recognition
Jančík, Zdeněk ; Fapšo, Michal (referee) ; Matějka, Pavel (advisor)
Most current automatic speaker recognition system extract speaker-depend features by looking at short-term spectral information. This approach ignores long-term information. I explored approach that use the fundamental frequency and energy trajectories for each speaker. This approach models prosody dynamics on single fonemes or syllables. It is known from literature that prosodic systems do not work as well the acoustic one but it improve the system when fusing. I verified this assumption by fusing my results with state of the art acoustic system from BUT. Data from standard evaluation campaigns organized by National Institute of Standarts and Technology are used for all experiments.
STATISTICAL LANGUAGE MODELS BASED ON NEURAL NETWORKS
Mikolov, Tomáš ; Zweig, Geoffrey (referee) ; Hajič,, Jan (referee) ; Černocký, Jan (advisor)
Statistické jazykové modely jsou důležitou součástí mnoha úspěšných aplikací, mezi něž patří například automatické rozpoznávání řeči a strojový překlad (příkladem je známá aplikace Google Translate). Tradiční techniky pro odhad těchto modelů jsou založeny na tzv. N-gramech. Navzdory známým nedostatkům těchto technik a obrovskému úsilí výzkumných skupin napříč mnoha oblastmi (rozpoznávání řeči, automatický překlad, neuroscience, umělá inteligence, zpracování přirozeného jazyka, komprese dat, psychologie atd.), N-gramy v podstatě zůstaly nejúspěšnější technikou. Cílem této práce je prezentace několika architektur jazykových modelůzaložených na neuronových sítích. Ačkoliv jsou tyto modely výpočetně náročnější než N-gramové modely, s technikami vyvinutými v této práci je možné jejich efektivní použití v reálných aplikacích. Dosažené snížení počtu chyb při rozpoznávání řeči oproti nejlepším N-gramovým modelům dosahuje 20%. Model založený na rekurentní neurovové síti dosahuje nejlepších publikovaných výsledků na velmi známé datové sadě (Penn Treebank).
Czech-English Translation
Petrželka, Jiří ; Schmidt, Marek (referee) ; Smrž, Pavel (advisor)
Tato diplomová práce popisuje principy statistického strojového překladu a demonstruje, jak sestavit systém pro statistický strojový překlad Moses. V přípravné fázi jsou prozkoumány volně dostupné bilingvní česko-anglické korpusy. Empirická analýza časové náročnosti vícevláknových nástrojů pro zarovnání slov demonstruje, že MGIZA++ může dosáhnout až pětinásobného zrychlení, zatímco PGIZA++ až osminásobného zrychlení (v porovnání s GIZA++). Jsou otestovány tři způsoby morfologického pre-processingu českých trénovacích dat za použití jednoduchých nefaktorových modelů. Zatímco jednoduchá lemmatizace může snížit BLEU, sofistikovanější přístupy většinou BLEU zvyšují. Positivní efekty morfologického pre-processingu se vytrácejí s růstem velikosti korpusu. Vztah mezi dalšími charakteristikami korpusu (velikost, žánr, další data) a výsledným BLEU je empiricky měřen. Koncový systém je natrénován na korpusu CzEng 0.9 a vyhodnocen na testovacím vzorku z workshopu WMT 2010.
Analýza videozáznamov správ z oblasti finančných trhov
Mikula, Michal
This work deals with the analysis of video recordings of reports from the field of financial markets. Many media from the financial sphere more and more often publish information via video or even prefer this format in some cases. Manual analysis of these videos is very time-consuming. The work therefore deals with the creation of a tool enabling their automatic analysis. The work deals with two main areas. The first area is automatic speech recognition for obtaining transcripts of videos and the second area is natural language processing for performing text analysis on a given video. Text analysis includes sentiment analysis, text summarization and key phrase extraction.
Reordering Text Fragments Using a Language Model
Holubec, Michael ; Kocour, Martin (referee) ; Beneš, Karel (advisor)
The aim of this work is to construct and experimentally verify the effectiveness of the language model in identifying the reading order. For this purpose language model with LSTM architecture was constructed. This work designs and implements three methods which are used to identify reading order. These methods are Language analysis, Spatial analysis and Combined analysis. Language analysis and combined analysis used constructed language model. The success of the language model, and all three methods, was measured on three datasets containing newspaper articles. Language analysis reaches 57,6 % and spatial analysis reaches 91,6 %. Combined analysis achieved the best results 92,9 %. The work shows that the language model can be used to identify reading order but use of additional data (e.g. spatial data
Domain Specific Data Crawling for Language Model Adaptation
Gregušová, Sabína ; Švec, Ján (referee) ; Karafiát, Martin (advisor)
The goal of this thesis is to implement a system for automatic language model adaptation for Phonexia ASR system. System expects input in the form of source that, which is analysed and appropriate terms for web search are chosen. Every web search results in a set of documents that undergo cleaning and filtering procedures. The resulting web corpora is mixed with Phonexia model and evaluated. In order to estimate the most optimal parameters, I conducted 3 sets of experiments for Hindi, Czech and Mandarin. The results of the experiments were very favourable and the implemented system managed to decrease perplexity and Word Error Rate in most cases.

National Repository of Grey Literature : 39 records found   previous11 - 20nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.