National Repository of Grey Literature 17 records found  1 - 10next  jump to record: Search took 0.01 seconds. 
Convolutional Networks for Handwriting Recognition
Sladký, Jan ; Kišš, Martin (referee) ; Hradiš, Michal (advisor)
This thesis deals with handwriting recognition using convolutional neural networks. From the current methods, a network model was chosen to consist of convolutional and recurrent neural networks with the Connectist Temporal Classification. The Vertical Attention Module, which selects the relevant information in each column corresponding to the text in the figure was subsequently implemented in such a model. Then, this module was compared with other possibilities of vertical aggregation between convolutional and recurrent networks. The experiments took place on a data set containing over 80,000 lines of text from Czech letters from the 20th century. The results show that the Vertical Attention Module almost always achieves the best results on all used types of convolution networks. The resulting network achieved the best result with 8,9%  of the character error rate. The contribution of this work is a neural network with a newly introduced element that can recognize lines of text.
Holistic License Plate Recognition Based on Convolution Neural Networks
Le, Hoang Anh ; Hradiš, Michal (referee) ; Špaňhel, Jakub (advisor)
Main goal of this work was to create a holistic license plate reader, with an emphasis on achieving the highest possible accuracy on low quality images. Combination of convolutional and recurrent neural networks was designed and implemented, with usage of LSTM and CTC, where the inputs are cut-outs from the entire license plate. Competitive networks were also implemented to compare results. Networks were compared on a total of 4 datasets and the results were, that my design has achieved the best results with a recognition accuracy of 97.6%.
Convolutional Networks for Lip Reading
Kadleček, Josef ; Kišš, Martin (referee) ; Hradiš, Michal (advisor)
This thesis deals with current methods for automatic speech recognition and lip reading via neural networks. Furthermore it deals with similarities in the architectures of neural networks for audio and visual data and available datasets in the field of audiovisual automatic speech recognition. The main contribution of this thesis is set of experiments comparing different changes in neural network architecture and its impact on results. The thesis includes an implementation of a system for automatic speech recognition from audio (CER: 12.6 %) and visual (CER: 57,7 %) data. The architectures of both systems are based on features extraction via convolutional networks followed by recurrent layers LSTM, another layer of convolutions and loss function CTC. 
New Techniques in Neural Networks Training - Connectionist Temporal Classification
Gajdár, Matúš ; Švec, Ján (referee) ; Karafiát, Martin (advisor)
This bachelor’s thesis deals with neural network and their use in speech recognition. Firstly,there is some theory about speech recognition, afterwards we show theory around neural networks in connection with connectionist temporal classification method. In next chapter we introduce toolkits, which were used for training of neural networks and also experiments done by them to find out impact of connectionist temporal classification method on precisionin phoneme decoding. The last chapter include summarization of work and overall evaluation of experiments.
Recurrent Neural Networks for Speech Recognition
Nováčik, Tomáš ; Karafiát, Martin (referee) ; Veselý, Karel (advisor)
This master thesis deals with the implementation of various types of recurrent neural networks via programming language lua using torch library. It focuses on finding optimal strategy for training recurrent neural networks and also tries to minimize the duration of the training. Furthermore various types of regularization techniques are investigated and implemented into the recurrent neural network architecture. Implemented recurrent neural networks are compared on the speech recognition task using AMI dataset, where they model the acustic information. Their performance is also compared to standard feedforward neural network. Best results are achieved using BLSTM architecture. The recurrent neural network are also trained via CTC objective function on the TIMIT dataset. Best result is again achieved using BLSTM architecture.
Holistic License Plate Recognition Based on Convolution Neural Networks
Morbitzer, Dušan ; Juránek, Roman (referee) ; Špaňhel, Jakub (advisor)
The goal of this work is to create a model of neural network for holistic recognition of license plates, focused on accuracy and shortening of the learning process. The model was implemented as a union of convolutional neural network for extraction of deep features of a plate and Bidirectional LSTM with CTC. The trained model was compared to another implementation using a holistic approach, that was trained on the same dataset. My design of the network achieved better results in recognition on a dataset, which is different from the training one, with an error rate of 8.3 %.
Improving Consistency in Text Recognition Datasets
Tvarožný, Matúš ; Hradiš, Michal (referee) ; Kišš, Martin (advisor)
This work is concerned with increasing the consistency of datasets for text recognition. This paper describes the problems that cause the inconsistency and then presents solutions to eliminate it. The effect of the properties of the polygons defining the text line boundaries and hence how the modified version of the dataset, which is composed of ideal text line variants, affected the accuracy of the model is investigated. Further, the work focuses on detecting and then removing or modifying text lines whose ground truth transcription does not match the actual text they contain. Experimentation showed that removing the visual inconsistency on the training set did not have a significant effect on the trained model, but modifying the test set improved the OCR accuracy of the model by 1.1\% CER. By modifying the dataset so that it did not contain mutually inconsistent pairs of recognized text and the corresponding ground truth, the model improved by a maximum of only 0.2\% CER after re-training. The main finding of this work is, above all, the proven beneficial effect of removing inconsistencies on test suites, thanks to which it is possible to determine a more realistic error rate of the OCR model.
Fast Discriminative Neural Networks for Text Correction
Chupáč, Sebastián ; Beneš, Karel (referee) ; Kohút, Jan (advisor)
The goal of this work is to propose and implement a fast discriminating neural network with only one forward pass, to detect and correct mistakes in text data. Multiple architectures were implemented for detection and correction separately. These models make use of convolution layers, LSTM layers and CTC loss function. Models were trained and evaluated on datasets made from three different text corpora. Experiments and evaluation present the ability of these models to detect and correct mistakes on character level with only one, fast forward pass.
Improving Consistency in Text Recognition Datasets
Tvarožný, Matúš ; Hradiš, Michal (referee) ; Kišš, Martin (advisor)
This work is concerned with increasing the consistency of datasets for text recognition. This paper describes the problems that cause the inconsistency and then presents solutions to eliminate it. The effect of the properties of the polygons defining the text line boundaries and hence how the modified version of the dataset, which is composed of ideal text line variants, affected the accuracy of the model is investigated. Further, the work focuses on detecting and then removing or modifying text lines whose ground truth transcription does not match the actual text they contain. Experimentation showed that removing the visual inconsistency on the training set did not have a significant effect on the trained model, but modifying the test set improved the OCR accuracy of the model by 1.1\% CER. By modifying the dataset so that it did not contain mutually inconsistent pairs of recognized text and the corresponding ground truth, the model improved by a maximum of only 0.2\% CER after re-training. The main finding of this work is, above all, the proven beneficial effect of removing inconsistencies on test suites, thanks to which it is possible to determine a more realistic error rate of the OCR model.
Holistic License Plate Recognition Based on Convolution Neural Networks
Morbitzer, Dušan ; Juránek, Roman (referee) ; Špaňhel, Jakub (advisor)
The goal of this work is to create a model of neural network for holistic recognition of license plates, focused on accuracy and shortening of the learning process. The model was implemented as a union of convolutional neural network for extraction of deep features of a plate and Bidirectional LSTM with CTC. The trained model was compared to another implementation using a holistic approach, that was trained on the same dataset. My design of the network achieved better results in recognition on a dataset, which is different from the training one, with an error rate of 8.3 %.

National Repository of Grey Literature : 17 records found   1 - 10next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.