National Repository of Grey Literature 33 records found  beginprevious24 - 33  jump to record: Search took 0.01 seconds. 
Deep Learning for OCR in GUI
Hamerník, Pavel ; Špaňhel, Jakub (referee) ; Lysek, Tomáš (advisor)
Optical character recognition (OCR) has been a topic of interest for many years. It is defined as the process of digitizing a document image into a sequence of characters. Despite decades of intense research, OCR systems with capabilities to that of human still remains an open challenge. In this work there is presented a design and implementation of such system, which is capable of detecting texts in graphical user interfaces.
An automatic football match event detection
Dvonč, Tomáš ; Říha, Kamil (referee) ; Přinosil, Jiří (advisor)
This diploma thesis describes methods suitable for automatic detection of events from video sequences focused on football matches. The first part of the work is focused on the analysis and creation of procedures for extracting informations from available data. The second part deals with the implementation of selected methods and neural network algorithm for corner kick detection. Two experiments were performed in this work. The first captures static information from one image and the second is focused on detection from spatio-temporal data. The output of this work is a program for automatic event detection, which can be used to interpret the results of the experiments. This work may figure as a basis to gain new knowledge about the issue and also to the further development of detection events from football.
Image based smoke and fire detection
Ďuriš, Denis ; Burda, Karel (referee) ; Přinosil, Jiří (advisor)
This diploma thesis deals with the detection of fire and smoke from the image signal. The approach of this work uses a combination of convolutional and recurrent neural network. Machine learning models created in this work contain inception modules and blocks of long short-term memory. The research part describes selected models of machine learning used in solving the problem of fire detection in static and dynamic image data. As part of the solution, a data set containing videos and still images used to train the designed neural networks was created. The results of this approach are evaluated in conclusion.
Deep Learning for OCR in GUI
Hamerník, Pavel ; Špaňhel, Jakub (referee) ; Lysek, Tomáš (advisor)
Optical character recognition (OCR) has been a topic of interest for many years. It is defined as the process of digitizing a document image into a sequence of characters. Despite decades of intense research, OCR systems with capabilities to that of human still remains an open challenge. In this work there is presented a design and implementation of such system, which is capable of detecting texts in graphical user interfaces.
Convolutional Networks for Historic Text Recognition
Vešelíny, Peter ; Kolář, Martin (referee) ; Kišš, Martin (advisor)
This thesis deals with text line recognition of historical documents. Historical texts dating back to the 17th - 19th centuries are written in fraktur typeface. The character recognition problem is solved using neural network architecture called sequence-to-sequence . This architecture is based on encoder-decoder model and contains attention mechanism. In this thesis a dataset, from texts originated from German archiv called Deutsches Textarchiv , was created. This archive contains 3 897 different German books that have available transcripts and corresponding images of pages. The created dataset was used to train and experiment with the proposed neural network. During the experiments, several convolutional models, hyperparameters and the effects of positional embedding were investigated. The final tool can recognize characters with accuracy 99,63 %. The contribution of this work is the~mentioned dataset and neural network, which can be used to recognize historical documents.
Holistic License Plate Recognition Based on Convolution Neural Networks
Le, Hoang Anh ; Hradiš, Michal (referee) ; Špaňhel, Jakub (advisor)
Main goal of this work was to create a holistic license plate reader, with an emphasis on achieving the highest possible accuracy on low quality images. Combination of convolutional and recurrent neural networks was designed and implemented, with usage of LSTM and CTC, where the inputs are cut-outs from the entire license plate. Competitive networks were also implemented to compare results. Networks were compared on a total of 4 datasets and the results were, that my design has achieved the best results with a recognition accuracy of 97.6%.
Convolutional Networks for Historic Text Recognition
Kišš, Martin ; Zemčík, Pavel (referee) ; Hradiš, Michal (advisor)
The aim of this work is to create a tool for automatic transcription of historical documents. The work is mainly focused on the recognition of texts from the period of modern times written using font Fraktur. The problem is solved with a newly designed recurrent convolutional neural networks and a Spatial Transformer Network. Part of the solution is also an implemented generator of artificial historical texts. Using this generator, an artificial data set is created on which the convolutional neural network for line recognition is trained. This network is then tested on real historical lines of text on which the network achieves up to 89.0 % of character accuracy. The contribution of this work is primarily the newly designed neural network for text line recognition and the implemented artificial text generator, with which it is possible to train the neural network to recognize real historical lines of text.
Neural networks for automatic speaker, language, and sex identification
Do, Ngoc ; Jurčíček, Filip (advisor) ; Peterek, Nino (referee)
Title: Neural networks for automatic speaker, language, and sex identifica- tion Author: Bich-Ngoc Do Department: Institute of Formal and Applied Linguistics Supervisor: Ing. Mgr. Filip Jurek, Ph.D., Institute of Formal and Applied Linguistics and Dr. Marco Wiering, Faculty of Mathematics and Natural Sciences, University of Groningen Abstract: Speaker recognition is a challenging task and has applications in many areas, such as access control or forensic science. On the other hand, in recent years, deep learning paradigm and its branch, deep neural networks have emerged as powerful machine learning techniques and achieved state-of- the-art in many fields of natural language processing and speech technology. Therefore, the aim of this work is to explore the capability of a deep neural network model, recurrent neural networks, in speaker recognition. Our pro- posed systems are evaluated on TIMIT corpus using speaker identification task. In comparison with other systems in the same test conditions, our systems could not surpass reference ones due to the sparsity of validation data. In general, our experiments show that the best system configuration is a combination of MFCCs with their dynamic features and a recurrent neural network model. We also experiment recurrent neural networks and convo- lutional neural...
Recurrent Neural Networks for Speech Recognition
Nováčik, Tomáš ; Karafiát, Martin (referee) ; Veselý, Karel (advisor)
This master thesis deals with the implementation of various types of recurrent neural networks via programming language lua using torch library. It focuses on finding optimal strategy for training recurrent neural networks and also tries to minimize the duration of the training. Furthermore various types of regularization techniques are investigated and implemented into the recurrent neural network architecture. Implemented recurrent neural networks are compared on the speech recognition task using AMI dataset, where they model the acustic information. Their performance is also compared to standard feedforward neural network. Best results are achieved using BLSTM architecture. The recurrent neural network are also trained via CTC objective function on the TIMIT dataset. Best result is again achieved using BLSTM architecture.
Image Captioning with Recurrent Neural Networks
Kvita, Jakub ; Španěl, Michal (referee) ; Hradiš, Michal (advisor)
Tato práce se zabývá automatickým generovaním popisů obrázků s využitím několika druhů neuronových sítí. Práce je založena na článcích z MS COCO Captioning Challenge 2015 a znakových jazykových modelech, popularizovaných A. Karpathym. Navržený model je kombinací konvoluční a rekurentní neuronové sítě s architekturou kodér--dekodér. Vektor reprezentující zakódovaný obrázek je předáván jazykovému modelu jako hodnoty paměti LSTM vrstev v síti. Práce zkoumá, na jaké úrovni je model s takto jednoduchou architekturou schopen popisovat obrázky a jak si stojí v porovnání s ostatními současnými modely. Jedním ze závěrů práce je, že navržená architektura není dostatečná pro jakýkoli popis obrázků.

National Repository of Grey Literature : 33 records found   beginprevious24 - 33  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.