National Repository of Grey Literature 29 records found  beginprevious17 - 26next  jump to record: Search took 0.02 seconds. 
Model Compression of Denoising Diffusion Probabilistic Models for Image Generation
Dobiš, Lukáš ; Kišš, Martin (referee) ; Hradiš, Michal (advisor)
Táto diplomová práca sa zameriava na optimalizáciu výpočtovej efektívnosti generatívnych difúznych modelov skrz vyhodnotenie konvenčných metód komprimácie neurónovych sieti na architektúre Denoising Diffusion Probabilistic Model (DDPM). Modelová komprimácia bola vykonaná na parametroch predtrénovanéj sieti DDPM niekoľkými kvantizačnými a prerezávacími metódami. Tieto metódy boli vyhodnotené na troch rôznych obrázkových dátových sadách. Výsledky potvrdzujú, že implementované kompresné metódy sú vhodne pre nasadenie difúznych modelov na malých zariadeniach s obmedzenými zdrojmi alebo na zníženie ich výpočetnych prevádzkových nákladov.
Deep Neural Networks for Historical Document Classification
Pinkeová, Bettina ; Kohút, Jan (referee) ; Kišš, Martin (advisor)
The aim of this work is to create a system for historical documents classification . The task is specifically about classification of documents according to the place of origin. Several systems are proposed for solving this problem, in the work. The first designed and implemented system is based on a convolutional neural network with a self-attention mechanism instead of an average pooling layer. Another system is based on the BEiT model, which is built on a visual transformer. The BEiT model was pretrained on the task of masked image modelling and subsequently trained on the given classification task. The system based on convolutional neural network achieved an accuracy of 81.6% and the system based on masked image modelling achieved an accuracy of 82.9%. The systems implemented in this work, surpassed the systems participating in the ICDAR 2021 conference in terms of success.
Multi-Modal Text Recognition
Kabáč, Michal ; Herout, Adam (referee) ; Kišš, Martin (advisor)
The aim of this thesis is to describe and create a method for correcting text recognizer outputs using speech recognition. The thesis presents an overview of current methods for text and speech recognition using neural networks. It also presents a few existing methods of connecting the outputs of two modalities. Within the thesis, several approaches for the correction of recognizers, which are based on algorithms or neural networks, are designed and implemented. An algorithm based on the principle of searching the outputs of recognizers using levenshtain alignment was proven to be the best approach. It scans the outputs, if the uncertainty of the text recognizer character is less than the pre-selected limit. As part of the work, an annotation server was created for the text transcripts, which was used to collect recordings for the evaluation of experiments.
Improving Consistency in Text Recognition Datasets
Tvarožný, Matúš ; Hradiš, Michal (referee) ; Kišš, Martin (advisor)
This work is concerned with increasing the consistency of datasets for text recognition. This paper describes the problems that cause the inconsistency and then presents solutions to eliminate it. The effect of the properties of the polygons defining the text line boundaries and hence how the modified version of the dataset, which is composed of ideal text line variants, affected the accuracy of the model is investigated. Further, the work focuses on detecting and then removing or modifying text lines whose ground truth transcription does not match the actual text they contain. Experimentation showed that removing the visual inconsistency on the training set did not have a significant effect on the trained model, but modifying the test set improved the OCR accuracy of the model by 1.1\% CER. By modifying the dataset so that it did not contain mutually inconsistent pairs of recognized text and the corresponding ground truth, the model improved by a maximum of only 0.2\% CER after re-training. The main finding of this work is, above all, the proven beneficial effect of removing inconsistencies on test suites, thanks to which it is possible to determine a more realistic error rate of the OCR model.
Deep Neural Network Pruning for Text Recognition
Petráš, Simon ; Hradiš, Michal (referee) ; Kišš, Martin (advisor)
This document is a work on pruning neural network for handwriting recognition. The aim of the work is to create a program for pruning the network. We prune two types of neural networks, namely convolutional and recurrent neural networks. During the pruning of the convolution part, various criteria of parameter selection were experimented with. The result of the work is a model that achieves 20% acceleration while increasing the network inaccuracy by only 0.4%, but also a number of other models that are faster but also acquire higher inaccuracies.
Text Recognition Enhanced by Writer Identity
Trněný, Matěj ; Kišš, Martin (referee) ; Kohút, Jan (advisor)
The objective of this theses was to implement a neural network for text recognition enhanced by writers identity. Adversarial learning method was selected for this purpose. Usefulness of this method was verified by experiments. This net should yield better results on data which are not similar to data contained in training data set. Accuracy of the resulting net was compared to method single-task learning and method multi-task learning. Net implementing single-task learning method has reached average character recognition error of 7, 995%, net implementing multi-task learning method has reached average error of 7, 565% and net implementing adversarial learning method has reached average error of 7, 573%. In comparison to the net implementing single-task learning multi-task learning has improvement of 5, 38% and adversarial learning has reached improvement of 5, 28%. 
Convolutional Networks for Handwriting Recognition
Sladký, Jan ; Kišš, Martin (referee) ; Hradiš, Michal (advisor)
This thesis deals with handwriting recognition using convolutional neural networks. From the current methods, a network model was chosen to consist of convolutional and recurrent neural networks with the Connectist Temporal Classification. The Vertical Attention Module, which selects the relevant information in each column corresponding to the text in the figure was subsequently implemented in such a model. Then, this module was compared with other possibilities of vertical aggregation between convolutional and recurrent networks. The experiments took place on a data set containing over 80,000 lines of text from Czech letters from the 20th century. The results show that the Vertical Attention Module almost always achieves the best results on all used types of convolution networks. The resulting network achieved the best result with 8,9%  of the character error rate. The contribution of this work is a neural network with a newly introduced element that can recognize lines of text.
Automatic Delivery Note Transcription
Necpál, Dávid ; Kišš, Martin (referee) ; Hradiš, Michal (advisor)
This bachelor thesis aims to create a system for automatic transcription of delivery notes - documents with a fixed structure. The solution is divided into two parts. The first part is table lines detection and subsequent detection and extraction of cells, that contain required data. The second part is handwritten numeric characters recognition in the images of the cutted cells. The resulting system can detect cells with the required data with 100 % accuracy with well-scanned delivery notes, while the success rate of numerical character recognition is more than 95 % for individual characters and more than 92 % for entire character sequences. The benefit of this work is a system for automatic transcription of delivery notes, which provides faster and easier otherwise lengthy rewriting of the contents of delivery notes to the information system in the retail. By using this system, the employee saves more than 50 % of the time on each delivery note.
Online Tool for Recognition of Tables in Images
Inhliziian, Bohdan ; Kišš, Martin (referee) ; Herout, Adam (advisor)
This work solves the problem of recognising the tables in the figures. The goal is to convert the table into an XLS file thought web application. For line detection we have used the Probablistic Hough Transform algorithm and Tesse- ract tool was used to detect text in cells. The program was stored to the Amazon AWS and accessed by the web app using the API. An algorithm for line merging has been created, as well as an algorithm for removing lines that do not belong to the table and removing wrong detected lines (text, noise). The solution provides users who manually overwrite data from tables in documents, books, use a program that does everything automatically, you only need to upload photos to a web application.
Convolutional Networks for Historic Text Recognition
Vešelíny, Peter ; Kolář, Martin (referee) ; Kišš, Martin (advisor)
This thesis deals with text line recognition of historical documents. Historical texts dating back to the 17th - 19th centuries are written in fraktur typeface. The character recognition problem is solved using neural network architecture called sequence-to-sequence . This architecture is based on encoder-decoder model and contains attention mechanism. In this thesis a dataset, from texts originated from German archiv called Deutsches Textarchiv , was created. This archive contains 3 897 different German books that have available transcripts and corresponding images of pages. The created dataset was used to train and experiment with the proposed neural network. During the experiments, several convolutional models, hyperparameters and the effects of positional embedding were investigated. The final tool can recognize characters with accuracy 99,63 %. The contribution of this work is the~mentioned dataset and neural network, which can be used to recognize historical documents.

National Repository of Grey Literature : 29 records found   beginprevious17 - 26next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.