National Repository of Grey Literature 22 records found  1 - 10nextend  jump to record: Search took 0.00 seconds. 
Detekce typu a bodového ohodnocení kartiček ve hře Hobiti
Hlinský, Martin ; Kohút, Jan (referee) ; Vaško, Marek (advisor)
This thesis aims to create a card detector that can train a model that can detect the score of a card and its type using the synthetic generation of the dataset. The YOLOv8 model is used for training. The first step is to take pictures of the cards, which then go through a pre-processing stage so they do not contain background and are aligned. These pre-processed card images are combined with photos from other datasets in a generator that randomly translates, rotates, and otherwise simulates photos of possible card placements. This generator’s output is roughly 50 000 annotated images in the case of the Hobiti game, but different dataset sizes and pre-trained weights are compared in the experiments. The latest generation of trained detectors was validated on a real dataset for unbiased testing, and the most accurate model trained on purely synthetic datasets achieved precision up to 81.5 % according to the 50 metric. It is then possible to implement, for example, a point counter on the final detector, a prototype of which is also described in this paper.
Automated Metadata Extraction From Document Images
Křivánek, Jakub ; Vaško, Marek (referee) ; Kohút, Jan (advisor)
This Bachelor thesis addresses the problem of extracting structured data from scans of documents from Czech libraries. The aim of the thesis is to simplify the time-consuming manual process for librarians. I focused on creating datasets from documents of Czech libraries and on detecting metadata on these datasets. I created one dataset for books and another for periodicals. Detection was performed by classifying lines read from the documents. This utilized a fully connected neural network and a network employing a Transformer Encoder. The second method of metadata detection is based on object detection in document scans using the YOLOv8 model. Detection using the fully connected neural network achieves an F1 score of 0.83 on the book dataset and 0.78 on the periodicals dataset. The Transformer Encoder network achieves F1 scores of 0.84 on the book dataset and 0.59 on the periodicals dataset. The YOLO model achieves an F1 score of 0.86 (confidence at 0.549) on the book dataset and 0.7 (confidence at 0.336) on the periodicals dataset.
Webová aplikace pro efektivní anotaci atributů objektů ve videu
Pernický, Michal ; Kohút, Jan (referee) ; Hradiš, Michal (advisor)
The goal of this work is to develop a web application for annotating video object attributes that combines an efficient user interface with an assistant classifier providing predictions. In contrast to currently available tools, the solution focuses directly on objects without assigning them to the original videos. The ability to filter objects according to their attributes and to confirm or reject predicted attribute values in bulk is important. Testing on users has been found to reduce the time spent working by up to half. This shows that further work with this annotation principle is worthwhile.
Assessment of Uncertainty of Neural Net Predictions in the Tasks of Classification, Detection and Segmentation
Vlasák, Jiří ; Kohút, Jan (referee) ; Herout, Adam (advisor)
This work focuses on comparing three widely used methods for improving uncertainty estimations: Deep Ensembles, Monte Carlo Dropout, and Temperature Scaling. These methods are applied to six computer vision models that are pretrained as well as trained from scratch. The models are then evaluated on computer vision datasets for classification, semantic segmentation, and object detection using a wide range of metrics. The models are also evaluated on distorted versions of these datasets to measure their performance on out-of-distribution data.      These modified models achieve promising results. Ensembles outperform the other models by as high as 70 % in accuracy and 0.2 in IOU on the distorted MedSeg COVID-19 segmentation dataset while also outperforming the other models on the CIFAR-100 and FMNIST datasets.
Recognition of Driving Lane Borders in Video from On-Board Camera
Fridrich, David ; Kohút, Jan (referee) ; Herout, Adam (advisor)
This paper talks about lane detection. Specifically custom generator of synthetic images, usage during training of neural networks, testing on convolutional neural network (CNN) UNet model and possibilities of extension of this model to SALMnet (Structure-Aware Lane Marking Detection Network) via addding SGCA module (semantic-guided channel attention) and PDC module (pyramid deformable convolution). Training results from synthetic datasets show very accurate results, reaching around 95\,\% in accuracy (even 99\,\% for easier images). Trainings with real datasets show lower accuracy, depending on the difficulty of the dataset itself. TuSimple has easier and clearer images and reaches about 62\,\%. CuLane is much more complex and results show accuracy around 37\,\%.
Text Recognition Enhanced by Writer Identity
Trněný, Matěj ; Kišš, Martin (referee) ; Kohút, Jan (advisor)
The objective of this theses was to implement a neural network for text recognition enhanced by writers identity. Adversarial learning method was selected for this purpose. Usefulness of this method was verified by experiments. This net should yield better results on data which are not similar to data contained in training data set. Accuracy of the resulting net was compared to method single-task learning and method multi-task learning. Net implementing single-task learning method has reached average character recognition error of 7, 995%, net implementing multi-task learning method has reached average error of 7, 565% and net implementing adversarial learning method has reached average error of 7, 573%. In comparison to the net implementing single-task learning multi-task learning has improvement of 5, 38% and adversarial learning has reached improvement of 5, 28%. 
Generating Animations with Neural Networks
Dráber, Filip ; Kohút, Jan (referee) ; Hradiš, Michal (advisor)
Ačkoli je snímání pohybu už tak nástrojem, který má animátorům pomoci zjednodušit ty nejsložitější aspekty tvorby realistických animací, spousta námahy je stále ukrytá v anotování a strukturalizaci těchto dat. Tento problém řeším návrhem neuronové sítě, která může být natrénována na datovém souboru nasnímaného pohybu tak, aby reprodukovala lidský pohyb, který je vizualizován v aplikaci, které umožňuje uživateli tento pohyb ovládat. Také experimentuji s různými metodami trénování autoregresivního modelu, a na základě toho určuji, která metoda nejlépe vyvažuje dobu trénování a výkon. Dalším postřehem je, jak přidání ovládacích hodnot do vlastností generovaných snímků ovlivňuje použití rekurentních neuronových sítí pro tento úkol.
Adaptation of Neural Networks to Target Writer
Sekula, Jakub ; Hradiš, Michal (referee) ; Kohút, Jan (advisor)
This bachelor's thesis deals with the adaptation of neural networks to a specific writer with an aim to improve recognition of handwritten text of this specific writer. The method that I use is fast, requires small training dataset and uses regularization, which tries to keep the distribution of regularized weights in adaptation network similar to the one in the original network. I tested this method on dataset of printed text called IMPACT and dataset of handwritten text. When testing on dataset of handwritten text I was able to improve recognition on two diaries with pre adaptation recognition error rate of 10,82 % and 1,82 % to 8,48 % and 0,77 % with a small number of adaptation iterations and using small amount of training lines. When testing on IMPACT dataset I was able to improve recognition error rate from 32,88 % to 5,30 %.
Learning to Generate Images with Convolutional Neural Networks
Kohút, Jan ; Kolář, Martin (referee) ; Hradiš, Michal (advisor)
The aim of this Bachelor's thesis is to design and analyze convolutional neural networks generating images of characters based on their parameters. Parameters of characters are type of char, font, colour of character, background colour, translation and rotation. Neural networks have created multidimensional representation of each parameter. Relations inside these representation are similar to relations inside parameters. Neural networks generate characters with new values of parameters based on interpolation between learned values of parameters. Neural networks are capable to generalize problem of generating images.
Transformer Neural Networks for Handwritten Text Recognition
Vešelíny, Peter ; Beneš, Karel (referee) ; Kohút, Jan (advisor)
This Master's thesis aims to design a system using the transformer neural network and perform experiments with this proposed model in the task of handwriting text recognition. In this thesis, a multilingual dataset with predominate Czech texts is used. The experiments examine the influence of basic hyperparameters, such as network size, convolutional encoder type, and the use of different text tokenizers. In this work, I also use text corpora of the Czech language which is used to train the network decoder. Furthermore, I experiment with the usage of additional textual information during the decoding process. This information comes from the previous line of the transcribed image. The transformer achieves a character recognition error rate of 3.41 % on the test data set which is 0.16 % worse performance than the recurrent neural network achieves. To compare this model with other transformer-based models from available articles, the network was trained on the IAM dataset, where it achieved an error of 2.48 % and therefore outperformed other models in handwriting text recognition task.

National Repository of Grey Literature : 22 records found   1 - 10nextend  jump to record:
See also: similar author names
1 Kohút, Jakub
4 Kohút, Jiří
4 Kohút, Josef
Interested in being notified about new results for this query?
Subscribe to the RSS feed.