National Repository of Grey Literature 215 records found  previous11 - 20nextend  jump to record: Search took 0.00 seconds. 
Neural Networks for Optical Music Recognition
Vlach, Vojtěch ; Kohút, Jan (referee) ; Hradiš, Michal (advisor)
This thesis consideres the problem of optical music recognition from images to text using Artificial inteligence and neural networks. I have choosed particularly the field of printed polyphonic music (more notes and voices at the same time). The goal of this thesis is to create a model capable of recognising complex notations and its accuracy compare with previous literature and other known models. I solved the chosen problem by utilizing the Vision Transformer architecture, where I tested several network variants to find the most powerful one. And by creating a new dataset with polyphonic music. The work presents the process of creating the dataset by synthesizing images from MusicXML format using the MuseScore program. The most successful variant of the Vision Transformer architecture achieves an error rate of only 7.86 %, which is very promising for further development and utilization. The main finding is that the architecture has the potential to dominate in this field, just as it does in other areas of research, and there is a functional solution for the specific task of polyphonic music notation recognition, which has been only up for a debate until now.
Detection and Classification of Vehicles for Embedded Platforms
Skaloš, Patrik ; Hradiš, Michal (referee) ; Špaňhel, Jakub (advisor)
Táto práca hodnotí kompromisy rýchlosti a presnosti najmodernejších detektorov objektov YOLOv8 pre detekciu vozidiel v snímkoch z monitorovacích kamier na vstatných a nízkovýkonných zariadeniach. Modely YOLOv8 rôznych veľkostí, vrátane jedného s efektívnou sieťou MobileNetV2 na extrakciu príznakov a modelu YOLOv8-femto s menej ako \num{60000} parametrami, boli testované na šiestich zariadeniach, vrátane troch vstavaných platforiem z rodiny NVIDIA Jetson a počítačom Raspberry Pi 4B s nízkou výpočtovou silou. V práci boli zohľadnené rôzne faktory ovplyvňujúce výkonnosť modelov, ako napríklad ich kvantizácia, rozlíšenia vstupu, inferenčné knižnice a veľkosti dávok počas inferencie. Táto štúdia poskytuje užitočné informácie k vývoju a nasadeniu detektorov vozidiel na širokú škálu zariadení, od nízkovýkonných procesorov po špecializované vstavané platformy.
Multiplayer Game for a Mobile Device
Čechák, Daniel ; Hradiš, Michal (referee) ; Bambušek, Daniel (advisor)
The aim of this work is to design and develop a simple and yet fun game for mobile devices, which can entertain players and make their moments of rest more pleasant. The game is developed in Unity. It is a board game for 2-4 players and contains four minigames. Players move around the board using dice rolls and are affected by bonuses gained in minigames based on their skill. The thesis describes the process of designing this game as well as its implementation. Last but not least, the work includes several methods of testing for this game. The game has been published and is freely available on Google Play under the name MiniBoard Champ.
User Interface for Efficient Corrections of OCR Output
Szepsi, Pavol ; Kapinus, Michal (referee) ; Hradiš, Michal (advisor)
The aim of the present bachelor thesis was to design and implement a web user interface for checking and correcting OCR outputs which will be suitable for mobile and touchscreen devices. The user interface uses the OCR output variants that the user can use to modify the recognized text. The interface is implemented in JavaScript using the Vue JS framework. XAMPP package is used for the server part. The tool Axios is used for communication between the user interface and the server. The created interface allows users to quickly and easily correct the OCR outputs, either on a computer or on a mobile device.
Fingerprint Recognition with Graph Neural Networks
Pospíšil, Ondřej ; Špaňhel, Jakub (referee) ; Hradiš, Michal (advisor)
This thesis deals with the verification of fingerprints based on their graph representation. The proposed method uses a graph neural network and a combinatorial solver to obtain the matching between the minutae points of a pair of fingerprints. The matched minutae points are used to align the fingerprints using an estimated transformation by the RANSAC algorithm. The aligned fingerprints are processed by the SimGNN model. The resulting similarity score is then combined with the metrics obtained from the aligned fingerprints. The experiments summarize the selection of method parameters and the evaluation of fingerprint matching and verification accuracy. The contribution of this work is a new stable method of fingerprint alignment by solving the graph matching problem. The proposed verification method does not achieve high accuracy due to too few minutae attributes and poor discriminating power of the metrics used.
Generative Neural Networks for Handwritten Text
Ševčík, Pavel ; Dobeš, Petr (referee) ; Hradiš, Michal (advisor)
The aim of this study was to create a generative neural network for handwritten text lines. The model produces variable-sized images of handwritten text lines based on the expected style. The proposed method exceeds existing models in the image quality and can be used to generate both individual words and entire lines of handwritten text. It combines the use of the attention mechanism to extract the features for each character from the text query and their arranging on the line by inserting spaces between them. The new approach allows more granular control of the symbol positions on the line, which leads to smoother style interpolations. In contrast to the previous approach, the proposed method uses the Gaussian filter to spread the individual symbols features to the surrounding area. This approach also allows to train the model for symbols position predictions using the adversarial loss (GAN). In addition, annotations of symbol horizontal positions on the lines of the IAM dataset of handwritten text have been created.
Neural Networks for Automatic Table Recognition
Piwowarski, Lukáš ; Španěl, Michal (referee) ; Hradiš, Michal (advisor)
Tato práce seznamuje čtenáře se současnými technikami rozpoznávání tabulek, které se používají především k získávání informací z ručně psaných nebo tištěných historických tabulek. Představujeme také metodu založenou na grafové neuronové síti, která je inspirována představenými přístupy. Metoda se skládá ze tří fází: fáze inicializace grafu, fáze klasifikace uzlů/hran a fáze transformace grafu na text. Ve fázi inicializace grafu používáme algoritmus viditelnosti uzlů a OCR k vytvoření počáteční grafové reprezentace vstupní tabulky. Ve fázi klasifikace uzlů a hran jsou uzly a hrany klasifikovány a ve fázi transformace grafu na text zarovnáváme uzly grafu do mřížky, která je pak použita k vytvoření konečné textové reprezentace tabulky. Náš implementovaný model byl schopen dosáhnout přesnosti 68 % u detekce horizontálních sousedů, přesnosti 71 % u detekce vertikálních sousedů a přesnosti 83 % u detekce buněk na datové sadě ABP.
Iris Image Quality Assessment
Vaško, Marek ; Herout, Adam (referee) ; Hradiš, Michal (advisor)
Iris image recognition is one of the most accurate ways of biometric identification. Various verification errors can be caused if the biometric system receives poor input. By assessing the image quality it is possible to eliminate inputs causing such errors. There is a relatively insignificant development in the field of iris quality assessment and many methods that could potentially be used have not been tested in this area yet. This work focuses on different quality assessment methods used in face recognition. These quality assessment methods are then applied to the area of iris identification. The solution uses verification systems based on various iResNet and MobileNetV3 architectures. Selected quality assessment methods are applied to individual systems. Different quality assessment methods train either the system directly or use its outputs to obtain information about quality. The resulting system achieves a reduction of false non-match rate by up to 56% with the absolute value of 0.5% for iResNet50 and up to 22 \% with the absolute value of 6.4% for MobileNetV3 when using the best quality assessment method. The results are given for the data set University of Notre Dame Iris CrossSensor 2013 with an input reject rate of 10% and a false match rate of 0.1%.
Web-Based Image Annotation Tool
Vostřejž, Tomáš ; Hradiš, Michal (referee) ; Špaňhel, Jakub (advisor)
This thesis deals with the development of a web application for image data annotation. Describes the design and implementation of the client and server side of a tool that works with video files. Supports object tracking using interpolation. It is implemented in JavaScript using the Angular platform and the Express library. Allows the user to create point, line, stroke, rectangle, and polygon annotations. Annotations are created based on annotation templates that the tool organizes into groups. Datasets have one or more annotation groups and the user has the option to transfer and reuse them between datasets using a personal library. The tool exports the resulting annotations in JSON format.
Web Interface for Human Corrections of Automatic Transcript and Tagging
Plhal, Jan ; Hradiš, Michal (referee) ; Szőke, Igor (advisor)
The goal of this thesis is the design and implementation of a web application for human corrections of automatic transcript and annotation of VHF (very high frequency) pilot-tower radio communication. The main inspiration is the ATCO2 SpokenData annotation service from the company ReplayWell, which was remade from the ground up into a more user-friendly form with a number of user customizable elements. Based on studied principles of designing a well usable user interface and insights gained from research of annotation applications was created a functional design and also a graphical design of the application in the form of a wireframe. The designed application was successfully implemented using the selected technologies. The usability of the application was verified and improved with the help of user testing and the application was also deployed on the SpokenData website.

National Repository of Grey Literature : 215 records found   previous11 - 20nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.