The use of deep neural networks for the evaluation of metallographic cross-sections
Semančík, Adam ; Mendřický, Radomír (referee) ; Hurník, Jakub (advisor)
Táto diplomová práca skúma aplikáciu hlbokých neurónových sietí pre vylepšenie hodnotenia metalografických výbrusov pre materiály vyrobené pomocou aditívnej výroby. Zameriava sa na dve pokročilé techniky spracovania obrazu: sémantickú segmentáciu a super-rozlíšenie obrazu. Na sémantickú segmentáciu bola použitá architektúra U-Net pre klasifikáciu defektov, ako sú dva typy pórov. Okrem toho bol použitý model SRGAN (Super-Resolution Generative Adversarial Network) pre zvýšenie rozlíšenia obrazu, čo potenciálne zlepšuje presnosť segmentácie. Výskum hodnotí, či model trénovaný na AlSi10Mg môže dostatočne dobre vyhodnocovať materiály Cu99 a Ti6Al4V. Zároveň hodnotí vplyv super-rozlíšenia na výkonnosť segmentácie. Výsledky ukázali, že zatiaľ čo model segmentácie dosahoval dobré výsledky na AlSi10Mg, generalizácia na iné materiály vyžaduje diverzifikovanejšie tréningové dáta. V dôsledku výpočtových obmedzení zostáva kombinovaný efekt super-rozlíšenia a segmentácie nejednoznačný, čo naznačuje potrebu ďalšieho výskumu s výkonnejšími výpočtovými zdrojmi.
Generative Models for 3D Shape Completion
Zdravecký, Peter ; Španěl, Michal (referee) ; Kubík, Tibor (advisor)
Naskenované 3D modely často trpia chybami kvôli oklúzii, skenovacím nedostatkom alebo neúplnosti samotného modelu. Cieľom tejto práce je vyvinúť automatizovaný proces na doplnenie chýbajúcich častí 3D tvarov prostredníctvom hlbokého učenia. Navrhované riešenie vychádza z predchádzajúcej práce DiffComplete, ktorá využíva generatívny difúzny proces na vyplnenie chýbajúcich časti 3D tvarov. Úloha sa takto vníma ako generatívny problém. Výsledky preukazujú vysokú účinnosť tohto modelu s IoU skóre dosahujúcim 81,6 na konkrétnej testovacej sade pozostávajúcej z tvarov nábytku. Model navyše úspešne generalizuje aj na tvary, ktoré nie sú zahrnuté v trénovacej sade, dosahujúc priemerné IoU skóre 70,9. Práca okrem popisu dátovo orientovaných experimentov obohacuje súčasnú problematiku vypĺňania 3D útvarov dvoma spôsobmi. Po prvé rieši najväčšiu limitáciu, výpočetnú náročnosť, spracovaním vstupu v priestore s nízkym rozlíšením. Po druhé využíva užívateľský vstup (vo forme oblasti záujmu), čo umožňuje užívateľovi lepšie ovládať proces generácie v nejednoznačných situáciách.
Deep Neural Networks for Reinforcement Learning in Real-Time Strategy
Barilla, Marco ; Dobeš, Petr (referee) ; Kolář, Martin (advisor)
Machine learning is one of the fastest growing branches of modern science. It is a subfield of artificial intelligence research that is interested the problem of making computers help us solve complex modern problems. Games play an important role in this field because they represent the perfect environment for testing of new approaches and benchmarking against human performance. Starcraft 2 is currently in the spotlight, thanks to its broad playerbase and its complexity. The practical goal of this paper is to create an advantage actor critic agent that is able to operate in the environment of this game.
Ball Tracking in Sports Video
Motlík, Matúš ; Špaňhel, Jakub (referee) ; Bartl, Vojtěch (advisor)
This master's thesis deals with automatic detection and tracking of a soccer ball in sports videos. Based on the introduced techniques focusing on tracking of small objects in high-resolution videos, effective convolutional neural networks are designed and used by a modified version of tracking algorithm SORT for automatic object detection. A set of experiments with the processing of images in different resolutions and with various frequencies of detection extraction is carried out in order to examine the trade-off between processing speed and tracking accuracy. The obtained results of experiments are presented and used to form proposals for future work, which could lead to improvements in tracking accuracy while maintaining reasonable processing speed.
Modelling Music Waveforms Using Wavenet
Slanináková, Terézia ; Landini, Federico Nicolás (referee) ; Beneš, Karel (advisor)
This thesis focuses on exploring the possibilities of modelling music and speech with WaveNet, a deep neural network for generating raw audio waveforms. Using existing implementations, WaveNet was trained on multiple datasets and produced several audio files. Multiple experiments were carried out with various hyperparameter setups of WaveNet to find the optimal settings for the best results. Furthermore, multiple generation schemes were used, each having varying impact on the quality of generated audio. This quality was evaluated using human assessment via a questionnaire, where the musical samples were rated with a score 2-3.1818 on a 5 point scale, which is comparable to the rating of referential audio from the original WaveNet paper (3.1818).
Deep Neural Network Optimization
Bažík, Martin ; Wiglasz, Michal (referee) ; Sekanina, Lukáš (advisor)
The goal of this thesis was to design, implement and analyze various optimizations of deep neural networks, in order to improve the observed parameters. The optimizations are based on modification of the data representation used by neural network operations and searching for the best combination of its hyper-parameters. The convolutional neural networks used for these optimizations were built on LeNet-5 architecture and trained on MNIST, CIFAR-10, and SVHN datasets. The neural networks and their optimizations were implemented within Tiny-dnn library using C++ programming language.
Neural Network Based Dereverberation
Karlík, Pavol ; Černocký, Jan (referee) ; Žmolíková, Kateřina (advisor)
In the past years, the usage of neural networks in speech processing has increased significantly. This thesis focuses on implementing and evaluating a speech dereverberation framework that utilizes a deep neural network (DNN) to estimate the power spectral density of the signal. The proposed framework is based on the state-of-the-art speech enhancement algorithm called Weighted prediction error (WPE), which is known to effectively reduce reverberation from the speech signal. This thesis summarizes the theory of dereverberation, neural networks and the Weighted prediction error algorithm. Different DNN architectures are experimented with and trained using different datasets with varying properties. The results have shown that our framework is able to outperform the conventional WPE, especially in situations where duration of processed signal is short.
Deep Learning for Image Stitching
Držíková, Diana Maxima ; Vaško, Marek (referee) ; Španěl, Michal (advisor)
Zošívanie obrázkov nie je taký neznámy pojem ako sa na prvý pohľad môže zdať. Určite každý bežný používateľ technológií sa už zozámil s pojmom panoramatický obrázok. V pozadí na zariadení sa prekrývajúce sa obrázky zošívajú a tým vzniká vysoko kvalitný obrázok. Na to aby tento proces fungoval, existujúce algorimy musia spoľahlivo a presne detekovať zaujímavé body, podľa ktorých sa dokáže obrázok správne umiesniť. V tejto práci budú predstavené tradičné metódy na zošívanie obrázkov a taktiež aj metódy s pomocou hlbokých neurónových sietí. Hlavné dva modely, ktoré budú opísane a použíté sú implementácie SuperPoint a SuperGlue. Implementácia bude adaptovaná na párovací systém pre viac ako dva obrázky. Ostatné experimenty, ktoré boli vyskúšané a dopomohli k pochopeniu tejto problematiky budú opísane a vyhodnotené.
Blood vessel segmentation in retinal images using deep learning approaches
Serečunová, Stanislava ; Vičar, Tomáš (referee) ; Kolář, Radim (advisor)
This diploma thesis deals with the application of deep neural networks with focus on image segmentation. The theoretical part contains a description of deep neural networks and a summary of widely used convolutional architectures for segmentation of objects from the image. Practical part of the work was devoted to testing of an existing network architectures. For this purpose, an open-source software library Tensorflow, implemented in Python programming language, was used. A frequent problem incorporating the use of convolutional neural networks is the requirement on large amount of input data. In order to overcome this obstacle a new data set, consisting of a combination of five freely available databases was created. The selected U-net network architecture was tested by first modification of the newly created data set. Based on the test results, the chosen network architecture has been modified. By these means a new network has been created achieving better performance in comparison to the original network. The modified architecture is then trained on a newly created data set, that contains images of different types taken with various fundus cameras. As a result, the trained network is more robust and allows segmentation of retina blood vessels from images with different parameters. The modified architecture was tested on the STARE, CHASE, and HRF databases. Results were compared with published segmentation methods from literature, which are based on convolutional neural networks, as well as classical segmentation methods. The created network shows a high success rate of retina blood vessels segmentation comparable to state-of-the-art methods.
Automatic Pronunciation Evaluation of Non-Native English Speakers
Gazdík, Peter ; Szőke, Igor (referee) ; Žmolíková, Kateřina (advisor)
Computer-Assisted Pronunciation Training (CAPT) is becoming more and more popular these days. However, the accuracy of existing CAPT systems is still quite low. Therefore, this diploma thesis focuses on improving existing methods for automatic pronunciation evaluation on the segmental level. The first part describes common techniques for this task. Afterwards, we proposed the system based on two approaches. Finally, performed experiments show significant improvement over the reference system.

