National Repository of Grey Literature 9 records found  Search took 0.01 seconds. 
Active Learning for Processing of Archive Sources
Hříbek, David ; Zbořil, František (referee) ; Rozman, Jaroslav (advisor)
This work deals with the creation of a system that allows uploading and annotating scans of historical documents and subsequent active learning of models for character recognition (OCR) on available annotations (marked lines and their transcripts). The work describes the process, classifies the techniques and presents an existing system for character recognition. Above all, emphasis is placed on machine learning methods. Furthermore, the methods of active learning are explained and a method of active learning of available OCR models from annotated scans is proposed. The rest of the work deals with a system design, implementation, available datasets, evaluation of self-created OCR model and testing of the entire system.
Semi-Automatic Word Normalization in Parish Records
Hříbek, David ; Zbořil, František (referee) ; Rozman, Jaroslav (advisor)
This work deals with the extension of DEMoS web application for the management of parish records by the possibility of normalization (assignment of a normalized form of writing to individual words) of names, surnames, occupations, domiciles and other types of words occurring in parish records. In the solution, a duplicate record detection process was used, which allowed sorting of the records from parish records into clusters of similar words. As a result of the clustering, it was possible to share normalized word variants within these clusters. Thus, DEMoS suggests normalized variants for words entered by users, used not only for the same words, but also for similar words. In this work, automatic testing of word clustering was proposed. In total, 640 different combinations of clustering parameters were tested for each word type. Subsequently, the best clustering parameters were selected for each word type. By normalizing words, DEMoS application significantly increases the efficiency of searching in parish records. Records are also easier to read.
Detection of Fake News Using Machine Learning
Koreň, Matej ; Zbořil, František (referee) ; Hříbek, David (advisor)
This thesis focuses on the use of machine learning in fake news detection. For this purpose, four models have been selected – Bayesian, Decision Tree, Support Vector Machine and a Neural Network. In five experiments on various datasets, these models were trained, tested, evaluated and compared with state-of-the-art methods. Final implementation is in the form of a Python package, which allows it’s users to replicate this procedure with their own data. Beyond the assignment, Slovak dataset Dezinfo SK was created.
Measuring the Thickness of Material Layers Removed from a Sample in an Electron Microscope
Kutálek, Jiří ; Hříbek, David (referee) ; Čadík, Martin (advisor)
Motivace pro tuto práci vyvstává ze zájmu firmy Thermo Fisher Scientific o vyvinutí metody pro měření tloušťky vrstev odprášeného materiálu ze vzorku v elektronovém mikroskopu. Hlavním cílem práce je navržení meřicí metody, jež bude z praktického hlediska efektivnější než metody stávající. Mimo to, druhotným cílem je nalezení způsobu pro získání ground truth pro měření, která by umožnila navrženou metodu vyhodnotit. Tato práce představuje dvě nové meřicí metody detekující pozici hrany vzorku v obraze a způsob pro získání ground truth, spočívající ve vypálení drobných jamek (teček) do povrchu vzorku a následné detekce a počítání teček v obrázcích vzorku. Pro účely vyhodnocení všech metod jsem nasbíral tři sady obrázků. Výsledky experimentů ukazují, že jedna z navržených metod, Top-Down FIB, měří konzistentní hodnoty blízké očekávanému průměru a z porovnání vůču ground truth vychází o něco lépe, než state-of-the-art metoda. Navíc, algortimus počítající tečky v obraze se ukazál býti použitelnou metodou pro získání ground truth, neboť dosáhl stabilnějších výsledků, než alternativní ground truth vygenerovaná manuální anotací dat.
Active Learning for Work with Archive Materials
Štajerová, Alžbeta ; Hříbek, David (referee) ; Rozman, Jaroslav (advisor)
The aim of this Master's thesis is to design and implement an OCR system for archival historical documents containing handwriting text. The first part of the thesis deals with the study of optical character recognition, the process of OCR pipepline. Then the topic of active learning and its methods is described. The thesis reviews the available solutions for recognition of handwritten historical documents. I further describe the neural network architectures used for text detection. The thesis results in the design and subsequent implementation of system for text recognition of historical documents, enabling user annotation, full-text search in annotation records.
Time Series Prediction
Dvořáček, Tomáš ; Rozman, Jaroslav (referee) ; Hříbek, David (advisor)
The aim of this thesis is to design and implement a program that will be able to analyze and predict the future evolution of univariate and multivariate time series from a given input. Statistical approaches and approaches where time series are predicted using neural networks have been used in the solution.
Metody pro shlukování a vyhledávání v obrazových datech z elektronových mikroskopů
Plachý, Tomáš ; Hříbek, David (referee) ; Čadík, Martin (advisor)
This thesis deals with the problem of clustering images from electron microscopy. These images can be clustered by visual similarity or by metadata, which describe the settings of the microscope. The goal of this thesis is to compare these two clustering approaches and to explore the possibility of utilizing clustering to split a set of pictures into two parts - one containing correct pictures and the other containing pictures which capture an error during automatized work of an electron microscope. Conclusion of this thesis is that visual differences and differences in metadata between correct and errorous images from electron microscopy are so small, that they cannot be distinguished by unsupervised clustering techniques. However, a positive contribution of this work is demonstration of usability of the methods chosen in this thesis for clustering images into groups corresponding with different phases of work of the microscope, which will make the manual analysis of these pictures easier.
Active Learning for Processing of Archive Sources
Hříbek, David ; Zbořil, František (referee) ; Rozman, Jaroslav (advisor)
This work deals with the creation of a system that allows uploading and annotating scans of historical documents and subsequent active learning of models for character recognition (OCR) on available annotations (marked lines and their transcripts). The work describes the process, classifies the techniques and presents an existing system for character recognition. Above all, emphasis is placed on machine learning methods. Furthermore, the methods of active learning are explained and a method of active learning of available OCR models from annotated scans is proposed. The rest of the work deals with a system design, implementation, available datasets, evaluation of self-created OCR model and testing of the entire system.
Semi-Automatic Word Normalization in Parish Records
Hříbek, David ; Zbořil, František (referee) ; Rozman, Jaroslav (advisor)
This work deals with the extension of DEMoS web application for the management of parish records by the possibility of normalization (assignment of a normalized form of writing to individual words) of names, surnames, occupations, domiciles and other types of words occurring in parish records. In the solution, a duplicate record detection process was used, which allowed sorting of the records from parish records into clusters of similar words. As a result of the clustering, it was possible to share normalized word variants within these clusters. Thus, DEMoS suggests normalized variants for words entered by users, used not only for the same words, but also for similar words. In this work, automatic testing of word clustering was proposed. In total, 640 different combinations of clustering parameters were tested for each word type. Subsequently, the best clustering parameters were selected for each word type. By normalizing words, DEMoS application significantly increases the efficiency of searching in parish records. Records are also easier to read.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.