keywords:"DataSet" - Výsledky hledání - Digitální repozitář

host :: přihlásit Digitální repozitář
		Hledej		Nový záznam		Nápověda		O repozitáři

Hlavní stránka > Výsledky hledání: keywords:"DataSet"

Hledej:

Tipy pro vyhledávaní :: Rozšířené hledání

Hledej ve sbírkách:

Seřadit podle:	Zobrazit výsledky:	Výstupní formát:

	Column-oriented and Image Data Format Benchmarks Tarageľ, Marián ; Bartl, Vojtěch (oponent) ; Špaňhel, Jakub (vedoucí práce) This bachelor's thesis aims to evaluate different data formats for storing tabular and image data. To accomplish this task, this work designed a new benchmark of data formats. The benchmarks are divided into three benchmark suites. These include the benchmarking of uncompressed tabular formats, compressed tabular formats, and an image storage benchmark. Overall tabular benchmark results suggest that the best tabular data format for speed saving and reading is Feather, and the most memory-efficient format is Parquet. The results of the image storage benchmark show that the fastest image storage is SQLite and the least space is required by PNG format. The results of this work can contribute to a better understanding of how different data formats behave and help to choose the right format for tabular and image data. Úplný záznam
	Detekce karet při turnajích v pokru Kovalets, Vladyslav ; Šilling, Petr (oponent) ; Vaško, Marek (vedoucí práce) Tato bakalářská práce se zaměřuje na vývoj pokročilého systému pro automatické rozpoznávání a evidenci herních karet z videozáznamů pokerových her. Jako základní nástroj byla zvolena technologie konvolučních neuronových sítí, konkrétně síť YOLO, který umožňuje efektivní identifikaci karet na stole i v rukou hráčů i za náročných podmínek. Práce zahrnovala vytvoření rozsáhlé datové sady pro trénování a testování detektoru karet, který dosáhl přesnosti rozpoznávání 98,7 %. Pro spolehlivou evidenci karet byl navržen algoritmus, který minimalizuje chyby detektoru a zlepšuje celkovou přesnost systému. Výsledky práce naznačují, že navržený systém má potenciál pro využití i v praxi. Úplný záznam
	Large Language Models for Generating Code Focusing on Embedded Systems Vadovič, Matej ; Nosko, Svetozár (oponent) ; Smrž, Pavel (vedoucí práce) The goal of this work was to adapt a pre-trained language model for the purpose of generating code in the field of embedded systems. The work introduces a new dataset for fine-tuning code generation models, consisting of 50,000 pairs of source code and comments focused on embedded systems programming. This dataset is composed of collected source code from the GitHub platform. Two new language models for code generation, based on transformer architecture pre-trained models, were fine-tuned on the data of the new corpus. Model MicroCoder is based on the CodeLLaMA-Instruct 7B model, and during its fine-tuning, the QLoRA technique was used to minimize computational requirements. The second model, MicroCoderFIM, is based on the StarCoderBase 1B model and supports code infilling. The individual models were compared based on BLEU, CodeBLEU, ChrF++, and ROUGE-L metrics. Model MicroCoderFIM achieves the best adaptation results to the new task, with over 120% improvement in all measured metrics. The weights of the models along with the new dataset are freely accessible on a public repository. Úplný záznam
	Doplnění chybějící části obrazu pomocí hlubokého učení Zobaník, Radek ; Kubík, Tibor (oponent) ; Šilling, Petr (vedoucí práce) V této práci vznikla aplikace pro testování a porovnávání metod pro doplnění chybějící části obrazu za využití hlubokého učení a byly natrénovány dvě metody, pconv s konvoluční architekturou, respektive AOT-GAN s GAN architekturou. Práce popisuje návrh výsledné aplikace, její funkcionalitu a důležité body implementace. Byla zvolena datová sada, na které byly vybrané modely optimálně natrénovány. Proběhly experimenty na AOT-GAN modelu, kdy se zkoumal vliv počtu AOT bloků v generátoru na výsledný doplněný obraz. Všechny experimenty byly kvalitativně a kvantitativně porovnány. Výsledky ukázaly úctyhodné výsledky při práci s přírodní scenérií. Úplný záznam
	Detekce malware domén pomocí metod strojového učení Ebert, Tomáš ; Poliakov, Daniel (oponent) ; Hranický, Radek (vedoucí práce) Tato bakalářská práce se zabývá detekcí malware domén pomocí metod strojového učení na základě různých informací získaných o doméně (DNS záznamy, geolokační údaje atd.). S rychle rozšiřujícími se hrozbami, nejen formou malwaru, jsou často současné přístupy nedostačující ať už jen rychlostí detekce malware domén, nebo celkovým rozeznáním, zda se jedná o nebezpečnou doménu. Výstupem této práce je natrénovaný model klasifikátoru XGBoost, jehož výhodou je rychlá a efektivní detekce v reálném čase oproti detekci pomocí černých listin, které získávají data domén často s týdenním zpožděním. Pro tento model bylo získáno 131 tisíc malware domén, pomocí kterých bylo možné získat model s vysokými hodnotami. Pomocí experimentů bylo dosaženo skóre F1 96.8786 % u klasifikátoru XGBoost s poměrem falešně pozitivních detekcí 0.004887. Úplný záznam
	Benchmark of the Computational Tools for the Prediction of the Effect of Mutations on Protein Stability Berezný, Matej ; Martínek, Tomáš (oponent) ; Musil, Miloš (vedoucí práce) Protein design necessitates understanding how mutations influence their stability. Numerous online predictors exist for this aim, but it is challenging to compare them or to use them collectively. For that purpose I developed BenchStab, a console application/Python package designed for the swift and straightforward operation of 18 predictors, gathering results from a series of mutants. Benchstab is freely available on GitHub and can be expanded to include more predictors. To avoid potential dataset bias towards some predictors, I have constructed a new unique dataset, sourced from FireProtDB. I utilized this dataset to assess 24 distinct prediction methods from the three different perspectives. Úplný záznam
	Analyzing a person’s handwriting for recognizing his/her emotional state Chudárek, Aleš ; Matoušek, Jiří (oponent) ; Malik, Aamir Saeed (vedoucí práce) Emotion recognition from handwriting is a challenging and interdisciplinary task that can provide insights into the psychological and emotional aspects of the writer. In this study, we developed and evaluated a machine learning model that can predict the emotional state of a writer from their handwriting samples. We utilized the EMOTHAW dataset, which consists of handwriting and drawing samples from subjects whose emotional states are measured by the DASS test, which gives a score for depression, anxiety, and stress and the CIU Handwritten database for verification and experimentation. We extracted a large number of features that are inspired by the standard graphology work, as well as features that are specific to online data. We used ANOVA to select statistically significant features and normalized the data using Z-Score, MinMax, IQR or Log. We reduced the dimensionality of the features using Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). We employed a meta approach Ensemble learning that seeks to reduce the errors of a single model by exploiting the diversity and complementarity of multiple models. The structure of our classifier is dependent on multiple arguments resulting in over 300,000 different configurations. We optimized arguments using argument freezing. We found the best classifiers for binary and trinary classification for each emotion, resulting in six optimal models. We evaluated our models using different metrics, such as accuracy, precision, recall, and F1-score. Our models reached adequate results in all metrics. In addition to finding the classifiers, this thesis explored the importance of each extracted feature, providing a sorted list of the most significant features used for emotion recognition from handwriting. We also enhanced the EMOTHAW database by identifying tasks that are more indicative of specific emotions, thereby reducing the need for a full task battery for emotional analysis. Úplný záznam
	Sémantická segmentace leteckých snímků Pazdera, Jiří ; Králík, Jan (oponent) ; Adámek, Roman (vedoucí práce) Tato práce se zabývá sémantickou segmentací leteckých snímků a jejich následným využitím pro plánování trasy zachyceným terénem. První část představuje úvod do dané problematiky a teoretický popis současného stavu poznání. Část druhá popisuje testování dostupných segmentačních metod, vývoj vlastní datové sady a trénování existujícího modelu neuronové sítě. Na závěr je demonstrována možnost plánování trasy pomocí vhodného algoritmu. Úplný záznam
	Automatická vizuální podpora pro Q-řazení Kán, Dávid ; Hradiš, Michal (oponent) ; Vaško, Marek (vedoucí práce) This bachelor thesis deals with the integration of Q-sorting and computer vision methods for object detection. The goal of the work is to create a program that, with the help of~visual support, will facilitate the process and at the same time prevent errors in Q-sorting. Furthermore, the work deals with the creation of~a suitable data set for training the model and for experiments, which takes into account the way the cards are laid out and the~environment. The implemented program takes the form of a console application and is written using the Python programming language. The program uses YOLOv8 to detect objects and uses Pero OCR to retrieve text from cards. Using the created test set, experiments were performed on the trained model and the program was tested. Úplný záznam
	Automatická kontrola dopravního značení Čechmánek, Roman ; Klíma, Ondřej (oponent) ; Musil, Petr (vedoucí práce) Cílem této práce je vytvoření finančně nenáročného nástroje, který by byl schopen zautomatizovat proces kontroly dopravního značení. To zahrnuje práci se záznamy jízd po pozemních komunikacích, vytvořených pomocí levného záznamového zařízení jako je například akční kamera GoPro, či některé z palubních kamer. Kontrola probíhá na základě systémem lokalizovaného dopravního značení a historických dat o mapování dopravního značení. Výsledkem práce je systém, jehož vstupem jsou záznamy jízd a historická data a výstupem jsou dva soubory obsahující informace o výsledcích kontroly. Prvním z nich je soubor GEOJSON, který je vhodný pro další zpracování získaných dat a soubor HTML, který poskytuje jednoduché uživatelské rozhraní vizualizující výsledky kontroly na interaktivní webové mapě. Úplný záznam

Chcete být upozorněni, pokud se objeví nové záznamy odpovídající tomuto dotazu?
Přihlásit se k odběru RSS.

Digitální repozitář :: :: :: ::
Powered by v1.1.2
Spravuje

Tato stránka je dostupná také v následujících jazycích:
Česky English