keywords:"Data augmentation" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"Data augmentation"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Data Sets for Network Security Setinský, Jiří ; Hranický, Radek (referee) ; Tisovčík, Peter (advisor) In network security, machine learning techniques are used to effectively detect anomalies and malware in network traffic. A quality dataset is needed to train a network classifier with high accuracy. The aim of this paper is to modify the dataset using machine learning techniques to improve the quality of the dataset which will lead to training the model with a higher accuracy. The dataset is analyzed by a clustering algorithm and each cluster is characterized by a statistical description resulting from the attributes of the input dataset. The statistical description along with the information of the original classifier is used to compute the score. The score serves as a weight in the modification phase. Cluster analysis allows to filter out the data that are important for training the final model. The proposed approach allows us to mitigate the redundancy of the dataset or to augment it with missing data. The result is a modification framework that is able to reduce the datasets or perform their aggregation in order to create a compact dataset that reflects the actual network traffic. Models were trained on the created datasets and achieved higher accuracy compared to the existing solution. Detailed record
	Keyboard and Keys Image Recognition Lorenc, Jan ; Lichtner, Ondrej (referee) ; Pluskal, Jan (advisor) Cílem práce je vytvoření řešení pro rozpoznání kláves na klávesnici za účelem automatizace robotického psaní na klávesnici. V rámci práce jsou vytvořeny datasety pro detekci klávesnice v obraze, rozpoznání znaků v obraze a dodatečnou korekci detekovaných znaků na základě různých rozložení klávesnic. Práce předkládá různé přístupy k řešení problému rozpoznání znaků na klávesnici a vybírá ten nejvhodnější. Navržený postup je rozdělen do 3 fází, kterým odpovídají připravené datasety. Pomocí neuronových sítí a Cannyho metody detekce hran se nejprve rozpozná klávesnice v obraze a následně se v nalezené klávesnici detekují jednotlivé znaky. V poslední fázi dochází k dodatečnému zpracování výsledků (oprava znaků, doplnění nerozpoznaných znaků, nalezení speciálních kláves apod.). Pro každou část jsou vyhodnoceny výsledky. Přínos práce spočívá ve vytvoření datasetů pro detekci klávesnice a jejích kláves a především modulárního a rozšiřitelného řešení pro detekční proces se slibnými výsledky. Detailed record
	Automatic Speech Detection for VHF Channel Nováková, Mária ; Veselý, Karel (referee) ; Szőke, Igor (advisor) Výskyt hluku a šumu v pozadí audio leteckej komunikácie je problémom, ktorému denne čelia operanti riadenia letovej prevádzky. Aby bola zaistená bezpečná letecká preprava, komunikácia medzi vežou a lietatlom musí byť čo najefektívnejšia. Hlavnú rolu vo vylepšovaní kvality komunikácie hrá detekcia hlasovej aktivity. Správna detekcia reči je nevyhnutá pre rozpoznanie začiatku komunikácie pre systémy. Začiatok komunikácie začína stlačením tlačítka push-to-talk pomocou rádiového systému. Na rozpoznávanie reči existujú rôzne prístupy a implementácie. Za pomoci neurónových sietí sa dá detekcia reči upresniť. Výhodou používania umelej inteligencie je jej adaptácia na nové podnety. Táto práca ponúka riešenie na detekciu reči a push-to-talk udalostí v leteckej komunikácií. Navrhnuté riešenia budú evaluované a porovnané. Na záver, dostupná implementácia GPVAD je prepracovaná na riešenie tohto problému. Strojové učenie má zas a znova príležitosť predviesť svoje schopnosti. Detailed record
	Dataset augmentation with style transfer methods Wolny, Michał ; Ligocki, Adam (referee) ; Kratochvíla, Lukáš (advisor) This bachelor's thesis focuses on the research of dataset augmentation and style transfer methods. From the range of available style transfer algorithms, three very different methods were selected, implemented and then experimentally used for dataset augmentation. The effectiveness of augmentation using these methods was verified by performing a statistical analysis of each newly created dataset compared to the original, unmodified dataset. The results of the analysis provide important information about changes in statistical characteristics such as entropy, mean, median, variance, and standard deviation. This information helped to evaluate the effectiveness and impact of the augmentation methods used on the augmented dataset and provide evidence of their potential. Detailed record
	Imbalanced data training approaches in neural network Vicianová, Veronika ; Ředina, Richard (referee) ; Jakubíček, Roman (advisor) This thesis deals with the research and implementation of methods that eliminate the influence of an imbalanced dataset on the learning of neural networks. Individual methods are compared with each other for different levels of imbalance. The experiments carried out in the work are also compared with the available literature and a control experiment, which was carried out without the method of eliminating the influence of an imbalanced dataset. The experiments are extended to another dataset containing the original imbalance and compared. In the theoretical section, the topic of neural networks and the problems that may occur during learning are brought up. Subsequently, convolutional networks and their optimization algorithms are presented. The thesis also contains a more detailed presentation of the issue of an imbalanced dataset, including the metrics used in experiments and approaches used to eliminate this problem. Detailed record
	Research of the new augmentation methods for online handwriting Sigmund, Jan ; Burget, Radim (referee) ; Zvončák, Vojtěch (advisor) Graphomotor difficulties of school-aged children are characterised by problems in handwriting and drawing and can lead to developmental dysgraphia. Timely clinical diagnosis is critical to provide preventive care. In practice however, it is not feasible on day-to-day basis due to the need for expert staff and the prevalence of difficulties up to 30\%. Machine learning models can serve as an accessible objective tool for evaluating graphomotor functioning. In most cases there is not enough data collected, which results in poor classification performance. Therefore, this thesis focuses on data augmentation of online handwriting. Generating artificial samples is based on recombination of intrinsic mode functions, obtained by empirical mode decomposition. IMFs of health controls, numbering 72, and with graphomotor difficulties, 94 children in total, are calculated. The decomposition is performed specifically on X and Y coordinate time series. IMFs of the same indices of different subjects are randomly interchanged, thus producing a new signal. Then, the graphomotor features of the original and artificial time series are extracted. Only the spatial ones related to the coordinates are selected. Finally, the correlations of the features of the two databases will be analyzed and compared. Detailed record
	Algorithms for improving the detection of selected cardiac arrhythmias Šandová, Hana ; Ředina, Richard (referee) ; Novotná, Petra (advisor) The work deals with the generation of ECG arrhythmias that are underrepresented in databases. The theoretical part of the thesis is devoted to a literature search of academic publications that deal with the classification of arrhythmia by using deep learning and data augmentation metod for ECG. The practical part of the thesis deals with noise generator, because adding noise to signals could make the dataset richer. Functions for augmentation of atrial flutter and 3rd and 2nd atrioventricular block were created. It has been tried generation of 2nd atrioventricular block using generative adversarial networks (GAN). Deep learning-based ECG classifiers were used for evaluating the efficiency of the proposed technique in generating synthetic ECG data. Detailed record
	Pedestrian Detector Domain Shift Robustness Evaluation, And Domain Shift Error Mitigation Proposal Zemčík, Tomáš This paper evaluates daytime to nighttime traffic image domain shift on Faster R-CNNand SSD based pedestrian and cyclist detectors. Daytime image trained detectors are applied on anewly compiled nighttime image dataset and their performance is evaluated against detectors trainedon both daytime and nighttime images. Faster R-CNN based detectors proved relatively robust, butstill clearly inferior to the models trained on nighttime images, the SSD based model proved noncompetitive.Approaches to the domain shift deterioration mitigation were proposed and future workoutlined. Detailed record
	Techniques For Avoiding Model Overfitting On Small Dataset Kratochvila, Lukas Building a deep learning model based on small dataset is difficult, even impossible. Toavoiding overfitting, we must constrain model, which we train. Techniques as data augmentation,regularization or data normalization could be crucial. We have created a benchmark with a simpleCNN image classifier in order to find the best techniques. As a result, we compare different types ofdata augmentation and weights regularization and data normalization on a small dataset. Detailed record
	Segmentation of multiple sclerosis lesions using deep neural networks Sasko, Dominik ; Myška, Vojtěch (referee) ; Kolařík, Martin (advisor) Hlavným zámerom tejto diplomovej práce bola automatická segmentácia lézií sklerózy multiplex na snímkoch MRI. V rámci práce boli otestované najnovšie metódy segmentácie s využitím hlbokých neurónových sietí a porovnané prístupy inicializácie váh sietí pomocou preneseného učenia (transfer learning) a samoriadeného učenia (self-supervised learning). Samotný problém automatickej segmentácie lézií sklerózy multiplex je veľmi náročný, a to primárne kvôli vysokej nevyváženosti datasetu (skeny mozgov zvyčajne obsahujú len malé množstvo poškodeného tkaniva). Ďalšou výzvou je manuálna anotácia týchto lézií, nakoľko dvaja rozdielni doktori môžu označiť iné časti mozgu ako poškodené a hodnota Dice Coefficient týchto anotácií je približne 0,86. Možnosť zjednodušenia procesu anotovania lézií automatizáciou by mohlo zlepšiť výpočet množstva lézií, čo by mohlo viesť k zlepšeniu diagnostiky individuálnych pacientov. Našim cieľom bolo navrhnutie dvoch techník využívajúcich transfer learning na predtrénovanie váh, ktoré by neskôr mohli zlepšiť výsledky terajších segmentačných modelov. Teoretická časť opisuje rozdelenie umelej inteligencie, strojového učenia a hlbokých neurónových sietí a ich využitie pri segmentácii obrazu. Následne je popísaná skleróza multiplex, jej typy, symptómy, diagnostika a liečba. Praktická časť začína predspracovaním dát. Najprv boli skeny mozgu upravené na rovnaké rozlíšenie s rovnakou veľkosťou voxelu. Dôvodom tejto úpravy bolo využitie troch odlišných datasetov, v ktorých boli skeny vytvárané rozličnými prístrojmi od rôznych výrobcov. Jeden dataset taktiež obsahoval lebku, a tak bolo nutné jej odstránenie pomocou nástroju FSL pre ponechanie samotného mozgu pacienta. Využívali sme 3D skeny (FLAIR, T1 a T2 modality), ktoré boli postupne rozdelené na individuálne 2D rezy a použité na vstup neurónovej siete s enkodér-dekodér architektúrou. Dataset na trénovanie obsahoval 6720 rezov s rozlíšením 192 x 192 pixelov (po odstránení rezov, ktorých maska neobsahovala žiadnu hodnotu). Využitá loss funkcia bola Combo loss (kombinácia Dice Loss s upravenou Cross-Entropy). Prvá metóda sa zameriavala na využitie predtrénovaných váh z ImageNet datasetu na enkodér U-Net architektúry so zamknutými váhami enkodéra, resp. bez zamknutia a následného porovnania s náhodnou inicializáciou váh. V tomto prípade sme použili len FLAIR modalitu. Transfer learning dokázalo zvýšiť sledovanú metriku z hodnoty približne 0,4 na 0,6. Rozdiel medzi zamknutými a nezamknutými váhami enkodéru sa pohyboval okolo 0,02. Druhá navrhnutá technika používala self-supervised kontext enkodér s Generative Adversarial Networks (GAN) na predtrénovanie váh. Táto sieť využívala všetky tri spomenuté modality aj s prázdnymi rezmi masiek (spolu 23040 obrázkov). Úlohou GAN siete bolo dotvoriť sken mozgu, ktorý bol prekrytý čiernou maskou v tvare šachovnice. Takto naučené váhy boli následne načítané do enkodéru na aplikáciu na náš segmentačný problém. Tento experiment nevykazoval lepšie výsledky, s hodnotou DSC 0,29 a 0,09 (nezamknuté a zamknuté váhy enkodéru). Prudké zníženie metriky mohlo byť spôsobené použitím predtrénovaných váh na vzdialených problémoch (segmentácia a self-supervised kontext enkodér), ako aj zložitosť úlohy kvôli nevyváženému datasetu. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English