National Repository of Grey Literature 27 records found  beginprevious18 - 27  jump to record: Search took 0.01 seconds. 
Machine Learning Optimization of KPI Prediction
Haris, Daniel ; Burget, Radek (referee) ; Bartík, Vladimír (advisor)
This thesis aims to optimize the machine learning algorithms for predicting KPI metrics for an organization. The organization is predicting whether projects meet planned deadlines of the last phase of development process using machine learning. The work focuses on the analysis of prediction models and sets the goal of selecting new candidate models for the prediction system. We have implemented a system that automatically selects the best feature variables for learning. Trained models were evaluated by several performance metrics and the best candidates were chosen for the prediction. Candidate models achieved higher accuracy, which means, that the prediction system provides more reliable responses. We suggested other improvements that could increase the accuracy of the forecast.
Feature selection for text classification with Naive Bayes
Lux, Erik ; Petříčková, Zuzana (advisor) ; Petříček, Martin (referee)
The work presents the field of document classification. It describes existing techniques with emphasis on the Naive Bayes' classifier. Several existing feature selection methods suitable for the Naive Bayes' classifier are discussed. This theoretical background is the basis for the implementation of a classification library based on the Naive Bayes' method. Besides the classification program, the library provides a range of document preprocessing tools. They allow to work with different types of documents and, more importantly, they significantly reduce redundant document dimensions. Eventually, we tested the library on two different datasets and compared implemented feature selection methods. The functionality of the whole library is practically verified by including it into the open-source email client Mailpuccino.
Sentiment Analysis of Customer Reviews
Hrabák, Jan ; Helman, Karel (advisor) ; Malá, Ivana (referee)
This thesis is focused on sentiment analysis of unstructured text and its practical application on the real data downloaded from website Yelp.com The objectives of the theoretical part of this thesis is to sum up the information related to history, methods and possible applications of sentiment analysis. A reader is acquainted with important terms and processes of sentiment analysis. Theoretical part is focused on Naive Bayes classifier, that will be used in practical part of this thesis. In practical part there is detailed description of data set, construction and testing of model. At the end there are presented pros and cons of the chosen model and described some possibilities of its usage.
Electronic module for acoustic detection
Maršál, Martin ; Klusáček, Jan (referee) ; Havránek, Zdeněk (advisor)
This diploma thesis deals with the design and implementation of an electronic module for acoustic detection. The module has the task of detecting a predetermined acoustic signals through them learned classification model. The module is used mainly for security purposes. To identify and classify the proposed model using machine learning techniques. Given the possibility of retraining for a different set of sounds, the module becomes a universal sound detector. With acoustic sound using the digital MEMS microphone, for which it is designed and implemented conversion filter. The resulting system is implemented into firmware microcontroller with real time operating system. The various functions of the system are realized with regard to the possible optimization (less powerful MCU or battery power). The module transmits the detection results of the master station via Ethernet network. In the case of multiple modules connected to the network to create a distributed system, which is designed for precise time synchronization using PTP protocol defined by the IEEE-1588 standard.
Classification Framework
Koroncziová, Dominika ; Otrusina, Lubomír (referee) ; Kouřil, Jan (advisor)
The goal of this work is the design and implementation of a machine learning software, based on the RapidMiner library. The finished application integrates the most commonly used algorithms and processes implemented in RapidMiner into an easily usable program. The application contains a simple command line interface, as well as a graphic interface to simplify selection of multiple parameters. The program also provides a tool to create standalone programs, that can be used for classification with a pre-trained model. On top of the original requirements the possibility to work with textual data from Wikipedia was also implemented, providing a tool for downloading and preprocessing of the data in order to use them as training input. This text focuses on the specifics of the algorithms and classifiers used and on their features and uses, and describes the design and implementation of the system. As part of this work, several tests were run in order to validate the efficiency and functionality of the program. The test results are included at the end of the thesis.
Optimization of Heuristic Analysis of Executable Files
Wiglasz, Michal ; Křoustek, Jakub (referee) ; Hruška, Tomáš (advisor)
This BSc Thesis was performed during a study stay at the Universita della Svizzera italiana, Swiss. This thesis describes the implementation of a classification tool for detection of unknown malware based on their behaviour which could replace current solution, based on manually chosen attributes'scores and a threshold. The database used for training and testing was provided by AVG Technologies company, which specializes in antivirus and security systems. Five different classifiers were compared in order to find the best one for implementation: Naive Bayes, a decision tree, RandomForrest, a neural net and a support vector machine. After series of experiments, the Naive Bayes classifier was selected. The implemented application covers all necessary steps: attribute extraction, training, estimation of the performance and classification of unknown samples. Because the company is willing to tolerate false positive rate of only 1% or less, the accuracy of the implemented classifier is only 61.7%, which is less than 1% better than the currently used approach. However it provides automation of the learning process and allows quick re-training (in average around 12 seconds for 90 thousand training samples).
Web Application for Managing and Classifying Information from Distributed Sources
Vrána, Pavel ; Chmelař, Petr (referee) ; Drozd, Michal (advisor)
This master's thesis deals with data mining techniques and classification of the data into specified categories. The goal of this thesis is to implement a web portal for administration and classification of data from distributed sources. To achieve the goal, it is necessary to test different methods and find the most appropriate one for web articles classification. From the results obtained, there will be developed an automated application for downloading and classification of data from different sources, which would ultimately be able to substitute a user, who would process all the tasks manually.
Intelligent Mailbox
Pohlídal, Antonín ; Drozd, Michal (referee) ; Chmelař, Petr (advisor)
This master's thesis deals with the use of text classification for sorting of incoming emails. First, there is described the Knowledge Discovery in Databases and there is also analyzed in detail the text classification with selected methods. Further, this thesis describes the email communication and SMTP, POP3 and IMAP protocols. The next part contains design of the system that classifies incoming emails and there are also described realated technologie ie Apache James Server, PostgreSQL and RapidMiner. Further, there is described the implementation of all necessary components. The last part contains an experiments with email server using Enron Dataset.
Using of Data Mining Method for Analysis of Social Networks
Novosad, Andrej ; Očenášek, Pavel (referee) ; Bartík, Vladimír (advisor)
Thesis discusses data mining the social media. It gives an introduction about the topic of data mining and possible mining methods. Thesis also explores social media and social networks, what are they able to offer and what problems do they bring. Three different APIs of three social networking sites are examined with their opportunities they provide for data mining. Techniques of text mining and document classification are explored. An implementation of a web application that mines data from social site Twitter using the algorithm SVM is being described. Implemented application is classifying tweets based on their text where classes represent tweets' continents of origin. Several experiments executed both in RapidMiner software and in implemented web application are then proposed and their results examined.
Using Data Mining in Various Industries
Fabian, Jaroslav ; Novotný, Jakub (referee) ; Kříž, Jiří (advisor)
This master’s thesis concerns about the use of data mining techniques in banking, insurance and shopping centres industries. The thesis theoretically describes algorithms and methodology CRISP-DM dedicated to data mining processes. With usage of theoretical knowledge and methods, the thesis suggests possible solution for various industries within business intelligence processes.

National Repository of Grey Literature : 27 records found   beginprevious18 - 27  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.