National Repository of Grey Literature 7 records found  Search took 0.01 seconds. 
Using of Data Mining Method for Analysis of Social Networks
Novosad, Andrej ; Očenášek, Pavel (referee) ; Bartík, Vladimír (advisor)
Thesis discusses data mining the social media. It gives an introduction about the topic of data mining and possible mining methods. Thesis also explores social media and social networks, what are they able to offer and what problems do they bring. Three different APIs of three social networking sites are examined with their opportunities they provide for data mining. Techniques of text mining and document classification are explored. An implementation of a web application that mines data from social site Twitter using the algorithm SVM is being described. Implemented application is classifying tweets based on their text where classes represent tweets' continents of origin. Several experiments executed both in RapidMiner software and in implemented web application are then proposed and their results examined.
Optimization of Heuristic Analysis of Executable Files
Wiglasz, Michal ; Křoustek, Jakub (referee) ; Hruška, Tomáš (advisor)
This BSc Thesis was performed during a study stay at the Universita della Svizzera italiana, Swiss. This thesis describes the implementation of a classification tool for detection of unknown malware based on their behaviour which could replace current solution, based on manually chosen attributes'scores and a threshold. The database used for training and testing was provided by AVG Technologies company, which specializes in antivirus and security systems. Five different classifiers were compared in order to find the best one for implementation: Naive Bayes, a decision tree, RandomForrest, a neural net and a support vector machine. After series of experiments, the Naive Bayes classifier was selected. The implemented application covers all necessary steps: attribute extraction, training, estimation of the performance and classification of unknown samples. Because the company is willing to tolerate false positive rate of only 1% or less, the accuracy of the implemented classifier is only 61.7%, which is less than 1% better than the currently used approach. However it provides automation of the learning process and allows quick re-training (in average around 12 seconds for 90 thousand training samples).
Artificial Intelligence Document Classification
Molnár, Ondřej ; Kačic, Matej (referee) ; Třeštíková, Lenka (advisor)
This paper deals with document classification using artificial intelligence. It describes the principles of classification and machine learning. It also introduces AI methods and presents Naive Bayes classification method in detail. Provides practical implementation of the classifier in MS Office and discusses other possible extensions.
Artificial Intelligence Document Classification
Molnár, Ondřej ; Kačic, Matej (referee) ; Třeštíková, Lenka (advisor)
This paper deals with document classification using artificial intelligence. It describes the principles of classification and machine learning. It also introduces AI methods and presents Naive Bayes classification method in detail. Provides practical implementation of the classifier in MS Office and discusses other possible extensions.
Sentiment Analysis of Customer Reviews
Hrabák, Jan ; Helman, Karel (advisor) ; Malá, Ivana (referee)
This thesis is focused on sentiment analysis of unstructured text and its practical application on the real data downloaded from website Yelp.com The objectives of the theoretical part of this thesis is to sum up the information related to history, methods and possible applications of sentiment analysis. A reader is acquainted with important terms and processes of sentiment analysis. Theoretical part is focused on Naive Bayes classifier, that will be used in practical part of this thesis. In practical part there is detailed description of data set, construction and testing of model. At the end there are presented pros and cons of the chosen model and described some possibilities of its usage.
Optimization of Heuristic Analysis of Executable Files
Wiglasz, Michal ; Křoustek, Jakub (referee) ; Hruška, Tomáš (advisor)
This BSc Thesis was performed during a study stay at the Universita della Svizzera italiana, Swiss. This thesis describes the implementation of a classification tool for detection of unknown malware based on their behaviour which could replace current solution, based on manually chosen attributes'scores and a threshold. The database used for training and testing was provided by AVG Technologies company, which specializes in antivirus and security systems. Five different classifiers were compared in order to find the best one for implementation: Naive Bayes, a decision tree, RandomForrest, a neural net and a support vector machine. After series of experiments, the Naive Bayes classifier was selected. The implemented application covers all necessary steps: attribute extraction, training, estimation of the performance and classification of unknown samples. Because the company is willing to tolerate false positive rate of only 1% or less, the accuracy of the implemented classifier is only 61.7%, which is less than 1% better than the currently used approach. However it provides automation of the learning process and allows quick re-training (in average around 12 seconds for 90 thousand training samples).
Using of Data Mining Method for Analysis of Social Networks
Novosad, Andrej ; Očenášek, Pavel (referee) ; Bartík, Vladimír (advisor)
Thesis discusses data mining the social media. It gives an introduction about the topic of data mining and possible mining methods. Thesis also explores social media and social networks, what are they able to offer and what problems do they bring. Three different APIs of three social networking sites are examined with their opportunities they provide for data mining. Techniques of text mining and document classification are explored. An implementation of a web application that mines data from social site Twitter using the algorithm SVM is being described. Implemented application is classifying tweets based on their text where classes represent tweets' continents of origin. Several experiments executed both in RapidMiner software and in implemented web application are then proposed and their results examined.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.