National Repository of Grey Literature 36 records found  1 - 10nextend  jump to record: Search took 0.01 seconds. 
Data Mining Methods for Text Analysis
Kozák, Ondřej ; Marcoň, Petr (referee) ; Dohnal, Přemysl (advisor)
This bachelor thesis explores the current methodology and possibilities of text mining and the subsequent application of some methods. The thesis described methods for preprocessing, methods for converting text to vector space and methods for text analysis and discusses their possible applications. The different preprocessing methods were applied to the text and then the conversion to vector space was demonstrated using simple methods such as BOW, Bag of n-grams, TF-IDF or with machine learning methods which are FastText and GloVe. LSA, LDA, TextRank and cosine similarity methods were applied to the extracted vectors to extract information from the text.
Comparison of Classification Methods
Dočekal, Martin ; Zendulka, Jaroslav (referee) ; Burgetová, Ivana (advisor)
This thesis deals with a comparison of classification methods. At first, these classification methods based on machine learning are described, then a classifier comparison system is designed and implemented. This thesis also describes some classification tasks and datasets on which the designed system will be tested. The evaluation of classification tasks is done according to standard metrics. In this thesis is presented design and implementation of a classifier that is based on the principle of evolutionary algorithms.
Using of Data Mining Method for Analysis of Social Networks
Novosad, Andrej ; Očenášek, Pavel (referee) ; Bartík, Vladimír (advisor)
Thesis discusses data mining the social media. It gives an introduction about the topic of data mining and possible mining methods. Thesis also explores social media and social networks, what are they able to offer and what problems do they bring. Three different APIs of three social networking sites are examined with their opportunities they provide for data mining. Techniques of text mining and document classification are explored. An implementation of a web application that mines data from social site Twitter using the algorithm SVM is being described. Implemented application is classifying tweets based on their text where classes represent tweets' continents of origin. Several experiments executed both in RapidMiner software and in implemented web application are then proposed and their results examined.
Mining of Textual Data from the Web for Speech Recognition
Kubalík, Jakub ; Plchot, Oldřich (referee) ; Mikolov, Tomáš (advisor)
Prvotním cílem tohoto projektu bylo prostudovat problematiku jazykového modelování pro rozpoznávání řeči a techniky pro získávání textových dat z Webu. Text představuje základní techniky rozpoznávání řeči a detailněji popisuje jazykové modely založené na statistických metodách. Zvláště se práce zabývá kriterii pro vyhodnocení kvality jazykových modelů a systémů pro rozpoznávání řeči. Text dále popisuje modely a techniky dolování dat, zvláště vyhledávání informací. Dále jsou představeny problémy spojené se získávání dat z webu, a v kontrastu s tím je představen vyhledávač Google. Součástí projektu byl návrh a implementace systému pro získávání textu z webu, jehož detailnímu popisu je věnována náležitá pozornost. Nicméně, hlavním cílem práce bylo ověřit, zda data získaná z Webu mohou mít nějaký přínos pro rozpoznávání řeči. Popsané techniky se tak snaží najít optimální způsob, jak data získaná z Webu použít pro zlepšení ukázkových jazykových modelů, ale i modelů nasazených v reálných rozpoznávacích systémech.
Advanced Machine-Learning Methods for Text Classification
Dočekal, Martin ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
This thesis deals with advanced machine-learning methods for text classification. At first, these methods are described, and then text classification system is created based on these methods. The system also provides tools for document preprocessing and evaluation of classifier. The thesis describes the use of the system in a real-life task.
Semantic Similarity of Texts
Bradáč, Václav ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
This paper deals with the determination of semantic similarity texts, focusing on scalability. Part of treatment is a theoretical overview of the tools to implement the system on test data. Tested corpus contains expert articles in the English language. The aim is to analyze these articles, modified to facilitate the analysis of their semantic analogues. One of the most utilized tools is a representation of data in a vector space model.
Events and Places Agregation and Suggestions from Facebook
Dubeň, Matej ; Plchot, Oldřich (referee) ; Szőke, Igor (advisor)
The aim of this bachelor thesis is to explain the design and implementation of an Android application "Let's Go Out", which can recommend Facebook events and places to the user. The recommendation is carried out by using the hybrid recommending system approach that links together the collaborative filtering and a content-based recommendation approach, tracks the user's interaction with the application and, based on recorded data, adapts to the recommendation process. This thesis also describes the testing process that compares the recommender systems of competitive applications and points out achievements.
Classification Framework
Koroncziová, Dominika ; Otrusina, Lubomír (referee) ; Kouřil, Jan (advisor)
The goal of this work is the design and implementation of a machine learning software, based on the RapidMiner library. The finished application integrates the most commonly used algorithms and processes implemented in RapidMiner into an easily usable program. The application contains a simple command line interface, as well as a graphic interface to simplify selection of multiple parameters. The program also provides a tool to create standalone programs, that can be used for classification with a pre-trained model. On top of the original requirements the possibility to work with textual data from Wikipedia was also implemented, providing a tool for downloading and preprocessing of the data in order to use them as training input. This text focuses on the specifics of the algorithms and classifiers used and on their features and uses, and describes the design and implementation of the system. As part of this work, several tests were run in order to validate the efficiency and functionality of the program. The test results are included at the end of the thesis.
DNS Data Analysis for Mobile Device Identification Purposes
Sporni, Alex ; Bartík, Vladimír (referee) ; Burgetová, Ivana (advisor)
This bachelor's thesis deals with the problem of identification of mobile devices based on DNS data analysis. The thesis provides a theoretical introduction to the computer communication model. This thesis explains the importance of DNS in the terms of network communication between devices, It also presents the provided data sets, which contain real communication of mobile devices. These data sets must be with a suitable technique parsed and stored in a database to provide better data manipulation techniques in the later stages of implementation. This work further describes individual techniques of data processing. It also depicts in detail the methodologies for evaluating the relevance of TF-IDF and the application of cosine similarity to identify the mobile devices. The main output of this work is the evaluation of the achieved results.
Analysis of Mobile Devices Network Communication Data
Abraham, Lukáš ; Bartík, Vladimír (referee) ; Burgetová, Ivana (advisor)
At the beginning, the work describes DNS and SSL/TLS protocols, it mainly deals with communication between devices using these protocols. Then we'll talk about data preprocessing and data cleaning. Furthermore, the thesis deals with basic data mining techniques such as data classification, association rules, information retrieval, regression analysis and cluster analysis. The next chapter we can read something about how to identify mobile devices on the network. We will evaluate data sets that contain collected data from communication between the above mentioned protocols, which will be used in the practical part. After that, we finally get to the design of a system for analyzing network communication data. We will describe the libraries, which we used and the entire system implementation. We will perform a large number of experiments, which we will finally evaluate.

National Repository of Grey Literature : 36 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.