National Repository of Grey Literature 77 records found  beginprevious48 - 57nextend  jump to record: Search took 0.00 seconds. 
Rating of IT services through analysis of unstructured data
Kovykov, Maxim ; Vencovský, Filip (advisor) ; Bruckner, Tomáš (referee)
The main topic of this thesis is text mining and rating of services through summarization of unstructured text. The main goal is to describe a method for service rating. The method will be based on previous research. Described method will then be applied to real data. Another goal is to provide description of a toolset, necessary to fulfill set goals. This toolset will then be used to implement described method. The main contribution of this thesis is the implementation and application of the method on real data. The thesis is split into two parts: theory and practical application. Outputs of the practical applicaton are provided as an appendix.
Statistical methods in stylometry
Dupal, Pavel ; Kaspříková, Nikola (advisor) ; Šulc, Zdeněk (referee)
The aim of this thesis is to provide an overview of some of the commonly used methods in the area of authorship attribution (stylometry). The text begins with a recap of history from the end of the 19th century to present time and the required terminology from the field of text mining is presented and explained. What follows is a list of selected methods from the field of multidimensional statistics (principal components analysis, cluster analysis) and machine learning (Support Vector Machines, Naive Bayes) and their application as pertains to stylometrical problems, including several methods created specifically for use in this field (bootstrap consensus tree, contrast analysis). Finally these same methods are applied to a practical problem of authorship verification based on a corpus bulit from the works of four internet writers.
Application of text mining methods for analysis of users movie reviews
Palatínus, Vojtěch ; Matějka, Martin (advisor) ; Novotný, Ota (referee)
The topic of this thesis is to define the challenges while working with the unstructured data. It focuses, specifically, on a transformation between unstructured and structured data using text mining methods and bringing the closer view on so-called Big Data phenomenon. The goal of this thesis is to introduce problems that occur when working with unstructured data, to show their transformation to structured data format using text mining methods and to perform analysis on user reviews published on the website of The Internet Movie Database from the mined data. The aim of this thesis is to familiarize the reader with the unstructured data and on the example demonstrate how to use text mining methods for mining relevant information from this type of data.
Mining texts at the discourse level
Van de Moosdijk, Sara Francisca ; Pecina, Pavel (advisor) ; Novák, Michal (referee)
Linguistic discourse refers to the meaning of larger text segments, and could be very useful for guiding attempts at text mining such as document selection or summarization. The aim of this project is to apply discourse information to Knowledge Discovery in Databases. As far as we know, this is the first attempt at combining these two very different fields, so the goal is to create a basis for this type of knowledge extraction. We approach the problem by extracting discourse relations using unsupervised methods, and then model the data using pattern structures in Formal Concept Analysis. Our method is applied to a corpus of medical articles compiled from PubMed. This medical data can be further enhanced with concepts from the UMLS MetaThesaurus, which are combined with the UMLS Semantic Network to apply as an ontology in the pattern structures. The results show that despite having a large amount of noise, the method is promising and could be applied to domains other than the medical domain. Powered by TCPDF (www.tcpdf.org)
How to Create Self-Driven Education: The Social Web & Social Sciences, Coursera & Khan Academy 2014 Case Study
Růžička, Jakub ; Remr, Jiří (advisor) ; Soukup, Petr (referee)
This diploma thesis is concerned with the possibilities of the social web data employment in social sciences. Its theoretical part describes the changes in education in the context of the dynamics of contemporary society within three fundamental (interrelated) dimensions of technology (the cause and/or the tool for the change), work (new models of collaboration), and economics (sustainability of free & open-source business models). The main methodological part of the thesis is focused on the issues of sampling, sample representativeness, validity & reliability assessment, ethics, and data collection of the emerging social web research in social sciences. The research part includes illustrative social web analyses and conclusions of the author's 2014 Coursera & Khan Academy on the Social Web research and provides the full research report in its attachement to compare its results to the theoretical part in order to provide a "naive" (as derived from the social web mentions and networks) answer to the fundamental question: "How to Create Self-Driven Education?" Powered by TCPDF (www.tcpdf.org)
Vulnerability Reports Analysis and Management
Domány, Dušan ; Toropila, Daniel (advisor) ; Galgonek, Jakub (referee)
Various vulnerabilities in software products can often represent a significant security threat if they are discovered by malicious attackers. It is therefore important to identify these vulnerabilities and report their presence to responsible persons before they are exploited by malicious subjects. The number of security reports about discovered vulnerabilities in various software products has grown rapidly over the last decade. It is becoming more and more difficult to process all of the incoming reports manually. This work discusses various methods that can be used to automate several important processes in collecting and sorting the reports. The reports are analyzed in various ways, including techniques of text mining, and the results of the analysis are applied in form of practical implementation.
Sentiment Analysis with Use of Data Mining
Sychra, Martin ; Burget, Radek (referee) ; Bartík, Vladimír (advisor)
The theme of the work is sentiment analysis, especially in terms of informatics (marginally from a linguistic point of view). The linguistic part discusses the term sentiment and language methods for its analysis, e.g. lemmatization, POS tagging, using the list of stopwords etc. More attention is paid to the structure of the sentiment analyzer which is based on some of the machine learning methods (support vector machines, Naive Bayes and maximum entropy classification). On the basis of the theoretical background, a functional analyzer is projected and implemented. The experiments are focused mainly on comparing the classification methods and on the benefits of using the individual preprocessing methods. The success rate of the constructed classifier reaches up to 84 % in the cross-validation.
Automatic recognition of meaning in texts
Jeleček, Jiří ; Dvořák, Pavel (referee) ; Povoda, Lukáš (advisor)
As part of this work it was designed and implemented a system using data mining techniques from the text in order to detect emotions in Czech, English and German language texts. Because the system is built mostly on machine learning techniques, was designed and created training set, which was later used to build the model classifier using the selected algorithms.
Gender recognition from the text data
Mačát, Jakub ; Burda, Karel (referee) ; Červenec, Radek (advisor)
This bacheor`s work is focused on gender identification from a text just from an e-mail`s form and also contemporary techniques of data mining and text mining. The technique`s advantages and disadvantages and options of use. There was realized a program for recognizing gender in Java. In a program Rapid Miner is demostrated processing various learning methods. By both programs thete are described their basic attributes, used methods and operators used in the implementation. The programs were tested ona real data. Then there are mentioned methods for program`s extends. eventually there are given examples as the programs process stated assignment.
Metadata Extraction from Scientific Papers
Lokaj, Tomáš ; Dytrych, Jaroslav (referee) ; Otrusina, Lubomír (advisor)
This work deals with the Metadata Extraction from Scienti c Papers. There is generally described issue of information extraction, focusing on the processing of text documents. There is also presented programme clanky2meta.py designed to search for relevant  information in scienti c publication, created by the author. At the end of this work is a comparsion of systems dealing with the same issue, especially with the CiteSeerX system.

National Repository of Grey Literature : 77 records found   beginprevious48 - 57nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.