keywords:"Text mining" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"Text mining"

Search:



Search Tips :: Simple Search

Search collections:

Sort by:	Display results:	Output format:

	Automatic Creation of the Publication Index Strachota, Tomáš ; Černocký, Jan (referee) ; Smrž, Pavel (advisor) The goal of this thesis is to survey potential of common language processing methods for text indexing. A prototype of automatic index-building system will be made and tested on gathered data. A direction for the next developement will be set based on the results of the tests. Detailed record
	Representation of Text and Its Influence on Categorization Šabatka, Ondřej ; Chmelař, Petr (referee) ; Bartík, Vladimír (advisor) The thesis deals with machine processing of textual data. In the theoretical part, issues related to natural language processing are described and different ways of pre-processing and representation of text are also introduced. The thesis also focuses on the usage of N-grams as features for document representation and describes some algorithms used for their extraction. The next part includes an outline of classification methods used. In the practical part, an application for pre-processing and creation of different textual data representations is suggested and implemented. Within the experiments made, the influence of these representations on accuracy of classification algorithms is analysed. Detailed record
	Text Classification with the SVM Method Synek, Radovan ; Burget, Radek (referee) ; Bartík, Vladimír (advisor) This thesis deals with text mining. It focuses on problems of document classification and related techniques, mainly data preprocessing. Project also introduces the SVM method, which has been chosen for classification, design and testing of implemented application. Detailed record
	Text Data Clustering Leixner, Petr ; Burgetová, Ivana (referee) ; Bartík, Vladimír (advisor) Process of text data clustering can be used to analysis, navigation and structure large sets of texts or hypertext documents. The basic idea is to group the documents into a set of clusters on the basis of their similarity. The well-known methods of text clustering, however, do not really solve the specific problems of text clustering like high dimensionality of the input data, very large size of the databases and understandability of the cluster description. This work deals with mentioned problems and describes the modern method of text data clustering based on the use of frequent term sets, which tries to solve deficiencies of other clustering methods. Detailed record
	Mining of Textual Data from the Web for Speech Recognition Kubalík, Jakub ; Plchot, Oldřich (referee) ; Mikolov, Tomáš (advisor) Prvotním cílem tohoto projektu bylo prostudovat problematiku jazykového modelování pro rozpoznávání řeči a techniky pro získávání textových dat z Webu. Text představuje základní techniky rozpoznávání řeči a detailněji popisuje jazykové modely založené na statistických metodách. Zvláště se práce zabývá kriterii pro vyhodnocení kvality jazykových modelů a systémů pro rozpoznávání řeči. Text dále popisuje modely a techniky dolování dat, zvláště vyhledávání informací. Dále jsou představeny problémy spojené se získávání dat z webu, a v kontrastu s tím je představen vyhledávač Google. Součástí projektu byl návrh a implementace systému pro získávání textu z webu, jehož detailnímu popisu je věnována náležitá pozornost. Nicméně, hlavním cílem práce bylo ověřit, zda data získaná z Webu mohou mít nějaký přínos pro rozpoznávání řeči. Popsané techniky se tak snaží najít optimální způsob, jak data získaná z Webu použít pro zlepšení ukázkových jazykových modelů, ale i modelů nasazených v reálných rozpoznávacích systémech. Detailed record
	Textual Data Clustering Methods Miloš, Roman ; Burgetová, Ivana (referee) ; Bartík, Vladimír (advisor) Clustering of text data is one of tasks of text mining. It divides documents into the different categories that are based on their similarities. These categories help to easily search in the documents. This thesis describes the current methods that are used for the text document clustering. From these methods we chose Simultaneous keyword identification and clustering of text documents (SKWIC). It should achieve better results than the standard clustering algorithms such as k-means. There is designed and implemented an application for this algorithm. In the end, we compare SKWIC with a k-means algorithm. Detailed record
	Derivation of Dictionary for Process Inspector Tool on SharePoint Platform Pavlín, Václav ; Masařík, Karel (referee) ; Kreslíková, Jitka (advisor) This master's thesis presents methods for mining important pieces of information from text. It analyses the problem of terms extraction from large document collection and describes the implementation using C# language and Microsoft SQL Server. The system uses stemming and a number of statistical methods for term extraction. This project also compares used methods and suggests the process of the dictionary derivation. Detailed record
	Keyword Extraction from Documents Matička, Jiří ; Očenášek, Pavel (referee) ; Bartík, Vladimír (advisor) This thesis pursues an automated extraction of keywords from documents. Its goal is to design and implement an application which will be able to extract an appropriate set of keywords related to the contents of the document. The major requirements for the application are speed and accuracy. That is why the first part of the thesis talks about already developed principles and a detailed classification based on various criteria. The second part is focused on choosing and a thorough functional describing of one of the methods which should have been used for extracting the keywords. The next parts contain a detailed draft of the application and its implementation. Finally, the last chapter is particularly important due to testing the application on a group of text documents and evaluating final results of the extraction process. Detailed record
	Using of Data Mining Method for Analysis of Social Networks Novosad, Andrej ; Očenášek, Pavel (referee) ; Bartík, Vladimír (advisor) Thesis discusses data mining the social media. It gives an introduction about the topic of data mining and possible mining methods. Thesis also explores social media and social networks, what are they able to offer and what problems do they bring. Three different APIs of three social networking sites are examined with their opportunities they provide for data mining. Techniques of text mining and document classification are explored. An implementation of a web application that mines data from social site Twitter using the algorithm SVM is being described. Implemented application is classifying tweets based on their text where classes represent tweets' continents of origin. Several experiments executed both in RapidMiner software and in implemented web application are then proposed and their results examined. Detailed record
	Extraction of Semantic Relations from Text Schmidt, Marek ; Burget, Radek (referee) ; Smrž, Pavel (advisor) Extraction of semantic relations from English text is the topic of this thesis. It focuses on exploitation of a dependency parser. A method based on syntactic patterns is proposed and evaluated in addition to evaluation of several statistical methods over syntactic features. It applies the methods for extraction of a hypernymy relation and evaluates it on the WordNet thesaurus. A system for extraction of semantic relations from text is designed and implemented based on these methods. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English