National Repository of Grey Literature 77 records found  previous11 - 20nextend  jump to record: Search took 0.01 seconds. 
Gender recognition from the text data
Mačát, Jakub ; Burda, Karel (referee) ; Červenec, Radek (advisor)
This bacheor`s work is focused on gender identification from a text just from an e-mail`s form and also contemporary techniques of data mining and text mining. The technique`s advantages and disadvantages and options of use. There was realized a program for recognizing gender in Java. In a program Rapid Miner is demostrated processing various learning methods. By both programs thete are described their basic attributes, used methods and operators used in the implementation. The programs were tested ona real data. Then there are mentioned methods for program`s extends. eventually there are given examples as the programs process stated assignment.
Automatic recognition of meaning in texts
Jeleček, Jiří ; Dvořák, Pavel (referee) ; Povoda, Lukáš (advisor)
As part of this work it was designed and implemented a system using data mining techniques from the text in order to detect emotions in Czech, English and German language texts. Because the system is built mostly on machine learning techniques, was designed and created training set, which was later used to build the model classifier using the selected algorithms.
Metadata Extraction from Scientific Papers
Lokaj, Tomáš ; Dytrych, Jaroslav (referee) ; Otrusina, Lubomír (advisor)
This work deals with the Metadata Extraction from Scienti c Papers. There is generally described issue of information extraction, focusing on the processing of text documents. There is also presented programme clanky2meta.py designed to search for relevant  information in scienti c publication, created by the author. At the end of this work is a comparsion of systems dealing with the same issue, especially with the CiteSeerX system.
Mining of Textual Data from the Web for Speech Recognition
Kubalík, Jakub ; Plchot, Oldřich (referee) ; Mikolov, Tomáš (advisor)
Prvotním cílem tohoto projektu bylo prostudovat problematiku jazykového modelování pro rozpoznávání řeči a techniky pro získávání textových dat z Webu. Text představuje základní techniky rozpoznávání řeči a detailněji popisuje jazykové modely založené na statistických metodách. Zvláště se práce zabývá kriterii pro vyhodnocení kvality jazykových modelů a systémů pro rozpoznávání řeči. Text dále popisuje modely a techniky dolování dat, zvláště vyhledávání informací. Dále jsou představeny problémy spojené se získávání dat z webu, a v kontrastu s tím je představen vyhledávač Google. Součástí projektu byl návrh a implementace systému pro získávání textu z webu, jehož detailnímu popisu je věnována náležitá pozornost. Nicméně, hlavním cílem práce bylo ověřit, zda data získaná z Webu mohou mít nějaký přínos pro rozpoznávání řeči. Popsané techniky se tak snaží najít optimální způsob, jak data získaná z Webu použít pro zlepšení ukázkových jazykových modelů, ale i modelů nasazených v reálných rozpoznávacích systémech.
Automatic Creation of the Publication Index
Strachota, Tomáš ; Černocký, Jan (referee) ; Smrž, Pavel (advisor)
The goal of this thesis is to survey potential of common language processing methods for text indexing. A prototype of automatic index-building system will be made and tested on gathered data. A direction for the next developement will be set based on the results of the tests.
Identifying Entity Types and Attributes Across Languages
Švub, Daniel ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
The target of this thesis is to analyze articles on the Wikipedia internet encyclopedia and to convert their text written in natural language into a structured database of persons, places and other entities. The essence of the implemented program is the determination of the type of entity based on its typical characteristics, and the extraction of the most important attributes of this entity in the Czech and Slovak languages. The result of this task is a knowledge base allowing simple searching and sorting of information. Thanks to its easy extensibility, it is possible to add identification of other types of entities and other features to the program, as well as a support of other languages.
Textual Data Clustering Methods
Miloš, Roman ; Burgetová, Ivana (referee) ; Bartík, Vladimír (advisor)
Clustering of text data is one of tasks of text mining. It divides documents into the different categories that are based on their similarities. These categories help to easily search in the documents. This thesis describes the current methods that are used for the text document clustering. From these methods we chose Simultaneous keyword identification and clustering of text documents (SKWIC). It should achieve better results than the standard clustering algorithms such as k-means. There is designed and implemented an application for this algorithm. In the end, we compare SKWIC with a k-means algorithm.
Extraction of Semantic Relations from Text
Schmidt, Marek ; Burget, Radek (referee) ; Smrž, Pavel (advisor)
Extraction of semantic relations from English text is the topic of this thesis. It focuses on exploitation of a dependency parser. A method based on syntactic patterns is proposed and evaluated in addition to evaluation of several statistical methods over syntactic features. It applies the methods for extraction of a hypernymy relation and evaluates it on the WordNet thesaurus. A system for extraction of semantic relations from text is designed and implemented based on these methods.
Text Mining Based on Artificial Intelligence Methods
Povoda, Lukáš ; Tučková,, Jana (referee) ; Brezany, Peter (referee) ; Burget, Radim (advisor)
This work deals with the problem of text mining which is becoming more popular due to exponential growth of the data in electronic form. The work explores contemporary methods and their improvement using optimization methods, as well as the problem of text data understanding in general. The work addresses the problem in three ways: using traditional methods and their optimizations, using Big Data in train phase and abstraction through the minimization of language-dependent parts, and introduction of the new method based on the deep learning which is closer to how human reads and understands text data. The main aim of the dissertation was to propose a method for machine understanding of unstructured text data. The method was experimentally verified by classification of text data on 5 different languages – Czech, English, German, Spanish and Chinese. This demonstrates possible application to different languages families. Validation on the Yelp evaluation database achieve accuracy higher by 0.5% than current methods.
Text mining focused on clustering and fuzzy clustering methods
Zubková, Kateřina ; Karpíšek, Zdeněk (referee) ; Žák, Libor (advisor)
This thesis is focused on cluster analysis in the field of text mining and its application to real data. The aim of the thesis is to find suitable categories (clusters) in the transcribed calls recorded in the contact center of Česká pojišťovna a.s. by transferring these textual documents into the vector space using basic text mining methods and the implemented clustering algorithms. From the formal point of view, the thesis contains a description of preprocessing and representation of textual data, a description of several common clustering methods, cluster validation, and the application itself.

National Repository of Grey Literature : 77 records found   previous11 - 20nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.