keywords:"text preprocessing" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"text preprocessing"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Stemming Methods Used in Text Mining Adámek, Tomáš ; Chmelař, Petr (referee) ; Bartík, Vladimír (advisor) The main theme of this master's thesis is a description of text mining. This document is specialized to English texts and their automatic data preprocessing. The main part of this thesis analyses various stemming algorithms (Lovins, Porter and Paice/Husk). Stemming is a procedure for automatic conflating semantically related terms together via the use of rule sets. Next part of this thesis describes design of an application for various types of stemming algorithms. Application is based on the Java platform with using of graphic library Swing and MVC architecture. Next chapter contains description of implementation of the application and stemming algorithms. In the last part of this master's thesis experiments with stemming algorithms and comparing the algorithm from viewpoint to the results of classification the text are described. Detailed record
	Knowledge Discovery from Text Data in the Python Language Homola, Ján ; Hynek, Jiří (referee) ; Bartík, Vladimír (advisor) This bachelor thesis deals with knowledge discovery from text data more specifically classification of text-based user reviews. Using experiments, this thesis focuses on methods for preprocessing text data and comparing different classification methods through selected datasets. The conclusion of the work is the evaluation of the achieved results of experiments that were performed using the implemented application. Detailed record
	Scala Programming Language and Its Use for Data Analysis Kohout, Tomáš ; Bartík, Vladimír (referee) ; Zendulka, Jaroslav (advisor) This thesis deals with comparing the Scala programming language with other commonly used languages for data analysis. These languages are evaluated on the basis of the following categories: data manipulation and visualization, machine learning and concurent processing capabilities. The evaluation then shows the strengths and weaknesses of Scala. The strengths will be demonstrated on application for email categorization. Detailed record
	Processing of User Reviews Cihlářová, Dita ; Burget, Radek (referee) ; Bartík, Vladimír (advisor) Very often, people buy goods on the Internet that they can not see and try. They therefore rely on reviews of other customers. However, there may be too many reviews for a human to handle them quickly and comfortably. The aim of this work is to offer an application that can recognize in Czech reviews what features of a product are most commented and whether the commentary is positive or negative. The results can save a lot of time for e-shop customers and provide interesting feedback to the manufacturers of the products. Detailed record
	Estimation of Emotions from a Text Dufková, Aneta ; Fajčík, Martin (referee) ; Szőke, Igor (advisor) This thesis describes a process of estimation of emotions from a text using machine learning. The process starts with research of existing methods, continues with choosing a suitable method and experimenting. It uses several datasets, combines them and tests different techniques of text preprocessing. The result is a web interface which uses the pretrained model and allows to estimate emotions from Twitter posts. Detailed record
	Assessment and implementation of text data preprocessing in neural network models Ratnasari, Febiyanti In the realm of text data processing, text preprocessing has traditionally played a significant role. However, with the growing prominence of neural network models and novel representations of textual data, the importance of text preprocessing has been relatively understated. To address this, the present research endeavors to investigate the potential benefits of employing a composite of multiple text data preprocessing techniques in conjunction with a neural network-based text processing model. Detailed record
	Text Analysis in Specialized Translation: Accuracy and Error Rate Parobková, Alžbeta ; Marcoň, Petr (referee) ; Dohnal, Přemysl (advisor) Práca sa zameriava na prieskum a aplikáciu metód textovej analýzy, strojového prekladu na vyhodnotenie kvality technických textov, preložených práve pomocou strojového automatického prekladu. Praktická časť využíva tieto metódy na implementáciu algoritmu pre identifikáciu a klasifikáciu chýb. Ďaľšou časťou praktickej časti je aj aplikácia a natrénovanie neurónového modelu pre korekciu týchto chýb. Porovnanie chybovosti a presnosti prekladu rôznymi prekladačmi je potom preukázané nie len kvalitatívne, ale aj kvantitatívne pomocou štandartných metrík. Detailed record
	Knowledge Discovery from Text Data in the Python Language Homola, Ján ; Hynek, Jiří (referee) ; Bartík, Vladimír (advisor) This bachelor thesis deals with knowledge discovery from text data more specifically classification of text-based user reviews. Using experiments, this thesis focuses on methods for preprocessing text data and comparing different classification methods through selected datasets. The conclusion of the work is the evaluation of the achieved results of experiments that were performed using the implemented application. Detailed record
	Rychlý a trénovatelný tokenizér pro přirozené jazyky Maršík, Jiří ; Bojar, Ondřej (advisor) ; Spousta, Miroslav (referee) In this thesis, we present a data-driven system for disambiguating token and sentence boundaries. The implemented system is highly configurable and versatile to the point its tokenization abilities allow to segment unbroken Chinese text. The tokenizer relies on maximum entropy classifiers and requires a sample of tokenized and segmented text as training data. The program is accompanied by a tool for reporting the performance of the tokenization which helps to rapidly develop and tune the tokenization process. The system was built with multi-platform libraries only and with emphasis on speed and correctness. After a necessary survey of other tools for text tokenization and segmentation and a short introduction to maximum entropy modelling, a large part of the thesis focuses on the particular implementation we developed and its evaluation. Detailed record
	Estimation of Emotions from a Text Dufková, Aneta ; Fajčík, Martin (referee) ; Szőke, Igor (advisor) This thesis describes a process of estimation of emotions from a text using machine learning. The process starts with research of existing methods, continues with choosing a suitable method and experimenting. It uses several datasets, combines them and tests different techniques of text preprocessing. The result is a web interface which uses the pretrained model and allows to estimate emotions from Twitter posts. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English