National Repository of Grey Literature 23 records found  beginprevious14 - 23  jump to record: Search took 0.00 seconds. 
Automated Detection of Hate Speech and Offensive Language
Štajerová, Alžbeta ; Žmolíková, Kateřina (referee) ; Fajčík, Martin (advisor)
This thesis discusses hate speech and offensive language phenomenon, their respective definitions and their occurrence in natural language. It describes previously used methods of solving the detection. An evaluation of available data sets suitable for the problem of detection is provided. The thesis aims to provide additional methods of solving the detection of this issue and it compares the results of these methods. Five models were selected in total. Two of them are focused on feature extraction and the remaining three are neural network models.  I have experimentally evaluated the success of the implemented models. The results of this thesis allow for comparison of the typical approaches with the methods leveraging the newest findings in terms of machine learning that are used for the classification of hate speech and offensive language.
Deep Neural Networks Used for Customer Support Cases Analysis
Marušic, Marek ; Ryšavý, Ondřej (referee) ; Pluskal, Jan (advisor)
Umelá inteligencia je pozoruhodne populárna v dnešnej dobe, pretože si dokáže poradiť s rôznymi veľmi komplexnými úlohami v odvetviach ako napr. spracovanie obrazu, spracovanie zvuku, spracovanie prirodzeného jazyka a podobne. Keďže Red Hat doteraz už vyriešil obrovksé množstvo zákazníckych požiadavkov počas podpory rôznych produktov. Preto bola navrhnutá myšlienka použiť umelú inteligenciu práve na tieto dáta a docieliť tak zlepšenie a zrýchlenie procesu riešenia zákaznícky požiadavkov. V tejto práci sú popísané použité techniky na spracovanie týchto dát a úlohy, ktoré je možné riešiť pomocou hlbokých neurónových sietí. Taktiež sú v tejto práci popísane rôzne modely, ktoré boli vytvorené počas riešenia tejto práce a snažia sa adresovať rôzne úlohy. Ich výkony sú porovnané na spomínaných úlohách.
Shlukování textových dokumentů a jejich částí
Zápotocký, Radoslav ; Kopecký, Michal (advisor) ; Skopal, Tomáš (referee)
This thesis analyses use of vector-space model and data clustering approaches on parts of single document - on chapters, paragraphs and sentences - to allow simple navigation between similar parts. A simulation application (SimDIS), written in C# programming language is also part of this thesis. The application implements the described model and provides tools for visualization of vectors and clusters.
Shlukování textových dokumentů a jejich částí
Zápotocký, Radoslav ; Kopecký, Michal (advisor) ; Skopal, Tomáš (referee)
This thesis analyses use of vector-space model and data clustering approaches on parts of single document - on chapters, paragraphs and sentences. A simulation application (SimDIS), written in C# programming language is also part of this thesis. The application implements the adjusted model and provides tools for visualization of vectors and clusters.
Recognition of emotions in text using artificial intelligence
Vylíčil, Radek ; Karásek, Jan (referee) ; Mašek, Jan (advisor)
This thesis deals with the recognition of emotions from text using machine learning. The text describes methods how to train and test an recognition models. The main contribution of this thesis consists in creation decision tree in Java programming language. Created algorithm was integrated as plugin into the RapidMiner tool. The thesis contains some created examples for executing in RapidMiner. The functionality of decision tree was demonstrated on created database.
Classification Framework
Koroncziová, Dominika ; Otrusina, Lubomír (referee) ; Kouřil, Jan (advisor)
The goal of this work is the design and implementation of a machine learning software, based on the RapidMiner library. The finished application integrates the most commonly used algorithms and processes implemented in RapidMiner into an easily usable program. The application contains a simple command line interface, as well as a graphic interface to simplify selection of multiple parameters. The program also provides a tool to create standalone programs, that can be used for classification with a pre-trained model. On top of the original requirements the possibility to work with textual data from Wikipedia was also implemented, providing a tool for downloading and preprocessing of the data in order to use them as training input. This text focuses on the specifics of the algorithms and classifiers used and on their features and uses, and describes the design and implementation of the system. As part of this work, several tests were run in order to validate the efficiency and functionality of the program. The test results are included at the end of the thesis.
Determination of basic form of words
Šanda, Pavel ; Burget, Radim (referee) ; Karásek, Jan (advisor)
Lemmatization is an important preprocessing step for many applications of text mining. Lemmatization process is similar to the stemming process, with the difference that determines not only the word stem, but it´s trying to determines the basic form of the word using the methods Brute Force and Suffix Stripping. The main aim of this paper is to present methods for algorithmic improvements Czech lemmatization. The created training set of data are content of this paper and can be freely used for student and academic works dealing with similar problematics.
Options of automated categorization of contracts
Bereš, Miroslav ; Jelínek, Ivan (advisor) ; Oškera, Radek (referee)
My bachelor thesis is focused on automatic categorization. The main goal is to examine actual approaches in automatic categorization, propose methodology for an experiment and perform the experiment. The experiment is done in order to measure success rate of automatic categorization with use of machine learning. It is performed on contracts obtained from public administration's web pages. The bachelor is divided into two parts, theoretical part and the experiment. First one focuses on analyzing theory which explains the subject matter, there are also described current approaches in automatic categorization. Second part describes methodology proposal of the experiment and performing of the experiment. During the process of the experiment, there are created models that are applied on control group. The experiment's outputs are categorized documents. These documents are used to monitor the success rate of automatic categorization. In order to measure the success rate, there is software called Apache OpenNLP used in this experiment. The theoretical part and proposal of the methodology are written based on studying foreign professional literature, mostly obtained from electronic and information sources.
And the winner is... The presence of political slant in the movie production
Selep, Ján ; Stroukal, Dominik (advisor) ; Dušek, Libor (referee)
I study movie studio profit maximization based on an optimization of a political language in the dialogues. I explore the flexibility with which a rational firm slants language of its movies in order to get closer either to a Democratic or a Republican customer. Using computational linguistics I construct vectors of phrase frequency distribution based on a text of almost a decade of U.S. Congress transcripts and 457 randomly chosen movie subtitles. In order to measure distance between the phrase vectors I use chi square statistics and its Monte Carlo approximation. I find no evidence of political slant in movies neither in a movie studio comparison nor for a time-varying comparison of movies in different years. In addition I construct a slant index covering level of political language in a movie. Using the index I find no evidence of impact of political language on movie revenues.
Experimenty s českými lingvistickými daty a ILP
Dědek, Jan ; Eckhardt, Alan ; Vojtáš, Peter
In this paper we present basic experiments that we have made in connection with our research in the domain of the Semantic Web. These experiments should demonstrate possibilities of employing ILP technique in the task of acquisition of semantic information from text of Czech Web pages. These experiments are preceded by complex linguistic analysis of the texts and the output of linguistic tools is processed in the ILP procedure.

National Repository of Grey Literature : 23 records found   beginprevious14 - 23  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.