National Repository of Grey Literature 149 records found  beginprevious21 - 30nextend  jump to record: Search took 0.00 seconds. 
Syntactic Analyzer for Czech Language
Beneš, Vojtěch ; Otrusina, Lubomír (referee) ; Kouřil, Jan (advisor)
Master’s thesis describes theoretical basics, solution design, and implementation of constituency (phrasal) parser for Czech language, which is based on a part of speech association into phrases. Created program works with manually built and annotated Czech sample corpus to generate probabilistic context free grammar within runtime machine learning. Parser implementation, based on extended CKY algorithm, then for the input Czech sentence decides if the sentence can be generated by the created grammar and for the positive cases constructs the most probable derivation tree. This result is then compared with the expected parse to evaluate constituency parser success rate.
Word Sense Disambiguation
Kraus, Michal ; Glembek, Ondřej (referee) ; Smrž, Pavel (advisor)
The master's thesis deals with sense disambiguation of Czech words. Reader is informed about task's history and used algorithms are introduced. There are naive Bayes classifier, AdaBoost classifier, maximum entrophy method and decision trees described in this thesis. Used methods are clearly demonstrated. In the next parts of this thesis are used data also described.  Last part of the thesis describe reached results. There are some ideas to improve the system at the end of the thesis.
Data mining
Mrázek, Michal ; Sehnalová, Pavla (referee) ; Bednář, Josef (advisor)
The aim of this master’s thesis is analysis of the multidimensional data. Three dimensionality reduction algorithms are introduced. It is shown how to manipulate with text documents using basic methods of natural language processing. The goal of the practical part of the thesis is to process real-world data from the internet forum. Posted messages are transformed to the numerical representation, then to two-dimensional space and visualized. Later on, topics of the messages are discovered. In the last part, a few selected algorithms are compared.
Similarity Search in Document Collections
Jordanov, Dimitar Dimitrov ; Plchot, Oldřich (referee) ; Smrž, Pavel (advisor)
Hlavním cílem této práce je odhadnout výkonnost volně šířeni balík  Sémantický Vektory a třída MoreLikeThis z balíku Apache Lucene. Tato práce nabízí porovnání těchto dvou přístupů a zavádí metody, které mohou vést ke zlepšení kvality vyhledávání.
Entity Knowledge Base Creation from Czech Wikipedia
Sychra, Martin ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
The aim of this thesis is to propose and implement a system for an automatic extraction of named entities from Czech Wikipedia, to create a knowledge base consisting of these entities and to evaluate results of the created system. The first part explains basic notions of this field and discusses related work. The main part proposes several methods of extraction and details their implementation. The following types of entities are extracted: people, places, events and organizations. The final part of the thesis presents results, i.e., the success of the individual methods for each entity type and statistics on extraction of the individual entities in the whole Czech Wikipedia context.
Pragmatic aspects of communication with chatbots
Kopecký, Michal ; Krhutová, Milena (referee) ; Haupt, Jaromír (advisor)
Chatboti, programy schopné komunikovat s člověkem, jsou v posledních letech více a více oblíbení. Ale protože je umělá inteligence velmi složitá vědecká disciplína, je obtížné vybudovat robota, který by se v komunikaci podobal člověku. Tato práce poskytne stručný úvod do teorie chatbotů, kde a jak jsou využíváni, a technologie Zpracování přirozeného jazyka. Krátce bude popsáno několik chatbotů, společně s příkladovými konverzacemi. Hlavní důraz bude kladen na pragmatickou stránku konverzace s chatboty, zejména na dodržování konverzačních maxim a kooperačního a zdvořilostního principu. Získané poznatky budou názorně předvedeny na analýzách dialogů s chatbotem ve druhé části práce.
Twitter data analysis tool
Rýdl, Pavel ; Komosný, Dan (referee) ; Galáž, Zoltán (advisor)
This work deals with the creation of an application for automatic downloading and Twitter data analysis based on natural language processing techniques. The application is created in the Python programming language. A development environment Jupyter Notebook was used for creating the application, where the entire application, including GUI, was implemented. In the section of theory are data downloading issues and data analysis by natural language processing described. In the part of implementation there is solution of the application described in several steps, such as creating the application on the Twitter's side, downloading, preprocessing, data analysis with techniques of natural language processing and following visualization. There was also a technique with no natural language processing implemented. Testing run on tweets that contained reference to US president Donald Trump.
XML Databases for Dictionary Data Management
Samia, Michel ; Dytrych, Jaroslav (referee) ; Smrž, Pavel (advisor)
The following diploma thesis deals with dictionary data processing, especially those in XML based formats. At first, the reader is acquainted with linguistic and lexicographical terms used in this work. Then particular lexicographical data format types and specific formats are introduced. Their advantages and disadvantages are discussed as well. According to previously set criteria, the LMF format has been chosen for design and implementation of Python application, which focuses especially on intelligent merging of more dictionaries into one. After passing all unit tests, this application has been used for processing LMF dictionaries, located on the faculty server of the research group for natural language processing. Finally, the advantages and disadvantages of this application are discussed and ways of further usage and extension are suggested.
Library for Support of ReReSearch System Development
Heller, Stanislav ; Otrusina, Lubomír (referee) ; Šperka, Svatopluk (advisor)
At this time, the development of the ReReSearch system is significantly slowed down by mutual incompatibility of system modules, by the fact that developers often repeat already known mistakes and of course by poor communication between developers in general. To solve this problem, there was a need to create a component which would implement and unify often performed tasks in development of ReReSearch system and this way to spend time of ReReSearch developers. The result of this effort is so-called "rrslib" - a Python library, which is supposed to be a helper for everyone, who works on parts of ReReSearch project: database, data extractors, web-based agents, crawlers, XML-processing etc. The library should serve for more consistent, faster and more reliable development of ReReSearch system.
Sentiment Analysis of Czech and Slovak Social Networks and Web Discussions
Sojka, Matěj ; Dočekal, Martin (referee) ; Smrž, Pavel (advisor)
Thanks to digitalization, the spread of opinions in the population has accelerated sharply in the recent years, however the need to understand them has not changed. The goal of this thesis was to create a system for automatic data collection from social media and web discussions and sentiment analysis in Czech and Slovak language. The system has a web interface for visualizing results and configuring data analysis. The system is capable of offering topics to the user that it considers to occur in the selected data and group posts based on user-defined opinions.

National Repository of Grey Literature : 149 records found   beginprevious21 - 30nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.