National Repository of Grey Literature 142 records found  previous11 - 20nextend  jump to record: Search took 0.01 seconds. 
Word2vec Models with Added Context Information
Šůstek, Martin ; Rozman, Jaroslav (referee) ; Zbořil, František (advisor)
This thesis is concerned with the explanation of the word2vec models. Even though word2vec was introduced recently (2013), many researchers have already tried to extend, understand or at least use the model because it provides surprisingly rich semantic information. This information is encoded in N-dim vector representation and can be recall by performing some operations over the algebra. As an addition, I suggest a model modifications in order to obtain different word representation. To achieve that, I use public picture datasets. This thesis also includes parts dedicated to word2vec extension based on convolution neural network.
Information Extraction from Biomedical Texts
Knoth, Petr ; Burget, Radek (referee) ; Smrž, Pavel (advisor)
Recently, there has been much effort in making biomedical knowledge, typically stored in scientific articles, more accessible and interoperable. As a matter of fact, the unstructured nature of such texts makes it difficult to apply  knowledge discovery and inference techniques. Annotating information units with semantic information in these texts is the first step to make the knowledge machine-analyzable.  In this work, we first study methods for automatic information extraction from natural language text. Then we discuss the main benefits and disadvantages of the state-of-art information extraction systems and, as a result of this, we adopt a machine learning approach to automatically learn extraction patterns in our experiments. Unfortunately, machine learning techniques often require a huge amount of training data, which can be sometimes laborious to gather. In order to face up to this tedious problem, we investigate the concept of weakly supervised or bootstrapping techniques. Finally, we show in our experiments that our machine learning methods performed reasonably well and significantly better than the baseline. Moreover, in the weakly supervised learning task we were able to substantially bring down the amount of labeled data needed for training of the extraction system.
Brno Communication Agent
Křištof, Jiří ; Fajčík, Martin (referee) ; Smrž, Pavel (advisor)
The aim of this thesis is the implementation of a communication agent, which provides information about Brno. The communication agent uses three - tier architecture . For the question answering , machine learning and neural network techniques are used . User tests determined the success rate 84 %. 58 % of the primary users were satisfied with the system. Main benefit of the work is facilitating the retrieving of information about Brno for its residents and visitors .
Syntactic Analyzer for Czech Language
Beneš, Vojtěch ; Otrusina, Lubomír (referee) ; Kouřil, Jan (advisor)
Master’s thesis describes theoretical basics, solution design, and implementation of constituency (phrasal) parser for Czech language, which is based on a part of speech association into phrases. Created program works with manually built and annotated Czech sample corpus to generate probabilistic context free grammar within runtime machine learning. Parser implementation, based on extended CKY algorithm, then for the input Czech sentence decides if the sentence can be generated by the created grammar and for the positive cases constructs the most probable derivation tree. This result is then compared with the expected parse to evaluate constituency parser success rate.
Word Sense Disambiguation
Kraus, Michal ; Glembek, Ondřej (referee) ; Smrž, Pavel (advisor)
The master's thesis deals with sense disambiguation of Czech words. Reader is informed about task's history and used algorithms are introduced. There are naive Bayes classifier, AdaBoost classifier, maximum entrophy method and decision trees described in this thesis. Used methods are clearly demonstrated. In the next parts of this thesis are used data also described.  Last part of the thesis describe reached results. There are some ideas to improve the system at the end of the thesis.
Data mining
Mrázek, Michal ; Sehnalová, Pavla (referee) ; Bednář, Josef (advisor)
The aim of this master’s thesis is analysis of the multidimensional data. Three dimensionality reduction algorithms are introduced. It is shown how to manipulate with text documents using basic methods of natural language processing. The goal of the practical part of the thesis is to process real-world data from the internet forum. Posted messages are transformed to the numerical representation, then to two-dimensional space and visualized. Later on, topics of the messages are discovered. In the last part, a few selected algorithms are compared.
Similarity Search in Document Collections
Jordanov, Dimitar Dimitrov ; Plchot, Oldřich (referee) ; Smrž, Pavel (advisor)
Hlavním cílem této práce je odhadnout výkonnost volně šířeni balík  Sémantický Vektory a třída MoreLikeThis z balíku Apache Lucene. Tato práce nabízí porovnání těchto dvou přístupů a zavádí metody, které mohou vést ke zlepšení kvality vyhledávání.
Entity Knowledge Base Creation from Czech Wikipedia
Sychra, Martin ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
The aim of this thesis is to propose and implement a system for an automatic extraction of named entities from Czech Wikipedia, to create a knowledge base consisting of these entities and to evaluate results of the created system. The first part explains basic notions of this field and discusses related work. The main part proposes several methods of extraction and details their implementation. The following types of entities are extracted: people, places, events and organizations. The final part of the thesis presents results, i.e., the success of the individual methods for each entity type and statistics on extraction of the individual entities in the whole Czech Wikipedia context.
Pragmatic aspects of communication with chatbots
Kopecký, Michal ; Krhutová, Milena (referee) ; Haupt, Jaromír (advisor)
Chatboti, programy schopné komunikovat s člověkem, jsou v posledních letech více a více oblíbení. Ale protože je umělá inteligence velmi složitá vědecká disciplína, je obtížné vybudovat robota, který by se v komunikaci podobal člověku. Tato práce poskytne stručný úvod do teorie chatbotů, kde a jak jsou využíváni, a technologie Zpracování přirozeného jazyka. Krátce bude popsáno několik chatbotů, společně s příkladovými konverzacemi. Hlavní důraz bude kladen na pragmatickou stránku konverzace s chatboty, zejména na dodržování konverzačních maxim a kooperačního a zdvořilostního principu. Získané poznatky budou názorně předvedeny na analýzách dialogů s chatbotem ve druhé části práce.
Twitter data analysis tool
Rýdl, Pavel ; Komosný, Dan (referee) ; Galáž, Zoltán (advisor)
This work deals with the creation of an application for automatic downloading and Twitter data analysis based on natural language processing techniques. The application is created in the Python programming language. A development environment Jupyter Notebook was used for creating the application, where the entire application, including GUI, was implemented. In the section of theory are data downloading issues and data analysis by natural language processing described. In the part of implementation there is solution of the application described in several steps, such as creating the application on the Twitter's side, downloading, preprocessing, data analysis with techniques of natural language processing and following visualization. There was also a technique with no natural language processing implemented. Testing run on tweets that contained reference to US president Donald Trump.

National Repository of Grey Literature : 142 records found   previous11 - 20nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.