National Repository of Grey Literature 21 records found  1 - 10nextend  jump to record: Search took 0.01 seconds. 
Automatic detection and attribution of quotes
Ustinova, Evgeniya ; Hana, Jiří (advisor) ; Vidová Hladká, Barbora (referee)
Quotations extraction and attribution are important practical tasks for the media, but most of the presented solutions are monolingual. In this work, I present a complex machine learning-based system for extraction and attribution of direct and indirect quo- tations, which is trained on English and tested on Czech and Russian data. Czech and Russian test datasets were manually annotated as part of this study. This system is com- pared against a rule-based baseline model. Baseline model demonstrates better precision in extraction of quotation elements, but low recall. The machine learning-based model is better overall in extracting separate elements of quotations and full quotations as well. 1
Natural Language Generation system writing football articles
Raffl, Dan ; Hana, Jiří (advisor) ; Holeňa, Martin (referee)
Journalism could become a tedious job as its main concern is to create as many articles as possible, usually prioritising quantity over quality. Some articles are quite routine and they need to exist just because most of the population prefers text over raw data. The idea is to ease this job and generate articles, particularly about football in Czech language, automatically from non-linguistic data. This thesis is concerned with analysing implementation of such a linguistic software and moreover offers a brief overview of a Natural Language Generation (NLG) process. The major focus of this overview is on benefits and drawbacks of different approaches to NLG as well as describing NLG tasks and its challenges you need to overcome in order to produce a similar human language (not only Czech) producing program. 1
Matematické aspekty Van der Waalsovy rovnice
HÁNA, Jiří
The thesis deals with cubic equations and their application in physics. The first part clarifies some basic terms, which are then used in the following chapter, which is focused on solving the cubic equations using analytical and numerical methods. The third part of the thesis presents the Van der Waals equation, shows the possibilities of calculating the critical values of the pressure, the thermodynamic temperature and the molar volume, as well as the creation of a p-v diagram using a computer. The last part of the thesis is focused on a teaching unit draft on the thesis topic using the concept of STEM. The last part also presents interesting curves on the thermodynamic surface and shows the advantages of the STEM approach.
Automatické osvojení vzorů s minimální supervizí
Klíč, Radoslav ; Hana, Jiří (advisor) ; Hlaváčová, Jaroslava (referee)
The thesis presents a semi-supervised morphology learner developed by extending Paramor (Monson, 2009), an unsupervised system, to accept easy to obtain manually provided data in the form of inflections with marked morpheme boundary. In addition, a hierarchical clustering framework allowing combination of multiple sources of information was developed as a part of the thesis. The approach was tested on Czech, Slovene, German and Catalan and has shown increased F-measure in comparison with the Paramor baseline.
Language Modelling for German
Tlustý, Marek ; Bojar, Ondřej (advisor) ; Hana, Jiří (referee)
The thesis deals with language modelling for German. The main concerns are the specifics of German language that are troublesome for standard n-gram models. First the statistical methods of language modelling are described and language phenomena of German are explained. Following that suggests own variants of n-gram language models with an aim to improve these problems. The models themselves are trained using the standard n-gram methods as well as using the method of maximum entropy with n-gram features. Both possibilities are compared using corelation metrics of hand-evaluated fluency of sentences and automatic evaluation - the perplexity. Also, the computation requirements are compared. Next, the thesis presents a set of own features that represent the count of grammatical errors of chosen phenomena. Success rate is verified on ability to predict the hand-evaluated fluency. Models of maximum entropy and own models that classify only using the medians of phenomena values computed from training data are used.
An HPSG-based Formal Grammar of a Core Fragment of Georgian Implemented in TRALE
Abzianidze, Lasha ; Rosen, Alexandr (advisor) ; Hana, Jiří (referee)
Georgian is remarkably different from Indo-European languages. The language has several linguistic phenomena that are challenging both from theoretical and computational points of view. In addition, it is low- resourced and insufficiently studied from the computational point of view. In the thesis, we model morphology and syntax of a core fragment of the language in a formal grammar. Namely, the formal grammar is written in the HPSG framework - one of the most powerful grammar frameworks nowadays. We also implement the grammar in TRALE - a grammar implementation platform, which is faithful to "hand-written" HPSG-based grammars. Note that this is the first application of HPSG to Georgian.
Semantic disambiguation using Distributional Semantics
Prodanovic, Srdjan ; Hana, Jiří (advisor) ; Vidová Hladká, Barbora (referee)
Ve statistických modelů sémantiky jsou významy slov pouze na základě jejich distribuční vlastnosti.Základní zdroj je zde jeden slovník, který lze použít pro různé úkoly, kde se význam slov reprezentovány jako vektory v vektorového prostoru, a slovní podoby jako vzdálenosti mezi jejich vektorových osobnosti. Pomocí silných podobnosti, může vhodnost podmínek uvedených zejména v souvislosti se vypočítá a používá pro celou řadu úkolů, jeden z nich je slovo smysl Disambiguation. V této práci bylo vyšetřeno několik různých přístupů k modelům z vektorového prostoru a prováděny tak, aby k překročení vyhodnocení vlastního výkonu na Word Sense disambiguation úkolem Prague Dependency Treebank.
Detekce podezřelých anotací
Václ, Jan ; Vidová Hladká, Barbora (advisor) ; Hana, Jiří (referee)
This work describes a machine learning approach for checking the part-of-speech annotation, and presents its implementation - a system called MissTagger. The checking procedure covers both error detection and error correction. MissTagger employs a simplified instance-based learning algorithm where the words in the text are recognized as instances. Part-of-speech tags of context of static length are selected as features, no lexical information is included. The words whose tags comprises this context are chosen based either on a linear or on a dependency-tree structure of the sentence. Two languages are examined in the experiments for evaluation, Czech and English.
Text classification with limited training data
Laitoch, Petr ; Hana, Jiří (advisor) ; Vidová Hladká, Barbora (referee)
The aim of this thesis is to minimize manual work needed to create training data for text classification tasks. Various research areas including weak supervision, interactive learning and transfer learning explore how to minimize training data creation effort. We combine ideas from available literature in order to design a comprehensive text classification framework that employs keyword-based labeling instead of traditional text annotation. Keyword-based labeling aims to label texts based on keywords contained in the texts that are highly correlated with individual classification labels. As noted repeatedly in previous work, coming up with many new keywords is challenging for humans. To accommodate for this issue, we propose an interactive keyword labeler featuring the use of word similarity for guiding a user in keyword labeling. To verify the effectiveness of our novel approach, we implement a minimum viable prototype of the designed framework and use it to perform a user study on a restaurant review multi-label classification problem.

National Repository of Grey Literature : 21 records found   1 - 10nextend  jump to record:
See also: similar author names
1 HÁNA, Jonatan
1 Hána, J.
9 Hána, Jan
Interested in being notified about new results for this query?
Subscribe to the RSS feed.