National Repository of Grey Literature 64 records found  beginprevious55 - 64  jump to record: Search took 0.01 seconds. 
Automatic Resolution of Pronoun Coreference in Czech
Košarko, Ondřej ; Mírovský, Jiří (advisor) ; Vidová Hladká, Barbora (referee)
Title: Automatic Resolution of Pronoun Coreference in Czech Author: Ondřej Košarko Department: ÚFAL MFF UK Supervisor: RNDr. Jiří Mírovský, Ph.D. Supervisor's e­mail address: mirovsky@ufal.mff.cuni.cz Abstract: The aim of this thesis is to introduce a procedure for automatic pronomial coreference resolution in Czech texts. The text is morphologically and analytically annotated acording to the system of Prague Dependency Treebank. The procedure uses a machine learning method; for its training a set of manually annotated data from Prague Dependency Treebank is used. Evaluation of the results is also part of this thesis. Keywords: pronomial coreference, automatic resolution, machine learning
Hloubková automatická analýza angličtiny
Dušek, Ondřej ; Hajič, Jan (advisor) ; Vidová Hladká, Barbora (referee)
This thesis contains an account of our studies of deep or semantic analysis of English, particularly as described using predicate-argument structure description. Our main goal is to create a system for automatic inference of semantic relations between predicates and arguments - semantic role labeling. We developed a framework for parallel processing of our experiments, integrating third-party machine learning tools and implementing well-known as well as novel procedures. We investigated the current approaches to the problem and proposed several improvements, such as new classi cation features, separate handling of adverbial modi ers or special treatment for rare predicates. Based on our research, we designed and implemented our own semantic analysis system, consisting of predicate disambiguation and argument classi cation subtasks. We evaluated our solution using the CoNLL 2009 Shared Task English corpus.
Functional Arabic Morphology: Formal System and Implementation
Smrž, Otakar ; Vidová Hladká, Barbora (advisor) ; Hajič, Jan (referee) ; Habash, Nizar Y. (referee)
Functional Arabic Morphology is a formulation of the Arabic inflectional system seeking the working interface between morphology and syntax. ElixirFM is its high-level implementation that reuses and extends the Functional Morphology library for Haskell. Inflection and derivation are modeled in terms of paradigms, grammatical categories, lexemes and word classes. The computation of analysis or generation is conceptually distinguished from the general-purpose linguistic model. The lexicon of ElixirFM is designed with respect to abstraction, yet is no more complicated than printed dictionaries. It is derived from the open-source Buckwalter lexicon and is enhanced with information sourcing from the syntactic annotations of the Prague Arabic Dependency Treebank. MorphoTrees is the idea of building effective and intuitive hierarchies over the information provided by computational morphological systems. MorphoTrees are implemented for Arabic as an extension to the TrEd annotation environment based on Perl. Encode Arabic libraries for Haskell and Perl serve for processing the non-trivial and multi-purpose ArabTEX notation that encodes Arabic orthographies and phonetic transcriptions in parallel.
Enhanced HMM Tagger and Its Application for Czech Morphological Tagging
Kypta, Tomáš ; Mírovský, Jiří (advisor) ; Vidová Hladká, Barbora (referee)
In the present work I study possibilities of Czech morphological tagging by using statistical tagger based on hidden Markov models (HMM tagger). I especially intend to verify an influence of various size of training data, length of tagging history, setting n-parameter in n-best variant and reduction of tag set in history of tags to the successfulness of tagging. Text is completed with tables with results of tagger including comparison with previous results of other taggers. There is also a supplementary CD with test data and the program, which results are presented here.
Syntax-based classification of meaningful Czech sentences
Rovenský, Vladimír ; Bojar, Ondřej (referee) ; Vidová Hladká, Barbora (advisor)
This thesis tries to formulate a knowledge-based algorithm for meaningful sentence classi cation. This is a very interesting task for the applications of natural language processing, such as the web search engines. "To-be-meaningful" is a feature that cannot be de ned in an absolute way - we try to respect the natural language description layer system. In this approach, we pursue a layer system that goes from the morphological layer through the syntactical layer to the semantic layer - the bachelor thesis will cover the rst two of three layers. Czech will be used as the object language.
Natural Language Interface for online webcasts
Macošek, Jan ; Vidová Hladká, Barbora (referee) ; Hajič, Jan (advisor)
This text describes development of natural language interface for online webcasts. These webcasts are transformed from text to speech and then played by the electronic rabbit Nabaztag. Its user can control it by voice commands, so the text also focuses on training accoustic models with the HTK Toolkit and on using these models to recognize speech with the Julius speech recognizer. Searching for the webcasts and their processing is also described, along with some problems that occured during speech synthesis of sportoriented texts.
Automatic combinations of feature templates
Dubovský, Jakub ; Vidová Hladká, Barbora (referee) ; Novák, Václav (advisor)
Searching for useful combinations of features and feature templates is not a simple task. Though combination is valuable tool for increasing accuracy of machine learning. This paper tries to suggest an algorithm for automatic search for useful combinations of categorical features and their templates. An attempt to use simulated annealing and modified genetic algorithm for search process is studied. Construction of evaluation function for assessing categorical feature template is present as well. Features and feature templates are combined separately and together. The best increase of accuracy reached by suggested procedures on datasets used is around 0.1 percentage points. Experiments were made just on two datasets. Thus further testing of algorithm on other datasets is needed to verify its usefulness in general. However experiments indicate that it can be considered as a base of usable algorithm. Simple command-line application is part of work. It was developed and used for experimentation.
Syntactically-based classification of Czech sentences
Kríž, Vincent ; Mírovský, Jiří (referee) ; Vidová Hladká, Barbora (advisor)
Classification of syntactically meaningful sentences is a very useful task for the applications of natural language processing, for example machine translation, search engines and question answering systems. The theoretical linguistic research considers the language to be a system of layers. In our project, a term 'to-be-meaningful' will be specified with respect to this point of view. Namely, the morphological and syntactic layers will be considered. A knowledge-based algorithm classifying a string of Czech words being either meaningful or meaningless will be proposed and implemented. Before being classified, strings will be pre-processed by the external modules. Czech will be used as the object language.
Disambiguation of Czech Morphology Using Markov Models
Dufková, Kateřina ; Podveský, Petr (advisor) ; Vidová Hladká, Barbora (referee)
In my bachelor thesis I decided to focus on disambiguation of Czech morphology. This task is important in particular in the area of natural language translation, where it takes part in preprocessing the text intended for translation in order to eliminate ambiguity in part of speech and other morphological cathegories. This ambiguity would cause problems in subsequent phases of translation or unacceptable growth of translation's time demands. I chose statistical approach to this problem, which is in comparison with other possible methods faster, more universal and able to select word cathegory in all cases. I founded my aplication KDTagger, which I created within the framework of this bachelor thesis, on the theory of Hidden Markov Models. My aim was to create such a program, which would be universal in operating system and the way of use. KDTagger allows the experts to adjust every important linguistic parameter while preserving comfort use for begginers. My work also includes extensive testings of the program KDTagger, which I performed on the Czech newspaper texts from Prague Dependency Treebank version 2.0. The program can be however applied on arbitrary natural language without not even the smallest change. Powered by TCPDF (www.tcpdf.org)
Prague Dependency Treebank as a Czech grammar practice book Prague Dependency Treebank as an exercise book of Czech
Kučera, Ondřej ; Vidová Hladká, Barbora (advisor) ; Panevová, Jarmila (referee)
Prague Dependency Treebank (PDT) is one of the top language corpora in the world. The aim of this work is to introduce a software system that builds an exercise book of Czech using the data of PDT. Two kinds of exercises are provided: morphology (selecting correct parts of speech and their morphological cathegories) and sentence parsing (selecting analytical functions and dependencies between them). The PDT data cannot be used directly though, because of the differences between the academic approach in sentence parsing and the approach that is used in schools. Some of the sentences have to be discarded completely, several transformations have to be applied to the others in order to convert the original representation to the form to which the students are used to from school.

National Repository of Grey Literature : 64 records found   beginprevious55 - 64  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.