National Repository of Grey Literature 53 records found  beginprevious34 - 43next  jump to record: Search took 0.00 seconds. 
Development Environment Extending the Dialog Management Options of AIML
Brodec, Václav ; Kuboň, Vladislav (advisor) ; Plátek, Ondřej (referee)
The AIML language was created with a goal of authoring of simple chat bots. Therefore it lacks some of the features of advanced dialog systems. One of them is the support for dialog management, which is beneficial in many applications that the language has already spread into due to its popularity. This thesis solves the problem of dialog management implementation in pure AIML by using the augmented transition networks in design and code generation. It results in a development environment that supports the chosen solution, thus facilitating the design of more complex bots, while maintaining compatibility with standard interprets.
Joining Segments in Czech Complex Sentences
Čech, Josef ; Kuboň, Vladislav (advisor) ; Krůza, Oldřich (referee)
Title: Joining segments in Czech sentences Author: Bc. Josef Čech Department: Institute of Formal and Applied Linguistics Supervisor: doc. RNDr. Vladislav Kuboň Ph.D. e-mail: vk@ufal.mff.cuni.cz Abstract: This thesis follows up segmentation of complex sentences to linguistic motivated objects - segments - and their mutual relations. These relations can be used for next work with segments. Main purpose for mapping relations is their joining into next level unit - clause. Theoretically should be possible to analyze each clause of complex sentence separately. Analysis of set of clauses should be quicker than of analysis whole complex sentence. Segments should be found thanks to linguistic separators and rule approach. Rule approach prove in problem relations between neighbouring segments. This thesis should attest that rule approach is best solution for joining segments into clauses. Position tag of segment was part of this thesis. This tag should be used in metods dealing with segments instead of custom segment. Keyword: segment, clause, tag, joinig segments, syntactic analysis
Methods for Creating Subjectivity Lexicon for Indonesian
Franky, ; Bojar, Ondřej (advisor) ; Kuboň, Vladislav (referee)
In this work, we created subjectivity lexicons of positive and negative expres- sions for Indonesian language by automatically translating English lexicons, and by intersecting and unioning the translation results. We compared the perfor- mances of the resulting lexicons using a simple prediction method that compares the number of occurrences of positive and negative expressions in a sentence. We also experimented with weighting the expressions by their frequency and relative frequency in unannotated data. A modification in prediction method using ma- chine learning was later used to better incorporate the information that cannot be captured by the simple prediction. We showed that the lexicons were able to reach high recall but low precision when predicting whether a sentence is eval- uative (positive or negative) or not (neutral). Scoring the expressions improve the recall or precision but with comparable decrease in the other measure. The machine learning prediction was able to minimize the sensitivity of the perfor- mances to the size of the lexicon, but further experiments are required to explore the best choice for the prediction method. 1
Hybrid Machine Translation Approaches for Low-Resource Languages
Kamran, Amir ; Popel, Martin (advisor) ; Kuboň, Vladislav (referee)
In recent years, corpus based machine translation systems produce significant results for a number of language pairs. However, for low-resource languages like Urdu the purely statistical or purely example based methods are not performing well. On the other hand, the rule-based approaches require a huge amount of time and resources for the development of rules, which makes it difficult in most scenarios. Hybrid machine translation systems might be one of the solutions to overcome these problems, where we can combine the best of different approaches to achieve quality translation. The goal of the thesis is to explore different combinations of approaches and to evaluate their performance over the standard corpus based methods currently in use. This includes: 1. Use of syntax-based and dependency-based reordering rules with Statistical Machine Translation. 2. Automatic extraction of lexical and syntactic rules using statistical methods to facilitate the Transfer-Based Machine Translation. The novel element in the proposed work is to develop an algorithm to learn automatic reordering rules for English-to-Urdu statistical machine translation. Moreover, this approach can be extended to learn lexical and syntactic rules to build a rule-based machine translation system.
An Implementation of Methods of Structural Analysis of Czech Complex Sentences
Dutkevič, Jiří ; Kuboň, Vladislav (advisor) ; Holan, Tomáš (referee)
Title: An Implementation of Methods of Structural Analysis of Czech Complex Sentences Author: Jiří Dutkevič Department: Institute of Formal and Applied Linguistics Supervisor: doc. RNDr. Vladislav Kuboň, Ph.D., Institute of Formal and Applied Linguistics Abstract: This paper discusses automated analysis of complex sentences in Czech language. It summarizes the results of preceding research, uses therein described method for splitting complex sentences into segments using well defined set of separators and proposes three methods of automated assignment of levels to segments (which also describe relations between the segments) in sentences based on rules presented in the research. First method directly applies the rules presented in referenced research papers, the second method uses a genetic algorithm and the third makes use of a neural network. This paper includes an implementation of these methods and an analysis of the results using manually annotated data from the Prague Dependency Treebank.
Searching Czech Structured Data using Stemming
Tattermusch, Jan ; Hlaváčová, Jaroslava (advisor) ; Kuboň, Vladislav (referee)
This work describes and implements a component for fulltext searching with czech diacritics restoration and stemming support. Diacritics restoration is based on statistical principles and is context dependent. This work presents ve stemmers ready for immediate use (two algorithmic stemmers and three hybrid stemmers) and discusses their properties. The component is implemented using Apache Lucene library and provides a simple interface for querying and insertions, deletions and updates of documents indexed. Stored documents consist of named elds with prede ned data types. Besides regular fulltext queries, the component also supports non-trivial queries with additional constraints and provides a way to customize the way query result score is computed. Component's performance is suffcient for medium-load applications and is approximately 50 queries per second with a repository that contains 2.7 million documents. Contribution of stemming and diacritics restoration to the quality of fulltext searching was measured using MAP and is signi cant.
Machine Translation of Related Asian Languages
Larasati, Septina Dian ; Kuboň, Vladislav (advisor) ; Petkevič, Vladimír (referee)
This thesis presents the development of an MT system between Indonesian and Malaysian. The system uses a method of almost a direct translation exploiting the similarity of both languages. This method was previously used on a number of language pairs of European languages. The thesis also elaborates the attempts to make language resources from scratch since the languages are under-resourced.
Automatic Checking of Translation
Šimlovič, Juraj ; Kuboň, Vladislav (advisor) ; Dědek, Jan (referee)
Translation memories are becoming more and more popular with professional translators nowadays, especially in fields of software localization and translation of technical and official documents. Although commercial systems, which employ memory translation, provide some limited capabilities for automatic checking of translations, these are mostly of simple search-and-replace type. And none of these systems provide reasonable means of applying Czech morphology while checking. Professional translators could benefit from an automatic tool, which would provide more advanced rule-based checking capabilities, taking Czech and even English morphology into the process. Checking not only for correct use of terminology, but also for illicit translations and use of forbidden terms would be useful. This thesis investigates types of mistakes translators tend to make. Review of existing solutions for automatic translation checking for different languages is provided. An application is then suggested and developed, which attempts to search for some of the most frequent mistakes made in translations into Czech language, taking morphology into account while searching.
Clause analysis in Czech conmplex sentences
Krůza, Oldřich ; Lopatková, Markéta (referee) ; Kuboň, Vladislav (advisor)
This Master thesis deals with identification of clauses in Czech morphologically annotated sentences and finding the inter-clausal relations. The task is approached as a machine-learning problem. An annotation scheme for clauses in Czech text is presented alongside with a method for deriving clause-annotated data from the analytical layer of Functional Generative Description coded in the Prague Markup Language. The gathered data are used for training and evaluating a system of automated identification of clauses and their relations. A method of evaluation of the result is suggested and separate software applications created during the development are presented.

National Repository of Grey Literature : 53 records found   beginprevious34 - 43next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.