National Repository of Grey Literature 53 records found  beginprevious44 - 53  jump to record: Search took 0.01 seconds. 
Segmentation analysis of Czech sentences
Procházka, Jan ; Holan, Tomáš (referee) ; Kuboň, Vladislav (advisor)
Objective of this work is implementing of segmentation analysis method for Czech language including creating list of separators. Also method, how to divide long sentences into clauses, is proposed and implemented. Implementation uses Czech "Free" Morfology by Jan Hajič. Program is written in Python. Method was debugged on 62-sentences and tested on 80-sentences corpus.
Automatické zjednodušování textů pro překlad
Prokopová, Magdalena ; Zeman, Daniel (referee) ; Kuboň, Vladislav (advisor)
This thesis describes one of the areas where automatic simpli fication can be used: simpli cation of texts for machine translation. We start by comparing methods of automatic simpli cation and controlled language, describing their similarities and di erences. Further on we focus only on automatic simpli cation used as a preprocessing step for machine translation. We describe what issues can be solved and address some of them using our own system ASOFT. A text preprocessed by ASOFT is intended to be translated by a machine translation system PC Translator. We evaluate the output of the PC Translator using two automatic methods, BLEU and NIST scores, and one method of human evaluation. In the end we propose other issues that can be addressed by means of automatic simpli fication.
The Exploitation of Linguistic Information in EBMT
Týnovský, Miroslav ; Žabokrtský, Zdeněk (referee) ; Kuboň, Vladislav (advisor)
Example-based machine translation (EBMT) is a corpus-driven method of machine translation. It builds the translation using analogy of the input text with a translation already made. The benefit of using linguistic knowledge within EBMT is the subject of this thesis. Two language pairs are covered: Czech-English and Czech-German. The thesis covers gathering annotated parallel Czech-German data, design and implementation process of an experimental EBMT system, and the effort to improve it using linguistic knowledge. Detailed evaluation and comparison of both the baseline EBMT and the linguistically enhanced system are described. Evaluation has been done using machine and human evaluation methods. The three automatic evaluation methods are BLEU, NIST and METEOR. The linguistic enhancement of the baseline EBMT system includes comparisons of the input sentence with the examples in the translation memory based on morphology and syntax.
Symbolic derivations of functions with a single real variable
Skotnický, Stanislav ; Mírovský, Jiří (referee) ; Kuboň, Vladislav (advisor)
The Main goal of this work is to create a program for the Symbolic derivations of functions with a single real variable. Program will be able to simplify arithmetic expressions and functions. It will be also able to draw functions with the improved Epsilon method. Program will check if the input is correct and will be able to use binary operators +,-,*, /, ^ unary operators and goniometric functions. Possibly it will also include exponential and logarithm functions and some basic constants like e or pi. Program will be comfortable and easy to use.
Software localization and translation tools
Dolejš, Jan ; Nemejovský, Jan (advisor) ; Kuboň, Vladislav (referee)
The theoretical part gives an overview oj the history oj the localisation industry and defines basic terms before going on to cover the localisation tools and companies available. ft then defines the localisation process and its individua! phases and provides for a classification oj the translation tools available. Finally, it outlines their potential development. The practical part sets the theory against the Internet browser Mozilla Firefox v2. O localisation case study. ft dea/s with the practical aspects unique for localisation, i. e. the definition oj text strings to be localized, data recycling from previous versions and the application oj translation tools. ft subsequently !o o ks at the phases that follow localisation, i. e. the testing oj the localised application and the evaluation oj the localisation process. The analysis proves that an open-source community is in alf respects able to provide for a product localisation on the same quality Ieve! o.ffered by established software producers. The thesis also includes a Glossary oj terms, List oj relevant Internet links, Microsoft and Apple Product glossaries, Code-pages with Czech characters, a Mozilla Firefox v2.0 Product Glossary and a DVD-ROM containing tria! versions oj selected translation tools and Firefox browser resource fil es.
Deep analysis in IQA: evaluation on real users dialogues.
Ratkovic, Zorana ; Kuboň, Vladislav (advisor) ; Hoffmannová, Petra (referee)
Interactive Question Answering (IQA) is a natural and cohesive way for a user to obtain information by interactive with a system using natural language. With the advancement in Natural Language Processing, research in the eld of IQA has started to focus on the role of semantics and the discourse structure in these systems. The need for a deeper analysis, which examines the syntax and semantics of the questions and the answers is evident. Using this deeper analysis allows us to model the context of the interaction. I will look at a current closeddomain IQA system which is based on Linear Regression modeling. This system uses super cial and non-semantically motivated features. I propose adding deep analysis and semantic features in order to improve the system and show the need for such analysis. Particular attention will be placed on the so-called follow-up questions (questions that the user poses after having received some answer from the system) and the role of context. I propose that adding the linguistically heavy features will prove bene cial, thereby showing the need for such analysis in IQA systems.
The resemblance analysis of Czech texts
Cvengroš, Petr ; Kuboň, Vladislav (referee) ; Holan, Tomáš (advisor)
In the present work we study means of comparing two Czech texts. The design and the implementation principles of a program for comparing two Czech texts are described. The program compares texts using a set of various criteria, new criteria are easy to add. The program is able to learn, it configures itself on texts given by the user. In the first part of the work we give a description of the algorithms for comparing texts and for learning. The next part is dedicated to some interesting parts of implementation. The conclusion is about using the program on real texts.
Automatic Evaluation of Parallel Bilingual Data Quality
Kolovratník, David ; Kuboň, Vladislav (advisor) ; Pecina, Pavel (referee)
Statistical machine translation is an approach dependent particularly on huge amount of parallel bilingual data. It is used to train a translation model. The translation model works instead of a rule-based transfer; in some systems even lexical. It is believed that quality of the translation may be improved with more data for training. I have tried contrary to give less data and watch how the score of the translation changes. I selected sentence pairs to stay a part of the corpus with some key fi rst randomly, then according to sentence length ratio and finaly according to the number of word couples that a dictionary knows as translation pairs. I show that selection according to an advisable criteria slows down falling of NIST and BLEU score with decreasing size of the corpus and in some cases may tend even to better score. Decreasing the corpus size also lead to faster evaluation and less need of space. It may be useful in an implementation of the machine translation system in small devices with limited system resources.
Context-Dependent Dictionary for Translators
Fanta, Petr ; Bojar, Ondřej (advisor) ; Kuboň, Vladislav (referee)
During a manual translation of short texts, such as texts occurring on social networks or microblogs (e.g., Twitter), translators are often forced to gather additional information from various sources. These can include less common words, domain-specific terms, or numerous abbreviations. The aim of this thesis is to design and implement a system which automatically creates a minimal context-dependent dictionary for the given short message. The system identifies suitable dictionary entries in the translated text and searches for their definitions, translations, and examples from available open sources, or extracts them automatically from a parallel corpus. The resulted dictionary is ideally sufficient for human translators to understand the message, and to choose appropriate translation equivalent (including technical terms). An empirical evaluation is based on statistics which tracks how often users were satisfied with the proposed entries, how often the entries were incorrect and to what extent the system correctly identified the relevance for the input text.

National Repository of Grey Literature : 53 records found   beginprevious44 - 53  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.