National Repository of Grey Literature 40 records found  previous11 - 20nextend  jump to record: Search took 0.01 seconds. 
Tool for comparison and evaluation of machine translation
Klejch, Ondřej ; Popel, Martin (advisor) ; Tamchyna, Aleš (referee)
This bachelor thesis is about development of a tool for comparison and eva- luation of machine translation called MT-ComparEval. With this tool it is possi- ble to compare translations according to several criteria, such as automatic met- rics of machine translation quality computed on whole documents or single sen- tences, quality comparison of single sentence translation with highlighting confir- med, improving and worsening n-grams or summaries of the most improving and worsening n-grams for the whole document. When comparing two translations, MT-ComparEval also plots a chart with absolute differences of metrics compu- ted on single sentences and a chart with values obtained from paired bootstrap resampling.
Word prediction using language models
Koutný, Michal ; Popel, Martin (advisor) ; Novák, Michal (referee)
The thesis utilizes ngram language models to improve text entry with QWERTY keyboard by the means of word prediction. Related solutions are briedly introduced. Then follows theoretical background for the work. The analysis in the next part divides problems into four tasks: language model training, incorporating model for word prediction, GUI component and evaluation framework. The realization combines Python and C++. The used corpora come from Czech (19\,M words) and (84\,M words) English Wikipedia articles. A small corpus of Czech educative texts was used to test domain adaptation. The quality metrics are defined and various configuration are measured. The best solutions reduced keystrokes per character to 0.44, resp. 0.55 for English, resp. Czech on testing data.
Hybrid Machine Translation Approaches for Low-Resource Languages
Kamran, Amir ; Popel, Martin (advisor) ; Kuboň, Vladislav (referee)
In recent years, corpus based machine translation systems produce significant results for a number of language pairs. However, for low-resource languages like Urdu the purely statistical or purely example based methods are not performing well. On the other hand, the rule-based approaches require a huge amount of time and resources for the development of rules, which makes it difficult in most scenarios. Hybrid machine translation systems might be one of the solutions to overcome these problems, where we can combine the best of different approaches to achieve quality translation. The goal of the thesis is to explore different combinations of approaches and to evaluate their performance over the standard corpus based methods currently in use. This includes: 1. Use of syntax-based and dependency-based reordering rules with Statistical Machine Translation. 2. Automatic extraction of lexical and syntactic rules using statistical methods to facilitate the Transfer-Based Machine Translation. The novel element in the proposed work is to develop an algorithm to learn automatic reordering rules for English-to-Urdu statistical machine translation. Moreover, this approach can be extended to learn lexical and syntactic rules to build a rule-based machine translation system.
Peacekeeping Operations within the System of Collective Security
Popel, Martin ; Hýbnerová, Stanislava (advisor) ; Faix, Martin (referee)
of the Thesis: Peacekeeping Operations within the System of Collective Security The Thesis is developed on the basis of empirical-analytical approach is also used content analysis of documents and the comparative method. The Thesis is divided into three main parts. The first deals with the concept gradually collective security, the development of universal international organizations and is devoted to peacekeeping operations within the UN. I deal with here in particular the development of these missions from the beginning to the present. The first part is devoted to regional organizations and their contributions in the framework of collective security with regard to the guiding role of the UN. The intent of this section is to show the development of the concept of collective security and to highlight current issues related to security policy. The emphasis is on recent developments in the European area, especially on the development of common foreign and security policy. In the second part of the thesis I'm processing the case study, which concerns the conflict in Libya, which erupted in February 2011 and which created an interesting precedent. The third section describes the role of the Czech Republic in the system of collective security, legal and non-legal reasons for the deployment of Czech...
Metrics for Optimizing Statistical Machine Translation
Macháček, Matouš ; Bojar, Ondřej (advisor) ; Popel, Martin (referee)
State-of-the-art MT systems use so called log-linear model, which combines several components to predict the probability of the translation of a given sentence. Each component has its weight in the log-linear model. These weights are generally trained to optimize BLEU, but there are many alternative automatic metrics and some of them correlate better with human judgments than BLEU. We explore various metrics (PER, WER, CDER, TER, BLEU and SemPOS) in terms of correlation with human judgments. Metric SemPOS is examined in more detail and we propose some approximations and variants. We use the examined metrics to train Czech to English MT system using MERT method and explore how optimizing toward various automatic evaluation metrics affects the resulting model.
Japanese-Czech Machine Translation
Variš, Dušan ; Bojar, Ondřej (advisor) ; Popel, Martin (referee)
Machine translation (MT) using deep sentence analysis is not as widespread as other MT methods, however we believe that some of its aspects can contribute to the overall translation quality. It is also important to try out deep MT methods with various language pairs. In our case, we experiment with the language pair Japanese-Czech. As a part of this task, we also had to collect and process necessary parallel data. Due to a very small amount of such data being available, we were forced to devise aproaches tackling this problem. Our system is based on the same principles as the TectoMT translation system, therefore it was implemented within the same platform. In the process, we tried to capture at least some basic linguistic phenomena characteristic for Japanese. As a part of our research, we also compared our system with a simple phrase-based baseline. Powered by TCPDF (www.tcpdf.org)
Web Interface for the Treex Framework
Sedlák, Michal ; Popel, Martin (advisor) ; Rosa, Rudolf (referee)
This work deals with a web application called Treex::Web which serves as a web interface for NLP framework Treex. The work addresses several Treex issues (e.g. absence of graphical user interface and complicated installation) and offers Treex::Web as a possible solution. At the beginning of this work we introduce the Treex framework itself. The following chapters describe Treex::Web's user interface (chapter 3) and the implementation of the whole web application (chapter 4). Conclusion of this work includes a comparison of NLP frameworks similar to Treex and their web interfaces. 1
Feature Selection for Factored Phrase-Based Machine Translation
Tamchyna, Aleš ; Bojar, Ondřej (advisor) ; Popel, Martin (referee)
In the presented work we investigate factored models for machine translation. We provide a thorough theoretical description of this machine translation paradigm. We describe a method for evaluating the complexity of factored models and verify its usefulness in practice. We present a software tool for automatic creation of machine translation experiments and search in the space of possible configurations. In the experimental part of the work we verify our analyses and give some insight into the potential of factored systems. We indicate some of the possible directions that lead to improvement in translation quality, however we conclude that it is not possible to explore these options in a fully automatic way.
Popularity Meter
Hajič, Jan ; Bojar, Ondřej (advisor) ; Popel, Martin (referee)
Having the possibility of automatically tracking a person's popularity in the newspapers is an idea appealing not just to those in the media spotlight. While sentiment (subjectivity) analysis is a rapidly growing subfield of computational linguistics, no data from the news domain are yet available for Czech. We have therefore started building a manually annotated polarity corpus of sentences from Czech news texts; however, these texts have proven themselves rather unwieldy for such processing. We have also designed a classifier which should be able to track popularity based on this corpus; the classifier has been tested on a corpus of product reviews of domestic appliances and some introductory testing has been done on the nascent news corpus. As a model, we simply extract a unigram polarity lexicon from the data. We then use three related methods for identifying lemma polarity and a number of simple filters for feature selection. On the domestic appliance data, our simplest model has achieved results comparable to the state of the art, however, the properties of Czech news texts and preliminary results hint a more linguistically oriented approach might be preferrable.
Converting prose into poetry using neural networks
Gokirmak, Memduh ; Popel, Martin (advisor) ; Dušek, Ondřej (referee)
Title: Converting Prose into Poetry with Neural Networks Author: Memduh Gokirmak Institute: Institute of Formal and Applied Linguistics Supervisor: Martin Popel, Institute of Formal and Applied Linguistics Abstract: We present here our attempts to create a system that generates poetry based on a sequence of text provided to it by a user. We explore the use of machine translation and language model technologies based on the neural network architecture. We use different types of data across three languages in our research, and employ and develop metrics to track the quality of the output of the systems we develop. We find that combining machine translation techniques to generate training data to this end with fine-tuning of pre-trained language models provides the most satisfactory generated poetry. Keywords: poetry machine translation language models iii

National Repository of Grey Literature : 40 records found   previous11 - 20nextend  jump to record:
See also: similar author names
1 POPEL, Milan
Interested in being notified about new results for this query?
Subscribe to the RSS feed.