National Repository of Grey Literature 129 records found  beginprevious120 - 129  jump to record: Search took 0.01 seconds. 
Machine Translation to Czech via Lemmatized Text
Viktorinová, Eva ; Zeman, Daniel (referee) ; Bojar, Ondřej (advisor)
This work tries to improve machine translation from English to Czech language. It describes issue of rich Czech morphology and suggests several methods of solving this problem by lemmatization of Czech text. The translating with alternative decoding paths, where the rst path translates from English to Czech and the second one from English to lemmatized Czech, is studied more closely.
A Tool for Collecting Parallel Texts from the Web
Klempová, Hana ; Ježek, Pavel (referee) ; Bojar, Ondřej (advisor)
The aim of the thesis is to develop a complex tool for creating parallel corpora for a pair of languages (Czech and English) from a given list of websites. A parallel corpus is a set of pairs of documents which are translations of each other. The thesis describes the chosen algorithm in detail and compares all implemented methods. We use document structure as well as most frequent words and their translations to find matching documents in the collection. Finally, we con rm the applicability of the whole system by aligning texts from one bigger and a few smaller websites.
Movie subtitles as a source of parallel texts
Beňa, Peter ; Bojar, Ondřej (referee) ; Žabokrtský, Zdeněk (advisor)
After learning the basic principles of building parallel corpora, the student will focus on the Czech-English parallel corpus Czeng. The main goal of the work is to improve quality of the Czeng part created from Czech/English movie and series subtitles. Above all, it is necessary to design and implement methods for detecting wrongly aligned (or otherwise problematic) subtitle files or their parts. Impact of the cleaning methods on the corpus quality will be evaluated quantitatively.
Clients Management Database for Tax Advisors
Štěrba, Jan ; Novák, Václav (referee) ; Bojar, Ondřej (advisor)
The goal of this project is to implement a system for management of tax advisor's clients. The purpose of this system is to manage all relevant data about clients and work assignments and to simplify time management. An important feature of the system is to generate summaries of assignments, schedule and deadlines and also give statistical data about payments to the advisor. The system will provide output to external data formats and simultaneous work of multiple users.
Hierarchical reactive planning with transions
Mikula, Tomáš ; Bojar, Ondřej (referee) ; Brom, Cyril (advisor)
Hierarchical reactive planning (HRP) is a popular method for controlling virtual beings' behaviour. The advantage of HRP is that complex behaviours can be described relatively easily. However, in particular situations problems with biological plausibility arise. This is partially caused by the fact that so called transition behaviours and postponement of behaviour are hard to express in HRP. Transition behaviours are short actions that the simulated beings should engage in between two main behaviours in order to ensure a smooth transition between them. Postponement of behaviour is desirable in case the running behaviour is approaching the end and therefore should not be interrupted. In the present work we incorporate transition behaviours and postponement of behaviour into the model of HRP and describe the implemented prototype.
Automatic Extraction of Lexico-Syntactic Information from Corpora
Bojar, Ondřej
The presented work investigates methods for semi-automatic extraction of lexico-syntactic information from corpora, particularly the information on subcategorization and valency frames. We document that at present time, PDT and CNC corpora are not sufficient for this task. We describe a simple method for a selective extension of corpora based on texts from Internet. We evaluate three parsers available for Czech with respect to the task of extracting verb frames. We have implemented a linguistically motivated ltration of input sentences to identify "very simple sentences", which helps the parsers to achieve better accuracy. The system AX designed in this work is more generic, any kind of linguistic fi ltration can be employed. The system is also suitable for creating partial or full parsers of natural languages. The thesis also presents a user's guide to the system AX. Furthermore, we compare methods for extraction of subcategorization frames from observed frames. We classify observed frames into a hierarchy suitable for human anotators. Finally, several problems of automatic extraction of valency frames are discussed.
Automatic Modifications of Context in Text Fields
Dřínek, Vratislav ; Bojar, Ondřej (advisor) ; Horký, Vojtěch (referee)
Title: Automatic Modifications of Context in Text Fields Author: Vratislav Dřínek Department: Institute of formal and applied Linguistics Supervisor: RNDr. Ondřej Bojar Ph.D., Institute of formal and applied Linguistics Abstract: The topic of this bachelor thesis is text editor assistant. The program tries to anticipate user's intentions and propose their quick accomplishion. The aim of this thesis solves is completely new and its function is not available even in advanced text editors. The topic is inspired by the user's interface of the programming enviroment Visual Studio, which sometimes proposes the program code the user is probably going to write. The assistant uses morphological analyses provided by Morphodita to analyse sentences morpologically. Keywords: sentence analysis, Czech dictionary, Morphodita tagger, autocomplete
Context-Dependent Dictionary for Translators
Fanta, Petr ; Bojar, Ondřej (advisor) ; Kuboň, Vladislav (referee)
During a manual translation of short texts, such as texts occurring on social networks or microblogs (e.g., Twitter), translators are often forced to gather additional information from various sources. These can include less common words, domain-specific terms, or numerous abbreviations. The aim of this thesis is to design and implement a system which automatically creates a minimal context-dependent dictionary for the given short message. The system identifies suitable dictionary entries in the translated text and searches for their definitions, translations, and examples from available open sources, or extracts them automatically from a parallel corpus. The resulted dictionary is ideally sufficient for human translators to understand the message, and to choose appropriate translation equivalent (including technical terms). An empirical evaluation is based on statistics which tracks how often users were satisfied with the proposed entries, how often the entries were incorrect and to what extent the system correctly identified the relevance for the input text.

National Repository of Grey Literature : 129 records found   beginprevious120 - 129  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.