National Repository of Grey Literature 30 records found  1 - 10nextend  jump to record: Search took 0.00 seconds. 
Předpovídání trendů akciového trhu z novinových článků
Serebryannikova, Anastasia ; Kuboň, Vladislav (advisor) ; Vidová Hladká, Barbora (referee)
In this work we made an attempt to predict the upwards/downwards movement of the S&P 500 index from the news articles published by Bloomberg and Reuters. We employed the SVM classifier and conducted multiple experiments aiming at understanding the shape of the data and the specifics of the task better. As a result, we established the common evaluation settings for all our subsequent experiments. After that we tried incorporating various features into the model and also replicated several approaches previously suggested in the literature. We were able to identify some non-trivial dependencies in the data which helped us achieve a high accuracy on the development set. However, none of the models that we built showed comparable performance on the test set. We have come to the conclusion that whereas some trends or patterns can be identified in a particular dataset, such findings are usually barely transferable to other data. The experiments that we conducted support the idea that the stock market is changing at random and a high quality of prediction may only be achieved on particular sets of data and under very special settings, but not for the task of stock market prediction in general. 1
Comparing Machine Translation Output (and the Way it Changes over Time)
Kyselová, Soňa ; Svoboda, Tomáš (advisor) ; Kuboň, Vladislav (referee)
This diploma thesis focuses on machine translation (MT), which has been studied for a relatively long time in linguistics (and later also in translation studies) and which in recent years is at the forefront of the broader public as well. This thesis aims to explore the quality of machine translation outputs and the way it changes over time. The theoretical part first deals with the machine translation in general, namely basic definitions, brief history and approaches to machine translation, then describes online machine translation systems and evaluation methods. Finally, this part provides a methodological model for the empirical part. Using a set of texts translated with MT, the empirical part seeks to check how online machine translation systems deal with translation of different text-types and whether there is improvement of the quality of MT outputs over time. In order to do so, an analysis of text-type, semantics, lexicology, stylistics and pragmatics is carried out as well as a rating of the general applicability of the translation. The final part of this thesis compares and concludes the results of the analysis. With regard to this comparation, conclusions are made and general tendencies stated that have emerged from the empirical part of the thesis.
Artificial Neural Network for Opinion Target Identification in Czech
Glončák, Vladan ; Kuboň, Vladislav (advisor) ; Mírovský, Jiří (referee)
The main focus of this thesis is to use neural networks, specifically long short-term memory cells, for identifying opinion targets in Czech data. The side product is a new version of dataset for opinion target identification. For a comparison, previously obtained results for another languages and by employing probabilistic methods instead were listed. The experiment was successful, achieved results are above trivial baseline models and comparable with the results achieved previously. Powered by TCPDF (www.tcpdf.org)
Linguistic Issues in Machine Translation between Czech and Russian
Klyueva, Natalia ; Kuboň, Vladislav (advisor) ; Panevová, Jarmila (referee) ; Strossa, Petr (referee)
In this thesis we analyze machine translation between Czech and Russian languages from the perspective of a linguist. We work with two types of Machine Translation systems - rule-based (TectoMT) and statistical (Moses). We experiment with different setups of these two systems in order to achieve the best possible quality. One of the questions we address in our work is whether relatedness of the discussed languages has some impact on machine translation. We explore the output of our two experimental systems and two commercial systems: PC Translator and Google Translate. We make a linguistically-motivated classification of errors for the language pair and describe each type of error in detail, analyzing whether it occurred due to some difference between Czech and Russian or is it caused by the system architecture. We then compare the usage of some specific linguistic phenomena in the two languages and state how the individual systems cope with mismatches. For some errors, we suggest ways to improve them and in several cases we implement those suggestions. In particular, we focus on one specific error type - surface valency. We research the mismatches between Czech and Russian valency, extract a lexicon of surface valency frames, incorporate the lexicon into the TectoMT translation pipeline and present...
Semantic relation extraction from unstructured data in the business domain
Rampula, Ilana ; Pecina, Pavel (advisor) ; Kuboň, Vladislav (referee)
Text analytics in the business domain is a growing field in research and practical applications. We chose to concentrate on Relation Extraction from unstructured data which was provided by a corporate partner. Analyzing text from this domain requires a different approach, counting with irregularities and domain specific attributes. In this thesis, we present two methods for relation extraction. The Snowball system and the Distant Supervision method were both adapted for the unique data. The methods were implemented to use both structured and unstructured data from the database of the company. Keywords: Information Retrieval, Relation Extraction, Text Analytics, Distant Supervision, Snowball
Measures of Machine Translation Quality
Macháček, Matouš ; Bojar, Ondřej (advisor) ; Kuboň, Vladislav (referee)
Title: Measures of Machine Translation Quality Author: Matouš Macháček Department: Institute of Formal and Applied Linguistics Supervisor: RNDr. Ondřej Bojar, Ph.D. Abstract: We explore both manual and automatic methods of machine trans- lation evaluation. We propose a manual evaluation method in which anno- tators rank only translations of short segments instead of whole sentences. This results in easier and more efficient annotation. We have conducted an annotation experiment and evaluated a set of MT systems using this method. The obtained results are very close to the official WMT14 evaluation results. We also use the collected database of annotations to automatically evalu- ate new, unseen systems and to tune parameters of a statistical machine translation system. The evaluation of unseen systems, however, does not work and we analyze the reasons. To explore the automatic methods, we organized Metrics Shared Task held during the Workshop of Statistical Ma- chine Translation in years 2013 and 2014. We report the results of the last shared task, discuss various metaevaluation methods and analyze some of the participating metrics. Keywords: machine translation, evaluation, automatic metrics, annotation
Porovnáni metod česko-ruského automatického překladu
Bílek, Karel ; Kuboň, Vladislav (advisor) ; Bojar, Ondřej (referee)
In this thesis, I am presenting several methods of Czech-to-Russian ma- chine translation, including both historical approaches and more modern ones, and including both phrase-based and rule-based systems. I am rst brie y describing the linguistic background of Czech and Russian, and their common history and di er- ences. en, I am describing automating, building and improving some o he ma- chine translation systems, together with their comparison, using both an automated metric and a limited human annotation. Meanwhile, I am also describing the creation of a several corpora of Czech-Russian parallel data and Russian monolingual data.
Joining Segments in Czech Complex Sentences
Čech, Josef ; Kuboň, Vladislav (advisor) ; Krůza, Oldřich (referee)
Title: Joining segments in Czech sentences Author: Bc. Josef Čech Department: Institute of Formal and Applied Linguistics Supervisor: doc. RNDr. Vladislav Kuboň Ph.D. e-mail: vk@ufal.mff.cuni.cz Abstract: This thesis follows up segmentation of complex sentences to linguistic motivated objects - segments - and their mutual relations. These relations can be used for next work with segments. Main purpose for mapping relations is their joining into next level unit - clause. Theoretically should be possible to analyse each clause of complex sentence separately. Analysis of set of clauses should be quicker than of analysis whole complex sentence. Segments should be found thanks to linguistic separators and rule approach. Rule approach proves in problem relations between neighbouring segments. This thesis should attest that rule approach is best solution for joining segments into clauses. Position tag of segment was part of this thesis. This tag should be used in methods dealing with segments instead of custom segment. Keyword: segment, clause, tag, joining segments, syntactic analysis

National Repository of Grey Literature : 30 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.