National Repository of Grey Literature 35 records found  1 - 10nextend  jump to record: Search took 0.00 seconds. 
Light verb constructions and their exploitation for morphological annotation
Vyskočilová, Karolína ; Petkevič, Vladimír (advisor) ; Radimský, Jan (referee) ; Kettnerová, Václava (referee)
iv Abstract This Ph.D. thesis deals with light verb constructions (LVCs), such as provádět kontrolu (to perform a check) or chovat úctu (to show respect). It demonstrates how to apply theoretical knowledge of these constructions into practice, exploiting it during morphological disambiguation and thus potentially improving syntactic analysis. The theoretical part of the thesis covers three areas: light verb constructions, corpus annotation, and LanGr rule-based morphological disambiguation tagging. At first, LVCs are characterized, including their identification criteria, followed by a description of the current state of research on LVCs and a summary of papers published on the topic over the last fifteen years, with a particular emphasis on the syntactic approach to these constructions. A compilation of existing LVC inventories is also provided. Furthermore, Czech National Corpus written corpora tagging process is outlined, as it is closely related to the LanGr tool. Finally, LanGr rule creation and code implementation are covered. The practical part of the thesis addresses nominative-accusative case homonymy. New rules for the LanGr system are also developed to improve morphological annotation. In a case study, the most frequent forms of direct object LVCs are retrieved using data from the SYNv10 corpus....
Words that matter. Towards a Swedish-Czech colligational lexicon of basic verbs
Cinková, Silvie ; Petkevič, Vladimír (advisor) ; Malmgren, Sven-Göran (referee) ; Panevová, Jarmila (referee)
Basic verbs, i.e. very common verbs that typically denote physical movements, locations, states or actions, undergo various semantic shifts and acquire different secondary uses. In extreme cases, the distribution of secondary uses grows so general that they are regarded as auxiliary verbs (go and to be going to), phase verbs (turn, grow), etc. ese uses are usually well-documented by grammars and language textbooks, and so are idiomatic expressions (phraseologisms) in dictionaries. ere is, however, a grey area in between, which is extremely difficult to learn for non-native speakers. is consists of secondary uses with limited collocability, in particular light verb constructions, and secondary meanings that only get activated under particular morphosyntactic conditions. e basic-verb secondary uses and constructions are usually semantically transparent, such that they do not pose understanding problems, but they are generally unpredictable and language-specific, such that they easily become an issue in non-native text production. In this thesis, Swedish basic verbs are approached from the contrastive point of view of an advanced Czech learner of Swedish. A selection of Swedish constructions with basic verbs is explored. e observations result in a proposal for the structure of a machine-readable Swedish-Czech...
Quantitative view on the arabic text structure
Milička, Jiří ; Zemánek, Petr (advisor) ; Petkevič, Vladimír (referee)
The thesis suggests several general quantitative linguistic falsifiable hypotheses and tests them on corpora of standard modern Arabic, medieval Arabic and some European languages, including Czech and English. The hypotheses deal with structures built by word lengths and word frequencies within sentences and supra-sentential elements, with connection between sentence length - its constiuents frequency relation and Menzerath-Altmann Law, and with a view on text via so-called combinatorial mapping.
Machine Translation of Related Asian Languages
Larasati, Septina Dian ; Kuboň, Vladislav (advisor) ; Petkevič, Vladimír (referee)
This thesis presents the development of an MT system between Indonesian and Malaysian. The system uses a method of almost a direct translation exploiting the similarity of both languages. This method was previously used on a number of language pairs of European languages. The thesis also elaborates the attempts to make language resources from scratch since the languages are under-resourced.
On the Linguistic Structure of Emotional Meaning in Czech
Veselovská, Kateřina ; Hajičová, Eva (advisor) ; Petkevič, Vladimír (referee) ; Smrž, Pavel (referee)
Title: On the Linguistic Structure of Emotional Meaning in Czech Author: Mgr. Kateřina Veselovská Department: Institute of Formal and Applied Linguistics Supervisor: Prof. PhDr. Eva Hajičová, DrSc., Institute of Formal and Applied Linguistics Keywords: emotional meaning, linguistic structure, sentiment analysis, opinion mining, evaluative language Abstract: This thesis has two main goals. First, we provide an analysis of language means which together form an emotional meaning of written utterances in Czech. Sec- ond, we employ the findings concerning emotional language in computational applications. We provide a systematic overview of lexical, morphosyntactic, semantic and pragmatic aspects of emotional meaning in Czech utterances. Also, we propose two formal representations of emotional structures within the framework of the Prague Dependency Treebank and Construction Grammar. Regarding the computational applications, we focus on sentiment analysis, i.e. automatic extraction of emotions from text. We describe a creation of manually annotated emotional data resources in Czech and perform two main sentiment analysis tasks, polarity classification and opinion target identification on Czech data. In both of these tasks, we reach the state-of-the-art results.
Valency frames of Czech nouns: corpus-driven study
Čermáková, Anna ; Petkevič, Vladimír (advisor) ; Panevová, Jarmila (referee) ; Kopřivová, Marie (referee)
This thesis aims at providing a lexicological framework for systematic description of valency of Czech nouns. Valency is seen here as a lexicological property of words. Valency is an abstract relation with concrete textual realizations and the term "valency" is used here for both: the abstract notion and the concrete valency exponents and realisations. The analysis is corpus-driven and as such it is based on a rather loose notion of valency, devoid of any pre-conceived ideas, concentrating on typical structural patterns of occurrence on the right side of the noun under investigation. For the analysis the corpus SYN2000, a part of the Czech National Corpus has been used. The analysis is based on random selections of concordance lines of randomly chosen 99 nouns from the middle frequency range. In some cases, where the data proved insufficient, we have carried out additional specialized corpus queries. For high frequency nouns we assume highly differentiated valency profiles; to confirm this hypothesis we have carried out additional brief analysis of several high frequency nouns. The most frequent valency of Czech nouns is genitive complementation, which we find as occurring with more than 90% of the analysed nouns. For some of the nouns, the genitive valency is a very dominant valency pattern (in some cases...
Form and function of nouns in Czech: relation between nominal case and syntactic function. Based on a synchronic written corpus of Czech (SYN2005)
Jelínek, Tomáš ; Petkevič, Vladimír (advisor) ; Lopatková, Markéta (referee) ; Uličný, Oldřich (referee)
The case in Czech is the basic morphological means by which nouns express their function in a sentence. The objective of this thesis is to describe, from a frequency point of view, the relation between form and function of nouns, or, more precisely, how frequently cases (both simple and prepositional) are used to realise syntactic functions in sentences. The thesis is based on one of the largest corpora of written synchronic Czech: 100-million-token corpus SYN2005. In order to obtain data on frequencies of syntactic functions of nouns in relation to their cases, we annotated the corpus SYN2005 with a dependency syntactic annotation. For this annotation, we adopted the format of the analytical layer of the Prague Dependency Treebank. The syntactic annotation has been performed by a stochastic parser: the MST parser. Since the reliability of this annotation was not high enough, we have built an automatic correction module, which identifies errors of syntactic annotation in the output of the stochastic parser and corrects these errors by means of linguistic rules. We have implemented 26 different rules, but annotation errors have been reduced by merely 6-8%. However, this correction module can be further developed. It can be used to correct the output of any dependency parser trained on the data from...
Evaluation of Error Mark-Up in a Learner Corpus of Czech
Štindlová, Barbora ; Šebesta, Karel (advisor) ; Petkevič, Vladimír (referee) ; Šindelářová, Jaromíra (referee)
Title: Evaluation of Error Mark-Up in a Learner Corpus of Czech Author: Barbora Štindlová Department: Institute of Czech Language and Theory of Communication, Faculty of Arts, Charles University in Prague Supervisor: prof. PhDr. Karel Šebesta, CSc. Abstract: The thesis deals with the topic of Czech as a second language, while introducing methods of corpus linguistics as applied to texts produced by language learners. The context is the process of building and exploiting a learner corpus, with a focus on its error mark-up and options for evaluating the annotation scheme. Learner corpora have become a major resource for investigating a learner interlanguage and a significant incentive for many different types of research and teaching of second/foreign languages. They are used mainly for contrastive studies of native and non-native speakers, i.e. for contrastive interlanguage analysis, and for computer-aided error analysis of the learner language. This kind of analysis is crucially dependent on the type and quality of the error mark-up. In every error-annotated corpus the error annotation is based on an error typology, which is necessarily problematic from a number of theoretical aspects. Evaluation of the reliability and validity of the annotation scheme design is therefore an important step in the build-up...

National Repository of Grey Literature : 35 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.