National Repository of Grey Literature 35 records found  beginprevious26 - 35  jump to record: Search took 0.01 seconds. 
Quantitative view on the arabic text structure
Milička, Jiří ; Zemánek, Petr (advisor) ; Petkevič, Vladimír (referee)
The thesis suggests several general quantitative linguistic falsifiable hypotheses and tests them on corpora of standard modern Arabic, medieval Arabic and some European languages, including Czech and English. The hypotheses deal with structures built by word lengths and word frequencies within sentences and supra-sentential elements, with connection between sentence length - its constiuents frequency relation and Menzerath-Altmann Law, and with a view on text via so-called combinatorial mapping.
Evaluation of Error Mark-Up in a Learner Corpus of Czech
Štindlová, Barbora ; Šebesta, Karel (advisor) ; Petkevič, Vladimír (referee) ; Šindelářová, Jaromíra (referee)
Title: Evaluation of Error Mark-Up in a Learner Corpus of Czech Author: Barbora Štindlová Department: Institute of Czech Language and Theory of Communication, Faculty of Arts, Charles University in Prague Supervisor: prof. PhDr. Karel Šebesta, CSc. Abstract: The thesis deals with the topic of Czech as a second language, while introducing methods of corpus linguistics as applied to texts produced by language learners. The context is the process of building and exploiting a learner corpus, with a focus on its error mark-up and options for evaluating the annotation scheme. Learner corpora have become a major resource for investigating a learner interlanguage and a significant incentive for many different types of research and teaching of second/foreign languages. They are used mainly for contrastive studies of native and non-native speakers, i.e. for contrastive interlanguage analysis, and for computer-aided error analysis of the learner language. This kind of analysis is crucially dependent on the type and quality of the error mark-up. In every error-annotated corpus the error annotation is based on an error typology, which is necessarily problematic from a number of theoretical aspects. Evaluation of the reliability and validity of the annotation scheme design is therefore an important step in the build-up...
On lexical relations among synonymous adjectives (a corpus-driven research)
Marková, Věra ; Vachková, Marie (advisor) ; Petkevič, Vladimír (referee) ; Paradis, Carita (referee)
The doctoral thesis concems the research of lexical relations among a sampie of German adjectives, which are called "near-synonyms". Partly, the thesis deals also with other paradigmatic relations. The author considers the traditional approach to the semantic relations, which is modified by means of the corpus analysis of four German adjective synonymous pairs (kalt/kühl, schön/hübsch, nett/angenehm, unschön/hässlich). The author has suggested a methodology which makes use of corpus-linguistic tools developed by the Department of Corpus Linguistics at the Institute for the German Language in Mannheim, Germany (IDS Mannheim) and which can be used with some modifications for the research of other synonymous pairs. The corpus-linguistic tools explore especially similar collocation profiles of the analyzed words and the use of semantically elose words in certain contexts. The results of the analyses are documented on texts deriving from the corpus DEREKo (Mannheim German Reference Corpus). The analyses undertaken in this thesis contribute to illustration of semantic relations in language and help to build up a new, usage-based view of language.
Morphemic structure of contemporary Czech: from linguistic theory to automatic language processing
Lebeda, Jiří ; Petkevič, Vladimír (advisor) ; Kučera, Karel (referee)
The morphemic research of the Czech language witnessed its largest boom in the 1960s and 1970s. Since the appearance of Retrográdní morfematický slovník češtiny (1975) by Eleonora Slavíčková and of Komárek's Příspěvky k české morfologii (1978), the interest of researchers in this area of linguistics has been waning. The leitmotifs of all nine chapters of this monograph are an attempt to defend morphemics as a stand-alone discipline, an evaluation of the theoretic and empirical knowledge gathered so far, and the justification of formal computer processing as the only promising approach for the future. The interdisciplinary character of the present work manifests itself in the search for impulses for proposing an algorithmic approach to morphemic analysis and synthesis - which is the culmination of the monograph - e.g. in cognitive sciences and general semiotics. An examination of the main principles of how the mental lexicon works, including the theory of activation of language units, shows that a computational approach to traditional linguistic topics and methods can borrow inspiration not only from theoretical fields. The central term, namely morpheme, which is realized as morph in the language usage, is widely believed to have been introduced by Jan Baudoin de Courtenay in the 1880s. Using the great...
Machine Translation of Related Asian Languages
Larasati, Septina Dian ; Kuboň, Vladislav (advisor) ; Petkevič, Vladimír (referee)
This thesis presents the development of an MT system between Indonesian and Malaysian. The system uses a method of almost a direct translation exploiting the similarity of both languages. This method was previously used on a number of language pairs of European languages. The thesis also elaborates the attempts to make language resources from scratch since the languages are under-resourced.
Valency frames of Czech nouns: corpus-driven study
Čermáková, Anna ; Petkevič, Vladimír (advisor) ; Panevová, Jarmila (referee) ; Kopřivová, Marie (referee)
This thesis aims at providing a lexicological framework for systematic description of valency of Czech nouns. Valency is seen here as a lexicological property of words. Valency is an abstract relation with concrete textual realizations and the term "valency" is used here for both: the abstract notion and the concrete valency exponents and realisations. The analysis is corpus-driven and as such it is based on a rather loose notion of valency, devoid of any pre-conceived ideas, concentrating on typical structural patterns of occurrence on the right side of the noun under investigation. For the analysis the corpus SYN2000, a part of the Czech National Corpus has been used. The analysis is based on random selections of concordance lines of randomly chosen 99 nouns from the middle frequency range. In some cases, where the data proved insufficient, we have carried out additional specialized corpus queries. For high frequency nouns we assume highly differentiated valency profiles; to confirm this hypothesis we have carried out additional brief analysis of several high frequency nouns. The most frequent valency of Czech nouns is genitive complementation, which we find as occurring with more than 90% of the analysed nouns. For some of the nouns, the genitive valency is a very dominant valency pattern (in some cases...
Communication problems with the computer in natural language
Sirůčková, Hana ; Petkevič, Vladimír (referee) ; Jirků, Petr (advisor)
Ve své práci se snažím poukázat na to, jak je pochopení přirozeného jazyka složité. A ačkoli jej používáme každý den, není snadné jej jednoduše popsat natož jej přesně matematicky definovat. Ale pokud se chceme bavit s počítačem v přirozeném jazyce, tak je nutné mít nějaký systém, který převede přirozený jazyk do příkazu počítače. Prostředníkem by mohly být umělé jazyky, které by byly jednoduché a zbaveny nejednoznačností a přitom by byly dostatečně univerzální. Práce kromě úvodu a závěru obsahuje čtyři kapitoly. Ve druhé kapitole se snažím popsat, co to vlastně přirozený jazyk je, odkud se bere informace. Jaké jsou komplikace v porozumění přirozenému jazyku, problémy, které mají s porozuměním stroje na rozdíl od lidí, u nichž se předpokládá standardní chápání světa, je vyjádřeno ve třetí kapitole. Ve čtvrté kapitole je nastíněna snaha o vyřešení problémů z třetí kapitoly zavedením umělých jazyků a v páté kapitole jsou rozebrány možnosti reprezentace znalostí.
Formalization of the Czech morphology system with respect to automatic processing of Czech texts
Hlaváčová, Jaroslava ; Petkevič, Vladimír (advisor) ; Oliva, Karel (referee) ; Osolsobě, Klára (referee)
Detailed morphological description of word forms represents one of the most important conditions of a successful automatic processing of linguistic data. The system of categories and their values which are used for the description are the subject of the rst part of the thesis. The basic principle, so-called Golden rule of morphology, states that every word form has to be described by the system unambiguously. The existence of variants of word forms and whole paradigms, however, complicates the accomplishment of this rule.We introduce so called mutations as an extension of the variants to be able to include other sets of word forms with the same description (for instance multiple word forms of Czech personal pronouns). We divide mutations into two parts global ones describing all word forms of a paradigm, and in ectional ones for the description on the word form level. This division enables us to express their various combinations. We do not use features of style for the mutation division, for they are subjective. With a consistent use of the categories called In ectional Mutation and Global Mutation, the Golden rule of morphology will always be valid. The concept of multiple lemma is introduced in a chapter dealing with lemmatization. It describes lemma variants. We give a detailed description of so-called...
Aspects of the Slavic verb "to have" - in grammar. Its auxiliation and the grammaticalization of the European analytic perfect type structures "have" + ppp - Bulgarian, Macedonian, and a corpus based study of Czech
Marvanová, Mira ; Čermák, František (advisor) ; Damborský, Jiří (referee) ; Petkevič, Vladimír (referee)
The verb "to have" is one of significant elements of the ELA (European Linguistic Area). It belongs to the first stage of europeisms, yet it is not part of the IE heritage and its origin is different in the majority of European languages. Another phenomenon of later date which can be considered a europeism of structural type is the emergence and the development of the analytical perfect habere-tenses or constructions of this type in most of the European so-called haberelanguages, including Slavie. We obviously deal here with an expanding process, the original epicenter of which was Vulgar Latin - with some participation of Greek - in the conditions of centuries-long mutual bilingualism. The subsequent process of diffusion and induction developed several euro-zones out which Slavic represents the two last ones. The process of the europeanization, i.e. its spread ("euro-diffusion") from this epicenter to the European Linguistic Area (ELA), took place firstly in Romance, later in Germanic reaching eventually Slavie through two channels at the threshold of the 20th century. ln the South, this process became one ofthe components ofthe Balkan linguistic integration. It proved to be the most intense in Macedonian as well as in some South Bulgarian dialects which were in direct contact with non-Slavic languages of...

National Repository of Grey Literature : 35 records found   beginprevious26 - 35  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.