National Repository of Grey Literature 30 records found  beginprevious21 - 30  jump to record: Search took 0.00 seconds. 
Valency of Verbs in the Prague Dependency Treebank
Urešová, Zdeňka ; Hajičová, Eva (advisor) ; Lopatková, Markéta (referee) ; Ondrejovič, Slavo (referee)
Title: Valency of verbs in the Prague Dependency Treebank Author: PhDr. Zdeňka Urešová Department: Institute of Formal and Applied Linguistics MFF UK Supervisor: Prof. PhDr. Eva Hajičová, DrSc. Abstract: This dissertation describes PDT-Vallex, a valency lexicon of Czech verbs, and its relation to the annotation of the Prague Dependency Treebank (PDT). The PDT-Vallex lexicon was created during the an- notation of the PDT and it is a valuable source of verbal valency information available both for linguistic research and for computer- ized natural language processing. In this thesis, we describe not only the structure and design of the lexicon (which is closely related to the notion of valency as developed in the Functional Generative De- scription of language) but also the relation between the PDT-Vallex and the PDT. The explicit and full-coverage linking of the lexicon to the treebank prompted us to pay special attention to diatheses; we propose formal transformation rules for diatheses to handle their surface realization even when the canonical forms of verb arguments as captured in the lexicon do not correspond to the forms of these arguments actually appearing in the corpus.
Form and function of nouns in Czech: relation between nominal case and syntactic function. Based on a synchronic written corpus of Czech (SYN2005)
Jelínek, Tomáš ; Petkevič, Vladimír (advisor) ; Lopatková, Markéta (referee) ; Uličný, Oldřich (referee)
The case in Czech is the basic morphological means by which nouns express their function in a sentence. The objective of this thesis is to describe, from a frequency point of view, the relation between form and function of nouns, or, more precisely, how frequently cases (both simple and prepositional) are used to realise syntactic functions in sentences. The thesis is based on one of the largest corpora of written synchronic Czech: 100-million-token corpus SYN2005. In order to obtain data on frequencies of syntactic functions of nouns in relation to their cases, we annotated the corpus SYN2005 with a dependency syntactic annotation. For this annotation, we adopted the format of the analytical layer of the Prague Dependency Treebank. The syntactic annotation has been performed by a stochastic parser: the MST parser. Since the reliability of this annotation was not high enough, we have built an automatic correction module, which identifies errors of syntactic annotation in the output of the stochastic parser and corrects these errors by means of linguistic rules. We have implemented 26 different rules, but annotation errors have been reduced by merely 6-8%. However, this correction module can be further developed. It can be used to correct the output of any dependency parser trained on the data from...
Lexical-semantic Conversions in the Valency Lexicon
Kettnerová, Václava ; Lopatková, Markéta (advisor) ; Panevová, Jarmila (referee) ; Karlík, Petr (referee)
In this thesis, we provide an adequate lexicographic representation of lexical-semantic conversion. Under the term lexical-semantic conversion, the relation between semantically similar syntactic structures which are based on separate lexical units of the same verb lexeme is understood. These relations are associated with various changes in valency structure of verbs - they may involve a number of valency complementations, their type, obligatoriness as well as morphemic forms. These changes arise from differences in the mapping of situational participants onto valency complementations. On the basis of semantic and syntactic analysis of two types of Czech lexical- semantic conversions, the locative conversion and the conversion Bearer of action-Location, we propose to represent lexical units creating syntactic variants in the relation of lexical semantic conversion by separate valency frames stored in the data component of the lexicon. The special attribute -conv whose value is a type of lexical-semantic conversion is assigned to relevant valency frames. Then the rule component of the lexicon consists of general rules determining changes in the correspondence between situational participants and valency complementations. This proposal is primarily designed for the valency lexicon of Czech verbs, VALLEX....
Automatické určování sémantických preferencí pro slovesná valenční doplnění
Vandas, Karel ; Lopatková, Markéta (advisor) ; Vidová Hladká, Barbora (referee)
Verb valency plays an important role in the description of behaviour of verbs and connects surface realisation of language with its semantics. Verb itself usually encodes several readings. Complementations of a verb help to identify correct reading of the verb. So far valency verb complementations are mostly studied from morphological and syntactical point of view. The purpose of this thesis is to examine possibilities of automatic identification of semantic preferences for valency complementations of verbs. The thesis discusses performance of system with different levels of available verb valency information in connection with cluster analysis. The thesis contains an evaluation section that compares available methods and their comparision.
Typical Usage Patterns of English Verbs
Smejkalová, Lenka ; Holub, Martin (advisor) ; Lopatková, Markéta (referee)
Corpus Pattern Analysis (CPA) is a corpus-based method that explores typical usage patterns of verbs in a text corpus, and describes meaning of verbs by means of contextual preferences defined both syntactically and semantically [1]. CPA in conjuction with the British National Corpus (BNC) is currently used to create The Pattern Dictionary of English Verbs (PDEV) [1, 2]. The thesis describes the current status of the PDEV, presents a thorough analysis of available data on typical usage patterns and explores possible applications of the PDEV for automatic lexical analysis. In this thesis procedures usable in further PDEV development have been designed and implemented. The first of them automatically extracts arguments of verbs from an output of English syntactic analysis. The second one uses the extracted arguments to create lists of lexical units that realize semantic types. The last procedure uses these lists to automatically recognize typical usage patterns of verbs. The thesis also evaluates inter-annotator agreement, automatic extraction of verb arguments in/from English sentence, and effectiveness of the proposed procedures in the extraction of lexical units that realize semantic types and in automatic recognition of typical usage patterns.
Clause analysis in Czech conmplex sentences
Krůza, Oldřich ; Lopatková, Markéta (referee) ; Kuboň, Vladislav (advisor)
This Master thesis deals with identification of clauses in Czech morphologically annotated sentences and finding the inter-clausal relations. The task is approached as a machine-learning problem. An annotation scheme for clauses in Czech text is presented alongside with a method for deriving clause-annotated data from the analytical layer of Functional Generative Description coded in the Prague Markup Language. The gathered data are used for training and evaluating a system of automated identification of clauses and their relations. A method of evaluation of the result is suggested and separate software applications created during the development are presented.
Verb Valency Frames Disambiguation
Semecký, Jiří ; Hajič, Jan (advisor) ; Krbec, Pavel (referee) ; Lopatková, Markéta (referee)
Semantic analysis has become a bottleneck of many natural language applications. Machine translation, automatic question answering, dialog management, and others rely on high quality semantic analysis. Verbs are central elements of clauses with strong influence on the realization of whole sentences. Therefore the semantic analysis of verbs plays a key role in the analysis of natural language. We believe that solid disambiguation of verb senses can boost the performance of many real-life applications. In this thesis, we investigate the potential of statistical disambiguation of verb senses. Each verb occurrence can be described by diverse types of information. We investigate which information is worth considering when determining the sense of verbs. Different types of classification methods are tested with regard to the topic. In particular, we compared the Naive Bayes classifier, decision trees, rule-based method, maximum entropy, and support vector machines. The proposed methods are thoroughly evaluated on two different Czech corpora, VALEVAL and the Prague Dependency Treebank. Significant improvement over the baseline is observed.
Mapping the Prague Dependency Treebank Annotation Scheme onto Robust Minimal Recursion Semantics
Jakob, Max ; Štěpánek, Jan (referee) ; Lopatková, Markéta (advisor)
This thesis investigates the correspondence between two semantic formalisms, namely the tectogrammatical layer of the Prague Dependency Treebank 2.0 (PDT) and Robust Minimal Recursion Semantics (RMRS). It is a rst attempt to relate the dependency based annotation scheme of PDT to a compositional semantics approach like RMRS. An iterative mapping algorithm that converts PDT trees into RMRS structures is developed that associates RMRSs to each node in the dependency tree. Therefore, composition rules are formulated and the complex relation between dependency in PDT and semantic heads in RMRS is analyzed in detail. It turns out that structure and dependencies, morphological categories and some coreferences can be preserved in the target structures. Furthermore, valency and free modi cations are distinguished using the valency dictionary of PDT as an additional resource. The evaluation result of 81% recall shows that systematically correct underspeci ed target structures can be obtained by a rule-based mapping approach, which is an indicator that RMRS is capable of representing Czech data. This nding is novel as Czech, with its free word order and rich morphology, is typologically di erent from language that used RMRS thus far.
Question and Answer Classifier for closed domain Interactive Question Answering
Dinh, Le Thanh ; Schlesinger, Pavel (referee) ; Lopatková, Markéta (advisor)
Nowadays natural language processing has made big progress thanks to the application of statistical approaches and to the large amount of data available to train the systems. These progresses are pushed by the several evaluation campaigns. Thanks to them systems are compared and progress measured. These evaluations are mostly based on data sets artificially developed by the organizers of such evaluation campaigns. In our work we show that though useful these data sets are biased and there is the need of developing data generated in a more natural setting by real users. We consider as case studies the classification of questions. In particular we look at the classification of questions types needed in Question Answering systems, and the classification of follow up questions into topic continuation and topic shift needed in Interactive Question Answering. We evaluate classifiers first on TREC data and than on a corpus of real user's data. In both cases the performance of the classifiers drops significantly showing the need of working on more users centered systems. The results also show that the classifiers could be better fine tuned taking into account the new challenges real users data launch to NLP systems. We leave this for future research.

National Repository of Grey Literature : 30 records found   beginprevious21 - 30  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.