National Repository of Grey Literature 15 records found  1 - 10next  jump to record: Search took 0.00 seconds. 
Methods of Information Extraction
Adamček, Adam ; Smrž, Pavel (referee) ; Kouřil, Jan (advisor)
The goal of information extraction is to retrieve relational data from texts written in natural human language. Applications of such obtained information is wide - from text summarization, through ontology creation up to answering questions by QA systems. This work describes design and implementation of a system working in computer cluster which transforms a dump of Wikipedia articles to a set of extracted information that is stored in distributed RDF database with a possibility to query it using created user interface.
Consistency Checking of Relations Extracted from Text
Stejskal, Jakub ; Otrusina, Lubomír (referee) ; Smrž, Pavel (advisor)
This bachelor thesis is dedicated to mechanical techniques that are used in the natural language processing and information extraction from particular text. It is approaching the general methods that starting to process the raw text and it continues to the relations extraction from processed language constructs, moreover it provides options for the use of obtained relational data which can be seen for example in the project DBpedia. Another milestone of the described bachelor thesis is the design and implementation of an automated system for extracting information about entities, which do not have their own article on the English version of Wikipedia. Thesis also presents algorithms developed for the extraction of entities with their own name, the verification of the articles ‘existence of the extracted entities and finally the actual extraction of information about individual entities, which can be used during the information consistency checking. In the end, it can be seen the results and suggestions for further development of the created system.
Sledování aktivovanosti objektů v textech
Václ, Jan ; Vidová Hladká, Barbora (advisor) ; Novák, Michal (referee)
The notion of salience in the discourse analysis models how the activation of referred objects evolves in the flow of text. The salience algorithm was already defined and tested briefly in an earlier research, we present a reproduction of its results in a larger scale using data from the Prague Discourse Treebank 1.0. The results are then collected into an accessible shape and analyzed both in their visual and quantitative form in the context of the two main resources of the salience - coreference relations and topic-focus articulation. Finally, attempts are made with using the salience information in the machine learning NLP tasks of document clustering and topic modeling. Powered by TCPDF (www.tcpdf.org)
Coreference from the Cross-lingual Perspective
Novák, Michal ; Žabokrtský, Zdeněk (advisor) ; Stede, Manfred (referee) ; Rosen, Alexandr (referee)
Coreference from the Cross-lingual Perspective Michal Nov'ak The subject of this thesis is to study properties of coreference using cross- lingual approaches. The work is motivated by the research on coreference-related linguistic typology. Another motivation is to explore whether differences in the ways how languages express coreference can be exploited to build better models for coreference resolution. We design two cross-lingual methods: the bilingually informed coreference resolution and the coreference projection. The results of our experiments with the methods carried out on Czech-English data suggest that with respect to coreference English is more informative for Czech than vice versa. Furthermore, the bilingually informed resolution applied on parallel texts has managed to outperform the monolingual resolver on both languages. In the experiments, we employ the monolingual coreference resolver and an improved method for alignment of coreferential expressions, both of which we also designed within the thesis. 1
Coreference resolution for Universal Dependencies
Faryad, Ján ; Novák, Michal (advisor) ; Rosa, Rudolf (referee)
Title: Coreference resolution for Universal Dependencies Author: Ján Faryad Department: Institute of Formal and Applied Linguistics Supervisor: Mgr. Michal Novák Abstract: Coreference is an important tool for maintaining of the text coherence. Up to now, there has been no possibility to mark it in Universal Dependencies (UD), which is a project for universal description of morphology and dependency syntax. This work presents a way how to mark coreference in the UD project. It also includes a conversion of data with coreference annotation from the corpora PDT 3.0 and OntoNotes 5.0 with using a tool UDPipe for an automatic analysis of text in the UD style. This work is also aimed to implement a system for automatic resolution of pronoun coreference using machine learning. Finally, the quality of the system is evaluated by simple way. The design of the program emphasizes the language independence and compatibility with the Udapi interface, which is used for processing of the UD data. Keywords: coreference resolution, coreference, anaphora, Universal Dependencies, UD
Sledování aktivovanosti objektů v textech
Václ, Jan ; Vidová Hladká, Barbora (advisor) ; Žabokrtský, Zdeněk (referee)
The notion of salience in the discourse analysis models how the activation of referred objects evolves in the flow of text. The salience algorithm was already defined and tested briefly in an earlier research, we present a reproduction of its results in a larger scale using data from the Prague Dependency Treebank 3.0. The results are then collected into an accessible shape and analyzed both in their visual and quantitative form in the context of the two main resources of the salience - coreference relations and topic-focus articulation. Furthermore, a possibility of modeling the salience degree by a machine learning algorithm (decision trees and random forest) is examined. Finally, attempts are made with using the salience information in the machine learning NLP task of document clustering visualization. Powered by TCPDF (www.tcpdf.org)
Sledování aktivovanosti objektů v textech
Václ, Jan ; Vidová Hladká, Barbora (advisor) ; Novák, Michal (referee)
The notion of salience in the discourse analysis models how the activation of referred objects evolves in the flow of text. The salience algorithm was already defined and tested briefly in an earlier research, we present a reproduction of its results in a larger scale using data from the Prague Discourse Treebank 1.0. The results are then collected into an accessible shape and analyzed both in their visual and quantitative form in the context of the two main resources of the salience - coreference relations and topic-focus articulation. Finally, attempts are made with using the salience information in the machine learning NLP tasks of document clustering and topic modeling. Powered by TCPDF (www.tcpdf.org)
Anaphoric Nominal Phrases in the Sport Journalism
PROVÁZKOVÁ, Martina
The aim of this work is to describe and to analyse one type of anaphoric nominal syntagma found in the articles of French sports newspapers. The work is divided in two parts theoretical and practical. The theoretical part describes the basic concepts of text linguistics. Then it focuses on the concept of anaphora and nominal syntagma with brief reference to journalistic-style writing. In the practical part, an analysis of anaphoric nominal syntagmas and the description of their semantico-pragmatic relations is carried out.
Cataphora and its Functioning in the Journalistic texts
KRČKOVÁ, Martina
The theme of this thesis writen in french is one of the ways of referring in the text called cataphora. Firstly, the work deals begin with the theoretical part, which contains the theory of text linguistics because of a close connection with the analyzed phenomenon. Then, the cataphora term is defined on the basis of literature and later analyzed in the selected corpus of journalistic texts.
Coreference in Text
Pecsők, Ján ; Vidová Hladká, Barbora (advisor) ; Novák, Michal (referee)
The goal of this bachelor thesis is to explore possibilities of searching coreference with a rule-based approach employing the morphological and syntactical information. The coreference visualization and rule evaluation present a key part of the thesis. The application Koreferencie has been developed and it provides the environment for the coreference visualization, creation and evaluation of the rules. A number of rules have been formulated and evaluated and they are described in details in the thesis. In the last part there is user and programmer documentation.

National Repository of Grey Literature : 15 records found   1 - 10next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.