Original title: Automatická identifikace citátů
Translated title: Automatic detection and attribution of quotes
Authors: Ustinova, Evgeniya ; Hana, Jiří (advisor) ; Vidová Hladká, Barbora (referee)
Document type: Master’s theses
Year: 2023
Language: eng
Abstract: Quotations extraction and attribution are important practical tasks for the media, but most of the presented solutions are monolingual. In this work, I present a complex machine learning-based system for extraction and attribution of direct and indirect quo- tations, which is trained on English and tested on Czech and Russian data. Czech and Russian test datasets were manually annotated as part of this study. This system is com- pared against a rule-based baseline model. Baseline model demonstrates better precision in extraction of quotation elements, but low recall. The machine learning-based model is better overall in extracting separate elements of quotations and full quotations as well. 1
Keywords: NLP|quotation extraction|quotation attribution|CRFs|article|annotation; NLP

Institution: Charles University Faculties (theses) (web)
Document availability information: Available in the Charles University Digital Repository.
Original record: http://hdl.handle.net/20.500.11956/181574

Permalink: http://www.nusl.cz/ntk/nusl-528862


The record appears in these collections:
Universities and colleges > Public universities > Charles University > Charles University Faculties (theses)
Academic theses (ETDs) > Master’s theses
 Record created 2023-07-09, last modified 2023-12-31


No fulltext
  • Export as DC, NUŠL, RIS
  • Share