National Repository of Grey Literature 3 records found  Search took 0.01 seconds. 
Classification of Relations between Named Entities in Text
Ondřej, Karel ; Doležal, Jan (referee) ; Smrž, Pavel (advisor)
This master thesis deals with the extraction of relationships between named entities in the text. In the theoretical part of the thesis, the issue of natural language representation for machine processing is discussed. Subsequently, two partial tasks of relationship extraction are defined, namely named entities recognition and classification of relationships between them, including a summary of state-of-the-art solutions. In the practical part of the thesis, system for automatic extraction of relationships between named entities from downloaded pages is designed. The classification of relationships between entities is based on the pre-trained transformers. In this thesis, four pre-trained transformers are compared, namely BERT, XLNet, RoBERTa and ALBERT.
Document Information Extraction
Janík, Roman ; Špaňhel, Jakub (referee) ; Hradiš, Michal (advisor)
S rozvojem digitalizace přichází potřeba analýzy historických dokumentů. Důležitou úlohou pro extrakci informací a dolování dat je rozpoznávání pojmenovaných entit. Cílem této práce je vyvinout systém pro extrakci informací z českých historických dokumentů, jako jsou noviny, kroniky a matriční knihy. Byl navržen systém pro extrakci informací, jehož vstupem jsou naskenované historické dokumenty zpracované OCR algoritmem. Systém je založen na modifikovaném modelu RoBERTa. Extrakce informací z českých historických dokumentů přináší výzvy v podobě nutnosti vhodného korpusu pro historickou Češtinu. Pro trénování systému byly použity korpusy Czech Named Entity Corpus (CNEC) a Czech Historical Named Entity Corpus (CHNEC), spolu s mým vlastním vytvořeným korpusem. Systém dosahuje úspěšnosti 88,85 F1 skóre na CNEC a 87,19 F1 skóre na CHNEC. Toto je zlepšení o 1,36 F1 u CNEC a 5,19 F1 u CHNEC a tedy nejlepší známé výsledky.
Classification of Relations between Named Entities in Text
Ondřej, Karel ; Doležal, Jan (referee) ; Smrž, Pavel (advisor)
This master thesis deals with the extraction of relationships between named entities in the text. In the theoretical part of the thesis, the issue of natural language representation for machine processing is discussed. Subsequently, two partial tasks of relationship extraction are defined, namely named entities recognition and classification of relationships between them, including a summary of state-of-the-art solutions. In the practical part of the thesis, system for automatic extraction of relationships between named entities from downloaded pages is designed. The classification of relationships between entities is based on the pre-trained transformers. In this thesis, four pre-trained transformers are compared, namely BERT, XLNet, RoBERTa and ALBERT.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.