National Repository of Grey Literature 1 records found  Search took 0.01 seconds. 
Czech NLP with Contextualized Embeddings
Vysušilová, Petra ; Straka, Milan (advisor) ; Hajič, Jan (referee)
With the increasing amount of digital data in the form of unstructured text, the importance of natural language processing (NLP) increases. The most suc- cessful technologies of recent years are deep neural networks. This work applies the state-of-the-art methods, namely transfer learning of Bidirectional Encoders Representations from Transformers (BERT), on three Czech NLP tasks: part- of-speech tagging, lemmatization and sentiment analysis. We applied BERT model with a simple classification head on three Czech sentiment datasets: mall, facebook, and csfd, and we achieved state-of-the-art results. We also explored several possible architectures for tagging and lemmatization and obtained new state-of-the-art results in both tagging and lemmatization with fine-tunning ap- proach on data from Prague Dependency Treebank. Specifically, we achieved accuracy 98.57% for tagging, 99.00% for lemmatization, and 98.19% for joint accuracy of both tasks. Best models for all tasks are publicly available. 1

Interested in being notified about new results for this query?
Subscribe to the RSS feed.