National Repository of Grey Literature 35 records found  previous11 - 20nextend  jump to record: Search took 0.01 seconds. 
Computational analysis and synthesis of song lyrics
Březinová, Patrícia ; Popel, Martin (advisor) ; Rosa, Rudolf (referee)
We explore a dataset of almost half a million English song lyrics through three different processes - automatic evaluation, visualization, and generation. We create our own rhyme detector, using the EM algorithm with several improvements and adjustable parameters. This may, in some cases, replace human evaluators that cannot be used, for example, after each iteration of the lyrics generator to evaluate its improvement. By creating a web-page visualization of the results with interesting matrix rhyme highlighting, we make our evaluation accessible to the public. We discuss interesting genre differences discovered by applying our automatic evaluation on the entire dataset. Finally, we explore lyrics generation using state-of-the-art GPT-2.
Automatic generation of crosswords
Jurášová, Daniela ; Rosa, Rudolf (advisor) ; Novák, Michal (referee)
The aim of this work is to present a program for automatic generation of thematic crosswords. The program generates a specific type of crossword puzzle, where a few words in columns placed right next to each other create a secret word in a specified row. The resulting program is in the form of a computer application with a simple graphi- cal environment. The user initially chooses a crossword puzzle topic from the available list. The program then uses a freely accessible WordNet database to generate a crossword puzzle containing words related to the topic. At the same time, it will create a legend for solving the crossword puzzle. Compared to the original assignment, the current proposal is extended by the possibi- lity to define your own topic. The word database is extended by Slovak WordNet, therefore the user can choose between two languages of words filled in the crossword puzzle. 1
End-to-end dialogue systems with pretrained language models
Kulhánek, Jonáš ; Dušek, Ondřej (advisor) ; Rosa, Rudolf (referee)
Current dialogue systems typically consist of separate components, which are manu- ally engineered to a large part and need extensive annotation. End-to-end trainable sys- tems exist but produce lower-quality, unreliable outputs. The recent transformer-based pre-trained language models such as GPT-2 brought considerable progress to language modelling, but they rely on huge amounts of textual data, which are not available for common dialogue domains. Therefore, training these models runs a high risk of overfit- ting. To overcome these obstacles, we propose a novel end-to-end dialogue system called AuGPT. We add auxiliary training objectives to use training data more efficiently, and we use massive data augmentation via back-translation and pretraining on multiple datasets to increase data volume and diversity. We evaluate our system using automatic methods (corpus-based metrics, user simulation), human evaluation as part of the DSTC 9 shared task challenge (where our system placed 3rd out of 10), as well as extensive manual error analysis. Our method substantially outperforms the baseline on the MultiWOZ bench- mark and shows competitive results with state-of-the-art end-to-end dialogue systems. 1
Generating text from structured data
Trebuňa, František ; Rosa, Rudolf (advisor) ; Kasner, Zdeněk (referee)
In this thesis we examine ways of conditionally generating document-scale natural language text given structured input data. Specifically we train Deep Neural Network models on RotoWire dataset containing statistical data about basketball matches paired with descriptive summaries. First, we analyse the dataset and propose several prepro- cessing methods (e.g. Byte Pair Encoding). Next, we train a baseline model based on the Encoder-Decoder architecture on the preprocessed dataset. We discuss several prob- lems of the baseline and explore advanced Deep Neural Network architectures that aim to solve them (Copy attention, Content Selection, Content Planning). We hypothesize that our models are not able to learn the structure of the input data and we propose a method reducing its complexity. Our best model trained on the simplified data manages to outperform the baseline by more than 5 BLEU points. 1
Quote Attribution and Character Networks in Novels
Urbanová, Zuzana ; Rosa, Rudolf (advisor) ; Kyjánek, Lukáš (referee)
This thesis focuses on extracting information from literary works using tools for language analysis. Our goal is to automatically extract a conversational network of the characters in a novel. We divide the work into three subproblems and solve them separately: Character Extraction, Quote Attribution and Network Creation. The result is an end-to-end tool that gets a text of a novel in English and outputs a visual representation of the character network. Our work is based on existing literature. It presents new ideas and compares the accuracy of various methods for each subproblem. 1
Biblical paraphrasing
Michálek, Ondřej ; Rosa, Rudolf (advisor) ; Barančíková, Petra (referee)
In this bachelor's work, I deal with the paraphrasing of text at the word level, namely the paraphrasing of biblical texts in Czech. Paraphrasing should also include modernising the text to make it easier for the average reader to understand. For this purpose, I have developed a desktop user application in which I use tools to analyze and generate language: Word2vec and MorphoDiTa. In my work, I compare the results of each approach in paraphrasing. The evaluation of the results follows several criteria and compares the effectiveness of the methods used on verses from czech translations: Bible kralická and Český ekumenický překlad.
Generator of computer descriptions
Matějka, Jan ; Rosa, Rudolf (advisor) ; Dušek, Ondřej (referee)
This thesis deals with the problem of generating coherent and well-formed sentences from structured data. The goal of the thesis is to create a tool which could make generating brief descriptions of electronics based on parameters in the form of structured data easier. The tool can be useful for e.g. e-shops with such electronics. The first part of the thesis introduces possible solutions to this problem. The thesis next describes data needed for solving the problem, including the ways of acquiring such data and structure of the data. Two selected solutions are then described including their implementation. The thesis then examines the advantages and disadvantages of the selected solutions and evaluates texts generated by the created tool.
Generating text descriptions of journeys in a map
Svobodová, Zuzana ; Rosa, Rudolf (advisor) ; Mareček, David (referee)
This thesis aims to present the key aspects of a program developed for the purpose of improving orientation in maps by generating text description of routes. Even though such tools are already available and integrated within the most widely used internet map engines (e.g. maps.google.com and mapy.cz), they are not particularly user-friendly, as they rely on directions and distances. People are, on the other hand, more inclined to use landmarks such as significant buildings for orientation and synthetize simple information into more complex one. The program presented in this thesis addresses this issue and offers more intuitive route descriptions enabling its user to reach his/her destination potentially faster and more reliably.
Eye-tracking features in syntactic parsing
Agrawal, Abhishek ; Rosa, Rudolf (advisor) ; Straková, Jana (referee)
In this thesis, we explore the potential benefits of leveraging eye-tracking information for dependency parsing on the English part of the Dundee corpus. To achieve this, we cast dependency parsing as a sequence labelling task and then augment the neural model for sequence labelling with eye-tracking features. We also augment a graph-based parser with eye-tracking features and parse the Dundee Corpus to corroborate our findings from the sequence labelling parser. We then experiment with a variety of parser setups ranging from parsing with all features to a delexicalized parser. Our experiments show that for a parser with all features, although the improvements are positive for the LAS score they are not significant whereas our delexicalized parser significantly outperforms the baseline we established. We also analyze the contribution of various eye-tracking features towards the different parser setups and find that eye-tracking features contain information which is complementary in nature, thus implying that augmenting the parser with various gaze features grouped together provides better performance than any individual gaze feature. 1

National Repository of Grey Literature : 35 records found   previous11 - 20nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.