National Repository of Grey Literature 35 records found  1 - 10nextend  jump to record: Search took 0.00 seconds. 
Object layout in a 2D room based on text description
Pavelka, Jan ; Rosa, Rudolf (advisor) ; Kasner, Zdeněk (referee)
This thesis presents a solution for generating structured description a 2D map of a room from a bird's eye view based on textual description in Czech. It focuses on identifying physical objects and their mutual relative positions in the description. It describes linguistic phenomena of the information extraction and their usage in the im- plementation. It shows how syntactic parsing can be used for this task. Then, it uses a genetic algorithm to find a feasible layout of the extracted objects with respect to spatial constraints constructed from the extracted information. 1
Automatic inflection in Czech language
Sourada, Tomáš ; Rosa, Rudolf (advisor) ; Vidra, Jonáš (referee)
This thesis focuses on the task of automatic morphological inflection of Czech nouns, specifically in out-of-vocabulary (OOV) conditions (inflecting previously unseen words). We automatically extracted a large dataset suit- able for training and evaluation in the OOV conditions. We also manually built a real-world OOV dataset of neologisms. We developed three different systems: a retrograde model performing a variation of kNN algorithm, and two sequence-to-sequence (seq2seq) models based on LSTM and Transformer. Compared to an available rule-based inflection system sklonuj.cz and stan- dard SIGMORPHON shared task baselines, our seq2seq model reaches the best results in the standard OOV conditions. Moreover, it achieves state-of- the-art results for 6 out of 16 development languages from SIGMORPHON 2022 shared task data in the OOV evaluation (feature overlap) on large data condition. On the real-world OOV dataset, the retrograde model outper- forms all neural models and is competitive with a non-neural SIGMORPHON baseline. We release the inflection system with seq2seq model as a ready-to- use Python library. It could serve as a complement to the state-of-the-art dictionary-based inflection system MorphoDiTa as a back-off for OOV words, especially once extended to other parts of speech. 1
Development of a mobile application and question generator for the game Smart10.
Tomiška, Tadeáš ; Mareček, David (advisor) ; Rosa, Rudolf (referee)
This bachelor's thesis focuses on creating a mobile application for Android that allows playing an online version of the game Smart10. The work also includes generating questi- ons for the game, which will be generated using web pages from Wikipedia. The technique of web page parsing will be used to obtain the necessary data. The application will be written in Java and will be intended for Android versions 10 and higher. A client-server architecture will be used for communication between devices, with communication via Wifi technology. The application will have the same rules as the game Smart10 and will support two gaming modes. It will be playable in an online mode with other players or in a friend mode with friends. 1
Tackling Hallucinations in Chart Summarization
Obaid ul Islam, Saad ; Dušek, Ondřej (advisor) ; Rosa, Rudolf (referee)
Thesis Abstract Saad Obaid ul Islam Charles University, Saarland University Title Tackling Hallucinations in Chart Summarization Abstract Information visualizations like bar charts, line charts, and pie charts are a common way of communicating quantitative data. They are used to get important insights and make well informed decisions. Automatic Chart Summarization is the task to explain and summarize the key takeaways from the chart. Like other natural language generation (NLG) systems, chart summarization systems suffer from a phenomenon called halluci- nations. Hallucinations occur when the system generates text that is not grounded in the input. In this research work, we try to tackle the problem of hallucinations in chart summarization. Our analysis shows that a lot of additional information is present in the training data that leads to hallucinations during inference. We also found out that reducing long distance dependencies and addition of chart related information like title and legends improve the overall performance of the system. Furthermore, we propose a natural language inference (NLI) based method to clean the training data and show that our method produces faithful summaries. 1
Automatic extraction of the main characters from books and their interactions
Brezinová, Viktória ; Mareček, David (advisor) ; Rosa, Rudolf (referee)
The goal of this work is to automatically find named characters in the books, detect all occurrences of these characters and determine places in the text where two or more characters interact together. One of the outputs of this work is the tool for display- ing interactive graphs that show us the occurrences and interactions of the characters throughout the book. We can search and analyze the places of occurrences and inte- ractions using this tool, since the graphs are connected to the text of the book. We also evaluated our methods on the unseen texts, analyzed errors, and proposed improvements that could be explored in future work. 1
Automatic generation of medical reports from chest X-rays in Czech
Chaloupský, Lukáš ; Rosa, Rudolf (advisor) ; Libovický, Jindřich (referee)
This thesis deals with the problem of automatic generation of medical reports in the Czech language based on the input chest X-ray images using deep neural networks. The first part deals with the analysis of the problem itself including a comparison of existing solutions from several common points of view. In order to interpret medical images in the Czech language, we present a fine-tuned Czech GPT-2 model specialized on medical texts based on the original pre-trained English GPT-2 model along with its evaluation. In the second part, the created Czech GPT-2 is used for training a neural network model for generating medical reports. The training was conducted on freely available data along with data preprocessing and their adjustment for the Czech language. Furthermore, the model results are discussed and evaluated using standard metrics for natural language processing to determine the performance. 1
Analysis and visualization of the GPT-2 language model
Šipoš, Daniel ; Mareček, David (advisor) ; Rosa, Rudolf (referee)
Visualization of deep neural network models with Transformer architecture is generally a very demanding task which is usually solved by visualizing attention blocks and moni- toring which words these block focus on. However, Transformer models have many layers and there are multiple attention heads on each layer. Therefore, each head may attend to different linguistic features. In this work, we focus on developing an application that is designed to visualize the behaviour of GPT-2 language models more clearly. We propose four visualization methods that examine the dependencies of generated words on pre- vious words in the text. We monitor these dependencies by removing one of the words in the previously generated text or replacing it with a similar word and then observing changes of the probability of the generated word. We show the results of our methods produced on the GPT-2 Medium model and formulate hypotheses with the aim to explain them. 1
Automatic post-editing of phrase-based machine translation outputs
Rosa, Rudolf ; Mareček, David (advisor) ; Žabokrtský, Zdeněk (referee)
We present Depfix, a system for automatic post-editing of phrase-based English-to-Czech machine trans- lation outputs, based on linguistic knowledge. First, we analyzed the types of errors that a typical machine translation system makes. Then, we created a set of rules and a statistical component that correct errors that are common or serious and can have a potential to be corrected by our approach. We use a range of natural language processing tools to provide us with analyses of the input sentences. Moreover, we reimple- mented the dependency parser and adapted it in several ways to parsing of statistical machine translation outputs. We performed both automatic and manual evaluations which confirmed that our system improves the quality of the translations.
Web Interface for the Treex Framework
Sedlák, Michal ; Popel, Martin (advisor) ; Rosa, Rudolf (referee)
This work deals with a web application called Treex::Web which serves as a web interface for NLP framework Treex. The work addresses several Treex issues (e.g. absence of graphical user interface and complicated installation) and offers Treex::Web as a possible solution. At the beginning of this work we introduce the Treex framework itself. The following chapters describe Treex::Web's user interface (chapter 3) and the implementation of the whole web application (chapter 4). Conclusion of this work includes a comparison of NLP frameworks similar to Treex and their web interfaces. 1
Generation of tennis singles results
Prokop, Dominik ; Rosa, Rudolf (advisor) ; Kasner, Zdeněk (referee)
This thesis deals with the problem of generating short articles about past tennis singles from structured data. Articles are generated by using templates heuristically obtained from the original news articles. The result of this thesis is a template database and a graphic application which using these templates for given input generates corresponding short articles. This thesis also includes evaluation of the application outputs which shows that 64 % of the generated texts are correct. 1

National Repository of Grey Literature : 35 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.