National Repository of Grey Literature 46 records found  1 - 10nextend  jump to record: Search took 0.00 seconds. 
Project Financing by Cohesion Fund
Jurenková, Eva ; Dušek, Ondřej (referee) ; Koleňák, Jiří (advisor)
Bachelor thesis deals with the problematic of drawing subsidies from the Cohesion Fund through the Operation Programme of Environment and it focuses primarily on the first priority theme of the operation programme. The aim of the thesis is to elaborate a model application form a subsidy for the project of sewerage and water purification plant construction in Sloupnice.
Low-resource methods for dialogue systems applications
Hudeček, Vojtěch ; Dušek, Ondřej (advisor) ; Skantze, Gabriel (referee) ; Schwarz, Petr (referee)
This thesis focuses on developing and improving task-oriented dialogue systems design in the rapidly growing landscape of artificial intelligence and natural language processing. We propose techniques that can substantially decrease development and deployment costs, motivated by the desire to make these systems more adaptable and scalable. We introduce multiple novel approaches to achieving these goals. Firstly, we present a weakly supervised automatic data annotation pipeline that can transform raw dialogue transcript into a refined set of semantically coherent concepts, bypassing the need for exhaustive manual annotations in natural language understanding for a given domain and significantly streamlining the development process. We also explore the largely uninvestigated field of latent variable models in task-oriented dialogue system modeling. These models offer excellent capabilities with the potential to uncover the structure of behavioral patterns seen in the dialogue through inspection of the latent space and comparison with actions taken by the model. Furthermore, we explore the potential of these models to form hierarchical representations using our proposed architecture. Following recent progress in the field, we harness the power of pre-trained large language models using in-context learning. We...
Learning capabilities in Transformer Neural Networks
Variš, Dušan ; Bojar, Ondřej (advisor) ; Sennrich, Rico (referee) ; Dušek, Ondřej (referee)
Title: Learning Capabilities of the Transformer Neural Networks Author: Dušan Variš Department: Institute of Formal and Applied Linguistics Supervisor: doc. RNDr. Ondřej Bojar, Ph.D., Institute of Formal and Applied Linguistics Abstract: Although the contemporary neural networks, inspired by biological neurons, were able to reach human-like performance on many tasks in recent years, their optimiza- tion (learning) process is still very far from the one observed in humans. This thesis investigates various aspects of learning in the current state-of-the-art Transformer neural networks, the dominant architecture in the current neural language process- ing. Firstly, we measure the level of generalization in Transformers using several probing experiments based on the idea of adversarial evaluation. Secondly, we ex- plore their potential for incremental learning when combined with regularization using the elastic weight consolidation approach. Lastly, we propose a modular ex- tension of the existing Transformer architecture enabling subnetwork selection con- ditioned on the intermediate hidden layer outputs and analyze the attributes of this network modularization. We investigate our hypotheses mainly within the scope of neural machine translation and multilingual translation showing the limitations of the...
Practical neural dialogue management using pretrained language models
Šafář, Jaroslav ; Dušek, Ondřej (advisor) ; Bojar, Ondřej (referee)
Task-oriented dialogue systems pose a significant challenge due to their complexity and the need to handle components such as language understanding, state tracking, action selection, and language generation. In this work, we explore the improvements in dialogue management using pretrained language models. We propose three models that incorporate pretrained language models, aiming to provide a practical approach to designing dialogue systems capable of effectively addressing the language understanding, state tracking, and action selection tasks. Our dialogue state tracking model achieves a joint goal accuracy of 74%. We also identify limitations in handling complex or multi- step user requests in the action selection task. This research underscores the potential of pretrained language models in dialogue management while highlighting areas for further improvement. 1
Data-to-text generation with text-editing models
Grajcar, Peter ; Dušek, Ondřej (advisor) ; Variš, Dušan (referee)
We explore the use of different model extensions of the FELIX neural transformer-based text-editing model for data-to-text generation. Our ap- proach is based on iterative text-editing - transforming the individual items of the input data into short sentences using trivial templates and then it- eratively improving the text by fusing the sentences using a text-editing model. Our extensions include replacing the FELIX's non-autoregressive de- coder with an autoregressive transformer decoder, extending the decoding so that it can preserve the input data in the output text, and adding a pointer network-based clause-level reordering mechanism. Furthermore, we propose our own new dataset versions of the WebNLG and DiscoFuse datasets for training the text-editing models. We evaluate our models on the WebNLG dataset with automatic metrics and manually analyse the outputs of selected models.
Gender stereotypes in neural sentence representations
Al Ali, Adnan ; Libovický, Jindřich (advisor) ; Dušek, Ondřej (referee)
Neural networks have seen a spike in popularity in natural language processing in re- cent years. They consistently outperform the traditional methods and require less human labor to perfect as they are trained unsupervised on large text corpora. However, these corpora may contain unwanted elements such as biases. We inspect multiple language models, primarily focusing on a Czech monolingual model - RobeCzech. In the first part of this work, we present a dynamic benchmarking tool for identifying gender stereotypes in a language model. We present the tool to a group of annotators to create a dataset of biased sentences. In the second part, we introduce a method of measuring the model's perceived political values of men and women and compare them to real-world data. We argue that our proposed method provides significant advantages over other methods in our knowledge. We find no strong systematic beliefs or gender biases in the measured political values. We include all the code and created datasets in the attachment. 1
Neural Concept-to-text Generation with Knowledge Graphs
Szabová, Kristína ; Dušek, Ondřej (advisor) ; Libovický, Jindřich (referee)
Modern language models are strong at generating grammatically correct, natural lan- guage. However, they still struggle with commonsense reasoning - a task involving making inferences about common everyday situations without explicitly stated informa- tion. Prior research into the topic has shown that providing additional information from external sources helps language models generate better outputs. In this thesis, we explore methods of extracting information from knowledge graphs and using it as additional input for a pre-trained generative language model. We do this by either extracting a subgraph relevant to the context or by using graph neural networks to predict which information is relevant. Moreover, we experiment with a post-editing approach and with a model trained in a multi-task setup (generation and consistency classification). Our methods are evaluated on the CommonGen benchmark for generative commonsense reasoning using both automatic metrics and a detailed error analysis on a small sample of outputs. We show that the methods improve over a simple language model fine-tuning baseline, although they do not set a new state of the art. 1
Normalization of numbers into spoken form for text-to-speech systems
Růžička, Jakub ; Dušek, Ondřej (advisor) ; Peterek, Nino (referee)
Title: Normalization of numbers into spoken form for text-to-speech systems Author: Jakub Růžička Institute: Institute of Formal and Applied Linguistics Supervisor: Mgr. et Mgr. Ondřej Dušek, Ph.D., Institute of Formal and Applied Lin- guistics Abstract: A necessary part of any text-to-speech system is the normalization of num- bers and words containing numbers. The accuracy of this process can significantly affect the quality of the resulting speech. The main goal of this work is the design and imple- mentation of a number normalization module for Czech. Words containing digits are first assigned to one of the predefined categories. Based on the category given, possible spoken forms are subsequently generated. For the selection of the contextually correct variant, an existing language model is used. The system is distributed as a Python package and can run on Linux or in a Docker container whose configuration is part of the project. Moreover, a specialized data annotation application has been designed and written for creating the datasets for the Czech text normalization task. Two datasets with 1,882 sen- tences and 3,185 words requiring normalization were obtained using the data annotation service. The system achieved a sentence-level accuracy of over 80% on both datasets. We perform a detailed error...

National Repository of Grey Literature : 46 records found   1 - 10nextend  jump to record:
See also: similar author names
19 DUŠEK, Ondřej
7 Dušek, Otakar
4 Dušek, Oto
Interested in being notified about new results for this query?
Subscribe to the RSS feed.