National Repository of Grey Literature 132 records found  1 - 10nextend  jump to record: Search took 0.00 seconds. 
Speech-Informed Inverse Text Normalization
Stankov, Vladislav ; Bojar, Ondřej (advisor) ; Plátek, Ondřej (referee)
In the domain of Automatic Speech Recognition (ASR), Inverse Text Normalization (ITN) is applied after the speech recognition step to transform recognized verbalized text into written form. This process includes converting verbalized numbers into digits, formatting dates and monetary amounts, and applying correct capitalization and inserting punctuation marks. As ITN systems serve as post-processing modules for ASR outputs, integrating the original audio input as an additional signal into the ITN system is also possible. In this thesis, we explore the impact of the speech signal on the performance of ITN neural models and create a dataset for training and evaluating speech-informed ITN models. Our best model demonstrates a significant improvement in the precision and recall of inserting periods, commas, and question marks, as well as in adding letter casing, when compared to the text-only baseline. Improvements are also observed in less frequent punctuation symbols, though they are not statistically significant. 1
Self-Supervised Summarization via Reinforcement Learning
Kripner, Matěj ; Bojar, Ondřej (advisor) ; Straka, Milan (referee)
In deep learning, summarization models are traditionally trained using a maximum like- lihood objective with reference summaries. Another line of work explores self-supervised approaches that do not require and are not limited by references. In this thesis, we opt for the latter approach. Our main contributions include the design of a novel dense reward function for summarization and its application for fine-tuning a sequence-to-sequence model via reinforcement learning. We build the whole training pipeline in a modular fashion, separately evaluating and tuning a supervised pre-training module, the rein- forcement learning algorithm, and the reward function. After connecting all these com- ponents together, we also tune our self-learning approach as a whole. We evaluate the final checkpoints using 12 automatic and 3 manual metrics, revealing an improvement in reference-free metrics in nearly all cases. 1
Multi-Source Simultaneous Speech Translation
Macháček, Dominik ; Bojar, Ondřej (advisor) ; Yvon, Francois (referee) ; Niehues, Jan (referee)
Neural machine translation has the capability to translate from several parallel in- puts into different languages. Current simultaneous speech translation sometimes faces issues with quality, especially when the source is noisy. We investigate the opportunity to use multiple parallel speech signals - the original and simultaneous interpreting - as sources for translation to achieve higher quality. We create an evaluation set ESIC (Europarl Simultaneous Interpreting Corpus). We analyze the challenges of simultane- ous interpreting when used as an additional parallel source. Then, we investigate the robustness of multi-sourcing to transcription errors and assess the reliability of machine translation metrics when evaluating simultaneous speech translation. Last but not least, we implement Whisper-Streaming, a tool that enables real-time processing of large offline speech-to-text models and demonstrates the state of the art. 1
Algonauts challenge 2023: predicting human fMRI activity in response to visual stimulation
Petliak, Nataliia ; Antolík, Ján (advisor) ; Bojar, Ondřej (referee)
In this thesis, we investigate the application of pretrained Deep Neural Networks, par- ticularly Vision Transformers (ViT), for predicting human fMRI activity in response to visual stimulation. The Algonauts Challenge 2023 dataset, serving as a large-scale bench- mark of human fMRI data, allows us to assess the performance of ViT in comparison with established CNN architectures like VGG and ResNet. Our study highlights the complex- ity of this task, especially in accurately modeling the diverse regions of the full visual cortex. We identify specific ViT layers that align with the brain's hierarchical processing and prove to be the most predictive. However, one of the limitations we encounter with pretrained ViT is its reduced adaptability due to inherent subject variability. This limi- tation underscores the challenge in developing a single model that is universally effective across different individuals. To address this, we implement an iterative training strategy, starting with the layers that perform best across all subjects, followed by fine-tuning for specific visual areas in individual subjects. Despite these efforts, the effectiveness of ViT varies; it performs satisfactorily in some subjects but struggles in others, particu- larly in word-selective regions. The incorporation of textual data...
Learning capabilities in Transformer Neural Networks
Variš, Dušan ; Bojar, Ondřej (advisor) ; Sennrich, Rico (referee) ; Dušek, Ondřej (referee)
Title: Learning Capabilities of the Transformer Neural Networks Author: Dušan Variš Department: Institute of Formal and Applied Linguistics Supervisor: doc. RNDr. Ondřej Bojar, Ph.D., Institute of Formal and Applied Linguistics Abstract: Although the contemporary neural networks, inspired by biological neurons, were able to reach human-like performance on many tasks in recent years, their optimiza- tion (learning) process is still very far from the one observed in humans. This thesis investigates various aspects of learning in the current state-of-the-art Transformer neural networks, the dominant architecture in the current neural language process- ing. Firstly, we measure the level of generalization in Transformers using several probing experiments based on the idea of adversarial evaluation. Secondly, we ex- plore their potential for incremental learning when combined with regularization using the elastic weight consolidation approach. Lastly, we propose a modular ex- tension of the existing Transformer architecture enabling subnetwork selection con- ditioned on the intermediate hidden layer outputs and analyze the attributes of this network modularization. We investigate our hypotheses mainly within the scope of neural machine translation and multilingual translation showing the limitations of the...
Towards Machine Translation Based on Monolingual Texts
Kvapilíková, Ivana ; Bojar, Ondřej (advisor) ; Espana-Bonet, Cristina (referee) ; Čmejrek, Martin (referee)
Title: Towards Machine Translation Based on Monolingual Texts Author: Ivana Kvapilíková Institute: Institute of Formal and Applied Linguistics Supervisor: doc. RNDr. Ondřej Bojar, Ph.D., Institute of Formal and Applied Linguistics Abstract: The current state of the art in machine translation (MT) heavily relies on parallel data, i.e. texts that have been previously translated by humans. This type of resource is expen- sive and only available for several language pairs in limited domains. A new line of research has emerged to design models capable of learning to translate from monolingual texts which are signicantly easier to obtain, e.g. by web-crawling. While it is impressive that such models achieve translation capabilities, the translation quality of the output they produce is still low for practical applications. This dissertation thesis strives to improve their performance. We explore the existing approaches of using monolingual resources to train translation models and propose a new technique to generate pseudo-parallel training data articially without expensive human input. We automatically select similar sentences from monolingual corpora in different languages and we show that using them in the initial stages of MT training leads to a signicant enhancement in translation quality. We also...
Practical neural dialogue management using pretrained language models
Šafář, Jaroslav ; Dušek, Ondřej (advisor) ; Bojar, Ondřej (referee)
Task-oriented dialogue systems pose a significant challenge due to their complexity and the need to handle components such as language understanding, state tracking, action selection, and language generation. In this work, we explore the improvements in dialogue management using pretrained language models. We propose three models that incorporate pretrained language models, aiming to provide a practical approach to designing dialogue systems capable of effectively addressing the language understanding, state tracking, and action selection tasks. Our dialogue state tracking model achieves a joint goal accuracy of 74%. We also identify limitations in handling complex or multi- step user requests in the action selection task. This research underscores the potential of pretrained language models in dialogue management while highlighting areas for further improvement. 1
Methods of User-Assisted Summarization of Meetings
Kmječ, František ; Bojar, Ondřej (advisor) ; Kasner, Zdeněk (referee)
Automated minuting, or meeting summarization, is the task of accurately capturing the contents of a meeting in a short text or in bullet points. Recently, a lot of progress has happened in this area, largely due to the rise of the large language models. However, most fully automated approaches have severe limitations; either their outputs are vague or they are prone to hallucinations. We explore the possibility of user-assisted minuting to provide factual accuracy as well as coverage. We introduce a novel open-source tool, Minuteman, integrated with JitSi Meet to explore the methods by which users can interact with summarization models. We then analyze data gathered from multiple experiments with users and show how similar means of interaction can be of use in increasing summary quality. 1
German Compounds in Transformer Models
Neumannová, Kristýna ; Bojar, Ondřej (advisor) ; Zeman, Daniel (referee)
German is known for its highly productive word formation processes, particularly in the area of compounding and derivation. In this thesis, we focus on German nominal compounds and their representation in machine translation (MT) outputs. Despite their importance in German text, commonly used metrics for MT evaluation, such as BLEU, do not adequately capture the usage of compounds. The aim of this thesis was to investigate the generation of German compounds in Transformer models and to explore the conditions that lead to their production. Our analysis revealed that MT systems tend to produce fewer compounds than humans. However, we found that due to the highly productive nature of German compounds, it is not feasible to identify them based on a fixed list. Therefore, we manually identified novel compounds, and even then, human translations still contained more compounds than MT systems. We trained our own Transformer model for English-German translation and conducted experiments to examine various factors that influence the production of compounds, in- cluding word segmentation and the frequency of compounds in the training data. Addi- tionally, we explored the use of forced decoding and the impact of providing the model with the first words of a sentence during translation. Our findings highlight the...

National Repository of Grey Literature : 132 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.