National Repository of Grey Literature 70 records found  1 - 10nextend  jump to record: Search took 0.01 seconds. 
Generování hudebních symbolů pomocí neuronových sítí
Havelka, Jonáš ; Pecina, Pavel (advisor) ; Hajič, Jan (referee)
We create more training data for the optical music recognition (OMR) task by generating artificial images of the music symbols. We follow up Mashcima and the model J. Mayer trained on it. We take the Rebelo dataset (dataset of music symbol images), adjust it with some computer vision methods, and train generative neural networks (above all, variational and adversarial autoencoders) on it. By replacing some original images in Mashcima input with ones generated by those networks, we get more general performance from the model: For slightly worsening on the original dataset (CVC-MUSCIMA), we get much better results on the PrIMuS dataset. Also, we create very realistic synthetic images of music symbols.
Unsupervised segmentation of Gregorian chant melodies for exploring chant modality
Lanz, Vojtěch ; Hajič, Jan (advisor) ; Mareček, David (referee)
Gregorian chant, as an oral musical tradition, was performed by singers that had to memorize thousands of melodies. Each melody has a set of properties, one of which is what mode it belongs to within the modal system. To understand the learning process principles of chants, it may be helpful to decompose melodies into smaller units and analyze their relationship to modality. In this work, we compare Bayesian and neural network unsupervised segmentation methods. We measure their performance on evalu- ation metrics we design in order to examine the chant's properties with respect to the memorization challenge considering the modality aspects. For this purpose, we have two datasets, one with over thirteen thousand antiphons and the other with over seven thousand responsories. We find the Pitman-Yor process to be a more fitting model than BERT for this particular task, especially the conditional Pitman-Yor process model we proposed to segment each mode independently. We provide several clear arguments that modality and chant segmentation are closely connected. We also dispute the claim by Cornelissen et al. [2020] that the natural segmentation by chant words or syllables is best in terms of mode classification, and we provide a new state-of-the-art performance on the mode classification task. 1
Automatic generation of Einstein's puzzles in natural language
Hubená, Michaela ; Mareček, David (advisor) ; Hajič, Jan (referee)
In this bachelor thesis was created command line application for generat- ing Einstein's riddles in natural language using language model GPT-3 (third generation Generated Pre-trained Transformer). The few-shot method was used to generate Einstein's riddles, where, in addition to entering the required task, the language model is also given several solved examples of this task, with which the language model is supposed to learn the task directly during generation. The created application allows user to generate Einstein's riddles of various sizes and difficulties on any topic in Czech or English language. During generation the emphasis is placed on the creativity and originality of Einstein's riddles.
Non-Autoregressive Neural Machine Translation
Helcl, Jindřich ; Hajič, Jan (advisor) ; Duh, Kevin (referee) ; Popel, Martin (referee)
In recent years, a number of mehtods for improving the decoding speed of neural machine translation systems have emerged. One of the approaches that pro- poses fundamental changes to the model architecture are non-autoregressive models. In standard autoregressive models, the output token distributions are conditioned on the previously decoded outputs. The conditional dependence al- lows the model to keep track of the state of the decoding process, which improves the fluency of the output. On the other hand, it requires the neural network computation to be run sequentially, and thus it cannot be parallelized. Non- autoregressive models impose conditional independence on the output distri- butions, which means that the decoding process is parallelizable and hence the decoding speed improves. A major drawback of this approach is lower trans- lation quality compared to the autoregressive models. The goal of the non- autoregressive translation research is to find methods that improve the trans- lation quality, while retaining high decoding speed. In this thesis, we explore the research progress so far and identify flaws in the generally accepted eval- uation methodology. We experiement with non-autoregressive models trained with connectionist temporal classification. We find that even though our models...
Speech Interface for Corpus Annotation Tools
Přikryl, Leoš ; Hajič, Jan (advisor) ; Peterek, Nino (referee)
The thesis considers design and implementation of the interface for the corpus annotation tools used at the Institute of Formal and Applied Linguistics (TrEd and its additional modules) in the natural language (speech). Already existing modules for automatic speech recognition from the University of West Bohemia in Pilsen are used.
New Methods in Statistical Speech Recognition
Klusáček, David ; Hajič, Jan (advisor) ; Psutka, Josef (referee) ; Černocký, Jan (referee)
Title: New Methods in Statistical Speech Recognition Author: David Klusáček Department: Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics in Prague, Malostranské náměstí 25, 118 00 Praha 1. Advisor: Prof. RNDr. Jan Hajič, Dr., Institute of Formal and Applied Linguistics. Abstract: This works aims to identify limits of contemporary speech rec- ognizers and tries to come up with methods that could push back the fron- tiers. After describing the state of the art, the weakest link of the chain has been identified in the acoustic front-end, especially when working in harsh acoustic conditions. NUFIBA front-end, the proposed solution, includes re- verb compensation and speaker/background segmentation as well as contin- uous SNR monitoring which, thru cooperation with acoustic model, hinders from avalanche spreading of recognition errors. Owing to the lack of time, only a phoneme recognizer was finally implemented, although large blocks of originally intended word-based continuous speech recognizer were implemented and tested (such as the MMI-class based language model).
Analytical and Tectogrammatical Analysis of a Natural Language
Klimeš, Václav ; Hajič, Jan (advisor) ; Pala, Karel (referee) ; Ribarov, Kiril (referee)
The thesis presents tools for analysis at analytical and tectogrammatical layers that the Prague Dependency Treebank is based on. The tools for analytical annotation consist of two parsers and a tool for assigning syntactic tags. Although the performance of the parsers is far below that of the state-of-the-art parsers, they both can be considered a certain contribution to parsing, since the methods they are based on are novel. The tool for assigning syntactic tags makes 15% less errors than a tool used for this purpose previously. The tool developed for tectogrammatical annotation is the only one that can currently perform this task in such a breadth. Although other, specialized tools may have a better performance of some of its particular subtasks, my tool makes 29% and 47% less errors for the Czech language than the combination of existing tools for annotating the tectogrammatical structure and deep functors, respectively, which are the core of the tectogrammatical layer. The proposed tools are designed the way they can be used for other languages as well.
Neural Network Based Named Entity Recognition
Straková, Jana ; Hajič, Jan (advisor) ; Černocký, Jan (referee) ; Konopík, Miloslav (referee)
Title: Neural Network Based Named Entity Recognition Author: Jana Straková Institute: Institute of Formal and Applied Linguistics Supervisor of the doctoral thesis: prof. RNDr. Jan Hajič, Dr., Institute of Formal and Applied Linguistics Abstract: Czech named entity recognition (the task of automatic identification and classification of proper names in text, such as names of people, locations and organizations) has become a well-established field since the publication of the Czech Named Entity Corpus (CNEC). This doctoral thesis presents the author's research of named entity recognition, mainly in the Czech language. It presents work and research carried out during CNEC publication and its evaluation. It fur- ther envelops the author's research results, which improved Czech state-of-the-art results in named entity recognition in recent years, with special focus on artificial neural network based solutions. Starting with a simple feed-forward neural net- work with softmax output layer, with a standard set of classification features for the task, the thesis presents methodology and results, which were later used in open-source software solution for named entity recognition, NameTag. The thesis finalizes with a recurrent neural network based recognizer with word embeddings and character-level word embeddings,...
Speech Recognition of Czech Using Finite-State Machines
Podveský, Petr ; Hajič, Jan (advisor) ; Psutka, Josef (referee) ; Krbec, Pavel (referee)
Speech recognition has become a thriving field with many real-life applications. Voice dialing in cell phones, voice control in embedded devices, speech-driven interactive manuals and many other utilities rely on solid speech recognition software. We believe that research in speech recognition can boost performance of many applications related to the area. The thesis concentrates on automatic large-vocabulary continuous-speech recognition of Czech. Czech differs from English in a few aspects. We focus on these differences and propose new language-depended techniques. Namely rich morphology is investigated and its impact on speech recognition is studied. Out-of-vocabulary (OOV) words are identified as one of the major sources deteriorating recognition performace. New language modeling techniques are proposed to alleviate the problem of OOV words. The proposed language models are tested in speech recognition systems on diverse speech corpora. The obtained results validate the original approach to language modeling. Significant overall speech recognition improvement is observed.

National Repository of Grey Literature : 70 records found   1 - 10nextend  jump to record:
See also: similar author names
2 Hajič, Jakub
Interested in being notified about new results for this query?
Subscribe to the RSS feed.