National Repository of Grey Literature 72 records found  beginprevious41 - 50nextend  jump to record: Search took 0.01 seconds. 
Melody Extraction with Deep Learning
Balhar, Jiří ; Hajič, Jan (advisor) ; Maršík, Ladislav (referee)
Melody extraction is arguably one of the most important and challenging problems in Music Information Retrieval. It is melody that we are likely to recall after listening to a song and so it is one of the most relevant aspects of music. However the presence of accompaniment in songs makes the task hard to address using rule-based methods. During the last years data-driven methods based on deep learning started to outperform methods traditionally used in the field. In this thesis we continue in these efforts and propose three new methods for melody extraction. Among these an architecture called Harmonic Convolutional Neural Network, based on a modification of convolutional neural networks to better capture harmonically related information in an input spectrogram with logarithmic frequency axis, was able to achieve state-of-the-art performance on several publicly available melody datasets. 1
Feature Evaluation for Scalable Cover Song Identification Using Machine Learning
Martišek, Petr ; Maršík, Ladislav (advisor) ; Hajič, Jan (referee)
Cover song identification is a field of music information retrieval where the task is to determine whether two different audio tracks represent different versions of the same underlying song. Since covers might differ in tempo, key, instrumentation and other characteristics, many clever features have been developed over the years. We perform a rigorous analysis of 32 features used in related works while distinguishing between exact and scalable features. The former are based on a harmonic descriptor time series (typically chroma vectors) and offer better performance at the cost of computation time. The latter have a small constant size and only capture global phenomena in the track, making them fast to compute and suitable for use with large datasets. We then select 7 scalable and 3 exact features to build our own two-level system, with the scalable features used on the first level to prune the dataset and the exact on the second level to refine the results. Two distinct machine learning models are used to combine the scalable resp. exact features. We perform the analysis and the evaluation of our system on the Million Song Dataset. The experiments show the exact features being outperformed by the scalable ones, which lead us to a decision to only use the 7 scalable features in our system. The...
Neural Network Based Named Entity Recognition
Straková, Jana ; Hajič, Jan (advisor) ; Černocký, Jan (referee) ; Konopík, Miloslav (referee)
Title: Neural Network Based Named Entity Recognition Author: Jana Straková Institute: Institute of Formal and Applied Linguistics Supervisor of the doctoral thesis: prof. RNDr. Jan Hajič, Dr., Institute of Formal and Applied Linguistics Abstract: Czech named entity recognition (the task of automatic identification and classification of proper names in text, such as names of people, locations and organizations) has become a well-established field since the publication of the Czech Named Entity Corpus (CNEC). This doctoral thesis presents the author's research of named entity recognition, mainly in the Czech language. It presents work and research carried out during CNEC publication and its evaluation. It fur- ther envelops the author's research results, which improved Czech state-of-the-art results in named entity recognition in recent years, with special focus on artificial neural network based solutions. Starting with a simple feed-forward neural net- work with softmax output layer, with a standard set of classification features for the task, the thesis presents methodology and results, which were later used in open-source software solution for named entity recognition, NameTag. The thesis finalizes with a recurrent neural network based recognizer with word embeddings and character-level word embeddings,...
Generating polyphonic music using neural networks
Židek, Marek ; Hajič, Jan (advisor) ; Maršík, Ladislav (referee)
The aim of this thesis is to explore new ways of generating unique polyphonic music using neural networks. Music generation, either in raw audio waveforms or discretely represented, is very interesting and under a heavy ex- ploration in recent years. This thesis works with midi represented polyphonic classical music for piano as training data. We introduce the problem, show rele- vant neural network architectures and describe our numerous ideas, out of which one idea, our experiment with three versions of skip residual LSTM connections for music composition, we consider a good contribution to the field. In related work, skip-connections were explored mostly for classification tasks, however, our results show a solid improvement for music composition (e.g. 47% of respondents considered our samples real). We also show that skip-connections have rather diverse hyperparameter space for future tuning. Apart from standard automated test set evaluation, which is hard to design and interpret for creativity mimicking models, we also did a complex evaluation through surveys. The evaluation was specifically designed to not only to show results for our samples, but to reveal information about expectancy, preconceptions and influence of personal charac- teristics of the respondents. We consider this a valuable...
Music composition based on a programming language
Pavlín, Tomáš ; Maršík, Ladislav (advisor) ; Hajič, Jan (referee)
Computer music composition brings a lot of problems which can be solved using a variety of approaches. The existing music composition programs either do not provide enough flexibility to composers or they are considerably complicated for users which do not have technical background. In this thesis, we introduce an intuitive programming language designed for music composition along with an interpreter of this language represented by user-friendly graphical interface. The interface can be utilized for music composition and production even by users without technical and musical skills. The program provides a new approach for music composition and allows an effortless music creation that can be used e.g. in game industry. In addition, the program can be used for musical accompaniment. 1
Vícejazyčná databáze kolokací
Helcl, Jindřich ; Hajič, Jan (advisor) ; Mareček, David (referee)
Collocations are groups of words which are co-occurring more often than appearing separately. They also include phrases that give a new meaning to a group of unrelated words. This thesis is aimed to find collocations in large data and to create a database that allows their retrieval. The Pointwise Mutual Information, a value based on word frequency, is computed for finding the collocations. Words with the highest value of PMI are considered candidates for good collocations. Chosen collocations are stored in a database in a format that allows searching with Apache Lucene. A part of the thesis is to create a Web user interface as a quick and easy way to search collocations. If this service is fast enough and the collocations are good, translators will be able to use it for finding proper equivalents in the target language. Students of a foreign language will also be able to use it to extend their vocabulary. Such database will be created independently in several languages including Czech and English. Powered by TCPDF (www.tcpdf.org)
Matching Images to Texts
Hajič, Jan ; Pecina, Pavel (advisor) ; Průša, Daniel (referee)
We build a joint multimodal model of text and images for automatically assigning illustrative images to journalistic articles. We approach the task as an unsupervised representation learning problem of finding a common representation that abstracts from individual modalities, inspired by multimodal Deep Boltzmann Machine of Srivastava and Salakhutdinov. We use state-of-the-art image content classification features obtained from the Convolutional Neural Network of Krizhevsky et al. as input "images" and entire documents instead of keywords as input texts. A deep learning and experiment management library Safire has been developed. We have not been able to create a successful retrieval system because of difficulties with training neural networks on the very sparse word observation. However, we have gained substantial understanding of the nature of these difficulties and thus are confident that we will be able to improve in future work.
Popularity Meter
Hajič, Jan ; Bojar, Ondřej (advisor) ; Popel, Martin (referee)
Having the possibility of automatically tracking a person's popularity in the newspapers is an idea appealing not just to those in the media spotlight. While sentiment (subjectivity) analysis is a rapidly growing subfield of computational linguistics, no data from the news domain are yet available for Czech. We have therefore started building a manually annotated polarity corpus of sentences from Czech news texts; however, these texts have proven themselves rather unwieldy for such processing. We have also designed a classifier which should be able to track popularity based on this corpus; the classifier has been tested on a corpus of product reviews of domestic appliances and some introductory testing has been done on the nascent news corpus. As a model, we simply extract a unigram polarity lexicon from the data. We then use three related methods for identifying lemma polarity and a number of simple filters for feature selection. On the domestic appliance data, our simplest model has achieved results comparable to the state of the art, however, the properties of Czech news texts and preliminary results hint a more linguistically oriented approach might be preferrable.
New Methods in Statistical Speech Recognition
Klusáček, David ; Hajič, Jan (advisor) ; Psutka, Josef (referee) ; Černocký, Jan (referee)
Title: New Methods in Statistical Speech Recognition Author: David Klusáček Department: Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics in Prague, Malostranské náměstí 25, 118 00 Praha 1. Advisor: Prof. RNDr. Jan Hajič, Dr., Institute of Formal and Applied Linguistics. Abstract: This works aims to identify limits of contemporary speech rec- ognizers and tries to come up with methods that could push back the fron- tiers. After describing the state of the art, the weakest link of the chain has been identified in the acoustic front-end, especially when working in harsh acoustic conditions. NUFIBA front-end, the proposed solution, includes re- verb compensation and speaker/background segmentation as well as contin- uous SNR monitoring which, thru cooperation with acoustic model, hinders from avalanche spreading of recognition errors. Owing to the lack of time, only a phoneme recognizer was finally implemented, although large blocks of originally intended word-based continuous speech recognizer were implemented and tested (such as the MMI-class based language model).

National Repository of Grey Literature : 72 records found   beginprevious41 - 50nextend  jump to record:
See also: similar author names
2 Hajič, Jakub
Interested in being notified about new results for this query?
Subscribe to the RSS feed.