National Repository of Grey Literature 67 records found  beginprevious21 - 30nextend  jump to record: Search took 0.00 seconds. 
Deep learning based sound event recognition
Bajzík, Jakub ; Kiska, Tomáš (referee) ; Přinosil, Jiří (advisor)
This paper deals with processing and recognition of events in audio signal. The work explores the possibility of using audio signal visualization and subsequent use of convolutional neural networks as a classifier for recognition in real use. Recognized audio events are gunshots placed in a sound background such as street noise, human voice, animal sounds, and other forms of random noise. Before the implementation, a large database with various parameters, especially reverberation and time positioning within the processed section, is created. In this work are used freely available platforms Keras and TensorFlow for work with neural networks.
De-identification of speakers with hypokinetic dysarthria
Kárník, Radoslav ; Kiska, Tomáš (referee) ; Mekyska, Jiří (advisor)
This paper discuses design and implementation of a system that performs de-identification of speech recordings of patients suffering from Parkinson's disease. The paper describes causes and symptoms of Parkinson's disease and effects of hypokinetic dysarthria on speech. Part of the paper is devoted to speech features that can be used for diagnosing hypokinetic dysarthria from speech. It also describes ways of speech de-identification and system for evaluating results using recognition of speakers and patients. De-identification system uses vocal tract length normalization (VTLN) and evaluating system uses Gaussian mixture models (GMM). PARCZ database was used for testing. It contains recordings of speech of patients affected by Parkinson's disease and control speakers.
Recognition of music cover versions using Music Information Retrieval techniques
Martinek, Václav ; Zvončák, Vojtěch (referee) ; Kiska, Tomáš (advisor)
This master’s thesis deals with designs and implementation of systems for music cover recognition. The introduction part is devoted to the calculation parameters from audio signal using Music Information Retrieval techniques. Subsequently, various forms of cover versions and musical aspects that cover versions share are defined. The thesis also deals in detail with the creation and distribution of a database of cover versions. Furthermore, the work presents methods and techniques for comparing and processing the calculated parameters. Attention is then paid to the OTI method, CSM calculation and methods dealing with parameter selection. The next part of the thesis is devoted to the design of systems for recognizing cover versions. Then there are compared systems already designed for recognizing cover versions. Furthermore, the thesis describes machine learning techniques and evaluation methods for evaluating the classification with a special emphasis on artificial neural networks. The last part of the thesis deals with the implementation of two systems in MATLAB and Python. These systems are then tested on the created database of cover versions.
Music mood and emotion recognition using Music information retrieval techniques
Smělý, Pavel ; Mucha, Ján (referee) ; Kiska, Tomáš (advisor)
This work focuses on scientific area called Music Information Retrieval, more precisely it’s subdivision focusing on the recognition of emotions in music called Music Emotion Recognition. The beginning of the work deals with general overview and definition of MER, categorization of individual methods and offers a comprehensive view of this discipline. The thesis also concentrates on the selection and description of suitable parameters for the recognition of emotions, using tools openSMILE and MIRtoolbox. A freely available DEAM database was used to obtain the set of music recordings and their subjective emotional annotations. The practical part deals with the design of a static dimensional regression evaluation system for numerical prediction of musical emotions in music recordings, more precisely their position in the AV emotional space. The thesis publishes and comments on the results obtained by individual analysis of the significance of individual parameters and for the overall analysis of the prediction of the proposed model.
Reconstruction of signal modified by fade-in/fade-out
Bača, Petr ; Kiska, Tomáš (referee) ; Rajmic, Pavel (advisor)
This thesis contains the theory needed to solve the special problem of bit-depth expansion. The goal is to reconstruct the signal which suffered from application of the fade-in, fade-out effect. The theory includes information of analog to digital conversion and the theory of sparse representations. Thesis formulates the task of bit-depth expansion and advices the algorithm to solve it. Furthermore, the realization of the issue is discussed and the results are given.
Music genre recognition using Music information retrieval techniques
Zemánková, Šárka ; Zvončák, Vojtěch (referee) ; Kiska, Tomáš (advisor)
This diploma work deals with music genre recognition using the techniques of Music Information Retrieval. It contains a brief description of the principle of this research area and its subfield called Music Genre Recognition. The following chapter includes selection of the most suitable parameters for describing music genres. This work further characterizes machine learning methods used in this field of research. The next chapter deals with the descriptions of music datasets created for genre classification studies. Subsequently, there is a draft and evaluation of the system for music genre recognition. The last part of this work describes the results of partial parameter analysis, dependence of genre classification accuracy on the amount of parameters and contains a discussion on the causes of classification accurancy for the individual genres.
Platform for subjective evaluation of video-sequences
Srnec, Tomáš ; Kiska, Tomáš (referee) ; Číka, Petr (advisor)
This bachelor thesis is focused on subjective video quality assessment. Used modern codecs such as H.264, H.265, VP8 and VP9 are described in first chapter. In the next part of the thesis, four methods of the subjective video assessment are being called, according to Recommendation ITU-T P.910. The practical part includes encoding of three videos, into four resolutions, for four codecs. Output of the thesis is JavaFX application, capable of playing used videos for participants of test, who are making judgment. Their results are real-time sent to MySQL server and directly in application evaluated into bar charts. According to our results, the best codec is VP9, before codec H.265, H.264 and VP8.
An alternative JPEG coder/decoder
Jirák, Jakub ; Kiska, Tomáš (referee) ; Rajmic, Pavel (advisor)
The JPEG codec is currently the most widely used image format. This work deals with the design and implementation of an alternative JPEG codec using proximal algorithms in combination with the fixation of points from the original image to suppression of artifacts created in common JPEG coding. To solve the problem, the prox_TV and then the Douglas-Rachford algorithm were used, for which special functions using l_1-norm for image reconstruction were derived. The results of the proposed solution are very good because they can effectively suppress the artefacts created and the result corresponds to the image with a higher set qualitative factor. The proposed method achieves very good results for both simple images and photos, but in the case of large images (1024 × 1024 px) and larger, a large amount of computing time is required, so the method is more suitable for smaller images.
Research of dynamics features comparing audio records
Zemánková, Šárka ; Smékal, Zdeněk (referee) ; Kiska, Tomáš (advisor)
This work deals with the analysis of parameters related to the dynamics of sound recordings. It contains a brief description of the history of sound processing in analogue and digital form and the process of audio signal processing nowadays. The following chapter includes selection of the most suitable parameters for describing an audio recording, especially those describing the dynamics. This work further characterizes the methods used in similar researches in the world. There is also a system designed to calculate 43 dynamic parameters and the possibilities of their analysis are outlined as well. 35 different interpretations of one musical work were compared. Finally, the calculated parameters were drawn into scatter plots and evaluated using visual cluster analysis.
Acoustic analysis of gender-related patterns in Parkinson's disease
Herinek, Denis ; Kiska, Tomáš (referee) ; Galáž, Zoltán (advisor)
The bachelor's thesis is about acoustic analysis of gender-related patterns in Parkinson's disease by analysing speech task: reading passage. Parkinson's disease manifests in all subsystems involved in speech production (respiration, phonation, articulation and prosody). The aim of this thesis is familirization with symptoms of this disorder and speech parameters influenced by this disorder. Thesis contains preprocessing, parametrization of speech signal and statistic analysis of parameters. System of speech signal processing is created in MATLAB programming language.

National Repository of Grey Literature : 67 records found   beginprevious21 - 30nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.