National Repository of Grey Literature 145 records found  beginprevious67 - 76nextend  jump to record: Search took 0.00 seconds. 
Learning the Face Behind a Voice
Kyjonka, Mojmír ; Matějka, Pavel (referee) ; Plchot, Oldřich (advisor)
This thesis deals with face reconstruction based on voice. The state of the art of this problem is investigated and model for such problem is trained. Model used in this thesis is based on the work "Reconstructing faces from voices" which architecture is based on Generative Adversarial Network (GAN). In this work, we used VGGFace and VoxCeleb datasets, and additionally, we created a small audiovisual dataset of Czech speakers. This work was implemented using the Python scripting language and PyTorch library.
Analysis of Interview Audio
Polok, Alexander ; Plchot, Oldřich (referee) ; Matějka, Pavel (advisor)
The aim of this thesis is the analysis of psychotherapeutic sessions. Classifiers describing the therapy are extracted from the audio recordings. These are then aggregated, compared with other sessions, and graphically presented in a report summarizing the conversation. In this way, therapists are provided with feedback that can serve for professional growth and better psychotherapy in the future.
Detection of Pre-Recorded Messages in Speech
Boboš, Dominik ; Matějka, Pavel (referee) ; Černocký, Jan (advisor)
Rozpoznání před-nahraných zpráv v řeči (tzv. plechové huby) je užitečné pro jakékoliv následující dolování informací v řečových datech. Tato práce shrnuje teorii hledání podobných promluv v řeči a efektivní přístupy k porovnání dvou sekvencí. Ke zkoumání identifikace opakujících se informací v audiu je nutné mít velké množství dat s přesně se opakujícími úseky. Takovou datovou sadu jsme vygenerovali smícháním předem nahraných zpráv s telefonními hovory se změnami rychlosti, hlasitosti a opakování. Náš systém řeší scénáře "známých zpráv a "neznámých zpráv pomocí shlukování nebo detekce v blocích. Porovnali jsme techniky dynamického borcení času (DTW), přibližné shody řetězců a rekurentní kvantifikační analýzy, a nakonec jsme všechny uvedené techniky zkombinovali a získali tak přesný a efektivně pracující systém.
Analýza stavu porostů smrku ztepilého (Picea abies) ve vztahu k pedologickým charakteristikám prostředí na LS Vsetín
Matějka, Petr
The bachelor thesis compares the prosperous and non-prosperous habitat in the cadastre of Velké Karlovic according to their pedological characteristics with respect to the growth and life of Norway spruce. The area of interest is located in the flysch zone of the Western Carpathians. The specific placement of claystone and sandstone layers has a significant impact on the total soil water regime at the sites under investigation. Both sites form ecologically unstable spruce monocultures, which amplifies the negative effects of water deficit. In the non-prosperous habitat, the physical and hydrophysical properties of the soil were assessed as unsuitable for spruce growing. An optimal woody composition with a predominance of beech was proposed, supplemented with larch, ate and cherry
Institutional fight against doping in sport
Matějka, Petr ; Balaš, Vladimír (advisor) ; Lipovský, Milan (referee)
Institutional fight against doping in sport The diploma thesis with the title Institutional fight against doping in sport deals with the problematic issues of using banned substances and banned methods in sport or any other violation of anti-doping rule from the point of view of establishing international institutions with worldwide scope of activity with the aim of elimination of doping in sport. After the general introduction into the problematic issues of doping with the description of the historical roots of this unfair sporting practice there is a part focusing on the basic instruments of the fight against doping. A principle of strict liability of a sportsman for a violation of anti-doping rule is described, as well as the list of banned substances and banned methods, the process of testing, the therapeutic use exemption, the whereabouts and the athlete biological passport. In the following part the instruments of the public international law which were concluded by Council of Europe and UNESCO are analysed. The fight against doping in sport is transported through these international conventions on the level of intergovernmental cooperation which reflects the important non-governmental institutions and binds itself to international coordination. The main part of the thesis is contributed to...
Robust Speaker Verification with Deep Neural Networks
Profant, Ján ; Rohdin, Johan Andréas (referee) ; Matějka, Pavel (advisor)
The objective of this work is to study state-of-the-art deep neural networks based speaker verification systems called x-vectors on various conditions, such as wideband and narrowband data and to develop the system, which is robust to unseen language, specific noise or speech codec. This system takes variable length audio recording and maps it into fixed length embedding which is afterward used to represent the speaker. We compared our systems to BUT's submission to Speakers in the Wild Speaker Recognition Challenge (SITW) from 2016, which used previously popular statistical models - i-vectors. We observed, that when comparing single best systems, with recently published x-vectors we were able to obtain more than 4.38 times lower Equal Error Rate on SITW core-core condition compared to SITW submission from BUT. Moreover, we find that diarization substantially reduces error rate when there are multiple speakers for SITW core-multi condition but we could not see the same trend on NIST SRE 2018 VAST data.
Robust Speech Activity Detection
Popková, Anna ; Plchot, Oldřich (referee) ; Matějka, Pavel (advisor)
The aim of this work is to design and create a robust speech activity detector that is able to detect speech in different languages, in a noise environment and with music on background. I decided to solve this problem by using a neural network as a classification model that assigns one of the four possible classes - silence, speech, music, or noise to the input of audio recording. The resulting tool is able to detect the speech in at least 12 languages. Speech with musical background up to 88 % accuracy and system success on noisy data reaches from 84 % (5 dB SNR) to 88 % (20 dB SNR). This tool can be used for speech activity detection in various research areas of speech processing. The main contribution is the elimination of music, which when not eliminated, significantly increases the error rate of systems for speaker identification or speech recognition.
Agreements and Disagreements between Automatic and Human Speaker Recognition
Valenta, Jakub ; Matějka, Pavel (referee) ; Rohdin, Johan Andréas (advisor)
Tato práce se zabývá problémem rozpoznáváním mluvčího. Uvedený pojem je definován a doplněn o jednotlivé metody, které s ním souvisí. Cílem práce je poukázat na shody a rozdíly mezi lidským a automatickým procesem rozpoznávání mluvčího. V úvodu práce jsou popsány teoretické poznatky z obou zmíněných oblastí, tj. na jaké aspekty lidské řeči se zaměřuje člověk, resp. automatický systém. Následně je provedeno několik experimentů, které mají za úkol srovnat tyto dvě metody. Tyto experimenty jsou vyhodnoceny tak, že je možné pozorovat, které testovací úlohy dokáže lépe vyřešit člověk, aby následně bylo možné tyto poznatky použít ke zlepšení funkce automatického systému. V závěru práce je takovýto návrh na zlepšení automatického systému předveden a otestován. Testování proběhlo úspěšně a byla zaznamenána vyšší přesnost při vyhodnocování. Takový výsledek tedy může být užitý v dalších výzkumech a umožnit tak další vývoj v oblasti automatického rozpoznávání mluvčích.
Acoustic Scene Classification from Speech
Grepl, Filip ; Beneš, Karel (referee) ; Matějka, Pavel (advisor)
This thesis deals with creating a system whose task is to recognize what type of location the recording was created at by analyzing the audio signal. The classifier is based on a multi-layer, fully connected neural network. The topology of the neural network is based on the baseline system provided for the DCASE competition. A dataset from this competition is also used for training and evaluating the neural network. The experiments are performed in particular with the representation of the properties of the audio records and with the format of the input data of the neural network. For this purpose, Mel-filter bank, block Mel-filter bank and MFCC flags are used. The experiments performed in this thesis brought a classification accuracy increased by 6.5 % compared to the baseline system of DCASE. Overall system success rate reached 67.5 %.

National Repository of Grey Literature : 145 records found   beginprevious67 - 76nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.