National Repository of Grey Literature 37 records found  previous11 - 20nextend  jump to record: Search took 0.00 seconds. 
Neural networks in speaker classification
Svoboda, Libor ; Atassi, Hicham (referee) ; Míča, Ivan (advisor)
The content of this work is focused on the neural network per speaker recognition. The work deals with problems of processing speech signal and there are introduction some types of neural network. The part of work was made database of records from speakers with have various sex and ages. The train and test group was made from the database. For classifier were suggested afterwards. One of them was nominated on base Gaussian mixture model and three of them were nominated on neural. This system was tested and analyzed on the basis of age, gender and both criterions each other at the end. Attention is focused on choice suitable feature in each mission of classification at the same time. At the end of work are introduced results of analysis for individual groups and features. The most suitable features are diagnosed from given mission of classification and the most prosperous classifier.
Identification of persons via voice imprint
Mekyska, Jiří ; Atassi, Hicham (referee) ; Smékal, Zdeněk (advisor)
This work deals with the text-dependent speaker recognition in systems, where just a few training samples exist. For the purpose of this recognition, the voice imprint based on different features (e.g. MFCC, PLP, ACW etc.) is proposed. At the beginning, there is described the way, how the speech signal is produced. Some speech characteristics important for speaker recognition are also mentioned. The next part of work deals with the speech signal analysis. There is mentioned the preprocessing and also the feature extraction methods. The following part describes the process of speaker recognition and mentions the evaluation of the used methods: speaker identification and verification. Last theoretically based part of work deals with the classifiers which are suitable for the text-dependent recognition. The classifiers based on fractional distances, dynamic time warping, dispersion matching and vector quantization are mentioned. This work continues by design and realization of system, which evaluates all described classifiers for voice imprint based on different features.
Text Dependent Speaker Verification
Fux, Jan ; Glembek, Ondřej (referee) ; Matějka, Pavel (advisor)
The goal of this Bachelor's thesis was to design text dependent speaker recognition system. There were few systems tested for MIT database. This database contains recordings of 0.46s average length. Best case for recognition is to use a combination of DTW system using posterior probability estimation (posteriograms) as an output of Phoneme recognizer and acoustic SID system based on iVectors and PLDA (Probabilistic Linear Component Analysis). Fusion with Neural network gives the best results (EER). These are 17.84% EER for women and 16.38% for men. It's 49.9% relative improvement for women and 54.2% for men against acoustic recognition alone.
Adaptation of Speaker Recognition Systems
Novotný, Ondřej ; Pešán, Jan (referee) ; Plchot, Oldřich (advisor)
In this paper, we propose techniques for adaptation of speaker recognition systems. The aim of this work is to create adaptation for Probabilistic Linear Discriminant Analysis. Special attention is given to unsupervised adaptation. Our test shows appropriate clustering techniques for speaker estimation of the identity and estimation of the number of speakers in adaptation dataset. For the test, we are using NIST and Switchboard corpora.
Computer Graphics and Video Features for Speaker Recognition
Fér, Radek ; Matějka, Pavel (referee) ; Černocký, Jan (advisor)
We describe a non-traditional method for speaker recognition that uses features and algorithms used mainly for computer vision. Important theoretical knowledge of computer recognition is summarized first. The Boosted Binary Features are described and explored as an already proposed method, that has roots in computer vision. This method is evaluated on standard speaker recognition databases TIMIT and NIST SRE 2010. Experimental results are given and compared to standard methods. Possible directions for future work are proposed at the end.
Speaker Recognition on Mobile Phone
Pešán, Jan ; Glembek, Ondřej (referee) ; Černocký, Jan (advisor)
Tato práce se zaměřuje na implementaci počítačového systému rozpoznávání řečníka do prostředí mobilního telefonu. Je zde popsán princip, funkce, a implementace rozpoznávače na mobilním telefonu Nokia N900.
Modelling Prosodic Dynamics for Speaker Recognition
Jančík, Zdeněk ; Fapšo, Michal (referee) ; Matějka, Pavel (advisor)
Most current automatic speaker recognition system extract speaker-depend features by looking at short-term spectral information. This approach ignores long-term information. I explored approach that use the fundamental frequency and energy trajectories for each speaker. This approach models prosody dynamics on single fonemes or syllables. It is known from literature that prosodic systems do not work as well the acoustic one but it improve the system when fusing. I verified this assumption by fusing my results with state of the art acoustic system from BUT. Data from standard evaluation campaigns organized by National Institute of Standarts and Technology are used for all experiments.
Microphone Arrays for Speaker Recognition
Mošner, Ladislav ; Plchot, Oldřich (referee) ; Černocký, Jan (advisor)
Tato diplomová práce se zabývá problematikou vzdáleného rozpoznávání mluvčích. V případě dat zachycených odlehlým mikrofonem se přesnost standardního rozpoznávání značně snižuje, proto jsem navrhl dva přístupy pro zlepšení výsledků. Prvním z nich je použití mikrofonního pole (záměrně rozestavené sady mikrofonů), které je schopné nasměrovat virtuální "paprsek" na pozici řečníka. Dále jsem prováděl adaptaci komponent systému (PLDA skórování a extraktoru i-vektorů). S využitím simulace pokojových podmínek jsem syntetizoval trénovací a testovací data ze standardní datové sady NIST 2010. Ukázal jsem, že obě techniky a jejich kombinace vedou k výraznému zlepšení výsledků. Dále jsem se zabýval společným určením identity a pozice mluvčího. Zatímco výsledky ve venkovním simulovaném prostředí (bez ozvěn) jsou slibné, výsledky z interiéru (s ozvěnami) jsou smíšené a vyžadují další prozkoumání. Na závěr jsem mohl systémem vyhodnotit omezené množství reálných dat získaných přehráním a záznamem nahrávek ve skutečné místnosti. Zatímco výsledky pro mužské nahrávky odpovídají simulaci, výsledky pro ženské nahrávky nejsou přesvědčivé a vyžadují další analýzu.
Resilience of Biometric Authentication of Voice Assistants against Deepfakes
Šandor, Oskar ; Firc, Anton (referee) ; Malinka, Kamil (advisor)
S rozvojom technológie deepfake sa napodobňovanie hlasu cudzích ľudí stalo oveľa jednoduchším. Na napodobnenie hlasu osoby a prípadné oklamanie človeka alebo stroja už nie je potrebné mať profesionálneho imitátora. Útočníkom stačí niekoľko nahrávok hlasu osoby bez ohľadu na obsah, aby vytvorili klon hlasu za pomoci online nástrojov. V takom prípade dokáže útočník vytvoriť syntetické nahrávky s obsahom, ktorý daná osoba možno nikdy nepovedala. Tieto nahrávky sa dajú zneužiť napríklad na neoprávnené používanie hlasových asistentov. Cieľom tejto práce je zistiť, či hlasoví asistenti dokážu rozpoznať tieto nahrávky. Vykonané experimenty ukazujú, že deepfakes vytvorené v priebehu niekoľkých minút dokážu obísť schopnosť hlasových asistentov rozpoznať hovoriaceho a môžu byť použité na uskutočnenie viacerých útokov.
Exploring New Paths in Neural-Network-Based Speaker Recognition
Sova, Damián ; Matějka, Pavel (referee) ; Glembek, Ondřej (advisor)
Since the assignment of this work is very broad, it was necessary to focus only on a certain area. In the end, this work aims to apply the Stochastic Weight Averaging optimization method to the training process of the Deep Neural Network. After presenting the necessary theoretical knowledge in the first part of the work, the second part with the experiments courses follows. In the theoretical part, the main focus is on presenting the complete lifecycle of the training and evaluation process, including a description of each component. The practical part provides a detailed look at each experiment, intended to demonstrate the effectiveness of the overall speaker recognition system's performance enhancement. The overall performance improvement is achieved by gradually applying various training configurations where the experience from previous experiments is taken into account. The key ingredient to the successful Stochastic Weight Averaging in the experiments was a sufficiently high Learning Rate value with the successive transition applied or Cyclic course of the Learning Rate.

National Repository of Grey Literature : 37 records found   previous11 - 20nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.