|
Evaluation of Sources of Human Speech for Deepfake Creation
Frič, Michal ; Malinka, Kamil (referee) ; Firc, Anton (advisor)
Hlasové deepfaky, posúvané rýchlym vývojom v oblasti umelej inteligencie a strojového učenia, predstavujú technológiu s dvojitým potenciálom, prinášajúcu významné prínosy aj riziká. Tieto syntetické hlasové výstupy sú čím ďalej, tým viac realistické, a to vďaka jednoduchému prístupu k rozsiahlym množstvám ľudskej reči z rôznych zdrojov. Táto práca skúma vhodnosť týchto zdrojov pre tvorbu hlasových deepfakov. Identifikovali sme a hodnotili sme viaceré zdroje reči a vypracovali sme metodológie na posudzovanie ich kvality, dostupnosti, diverzity a frekvencie aktualizácií obsahu. Hodnotenie zahŕňalo aj analýzu vplyvu charakteristík zdrojov na kvalitu deepfakov a efektívnosť detekcie softvérom aj ľudskými hodnotiteľmi. Zistenia ukazujú, že všetky identifikované zdroje sú schopné poskytnúť dostatočne kvalitné nahrávky pre vytvorenie kvalitných, často nerozpoznateľných deepfakov. Súčasne poukazujú na konkrétne silné a slabé stránky (merané vlastnosti) jednotlivých zdrojov. Pri testovaní bola objavená anomália v detekčnom softvéri, ktorá umožňuje upraviť deepfaky tak, aby sa vyhli detekcii. Navyše bolo zistené, že menej ako 10 sekúnd ľudskej reči môže stačiť na vytvorenie kvalitného deepfaku, pričom dĺžka a kvalita vstupných nahrávok sú priamo spojené s kvalitou deepfaku.
|
|
Voice Conversion
Hodaň, David ; Novotný, Ondřej (referee) ; Černocký, Jan (advisor)
Voice conversion is the process of transformation of speech parameters belonging to one speaker in such a way that his/her speech sounds as spoken by someone else. This thesis presents a short summary of several techniques currently used for conversion. First, the theory of voice creation with an emphasis on key atributes that characterize and identify a speaker’s voice is described. Methods for voice modification are discussed, together with the advantages and pitfalls that predetermine the use-cases for suitable application of these methods. A high-level overview of how speech is transformed between the source and the target speakers is presented. This description is subsequently used to design a voice conversion system that is aimed to demonstrate one of the possible approaches to the conversion problem. The process of conversion consists of two phases: training and synthesis. As part of this project, a computer program for voice conversion based on the MATLAB programming environment has been developed. Its design, implementation and results are discussed.
|
| |
| |
|
Voice Conversion
Hodaň, David ; Novotný, Ondřej (referee) ; Černocký, Jan (advisor)
Voice conversion is the process of transformation of speech parameters belonging to one speaker in such a way that his/her speech sounds as spoken by someone else. This thesis presents a short summary of several techniques currently used for conversion. First, the theory of voice creation with an emphasis on key atributes that characterize and identify a speaker’s voice is described. Methods for voice modification are discussed, together with the advantages and pitfalls that predetermine the use-cases for suitable application of these methods. A high-level overview of how speech is transformed between the source and the target speakers is presented. This description is subsequently used to design a voice conversion system that is aimed to demonstrate one of the possible approaches to the conversion problem. The process of conversion consists of two phases: training and synthesis. As part of this project, a computer program for voice conversion based on the MATLAB programming environment has been developed. Its design, implementation and results are discussed.
|