Fire and smoke detection in video
Buzovský, Viktor ; Říha, Kamil (referee) ; Přinosil, Jiří (advisor)
Thesis deals with possibilities regarding detection of fire and smoke in real enviroment video. The main task is to choose suitable model, train this model, and improve detection capabilities of the model afterwards. First part of thesis is summary of theoritical knowledge needed to have understanding of discussed technical necessities. The second, practical part presents the learned model and its subsequent attempts at improvement, firstly using optical flow and then using additional classification networks. The work is concluded with a final implementation for the detector of fire and smoke, and a proposal for its potential improvement is presented. The work also includes the datasets used and created, among other things.
Artificial Intelligence for Video Sonification
Dobrocký, Filip ; Burget, Radim (referee) ; Říha, Kamil (advisor)
This thesis deals with the topic of video sonification – the transformation of image into sound. It aims to use state-of-the-art techniques of computer vision based on artificial intelligence to create a system capable of algorithmic sound creation applicable in the art context. The focus is put on the fields of sound art, algorithmic composition and generative music. The thesis includes an implementation of a modular sonification system which utilizes the modern object detector YOLOv7 along with a multiple object tracking algorithm (implemented in the library Norfair), built using the programming language Python. The fundementals of the system lie in systematic assignment of sound objects to objects tracked in the video. The sound creation relies on the SuperCollider platform using the Python API Supriya, incorporating various methods of sound synthesis along with a programmatically created sound database.
Audio Data Interpretation in Image and Video
Dobiášovský, Jiří ; Balík, Miroslav (referee) ; Říha, Kamil (advisor)
The content of the thesis is the elaboration of the problem of image and video interpretation of sound data. In the theoretical part, the topic of the sound signal, its properties and the representation of the sound signal in the time and frequency domain are treated first. Furthermore, its digital processing is described and individual communication protocols with which it is possible to transmit audio data or communicate with audio and video devices are described too. Based on the findings from the theoretical part, a suitable system is chosen for 3D generative visualization of the sound signal based on MIDI data or its envelope. A system is created for processing sound data in the DAW and sending it over the local network using a suitable protocol. In the environment chosen for visualization is created a program that receives the sent signals and sorts them appropriately. The process of creating visual effects is described, the method of their use in the visualization of the audio signal and mutual communication between the two programs is demonstrated using suitable examples and pilot project. The system resulting from this work can be used for real-time 3D generative visualization of music data.
Synthesis of Sound from Video
Lazorčák, Daniel ; Smékal, Zdeněk (referee) ; Říha, Kamil (advisor)
In this thesis, a survey of audio synthesis methods from image and video data to audio data is performed and the implementation of three new synthesis methods is reviewed. The first part of the thesis provides an overview of existing approaches to sound from image, identifying their advantages, limitations and possible extensions. The second part describes the implementation of VSyntha, an application that synthesizes audio from video in real-time with the ability to control musical parameters. The third section describes the ReAmper application, which performs soundscaping using sound objects and musical cues based on the detection and tracking of objects in the image. The fourth section describes the SegMentor application, which creates MIDI files from video using various image segmentation techniques. The implemented methods provide new tools for the creation of audio and multimedia works, open the way for further research and development in the field of sound-from-image synthesis, and provide useful tools for creating audio content and interacting with visual data in the form of audio. The results of this work provide an overview of the current state of research and practice in this area and offer opportunities for further development and applications in practice.
Multi-shuttle control system
Surový, Roman ; Říha, Kamil (referee) ; Přinosil, Jiří (advisor)
The aim of this work is the design of a partial concept and technical solution for the control system of carts that move within the racking system both horizontally and vertically. It is mainly about solving the problem of communication between individual trucks and planning the optimal route with safety in mind. The proposed system is implemented and verified in laboratory conditions. As part of the diploma’s work, the overall concept of the system is designed, the proposed method of intercar communication and verified in laboratory conditions.
Helmet for Virtual and Augmented Reality
Buš, Ondřej ; Přinosil, Jiří (referee) ; Říha, Kamil (advisor)
The thesis focuses on the construction of a functional headset for multimedia content consumption. The acquired theoretical knowledge was used in the design and construction of the resulting headset. The theoretical knowledge was consolidated and extended by practical tests on currently available commercial AR/VR headset solutions. In the practical design, increased effort was made to keep the resulting solution as simple as possible. The components that were used in the work can be obtained relatively easily via the internet. With the help of advanced techniques, the thesis tries to optimize the visual experience. The most suitable parameters of the optical lenses are selected. The optical system is enhanced to provide an extended horizontal field of view. The thesis discusses the fabrication of the headset using the 3D printing method on a 3D printer. Subsequently, the individual parts are assembled into a single functional unit. The result of the thesis is a working prototype headset suitable for multimedia content consumption.
Image and Video Interpretation of Sound Data
Milota, Vojtěch ; Přinosil, Jiří (referee) ; Říha, Kamil (advisor)
This work deals with image and video interpretation of sound data. It focuses on time variant parameters of sound and musical signal and its analysis. Different approaches of image and video interpretation of musical data and its evolution is investigated. The work consists of realisation of image and video interpretation using different software tools and methods. Possible development of audio-visual production using latest tools is predicted.
System for preventing unauthorised persons from entering platform areas
Jílek, Michal ; Říha, Kamil (referee) ; Přinosil, Jiří (advisor)
The work deals with the design of a non-existent security system to secure the station from unauthorized persons, which will increase the level and quality of travel for passengers. The system uses computer vision and software developed in C ++ with the OpenCV library. The practical part describes the development of the ticket detector, comparison with the model and validation of the ticket using OCR.
Standard baffle - laboratory device
Hodinka, Tomáš ; Říha, Kamil (referee) ; Balík, Miroslav (advisor)
The purpose of this project is the design and the construction of a laboratory preparation for loudspeaker measurements, a standard baffle. Standard baffle is a rectangular panel with a loudspeaker placed eccentrically. Measurements with this baffle are then done in an anechoic chamber. The main purpose of a standard baffle is the ability to measure loudspeakers without the effect, that is caused by the pressure properties of a closed baffle and without an acoustic short circuit. The standard baffle is designed, so that it is self-supporting, transferable by 2 people, easy to be stored, modular and able to support loudspeakers that are up to 12 inches in diameter. The designed baffle is detachable into 3 parts. The base, the panel, and the inserts. The base is a big, heavy board with a key on the upper side. This key fits into a key groove in the panel and that allows the two parts to assemble. The plane is made of four layers of different types of plywood. That allows the design to make a compromise between the toughness and weight. The inserts are designed, so that the replacement and manipulation of loudspeakers during measurement are as past and as simple as possible. During measurement the loudspeakers are mounted in their respective inserts, which can be mounted and dismounted from the standard baffle without any tools. The test measurements demonstrated that the standard baffle is unsuitable for reference measurements. Resonance of the plane, reflections from the base, refractions from the outer edges of the baffle and potentially different causes unduly distort the results of measurements. These distortions are displayed with a considerable consistency even between different loudspeakers. This indicates that these distortions could be filtered oud by, for example, using a specialized software.
Spatial Function Estimation with Uncertain Sensor Locations
Ptáček, Martin ; Říha, Kamil (referee) ; Poměnková, Jitka (advisor)
Tato práce se zabývá úlohou odhadování prostorové funkce z hlediska regrese pomocí Gaussovských procesů (GPR) za současné nejistoty tréninkových pozic (pozic senzorů). Nejdříve je zde popsána teorie v pozadí GPR metody pracující se známými tréninkovými pozicemi. Tato teorie je poté aplikována při odvození výrazů prediktivní distribuce GPR v testovací pozici při uvážení nejistoty tréninkových pozic. Kvůli absenci analytického řešení těchto výrazů byly výrazy aproximovány pomocí metody Monte Carlo. U odvozené metody bylo demonstrováno zlepšení kvality odhadu prostorové funkce oproti standardnímu použití GPR metody a také oproti zjednodušenému řešení uvedenému v literatuře. Dále se práce zabývá možností použití metody GPR s nejistými tréninkovými pozicemi v~kombinaci s výrazy s dostupným analytickým řešením. Ukazuje se, že k dosažení těchto výrazů je třeba zavést značné předpoklady, což má od počátku za následek nepřesnost prediktivní distribuce. Také se ukazuje, že výsledná metoda používá standardní výrazy GPR v~kombinaci s upravenou kovarianční funkcí. Simulace dokazují, že tato metoda produkuje velmi podobné odhady jako základní GPR metoda uvažující známé tréninkové pozice. Na druhou stranu prediktivní variance (nejistota odhadu) je u této metody zvýšena, což je žádaný efekt uvážení nejistoty tréninkových pozic.

