National Repository of Grey Literature 54 records found  1 - 10nextend  jump to record: Search took 0.00 seconds. 
Re-identification of Objects in Video Stream using Data Analytics
Smrž, Dominik ; Skopal, Tomáš (advisor) ; Lokoč, Jakub (referee)
The wide usage of surveillance cameras provides data that can be used in various areas, such as security and urban planning. An important stepping stone for useful information extraction is matching the seen object across different points in time or different cameras. In this work, we focus specifically on this part of the video processing, usually referred to as re-identification. We split our work into two stages. In the first part, we focus on the spatial and temporal information regarding the detected objects. In the second part, we combine this metadata with the visual information. For the extraction of useful descriptors from the images, we use methods based on the color distribution as well as state-of-the-art deep neural networks. We also annotate a dataset to provide a comprehensive evaluation of our approaches. Additionally, we provide a custom tool we used to annotate the dataset. 1
Detekce střihů a vyhledávání známých scén ve videu s pomocí metod hlubokého učení
Souček, Tomáš ; Lokoč, Jakub (advisor) ; Peška, Ladislav (referee)
Video retrieval represents a challenging problem with many caveats and sub-problems. This thesis focuses on two of these sub-problems, namely shot transition detection and text-based search. In the case of shot detection, many solutions have been proposed over the last decades. Recently, deep learning-based approaches improved the accuracy of shot transition detection using 3D convolutional architectures and artificially created training data, but one hundred percent accuracy is still an unreachable ideal. In this thesis we present a deep network for shot transition detection TransNet V2 that reaches state-of- the-art performance on respected benchmarks. In the second case of text-based search, deep learning models projecting textual query and video frames into a joint space proved to be effective for text-based video retrieval. We investigate these query representation learning models in a setting of known-item search and propose improvements for the text encoding part of the model. 1
Searching Image Collections Using Deep Representations of Local Regions
Bátoryová, Jana ; Lokoč, Jakub (advisor) ; Fink, Jiří (referee)
In a known-item search task (KIS), the goal is to find a previously seen image in a multimedia collection. In this thesis, we discuss two different approaches based on the visual description of the image. In the first one, the user creates a collage of images (using images from an external search engine), based on which we provide the most similar results from the dataset. Our results show that preprocessing the images in the dataset by splitting them into several parts is a better way to work with the spatial information contained in the user input. We compared the approach to a baseline, which does not utilize this spatial information and an approach that alters a layer in a deep neural network. We also present an alternative approach to the KIS task, search by faces. In this approach, we work with the faces extracted from the images. We investigate face representation for the ability to sort the faces based on their similarity. Then we present a structure that allows easy exploration of the set of faces. We provide a demo, implementing all presented techniques.
Known-item search with relevance to SOM feedback
Veselý, Patrik ; Lokoč, Jakub (advisor) ; Vomlelová, Marta (referee)
Multimedia searching is usually realized by means of text search, where a large dataset is sorted with respect to a relevance to a given text query. However, if users search for just one scene or image, a sequential browsing of a larger result set is often necessary, without a guarantee that the object is found in a reasonable time. This work focuses on methods relying on relevance feedback for more effective searching in a large collection of one million images. Several relevance update and display selection approaches are compared using simulations of relevance feedback. Our experiments reveal that the investigated models are a benefit to modern multimedia search engines. 1
Automatic recognition of musical notation from audio data
Čermák, Marek ; Lokoč, Jakub (advisor) ; Hajič, Jan (referee)
Title: Automatic recognition of musical notation from audio data Author: Marek Čermák Department: Department of Software Engineering Supervisor: doc. RNDr. Jakub Lokoč, Ph.D. Abstract: The goal of this thesis is the design and implementation of an application using convolutional neural networks to generate musical notation from audio data. The application is able to train a neural network using input files in the MIDI (Musical Instrument Digital Interface) format and pair all sections of the music with their audio form. The training of the neural network can be performed on a user- specified collection of MIDI files or on randomly generated music. Each instrument in the MIDI standard can be assigned a network whose output are the notes playing in the given time section. Continuously iterating over the audio data, the network generates sections of active notes which are then concatenated into the output file. The application is also capable of recognizing words from audio using an external service. Keywords: musical notation, neural network, deep learning, audio recognition, MIDI
Evaluation of Keyword-Based Search Models for Known-Item Search
Mejzlík, František ; Lokoč, Jakub (advisor) ; Skopal, Tomáš (referee)
Video retrieval over large datasets is still a very challenging task, which is getting even more relevant with the rapidly growing volume of unannotated data available. Know-item search, as one of the video retrieval tasks, is limited primarily due to the limited ability of users to formulate a suitable query and low efectivity of search models. This thesis focuses mainly on selected search models based on image classifcation, which we will also compare with a commercial solution. We will examine how to transform the network output and what models to use. Also, the efect of iterative user query reformulation on overall search efectivity will be investigated. We will also present a simple simulated user model for the generation of artifcial queries and supporting software for data collection and model evaluation in a web interface. 1
Effective known-item search in an initial query result set in the VIRET tool
Škrhák, Vít ; Lokoč, Jakub (advisor) ; Čech, Přemysl (referee)
Modern methods for effective video retrieval combine several research areas, especi- ally similarity search, machine learning and data visualization. Selected approaches from these areas are integrated to complex search systems, which are tested/compared at in- ternational video search competitions. An example of such system is VIRET developed at KSI MFF UK. Although VIRET represents a state-of-the-art system, it is necessary to further analyze and develop ranking models and variants of interfaces for result set browsing. This bachelor thesis focuses on implementation and testing of a method for result set visualization in the 2D grid using self-organizing maps and (hierarchical) brow- sing. The implemented method is experimentally compared with sequential browsing in the VIRET tool. 1
Known-item search in image datasets using simple color sketches
Dräxler, Peter ; Lokoč, Jakub (advisor) ; Iser, Tomáš (referee)
With the growing amount of multimedia content, the availability of quality search tools is becoming increasingly important. Without suitable sample queries, it is difficult to evaluate quality of any search algorithm. In this work we search for a known image in a database. We describe an experiment in which 2,500 simple color drawings were collected from real users. Using these drawings, we evaluate the accuracy with which the user is able to remember the colors in the image. We use the obtained data to evaluate the accuracy of various search models. Part of the work is also a web application that allows you to search in images. 1
Content-based exploration of unstructured data
Čech, Přemysl ; Lokoč, Jakub (advisor) ; Barthel, Kai Uwe (referee) ; Gudmundsson, Gylfi Thor (referee)
Effective analysis, searching and browsing throughout arbitrary multimedia collections is still a challenging task. To perform a search among multimedia objects, first, a similarity model has to be defined. Such a model establishes methods describing how the content of individual objects is processed and how key features and descriptors, that are used for modeling similarity between objects, are formed. This task is not trivial since there can be many ways of determining how to comprehend the content of multimedia data. Furthermore, with the growing size of contemporary database collections, multimedia retrieval and exploration are extremely computationally intensive. Hence, researchers investigate support indexing structures that can evaluate similarity queries and can respond to user's queries in almost real-time even on datasets counting billions of objects. Another very important aspect of a retrieval system is the user interface for defining queries as well as presenting retrieved results. A multimedia system should offer various inputs for formulating user's queries, especially for situations in which a user cannot provide an ideal query example. Finally, a well- arranged and easy to read interface for visualization of retrieved results is essential for the success of a multimedia exploration and...
Object detection for video surveillance using the SSD approach
Dobranský, Marek ; Lokoč, Jakub (advisor) ; Božovský, Petr (referee)
The surveillance cameras serve various purposes ranging from security to traffic monitoring and marketing. However, with the increasing quantity of utilized cameras, manual video monitoring has become too laborious. In re- cent years, a lot of development in artificial intelligence has been focused on processing the video data automatically and then outputting the desired no- tifications and statistics. This thesis studies the state-of-the-art deep learning models for object detection in a surveillance video and takes an in-depth look at SSD architecture. We aim to enhance the performance of SSD by updating its underlying feature extraction network. We propose to replace the initially used VGG model by a selection of modern ResNet, Xception and NASNet classifica- tion networks. The experiments show that the ResNet50 model offers the best trade-off between speed and precision, while significantly outperforming VGG. With a series of modifications, we improved the Xception model to match the ResNet performance. On top of the architecture-based improvements, we ana- lyze the relationship between SSD and a number of detected classes and their selection. We also designed and implemented a new detector with the use of temporal context provided by the video frames. This detector delivers enhanced precision while...

National Repository of Grey Literature : 54 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.