National Repository of Grey Literature 38 records found  1 - 10nextend  jump to record: Search took 0.00 seconds. 
Distributed Processing of IP flow Data
Krobot, Pavel ; Kořenek, Jan (referee) ; Žádník, Martin (advisor)
This thesis deals with the subject of distributed processing of IP flow. Main goal is to provide an implementation of a software collector which allows storing and processing huge amount of a network data in particular. There was studied an open-source implementation of a framework for the distributed processing of large data sets called Hadoop, which is based on MapReduce paradigm. There were made some experiments with this system which provided the comparison with the current systems and shown weaknesses of this framework. Based on this knowledge there was created a specification and scheme for an extension of current software collector within this work. In terms of the created scheme there was created an implementation of query framework for formed collector, which is considered as most critical in the field of distributed processing of IP flow data. Results of experiments with created implementation show significant performance growth and ability of linear scalability with some types of queries.
Distributed Forensic Digital Data Repository
Josefík, Martin ; Burget, Radek (referee) ; Rychlý, Marek (advisor)
This work deals with the design of distributed repository aimed at storing digital forensic data. The theoretical part of the thesis describes digital forensics and what is its purpose. There are also explained Big data, suitable storages, their properties, advantages and disadvantages, in this part. The main part of the thesis deals with the design and implementation of distributed storage for digital forensic data. The design is also focused in suitable indexing of stored data, and supporting new types of digital forensic data. The performance of implemented system was evaluated for chosen type of digital forensic data PCAP files.
Processing and Visualization of Military Sensor Data
Boychuk, Maksym ; Burget, Radek (referee) ; Rychlý, Marek (advisor)
This thesis deals with the creating, visualization and processing data in a military environment. The task is to design and implement a system that enables the creation, visualization and processing ESM data. The result of this work is a ESMBD application that allows using a classical approach, which is a relational database, and BigData technologies for data storage and manipulation. The comparison of data processing speed while using the classic approach (Postgres database) and BigData technologies (Cassandra databases and Hadoop) has been carried out as well.
Scalable machine learning using Hadoop and Mahout tools
Kryške, Lukáš ; Atassi, Hicham (referee) ; Burget, Radim (advisor)
This bachelor’s thesis compares several tools for building a scalable, machine learning platform and describes their advantages and disadvantages. It also practically demonstrates functionality of this scalable platform based on the Apache Hadoop and Apache Mahout tools and measures performance of the K-Means algorithm for total of five computing nodes.
Implementation of Regular Expression Grouping in MapReduce Paradigm
Šafář, Martin ; Dvořák, Milan (referee) ; Kaštil, Jan (advisor)
The greatest contribution of this thesis is design and implementation of program, that uses MapReduce paradigm and Apache Hadoop for acceleration of regular expression grouping. This paper also describes algorithms, that are used for regular expression grouping and proposes some improvements for these algorithms. Experiments carried out in this thesis show, that a cluster of 20 computers can speed up the grouping ten times.
Big Data Processing from Large IoT Networks
Benkő, Krisztián ; Podivínský, Jakub (referee) ; Krčma, Martin (advisor)
The goal of this diploma thesis is to design and develop a system for collecting, processing and storing data from large IoT networks. The developed system introduces a complex solution able to process data from various IoT networks using Apache Hadoop ecosystem. The data are real-time processed and stored in a NoSQL database, but the data are also stored  in the file system for a potential later processing. The system is optimized and tested using data from IQRF network. The data stored in the NoSQL database are visualized and the system periodically generates derived predictions. Users are connected to this system via an information system, which is able to automatically generate notifications when monitored values are out of range.
Big Data
Bútora, Matúš ; Bartík, Vladimír (referee) ; Hruška, Tomáš (advisor)
The aim of the bachelor thesis is to describe the Big Data issue and the OLAP aggregate operations. These operations are applied using Apache Hadoop technology. Most of the work is focused on the description of this technology. The last chapter contains application of aggregate operations and their implementation, following the conclusion of the work and the possibility for future development.
Distributed Big Data Processing on the Java Platform
Tutko, Jakub ; Rychlý, Marek (referee) ; Burget, Radek (advisor)
This thesis is focused on the distributed Big Data processing on the Java platform, together with graph databases. It analyses several graph database distributions and the possibilities to connect them to the Apache Hadoop system for distributed data processing. For the purpose of testing database solutions effectiveness, the thesis outcome is an application, which is downloading data from social networks Twitter and Facebook. It is able to write and analyse data with two different database frameworks which are Halyard and HGraphDB.
Application for Big Data
Blaho, Matúš ; Bartík, Vladimír (referee) ; Hruška, Tomáš (advisor)
This work deals with the description and analysis of the Big Data concept and its processing and use in the process of decision support. Suggested processing is based on the MapReduce concept designed for Big Data processing. The theoretical part of this work is largely about the Hadoop system that implements this concept. Its understanding is a key feature for properly designing applications that run within it. The work also contains design for specific Big Data processing applications. In the implementation part of the thesis is a description of Hadoop system management, description of implementation of MapReduce applications and description of their testing over data sets.
Scalable preprocessing of data using Hadoop tool
Marinič, Michal ; Šmirg, Ondřej (referee) ; Burget, Radim (advisor)
The thesis is concerned with scalable pre-processing of data using Hadoop tool which is used for processing of large volumes of data. In the first theoretical part it focuses on explaining of functioning and structure of the basic elements of Hadoop distributed file system and MapReduce methods for parallel processing. The latter practical part of the thesis describes the implementation of basic Hadoop cluster in pseudo-distributed mode for easy program-debugging, and also describes an implementation of Hadoop cluster in fully-distributed mode for simulation in practice.

National Repository of Grey Literature : 38 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.