keywords:"MapReduce" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"MapReduce"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Automatic Image Labelling Lukáč, Michal ; Řezníček, Ivo (referee) ; Hradiš, Michal (advisor) This thesis focuses on automatic image labelling to semantic categories. It describes the theory of classif cation and local features detection. It explains fundamental machine learning models used for image tagging, and how such models can be learned with Gradient descent. It propose solution with hierarchy for ImageNet and tagging images with attributes. MapReduce computing model is considered for learning on big data sets. In the last part it is described implementation, experimental and test results. Detailed record
	Optimization of the Hadoop Platform for Distributed Computation Čecho, Jaroslav ; Smrčka, Aleš (referee) ; Letko, Zdeněk (advisor) This thesis is focusing on possibilities of improving the Apache Hadoop framework by outsourcing some computation to a graphic card using the NVIDIA CUDA technology. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model called mapreduce. NVIDIA CUDA is a platform which allows one to use a graphic card for a general computation. This thesis contains description and experimental implementations of suitable computation inside te Hadoop framework that can benefit from being executed on a graphic card. Detailed record
	BigData Approach to Management of Large Netflow Datasets Melkes, Miloslav ; Ráb, Jaroslav (referee) ; Ryšavý, Ondřej (advisor) This master‘s thesis focuses on distributed processing of big data from network communication. It begins with exploring network communication based on TCP/IP model with focus on data units on each layer, which is necessary to process during analyzation. In terms of the actual processing of big data is described programming model MapReduce, architecture of Apache Hadoop technology and it‘s usage for processing network flows on computer cluster. Second part of this thesis deals with design and following implementation of the application for processing network flows from network communication. In this part are discussed main and problematic parts from the actual implementation. After that this thesis ends with a comparison with available applications for network analysis and evaluation set of tests which confirmed linear growth of acceleration. Detailed record
	Implementation of Regular Expression Grouping in MapReduce Paradigm Šafář, Martin ; Dvořák, Milan (referee) ; Kaštil, Jan (advisor) The greatest contribution of this thesis is design and implementation of program, that uses MapReduce paradigm and Apache Hadoop for acceleration of regular expression grouping. This paper also describes algorithms, that are used for regular expression grouping and proposes some improvements for these algorithms. Experiments carried out in this thesis show, that a cluster of 20 computers can speed up the grouping ten times. Detailed record
	Web Page Visitor Monitoring Jelič, Martin ; Očenášek, Pavel (referee) ; Burget, Radek (advisor) This Bachelor's Thesis deals with web analytics, its terms, principles, related problems and their solutions. There are described at large a few tools for web analytics. The focus of this work is the design and implementation of a new web analytics tool that allows to monitor the web page traffic and evaluate the obtained data for internet project management purposes. There are shown the results of testing the mentioned tool in real traffic and the comparsion with the results of the existing tools, which differ from the new tool in some characteristics. The work discusses the advantages of using the MongoDB document-oriented database for website traffic monitoring purposes as well. Detailed record
	Hadoop NoSQL database Švagr, Lukáš ; Palovská, Helena (advisor) ; Tomášková, Barbora (referee) The theme of this work is database storage Hadoop Hbase. The main goal is to demonstrate the principles of its function and show the main usage. The entire text assumes that the reader is already familiar with the basic principles of NoSQL databases. The theoretical part briefly describes the basic concepts of databases then mostly covers Hadoop and its properties. This work also includes the practical part which describes how to install a database repository and illustrates basic database operations in two simple programs. The components of the practical part are case studies that report current use of Hadoop in the world-famous companies. Detailed record
	Comparison of CouchDB and MarkLogic databases Sapegina, Evgeniya ; Palovská, Helena (advisor) ; Chlapek, Dušan (referee) This bachelor work deals with the NoSQL concepts and database management system CouchDB and MarkLogic in particular. The aim of this work is to bring the causes of NoSQL DBMS, describe their selected properties and to provide generally accepted classification of these DBMS based on their data model. Another aim is to describe the basic principles of DBMS CouchDB and MarkLogic, on selected use cases demonstrate some reasons of choosing these databases to some companies, and then compare these two DBMS and draw attention to some important aspects that may be crucial in decision making to deploy some of these DBMS. For the contribution of this work I consider compared popular and interesting because of its properties DBMS CouchDB and MarkLogic and thus created the basis for the selection of one of two DBMS into a specific project. The work is divided into four chapters. The theoretical part is introduced by the second and third chapters. The second chapter devotes to properties of NoSQL DBMS. The third chapter discusses the characteristics of DBMS CouchDB and MarkLogic. The practical part is introduced by the fourth chapter, where both these systems are compared according to various aspects. Detailed record
	Hadoop: HDFS, MapReduce and cmputing in IBM BigInsights Fessl, Adam ; Řezáč, Miroslav (advisor) ; Novotný, Ota (referee) This undergraduate thesis thematically appertains to the field of Big Data. Particularly, it concerns Hadoop, an open-source tool, serving for distributed processing and saving data. The object of this thesis is to provide the reader with theoretical knowledge and basic prin-ciples concerning the Apache Hadoop with concentration on the file system HDFS and model for distributed MapReduce computing. Theoretical knowledge and principles are illustrated on modified application WordCount in IBM InfoSphereBigInsights. This work consists of three parts. First part is dealing with Hadoop and its basic modules. Second one provides information concerning the prominent Hadoop distributors; special attention is given to IBM. The last part presents practical computing. This thesis offers a comprehensive view on Hadoop, which combines technical point of view with practical application. Both of them are illustrated on particular examples and supplemented with methods to operate Hadoop. Detailed record
	Document-oriented open source database systems Regner, Tomáš ; Chlapek, Dušan (advisor) ; Tomášková, Barbora (referee) One of the objectives of this bachelor thesis is to introduce readers with motives of developers for seeking alternatives to traditional relational database systems, which gradually resulted in emergence of NoSQL movement and also make them familiar with milestones and most important projects in its history. Then it introduces some basic characteristics common to NoSQL systems such as issues of scalability and distributed data processing and generally accepted categorization of NoSQL systems based on their data model. In more detail it focuses on field of document-oriented database systems, summarizes situation in this field and discusses its two currently most widely used representatives - systems MongoDB and CouchDB. It describes basic mechanisms of their operation and demonstrates meaning of their usage on examples. Then it defines evaluation criteria to compare these products and evaluate their fulfillment in currently available version of these systems. Detailed record
	Practical possibilities of using Apache CouchDb Pultera, Ondřej ; Palovská, Helena (advisor) ; Strossa, Petr (referee) This bachelor work is focused on practical possibilities of using Apache CouchDb a document oriented database system. In the first chapter I explain the basic theoretical terms and principles related to Apache CouchDb. I also briefly introduce database systems based on the relational model. The second chapter describes the architecture and properties of Apache CouchDb. In this chapter I also try to explain principles of running Apache CouchDb in a distributed system and think about need for new database systems. In the third chapter I review case studies of successful Apache CouchDb implementations. In this chapter I want to point out scenarios for which is Apache CouchDb a good candidate. In the next chapter I focus on practical usage of the system. I mention the tool for administering Apache CouchDb and describe some settings. I also show examples how to do basic operations through the HTTP interface. The examples are made with scripting languages PHP and JavaScript. This chapter introduces Apache CouchDb from the point of view of and administrator or developer. The reader of this work should understand the basic concepts of Apache CouchDb and be able to determine the usability of this system for a concrete purpose. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English