Název: Datawarehouse
Autoři: Ragab Negm, Hussein Mohamed Abdelhaq ; Merunka, Vojtěch (vedoucí práce) ; Martin, Martin (oponent)
Typ dokumentu: Diplomové práce
Jazyk: eng
Nakladatel: Česká zemědělská univerzita v Praze
Abstrakt: Data is being produced by the firms in ever increasing rates and firms are finding new ways to make use of data to create business value. The generated volumes of data create the need for better and cheaper storage options that allows utilizing the data as well. Data warehouses have emerged as the most appropriate tool for this task. However, data warehouses come with significant costs both human and financial. The pool of technologies for implementing data warehouses is versatile. This project aims to provide a comparative implementation using two of the technologies, namely, Microsoft SQL Server and Apache Hadoop. The project covers the different phases of building a data warehouse; the requirements specification phase; the design phase and a compact comparison between the entity-relation and dimensional modeling design techniques and the process of building a dimensional model based on based on the application data sources; the extract-transform-load phase. The comparison is then made between the two technologies for data capacity, data loading, connectivity and querying data. The project concludes that the decision to choose between Microsoft SQL Server and Apache Hadoop is not a recommendation for one over the other but should be based on the needs, resources and the existing ecosystem. Hadoop would be the choice for bigger amounts of data, unstructured or irregular data formats, and when the licensing fees are an unaffordable cost. On the other hand, Microsoft SQL Server would make a better choice when the data is structured, the anticipated data volumes are suitable and when the rest of ecosystem is Microsoft based. Future development for this project should cover new ways to make Hadoop more efficient with smaller data volumes.

Instituce: Česká zemědělská univerzita (web)
Informace o dostupnosti dokumentu: Dostupné v repozitáři ČZU.
Původní záznam: https://is.czu.cz/zp/index.pl?podrobnosti_zp=202160

Trvalý odkaz NUŠL: http://www.nusl.cz/ntk/nusl-257745


Záznam je zařazen do těchto sbírek:
Školství > Veřejné vysoké školy > Česká zemědělská univerzita
Vysokoškolské kvalifikační práce > Diplomové práce
 Záznam vytvořen dne 2016-09-21, naposledy upraven 2022-03-03.


Není přiložen dokument
  • Exportovat ve formátu DC, NUŠL, RIS
  • Sdílet