Column-oriented and Image Data Format Benchmarks
Tarageľ, Marián ; Bartl, Vojtěch (oponent) ; Špaňhel, Jakub (vedoucí práce)
This bachelor's thesis aims to evaluate different data formats for storing tabular and image data. To accomplish this task, this work designed a new benchmark of data formats. The benchmarks are divided into three benchmark suites. These include the benchmarking of uncompressed tabular formats, compressed tabular formats, and an image storage benchmark. Overall tabular benchmark results suggest that the best tabular data format for speed saving and reading is Feather, and the most memory-efficient format is Parquet. The results of the image storage benchmark show that the fastest image storage is SQLite and the least space is required by PNG format. The results of this work can contribute to a better understanding of how different data formats behave and help to choose the right format for tabular and image data.

