National Repository of Grey Literature 4 records found  Search took 0.00 seconds. 
Assisted Code Vectorization and Parallelization Using the OpenMP 4.0 Standard
Slouka, Lukáš ; Nikl, Vojtěch (referee) ; Jaroš, Jiří (advisor)
The subject of the bachelor's thesis is code optimization using the OpenMP 4.0 standard which provides tools for assisted parallelization and vectorization. In addition to the descrip tion of the OpenMP 4.0 standard, the thesis as well contains an insight into architectures of modern computers, specifically the system of cache memories and SSE/AVX modules that play a major role in the optimization field. The thesis demonstrates advantages of optimized code compared to unoptimized version on a set of benchmarks which are aimed at various aspects of optimization.
Neural Network Implementation without Multiplication
Slouka, Lukáš ; Baskar, Murali Karthick (referee) ; Szőke, Igor (advisor)
The subject of this thesis is neural network acceleration with the goal of reducing the number of floating point multiplications. The theoretical part of the thesis surveys current trends and methods used in the field of neural network acceleration. However, the focus is on the binarization techniques which allow replacing multiplications with logical operators. The theoretical base is put into practice in two ways. First is the GPU implementation of crucial binary operators in the Tensorflow framework with a performance benchmark. Second is an application of these operators in simple image classifier. Results are certainly encouraging. Implemented operators achieve speed-up by a factor of 2.5 when compared to highly optimized cuBLAS operators. The last chapter compares accuracies achieved by binarized models and their full-precision counterparts on various architectures.
Neural Network Implementation without Multiplication
Slouka, Lukáš ; Baskar, Murali Karthick (referee) ; Szőke, Igor (advisor)
The subject of this thesis is neural network acceleration with the goal of reducing the number of floating point multiplications. The theoretical part of the thesis surveys current trends and methods used in the field of neural network acceleration. However, the focus is on the binarization techniques which allow replacing multiplications with logical operators. The theoretical base is put into practice in two ways. First is the GPU implementation of crucial binary operators in the Tensorflow framework with a performance benchmark. Second is an application of these operators in simple image classifier. Results are certainly encouraging. Implemented operators achieve speed-up by a factor of 2.5 when compared to highly optimized cuBLAS operators. The last chapter compares accuracies achieved by binarized models and their full-precision counterparts on various architectures.
Assisted Code Vectorization and Parallelization Using the OpenMP 4.0 Standard
Slouka, Lukáš ; Nikl, Vojtěch (referee) ; Jaroš, Jiří (advisor)
The subject of the bachelor's thesis is code optimization using the OpenMP 4.0 standard which provides tools for assisted parallelization and vectorization. In addition to the descrip tion of the OpenMP 4.0 standard, the thesis as well contains an insight into architectures of modern computers, specifically the system of cache memories and SSE/AVX modules that play a major role in the optimization field. The thesis demonstrates advantages of optimized code compared to unoptimized version on a set of benchmarks which are aimed at various aspects of optimization.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.