National Repository of Grey Literature 62 records found  1 - 10nextend  jump to record: Search took 0.01 seconds. 
The GPU Accelerated Optimisation of the Water Management Systems
Marek, Jan ; Petrlík, Jiří (referee) ; Jaroš, Jiří (advisor)
Subject of this thesis is optimalization of storage function of water management system. The work is based on dissertation thesis of Ing. Pavel Menšík Ph.D. Automatization of   storage function of water management system. As optimalization method was chosen diferential evolution. Sequential version of the method will be implemented as a first step, followed by CPU accelerated and   GPU accelerated versions.
Interactive Cloth Simulation Accelerated by GPU
Melichar, Vojtěch ; Klepárník, Petr (referee) ; Jaroš, Jiří (advisor)
This master thesis deals with interactive cloth simulation accelerated by GPU. In the first part there is a description of all technologies used during implementation of a program. The second part discusses various simulation methods. It is mainly focused on particle systems as a most used method. These parts are followed by a design of the program, which is implemented as a part of this thesis. The program was implemented in four variants. The first variant is CPU implementation, which was then optimalized with OpenMP. CUDA implementation is based on these implementations. Last variant implemented in this thesis is optimized CUDA implementation. All these implementations are evaluated from compute complexity point of view and suitability for real time graphics.
The Parallel Genetic Algorithm for Multicore Systems
Vrábel, Lukáš ; Šimek, Václav (referee) ; Jaroš, Jiří (advisor)
Genetický algoritmus je optimalizačná metóda zameraná na efektívne hľadanie riešení rozličných problémov. Je založená na princípe evolúcie a prirodzeného výberu najschopnejších jedincov v prírode. Keďže je táto metóda výpočtovo náročná, bolo vymyslených veľa spôsobov na jej paralelizáciu. Avšak väčšina týchto metód je z historických dôvodov založená na superpočítačoch alebo rozsiahlych počítačových systémoch. Moderný vývoj v oblasti informačných technológií prináša na trh osobných počítačov stále lacnejšie a výkonnejšie viacjadrové systémy. Táto práca sa zaoberá návrhom nových metód paralelizácie genetického algoritmu, ktoré sa snažia naplno využiť možnosti práve týchto počítačových systémov. Tieto metódy sú následne naimplementované v programovacom jazyku C za využitia knižnice OpenMP určenej na paralelizáciu. Implementácia je následne použitá na experimentálne ohodnotenie rozličných charakteristík každej z prezentovaných metód (zrýchlenie oproti sekvenčnej verzii, závislosť konvergencie výsledných hodnôt od miery paralelizácie alebo od vyťaženia procesoru, ...). V poslednej časti práce sú prezentované porovnania nameraných hodnôt a závery vyplývajúce z týchto meraní. Následne sú prediskutované možné vylepšenia daných metód vyplývajúce z týchto záverov, ako aj možnosti spracovania väčšieho množstva charakteristík na presnejšie ohodnotenie efektivity paralelizácie genetických algoritmov.
Acceleration of Lattice-Boltzmann Algorithms for Bloodflow Modeling
Kompová, Radmila ; Kešner, Filip (referee) ; Jaroš, Jiří (advisor)
This thesis aims to explore possible implementations and optimizations of the lattice-Boltzmann method. This method allows modeling of fluid flow using a simulation of fictive particles. The thesis focuses on possible improvements of the existing tool HemeLB which  is designed and optimized for bloodflow modeling. Several vectorization and paralellization approaches that could be included in this tool are explored. An application focused on comparing chosen algorithms including optimizations for the lattice-Boltzmann method was implemented as a part of the thesis. A group of tests focused on comparing this algorithms according to performance, cache usage and overall memory usage was performed. The best performance achieved was 150 millions of lattice site updates per second.
Simulation of Heat Diffusion in the Brain Using High-Level GPGPU Techniques
Krbila, Martin ; Kadlubiak, Kristián (referee) ; Jaroš, Jiří (advisor)
This master's thesis deals with acceleration of heat diffusion simulation using graphics cards. It describes an approach to acceleration of an existing implementation in Matlab, which is a part of k-Wave package. Various high-level as well as low-level libraries for GPU programming are introduced here and their strengths and weaknesses compared. A complete implementation of the simulation on GPU was created as a part of this work. This implementation achieves around hundredfold speedup over the existing CPU solution in Matlab. A module for computation of discrete trigonometric transformations on graphics card was created to accelerate simulation with various boundary conditions. This module achieves around ten times speedup over the best CPU implementation. Another output of this thesis is a performance comparison of several implementations of basic diffusion simulation each using a different GPGPU technique.
Implementation of 2D Ultrasound Simulations
Šimek, Dominik ; Vaverka, Filip (referee) ; Jaroš, Jiří (advisor)
The work deals with design and implementation of 2D ultrasound simulation. Applications of the ultrasound simulation can be found in medicine, biophysic or image reconstruction. As an example of using the ultrasound simulation we can mention High Intensity Focused Ultrasound that is used for diagnosing and treating cancer. The program is part of the k-Wave toolbox designed for supercomputer systems, specifically for machines with shared memory architecture. The program is implemented in the C++ language and using OpenMP acceleration.  Using the designed solution, it is possible to solve large-scale simulations in 2D space. The work also deals with merging and unification of the 2D and 3D simulation using modern C++. A realistic example of use is ultrasound simulation in transcranial neuromodulation and neurostimulation in large domains, which have more than 16384x16384 grid points. Simulation of such size may take several days if we use the original MATLAB 2D k-Wave. Speedup of the new implementation is up to 8 on the Anselm and Salomon supercomputers.
High Performance Applications on Intel Xeon Phi Cluster
Kačurik, Tomáš ; Hrbáček, Radek (referee) ; Jaroš, Jiří (advisor)
The main topic of this thesis is the implementation and subsequent optimization of high performance applications on a cluster of Intel Xeon Phi coprocessors. Using two approaches to solve the N-Body problem, the possibilities of the program execution on a cluster of processors, coprocessors or both device types have been demonstrated. Two particular versions of the N-Body problem have been chosen - the naive and Barnes-hut. Both problems have been implemented and optimized. For better comparison of the achieved results, we only considered achieved acceleration against single node runs using processors only. In the case of the naive version a 15-fold increase has been achieved when using combination of processors and coprocessors on 8 computational nodes. The performance in this case was 9 TFLOP/s. Based on the obtained results we concluded the advantages and disadvantages of the program execution in the distributed environments using processors, coprocessors or both.
The Efficient Implementation of the Genetic Algorithm Using Multicore Processors
Kouřil, Miroslav ; Žaloudek, Luděk (referee) ; Jaroš, Jiří (advisor)
This diploma thesis deals with acceleration of advanced genetic algorithm. For implementation, discrete and continuos versions of UMDA genetic algorithm were chosen. The main part of the acceleration is the utilization of SSE instruction set. Using this set, the functions for calculating fitness and new population sampling were accelerated in particular. Then the pseudorandom number generator that also uses SSE instruction set was implemented.  The discrete algorithm reached the speed of up to 4,6 after this implementation. Finally, the algorithms were modified so that the system  OpenMP could be used, which enables the running of blocks of code in more threads. The continuous version of algorithm is not convenient for parallelization, because computational complexity of that algorithm is low. In comparison, the discrete versions of algorithm are really appropriate for parallelization. Both the implemented versions reached the total acceleration of up to 4,9 and 7,2. 
Acceleration of Photoacoustic Imaging
Nedeljković, Sava ; Bordovský, Gabriel (referee) ; Jaroš, Jiří (advisor)
Hlavním cílem této práce je navrhnout novu metodu rekonstrukce obrazu z dat fotoakustického snímkování. Fotoakustické snímkování je velmi populární neinvazivní metoda snímkování založená na detekování ultrazvukových vln vyvolaných laserovým paprskem. Proces snímkování generuje velké množství dat, a kvůli tomu je proces rekonstrukce obrazu velmi časově náročný. Táto práce demonstruje proces rekonstrukce obrazu pomocí zpětné projekce, algoritmu který je dostatečně jednoduchý na přizpůsobení moderním architekturám procesorů umožňující různé způsoby optimalizovaného výpočtu. Dvě různé variantu algoritmu byly navrženy: z pohledu pixelu a z pohledu senzoru, který detekuje ultrazvukové vlny. Obě varianty byly implementovány třemi různými způsoby: pomocí vektorového paralelismu, vláknového paralelismu a paralelismu na grafické karetě (GPU). Všechny 3 implementace obou variant algoritmu byly testovány a výsledky byly srovnány s výsledkem rekonstrukce algoritmu reverzního času, přesnějšího ale mnohokrát pomalejšího algoritmu. Výsledky ukázaly, že GPU paralelismus nabízí nejrychlejší výpočet, cca. 200 krát rychlejší než u algoritmu reverzního času, a proto se dá použit i v aplikacích pracující v reálném čase.
Modern Libraries for GPGPU Programming
Šuba, Patrik ; Kadlubiak, Kristián (referee) ; Jaroš, Jiří (advisor)
The main goal of this thesis is to conduct research in the field of graphics card libraries and to use this libraries to create a set of test cases. Test cases consist of mathematical operations with matrices and vectors. Two applications have been created for test cases. The first application was implemented in C ++ using the OpenMP library. The second application was implemented in C ++ using the cuBLAS and CUDA libraries. The implementation part of this work allows reader to look into the problematics of GPGPU programming and shows its practical use. The results of this work is to verify the performance and throughput of the graphics cards provided by the IT4Innovations group. The results of the applications are then compared with the referential values from the graphics card manufacturer and also among the used libraries.

National Repository of Grey Literature : 62 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.