National Repository of Grey Literature 2 records found  Search took 0.00 seconds. 
Simulation of the Heat Diffusion with a Time-Varying Source on GPUs
Hála, Pavel ; Záň, Drahoslav (referee) ; Jaroš, Jiří (advisor)
This bachelor's thesis deals with the simulation of the heat transfer inside human tissue injected by an external time varying heat source. The proposed implemented simulation is based on a 4th order in space and 1st order in time finite-difference time domain method. First, a multithreaded CPU version was implemented. Subsequently, several GPU accelerated versions were implemented taking into account architecture aspect of the GPU. The experimental results showed that the fastest GPU kernel was the naive one using only the GPU global memory. Next, the usefulness of the Gauss-Seidel's method was investigated. The CPU implementation of the method was evaluated as usable because of being only 13% slower while saving up to 50% of memory resources. However, the GPU implementation was twice as slow as the naive version mainly due to shared memory size limits. The peak performance in terms of GFLOPS reached 32 and 135 on CPU and GPU, respectively. This corresponds to 10% and 9% of the theoretical potential of given architectures.
Simulation of the Heat Diffusion with a Time-Varying Source on GPUs
Hála, Pavel ; Záň, Drahoslav (referee) ; Jaroš, Jiří (advisor)
This bachelor's thesis deals with the simulation of the heat transfer inside human tissue injected by an external time varying heat source. The proposed implemented simulation is based on a 4th order in space and 1st order in time finite-difference time domain method. First, a multithreaded CPU version was implemented. Subsequently, several GPU accelerated versions were implemented taking into account architecture aspect of the GPU. The experimental results showed that the fastest GPU kernel was the naive one using only the GPU global memory. Next, the usefulness of the Gauss-Seidel's method was investigated. The CPU implementation of the method was evaluated as usable because of being only 13% slower while saving up to 50% of memory resources. However, the GPU implementation was twice as slow as the naive version mainly due to shared memory size limits. The peak performance in terms of GFLOPS reached 32 and 135 on CPU and GPU, respectively. This corresponds to 10% and 9% of the theoretical potential of given architectures.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.