keywords:"MPI" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"MPI"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Parallelization of Ultrasound Simulations Using 2D Decomposition Nikl, Vojtěch ; Dvořák, Václav (referee) ; Jaroš, Jiří (advisor) This thesis is a part of the k-Wave project, which is a toolbox for the simulation and reconstruction of acoustic wave felds and one of its main contributions is the planning of focused ultrasound surgeries (HIFU). One simulation can take tens of hours and about 60% of the simulation time is taken by the calculation of the 3D Fast Fourier transforms. Up until now the 3D FFT has been calculated purely by the FFTW library and its 1D decomposition, whose major limitation is the maximum number of employable cores. Therefore we introduce a new approach, called the 2D hybrid decomposition of the 3D FFT (HybridFFT), where we combine both MPI processes and OpenMP threads to reach as best performance as possible. On a low number of cores, on the order of a few hundreds, we are about as fast or slightly faster than FFTW and pure MPI 2D decomposition libraries (PFFT and P3DFFT). One of the best results was achieved on a 512^3FFT using 512 cores, where our hybrid version run 31ms, FFTW run 39ms and PFFT run 44ms. The most significant performance advantage should be seen when employing around 8-16 thousand cores, however we haven't had an access to a machine with such resources. Almost a linear scalability has been proven for up to 2048 employed cores. Detailed record
	Techniques for parallel computing Vodák, René ; Hasmanda, Martin (referee) ; Lattenberg, Ivo (advisor) The text of this thesis deals with techniques of parallel processing calculations. It is an analysis of the most important libraries for parallelization including libraries for parallelization on GPU graphics cards and computing speed by comparing these libraries in Visual Studio 2010 based on a simple application searching primes on three different computer hardware configurations. With OpenCL library, that achieved the best result, there are formed two applications – an improved program for searching prime numbers using the sieve of Eratosthenes and a program for calculating the integral with the trapezoidal rule. Detailed record
	Acceleration of Ultrasound Neurostimulation Using Multi-GPU Systems Bayer, David ; Kadlubiak, Kristián (referee) ; Jaroš, Jiří (advisor) This theses is focused on extending the accelerated implementation of propagating acoustic waves in a medium simulation of k-Wave toolbox by the possibility of using multiple GPUs for the computation. It first describes multi-GPU systems in general and the tools that can be used to work with them. It continues with a description of the k-Wave toolbox and an analysis of existing accelerated implementations. Selected technologies are then tested on a heat diffusion in a medium simulation and the results are used to select tools for the design a resulting implementation. Finally, it summarizes the results obtained. Detailed record
	Optimization of magnetic nanoparticles for hyperthermia in viscous environments Sojková, Tereza ; Fabián,, Martin (referee) ; Hovorka,, Ondrej (referee) ; Gröger, Roman (advisor) Jednodoménové superparamagnetické nanočástice oxidu železa hrají významnou roli v magnetické hypertermii, což je slibná terapeutická metoda, která může potenciálně léčit jakýkoli druh nádoru. Je obecně známo, že rakovinné buňky jsou citlivější na zvýšenou teplotu než buňky zdravé. Léčba rakoviny hypertermií se opírá o tuto skutečnost. Aplikace střídavého magnetického pole s frekvencemi o stovkách kHz způsobí rozptyl energie z nanočástic (10-50 nm) do okolní tkáně. Klíčovým parametrem, který určuje účinnost nanočástic, je specifická rychlost absorpce (SAR), která je komplexní funkcí tvaru, velikosti a povlaku těchto částic. Mimo to, je délka expozice AC polem omezena tendencí nanočástic k agregaci při použití in vivo. Cílem této práce je vyvinout protokol syntézy pro přípravu nanočástic oxidů železa o stejné velikosti, které vykazují vysoké hodnoty SAR a dobrou koloidní stabilitu. Nanočástice byly připraveny dvěma typy chemické syntézy, precipitací a tepelným rozkladem, a vliv reakčních podmínek na velikost, tvar a magnetické vlastnosti těchto nanočástic byl pečlivě prozkoumán. Tepelný rozklad se ukázal jako vhodnější varianta pro přípravu jednovelikostních nanočástic oxidů železa, kde byly podrobněji zkoumány zejména nanokrychle typu jádro-obálka. Jejich velikost, stupeň polydisperzity, koloidní stabilita a morfologie byly studovány dynamickým rozptylem světla ve spojení s transmisní a skenovací elektronovou mikroskopií. Fázové složení nanočástic bylo charakterizováno práškovou rentgenovou difrakcí a Mössbauerovou spektroskopií a spektroskopií ztráty energie elektronů. Rentgenová difrakce byla rovněž použita ke studiu fázových transformací v nanočásticích typu jádro-obálka. Jejich magnetické vlastnosti byly zkoumány pomocí vibrační magnetometrie a elektronové holografie. U nanočástic typu jádro-obálka byl posuzován take jejich aplikační potenciál pro použití při magnetické hypertermii, zobrazování technikou MPI a pro použití jako kontrast při magnetické rezonanci. Tato práce rozšiřuje znalosti o nanočásticích oxidu železa v závislosti na velikosti pro biomedicínské aplikace. Výsledky pro 20 nm nanokrychle po úplné fázové transformaci ukazují velmi dobré možnosti ohřevu pro použití při magnetické hypertermii a třikrát vyšší MPI signál ve srovnání s komerčně používaným indikátorem VivotraxTM. Detailed record
	Zhroucený stát Somálsko - Analýza vývoje Somálska po pádu režimu Siyaada Barre Štěpánek, Karel This Bachelor thesis analyzes the development of one of the poorest countries in the world. The study primarily examines the political failure of the individual, general Barre, as the main cause of the collapse of the state but mainly this thesis analyzes and compares central Somalia with Puntland and Somaliland. The aim of the work is to analyze the 3 main regions in the country using MPI, Well-being index etc. This work also defines the concept of failed states, theoretical approaches to this concept, describes the historical roots of the current conflict, and also tries to suggest possible solutions to the current situation. Detailed record
	Non-Blocking Input/Output for the k-Wave Toolbox Kondula, Václav ; Vaverka, Filip (referee) ; Jaroš, Jiří (advisor) This thesis deals with an implementation of non-blocking I/O interface for the k-Wave project, which is designed for time-domain simulation of ultrasound propagation. Main focus is on large domain simulations that, due to high computing power requirements, must run on supercomputers and produce tens of GB of data in a single simulation step. In this thesis, I have designed and implemented a non-blocking interface for storing data using dedicated threads, which allows to overlap simulation calculations with disk operations in order to speed up the simulation. An acceleration of up to 33% was achieved compared to the current implementation of project k-Wave, which resulted, among other things, also to reduce cost of the simulation. Detailed record
	Parallelization of Ultrasound Simulations Using 2D Decomposition Nikl, Vojtěch ; Dvořák, Václav (referee) ; Jaroš, Jiří (advisor) This thesis is a part of the k-Wave project, which is a toolbox for the simulation and reconstruction of acoustic wave felds and one of its main contributions is the planning of focused ultrasound surgeries (HIFU). One simulation can take tens of hours and about 60% of the simulation time is taken by the calculation of the 3D Fast Fourier transforms. Up until now the 3D FFT has been calculated purely by the FFTW library and its 1D decomposition, whose major limitation is the maximum number of employable cores. Therefore we introduce a new approach, called the 2D hybrid decomposition of the 3D FFT (HybridFFT), where we combine both MPI processes and OpenMP threads to reach as best performance as possible. On a low number of cores, on the order of a few hundreds, we are about as fast or slightly faster than FFTW and pure MPI 2D decomposition libraries (PFFT and P3DFFT). One of the best results was achieved on a 512^3FFT using 512 cores, where our hybrid version run 31ms, FFTW run 39ms and PFFT run 44ms. The most significant performance advantage should be seen when employing around 8-16 thousand cores, however we haven't had an access to a machine with such resources. Almost a linear scalability has been proven for up to 2048 employed cores. Detailed record
	Parallel genetic algorithm Trupl, Jan ; Kobliha, Miloš (referee) ; Jaroš, Jiří (advisor) The thesis describes design and implementation of various evolutionary algorithms, which were enhanced to use the advantages of parallelism on the multiprocessor systems along with ability to run the computation on different machines in a computer network. The purpose of these algorithms is to find the global extreme of function of $n$ variables. In the thesis, there are demonstrated various optimization problems, and their effective solution with the help of evolutionary algorithms. There are also described interface libraries MPI(Message Passing Interface) and OpenMP, in the extent needed to understand the problematic of parallel evolutionary algorithms. Detailed record
	Efficient Communication in Multi-GPU Systems Špeťko, Matej ; Jaroš, Jiří (referee) ; Vaverka, Filip (advisor) After the introduction of CUDA by Nvidia, the GPUs became devices capable of accelerating any general purpose computation. GPUs are designed as parallel processors which posses huge computation power. Modern supercomputers are often equipped with GPU accelerators. Sometimes the performance or the memory capacity of a single GPU is not enough for a scientific application. The application needs to be scaled into multiple GPUs. During the computation there is need for the GPUs to exchange partial results. This communication represents computation overhead. For this reason it is important to research the methods of the effective communication between GPUs. This means less CPU involvement, lower latency, shared system buffers. Inter-node and intra-node communication is examined. The main focus is on GPUDirect technologies from Nvidia and CUDA-Aware MPI. Subsequently k-Wave toolbox for simulating the propagation of acoustic waves is introduced. This application is accelerated by using CUDA-Aware MPI. Detailed record
	Influence of Network Infrastructure on Distributed Password Cracking Eisner, Michal ; Zobal, Lukáš (referee) ; Hranický, Radek (advisor) Password cracking is a process used to obtain the cracking key through which we get access to encrypted data. This process normally works on the principle of the repeated try of attempts and their verification by making calculations of cryptographic algorithms. The difficulty of algorithms affects the time spent on solving of the calculations. In spite of various acceleration methods, it is often necessary to distribute the given problem among several nodes which are interconnected via the local network or the internet. The aim of this thesis is to analyze the influence of network infrastructure on the speed, the scalability, and the utilization during different attacks on cryptographical hashes. For these purposes, there was created an automatized experimental environment, which consists of distinctive topologies, scripts, and sets of testing tasks. Based on the results of the analysis, which were obtained by the usage of tools Fitcrack and Hashtopolis it was possible to observe this influence. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English