keywords:"superpočítač" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"superpočítač"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Parallelization of Ultrasound Simulations Using 2D Decomposition Nikl, Vojtěch ; Dvořák, Václav (referee) ; Jaroš, Jiří (advisor) This thesis is a part of the k-Wave project, which is a toolbox for the simulation and reconstruction of acoustic wave felds and one of its main contributions is the planning of focused ultrasound surgeries (HIFU). One simulation can take tens of hours and about 60% of the simulation time is taken by the calculation of the 3D Fast Fourier transforms. Up until now the 3D FFT has been calculated purely by the FFTW library and its 1D decomposition, whose major limitation is the maximum number of employable cores. Therefore we introduce a new approach, called the 2D hybrid decomposition of the 3D FFT (HybridFFT), where we combine both MPI processes and OpenMP threads to reach as best performance as possible. On a low number of cores, on the order of a few hundreds, we are about as fast or slightly faster than FFTW and pure MPI 2D decomposition libraries (PFFT and P3DFFT). One of the best results was achieved on a 512^3FFT using 512 cores, where our hybrid version run 31ms, FFTW run 39ms and PFFT run 44ms. The most significant performance advantage should be seen when employing around 8-16 thousand cores, however we haven't had an access to a machine with such resources. Almost a linear scalability has been proven for up to 2048 employed cores. Detailed record
	Estimation of Algorithm Execution Time Using Machine Learning Buchta, Martin ; Chlebík, Jakub (referee) ; Jaroš, Jiří (advisor) This work aims to predict the execution time of k-Wave ultrasound simulations on supercomputers based on a given domain size. The program uses MPI and can be run on multiple nodes. Prediction models were developed using symbolic regression and neural networks, both of which trained on captured data and compared against each other. The results demonstrate that the models outperform existing solutions. Specifically, the symbolic regression model achieved an average error of 5.64% for suitable tasks, while the neural network model achieved an average error of 8.25% on unseen domain sizes and across all tasks, including those not optimized for k-Wave simulations. This work contributes a new, more accurate model for predicting execution time, and compares the effectiveness of neural networks and symbolic regression for this specific type of regression problem. Overall, these findings suggest that new models will have important practical applications in the field of k-Wave ultrasound simulations. Detailed record
	Optimization of Run Configurations of k-Wave Jobs Sasák, Tomáš ; Jaroš, Marta (referee) ; Jaroš, Jiří (advisor) This thesis focuses on scheduling, i.e. correct approximation of configurations used to run k-Wave simulations on supercomputers from the IT4Innovations infrastructure. Especially, for clusters Salomon and Anselm. A single work is composed of a set which contains many simulations. Every simulation is executed by some code from the k-Wave toolbox. To calculate the simulation, it is necesarry to select a suitable configuration, which means the amount of supercomputer resources (number of nodes, i.e. cores), and the duration of the rental. Creation of an ideal configuration is complicated and is even harder for an inexperienced user. The approximation is made based on the empiric data, obtained from multiple executions of different sets of simulations on given clusters. This data is stored and used by a set of approximators, which performs the actual approximation by methods of interpolation and regression. The text describes the implementation of the final scheduler. By experimenting, the most efficient methods for this problem has found out to be Akima spline, PCHIP interpolation and cubic spline. The main contribution of this work is creation of a tool which can find suitable configuration for k-Wave simulation without knowing the code or having lots of experience with its usage. Detailed record
	Parallelization of Ultrasound Simulations Using 2D Decomposition Nikl, Vojtěch ; Dvořák, Václav (referee) ; Jaroš, Jiří (advisor) This thesis is a part of the k-Wave project, which is a toolbox for the simulation and reconstruction of acoustic wave felds and one of its main contributions is the planning of focused ultrasound surgeries (HIFU). One simulation can take tens of hours and about 60% of the simulation time is taken by the calculation of the 3D Fast Fourier transforms. Up until now the 3D FFT has been calculated purely by the FFTW library and its 1D decomposition, whose major limitation is the maximum number of employable cores. Therefore we introduce a new approach, called the 2D hybrid decomposition of the 3D FFT (HybridFFT), where we combine both MPI processes and OpenMP threads to reach as best performance as possible. On a low number of cores, on the order of a few hundreds, we are about as fast or slightly faster than FFTW and pure MPI 2D decomposition libraries (PFFT and P3DFFT). One of the best results was achieved on a 512^3FFT using 512 cores, where our hybrid version run 31ms, FFTW run 39ms and PFFT run 44ms. The most significant performance advantage should be seen when employing around 8-16 thousand cores, however we haven't had an access to a machine with such resources. Almost a linear scalability has been proven for up to 2048 employed cores. Detailed record
	Analysis of Operational Data and Detection od Anomalies during Supercomputer Job Execution Stehlík, Petr ; Nikl, Vojtěch (referee) ; Jaroš, Jiří (advisor) V posledních letech jsou superpočítače stále větší a složitější, s čímž souvisí problém využití plného potenciálu systému. Tento problém se umocňuje díky nedostatku nástrojů pro monitorování, které jsou specificky přizpůsobeny uživatelům těchto systémů. Cílem práce je vytvořit nástroj, nazvaný Examon Web, pro analýzu a vizualizaci provozních dat superpočítače a provést nad těmito daty hloubkovou analýzu pomocí neurálních sítí. Ty určí, zda daná úloha běžela korektně, či vykazovala známky podezřelého a nežádoucího chování jako je nezarovnaný přístup do operační paměti nebo např. nízké využití alokovaých zdrojů. O těchto faktech je uživatel informován pomocí GUI. Examon Web je postavený na frameworku Examon, který sbírá a procesuje metrická data ze superpočítače a následně je ukládá do databáze KairosDB. Implementace zahrnuje disciplíny od návrhu a implementace GUI, přes datovou analýzu, těžení dat a neurální sítě až po implementaci rozhraní na serverové straně. Examon Web je zaměřen zejména na uživatele, ale může být také využíván administrátory. GUI je vytvořeno ve frameworku Angular s knihovnami Dygraphs a Bootstrap. Uživatel díky tomu může analyzovat časové řady různých metrik své úlohy a stejně jako administrátor se může informovat o současném stavu superpočítače. Tento stav je zobrazen jako několik globálně agregovaných metrik v posledních 30 minutách nebo jako 3D model (či 2D model) superpočítače, který získává data ze samotných uzlů pomocí protokolu MQTT. Pro kontinuální získávání dat bylo využito rozhraní WebSocket s vlastním mechanismem přihlašování a odhlašování konkretních metrik zobrazovaných v modelu. Při analýze spuštěné úlohy má uživatel dostupné tři různé pohledy na danou úlohu. První nabízí celkový přehled o úloze a informuje o využitých zdrojích, času běhu a vytížení části superpočítače, kterou úloha využila společně s informací z neurálních sítí o podezřelosti úlohy. Další dva pohledy zobrazují metriky z výkonnostiního energetického hlediska. Pro naučení neurálních sítí bylo potřeba vytvořit novou datovou sadu ze superpočítače Galileo. Tato sada obsahuje přes 1100 úloh monitorovaných na tomto superpočítači z čehož 500 úloh bylo ručně anotováno a následně použito pro trénování sítí. Neurální sítě využívají model back-propagation, vhodný pro anotování časových sérií fixní délky. Celkem bylo vytvořeno 12 sítí pro metriky zahrnující vytížení procesoru, paměti a dalších části a např. také podíl celkového času procesoru v úsporném režimu C6. Tyto sítě jsou na sobě nezávislé a po experimentech jejich finální konfigurace 80-20-4-3-1 (80 vstupních až 1 výstupní neuron) podávaly nejlepší výsledky. Poslední síť (v konfiguraci 12-4-3-1) anotovala výsledky předešlých sítí. Celková úspěšnost systému klasifikace do 2 tříd je 84 %, což je na použitý model velmi dobré. Výstupem této práce jsou dva produkty. Prvním je uživatelské rozhraní a jeho serverová část Examon Web, která jakožto rozšiřující vrstva systému Examon pomůže s rozšířením daného systému mezi další uživatele či přímo další superpočítačová centra. Druhým výstupem je částečně anotovaná datová sada, která může pomoci dalším lidem v jejich výzkumu a je výsledkem spolupráce VUT, UNIBO a CINECA. Oba výstupy budou zveřejněny s otevřenými zdrojovými kódy. Examon Web byl prezentován na konferenci 1st Users' Conference v Ostravě pořádanou IT4Innovations. Další rozšíření práce může být anotace datové sady a také rozšíření Examon Web o rozhodovací stromy, které určí přesný důvod špatného chování dané úlohy. Detailed record
	Implementation of 2D Ultrasound Simulations Šimek, Dominik ; Vaverka, Filip (referee) ; Jaroš, Jiří (advisor) The work deals with design and implementation of 2D ultrasound simulation. Applications of the ultrasound simulation can be found in medicine, biophysic or image reconstruction. As an example of using the ultrasound simulation we can mention High Intensity Focused Ultrasound that is used for diagnosing and treating cancer. The program is part of the k-Wave toolbox designed for supercomputer systems, specifically for machines with shared memory architecture. The program is implemented in the C++ language and using OpenMP acceleration. Using the designed solution, it is possible to solve large-scale simulations in 2D space. The work also deals with merging and unification of the 2D and 3D simulation using modern C++. A realistic example of use is ultrasound simulation in transcranial neuromodulation and neurostimulation in large domains, which have more than 16384x16384 grid points. Simulation of such size may take several days if we use the original MATLAB 2D k-Wave. Speedup of the new implementation is up to 8 on the Anselm and Salomon supercomputers. Detailed record
	System for Supercomputer Automation Operation Strečanský, Peter ; Hrbáček, Radek (referee) ; Jaroš, Jiří (advisor) The main goal of this thesis is to extend already existing software FabSim by a module, which allows automated supercomputer operation, especially with OpenPBS scheduler. This module was implemented with Python programming language, using Fabric module as its backbone. The scripts, which are executed with OpenPBS are stored as the templates. These templates are dynamically modified to suit users needs. This solution provides a complex set of methods, which allows full--featured operation of supercomputers, integration with git and data management on clusters. The module saves time and makes working with supercomputers much easier. Detailed record
	Development and Programming of Low Power Cluster Hradecký, Michal ; Nikl, Vojtěch (referee) ; Jaroš, Jiří (advisor) This thesis deals with the building and programming of a low power cluster composed of Hardkernel Odroid XU4 kits based on ARM Cortex A15 and Cortex A7 chips. The goal was to design a simple cluster composed of multiple kits and run a set of benchmarks to analyze performance and power consumption. The test set consisted of HPL and Stream benchmarks and various tests for the MPI interface. The overall performance of the cluster composed of four kits in HPL benchmark was measured 23~GFLOP/s in double-precision. During this test, the cluster showed power efficiency about 0.58~GFLOP/W. The work also describes the installation of PBS Torque scheduler and HPC software build and installation framework EasyBuild on 32-bit ARM platform. The comparison with Anselm supercomputer showed that Odroid cluster is as effiecient as large supercomputer but with slightly higher price. Detailed record
	Optimization of the Distributed I/O Subsystem of the k-Wave Project Vysocký, Ondřej ; Klepárník, Petr (referee) ; Jaroš, Jiří (advisor) This thesis deals with an effective solution of the parallel I/O of the k-Wave tool, which is designed for time domain acoustic and ultrasound simulations. k-Wave is a supercomputer application, it runs on a Lustre file system and it requires to be implemented with MPI and stores the data in suitable data format (HDF5). I designed three methods of optimization which fits k-Wave's needs. It uses accumulation and redistribution techniques. In comparison with the native write, every optimization method led to better write speed, up to 13.6GB/s. It is possible to use these methods to optimize every data distributed application with the write speed issue. Detailed record
	Scalable machine learning using Hadoop and Mahout tools Kryške, Lukáš ; Atassi, Hicham (referee) ; Burget, Radim (advisor) This bachelor’s thesis compares several tools for building a scalable, machine learning platform and describes their advantages and disadvantages. It also practically demonstrates functionality of this scalable platform based on the Apache Hadoop and Apache Mahout tools and measures performance of the K-Means algorithm for total of five computing nodes. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English