National Repository of Grey Literature 23 records found  previous11 - 20next  jump to record: Search took 0.00 seconds. 
Parallel Processing of Huge Astronomical Data
Haas, František ; Zavoral, Filip (advisor) ; Kruliš, Martin (referee)
This master thesis focuses on the Random Forests algorithm analysis and implementation. The Random Forests is a machine learning algorithm targeting data classification. The goal of the thesis is an implementation of the Random Forests algorithm using techniques and technologies of parallel programming for CPU and GPGPU and also a reference serial implementation for CPU. A comparison and evaluation of functional and performance attributes of these implementations will be performed. For the comparison of these implementations various data sets will be used but an emphasis will be given to real world data obtained from astronomical observations of stellar spectra. Usefulness of these implementations for stellar spectra classification from the functional and performance view will be performed. Powered by TCPDF (www.tcpdf.org)
Artificial Intelligence Approach to Credit Risk
Říha, Jan ; Baruník, Jozef (advisor) ; Vošvrda, Miloslav (referee)
This thesis focuses on application of artificial intelligence techniques in credit risk management. Moreover, these modern tools are compared with the current industry standard - Logistic Regression. We introduce the theory underlying Neural Networks, Support Vector Machines, Random Forests and Logistic Regression. In addition, we present methodology for statistical and business evaluation and comparison of the aforementioned models. We find that models based on Neural Networks approach (specifically Multi-Layer Perceptron and Radial Basis Function Network) are outperforming the Logistic Regression in the standard statistical metrics and in the business metrics as well. The performance of the Random Forest and Support Vector Machines is not satisfactory and these models do not prove to be superior to Logistic Regression in our application.
Modern regression methods in data mining
Kopal, Vojtěch ; Holeňa, Martin (advisor) ; Gemrot, Jakub (referee)
The thesis compares several non-linear regression methods on synthetic data sets gen- erated using standard benchmarks for a continuous black-box optimization. For that com- parison, we have chosen the following regression methods: radial basis function networks, Gaussian processes, support vector regression and random forests. We have also included polynomial regression which we use to explain the basic principles of regression. The com- parison of these methods is discussed in the context of black-box optimization problems where the selected methods can be applied as surrogate models. The methods are evalu- ated based on their mean-squared error and on the Kendall's rank correlation coefficient between the ordering of function values according to the model and according to the function used to generate the data. 1
Construction of classifiers suitable for segmentation of clients
Hricová, Jana ; Antoch, Jaromír (advisor) ; Zvára, Karel (referee)
Title: Construction of classifiers suitable for segmentation of clients Author: Bc. Jana Hricová Department: Department of Probability and Mathematical Statistics Supervisor: prof. RNDr. Jaromír Antoch, CSc., Department of Probability and Mathematical Statistics Abstract: The master thesis discusses methods that are a part of the data analy- sis, called classification. In the thesis are presented classification methods used to construct tree like classifiers suitable for customer segmentation. Core methodo- logy that is discussed in our thesis is CART (Classification and Regression Trees) and then methodologies around ensemble models that use historical data to cons- truct classification and regression forests, namely Bagging, Boosting, Arcing and Random Forest. Here described methods were applied to real data from the field of customer segmentation and also to simulated data, both processed with RStudio software. Keywords: classification, tree like classifiers, random forests
Building credit scoring models using selected statistical methods in R
Jánoš, Andrej ; Bašta, Milan (advisor) ; Pecáková, Iva (referee)
Credit scoring is important and rapidly developing discipline. The aim of this thesis is to describe basic methods used for building and interpretation of the credit scoring models with an example of application of these methods for designing such models using statistical software R. This thesis is organized into five chapters. In chapter one, the term of credit scoring is explained with main examples of its application and motivation for studying this topic. In the next chapters, three in financial practice most often used methods for building credit scoring models are introduced. In chapter two, the most developed one, logistic regression is discussed. The main emphasis is put on the logistic regression model, which is characterized from a mathematical point of view and also various ways to assess the quality of the model are presented. The other two methods presented in this thesis are decision trees and Random forests, these methods are covered by chapters three and four. An important part of this thesis is a detailed application of the described models to a specific data set Default using the R program. The final fifth chapter is a practical demonstration of building credit scoring models, their diagnostics and subsequent evaluation of their applicability in practice using R. The appendices include used R code and also functions developed for testing of the final model and code used through the thesis. The key aspect of the work is to provide enough theoretical knowledge and practical skills for a reader to fully understand the mentioned models and to be able to apply them in practice.
Ssh Attacks Detection on Netflow Layer
Marek, Marcel ; Barabas, Maroš (referee) ; Michlovský, Zbyněk (advisor)
This bachelor's thesis briefly describes the basic principles of SSH protocol, its architecture and used encryption. The thesis is mainly focused on datamining information from low-level network communication and usage of its results for attacks detection. It also describes dictionary attacks used on SSH service and with NetFlow shows further possibilities of increasing network security.
Classification Methods for Micriarrays Data
Hudec, Vladimír ; Bartík, Vladimír (referee) ; Burgetová, Ivana (advisor)
This paper discusses about the data obtained from gene chips and methods of their analysis. Analyzes some methods for analyzing these data and focus on the method of "Random Forests". Shows dataset that is used for specific experiments. Methods are realized in R language environment. Than they are tested, and the results are presented and compared. Results with method "Random Forests" are compared with other experiments on same dataset.
Stručné porovnání dvou strategií vážení pro Random Forests
Kotrč, Emil
This paper is concerned with a theoretical comparison of two different modifications of Random Forests method based on weighing of leaves.

National Repository of Grey Literature : 23 records found   previous11 - 20next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.