National Repository of Grey Literature 23 records found  previous4 - 13next  jump to record: Search took 0.01 seconds. 
Statistical models for prediction of project duration
Oberta, Dušan ; Žák, Libor (referee) ; Hübnerová, Zuzana (advisor)
Cieľom tejto bakalárskej práce je odvodiť štatistické modely vhodné pre analýzu dát a aplikovať ich na analýzu reálnych dát týkajúcich sa časovej náročnosti projektov v závislosti na charakteristikách projektov. V úvodnej kapitole sú študované lineárne regresné modely založené na metóde najmenších štvorcov, vrátane ich vlastností a predikčných intervalov. Nasleduje kapitola zaoberajúca sa problematikou zobecnených lineárnych modelov založených na metóde maximálnej vierohodnosti, ich vlastností a zostavením asymptotických konfidenčných intervalov pre stredné hodnoty. Ďalšia kapitola sa zaoberá problematikou regresných stromov, kde sú znova ukázané metóda najmenších štvrocov a metóda maximálnej vierohodnosti. Boli ukázané základné princípy orezávania regresných stromov a odvodenie konfidenčných intervalov pre stredné hodnoty. Metóda maximálnej vierohodnosti pre regresné stromy a odvodenie konfidenčných intervalov boli z podstatnej časti vlastným odvodením autora. Posledným študovaným modelom sú náhodné lesy, vrátane ich základných vlastností a konfidenčných intervalov pre stredné hodnoty. V týchto kapitolách boli taktiež ukázané metódy posúdenia kvality modelu, výberu optimálneho podmodelu, poprípade určenia optimálnych hodnôt rôznych parametrov. Na záver sú dané modely a algoritmy implementované v jazyku Python a aplikované na reálne dáta.
Modern regression methods in data mining
Kopal, Vojtěch ; Holeňa, Martin (advisor) ; Gemrot, Jakub (referee)
The thesis compares several non-linear regression methods on synthetic data sets gen- erated using standard benchmarks for a continuous black-box optimization. For that com- parison, we have chosen the following regression methods: radial basis function networks, Gaussian processes, support vector regression and random forests. We have also included polynomial regression which we use to explain the basic principles of regression. The com- parison of these methods is discussed in the context of black-box optimization problems where the selected methods can be applied as surrogate models. The methods are evalu- ated based on their mean-squared error and on the Kendall's rank correlation coefficient between the ordering of function values according to the model and according to the function used to generate the data. 1
Data Analysis and Clasification from the Brain Activity Detector
Jileček, Jan ; Černocký, Jan (referee) ; Szőke, Igor (advisor)
This thesis aims to implement methods for recording EEG data obtained with the neural activity sensor OpenBCI Ultracortex IV headset. It also describes neurofeedback, methods of obtaining data from the motor cortex for further analysis and takes a look at the machine learning algorithms best suited for the presented problem. Multiple training and testing datasets are created, as well as a tool for recording the brain activity of a headset-wearing test subject, which is being visually presented with cognitive challenges on the screen in front of him. A neurofeedback demo app has been developed, presented and later used for calibration of new test subjects. Next part is data analysis, which aims to discriminate the left and right hand movement intention signatures in the brain motor cortex. Multiple classification methods are used and their utility reviewed.
Pixel-Wise Segmentation Of The Blood Vessels Using Random Forests
Hesko, Branislav
This paper presents segmentation of the blood vessels in retinal images. First, a serie of feature detectors is applied in form of multiple filters. Then, each pixel is classified using random forests, which was trained on labeled images. Promising results have currently been achieved.
Neural networks and tree-based credit scoring models
Turlík, Tomáš ; Krištoufek, Ladislav (advisor) ; Fanta, Nicolas (referee)
The most basic task in credit scoring is to classify potential borrowers as "good" or "bad" based on the probability that they would default in the case they would be accepted. In this thesis we compare widely used lo- gistic regression, neural networks and tree-based ensemble models. During the construction of neural network models we utilize recent techniques and advances in the field of deep learning, while for the tree-based models we use popular bagging, boosting and random forests ensembling algorithms. Performance of the models is measured by ROC AUC metric, which should provide better information value than average accuracy alone. Our results suggest small or even no difference between models, when in the best case scenario neural networks, boosted ensembles and stacked ensembles result in only approximately 1%−2% larger ROC AUC value than logistic regression. Keywords credit scoring, neural networks, decision tree, bagging, boosting, random forest, ensemble, ROC curve
Comparison of statistical methods for the scoring models development
Mrázková, Adéla ; Vitali, Sebastiano (advisor) ; Kopa, Miloš (referee)
The aim of this thesis is to introduce and summarize the process of scoring model development in general and then basic statistical approaches used to resolve this problem, which are in particular logistic regression, neural networks and decision trees (random forests). Application of described methods on a real dataset provided by PROFI CREDIT Czech, a.s. follows, including discussion of some implementation issues and their resolution. Obtained results are discussed and compared.
Valuation of real estates using statistical methods
Funiok, Ondřej ; Pecáková, Iva (advisor) ; Řezanková, Hana (referee)
The thesis deals with the valuation of real estates in the Czech Republic using statistical methods. The work focuses on a complex task based on data from an advertising web portal. The aim of the thesis is to create a prototype of the statistical predication model of the residential properties valuation in Prague and to further evaluate the dissemination of its possibilities. The structure of the work is conceived according to the CRISP-DM methodology. On the pre-processed data are tested the methods regression trees and random forests, which are used to predict the price of real estate.
Parallel Processing of Huge Astronomical Data
Haas, František ; Zavoral, Filip (advisor) ; Kruliš, Martin (referee)
This master thesis focuses on the Random Forests algorithm analysis and implementation. The Random Forests is a machine learning algorithm targeting data classification. The goal of the thesis is an implementation of the Random Forests algorithm using techniques and technologies of parallel programming for CPU and GPGPU and also a reference serial implementation for CPU. A comparison and evaluation of functional and performance attributes of these implementations will be performed. For the comparison of these implementations various data sets will be used but an emphasis will be given to real world data obtained from astronomical observations of stellar spectra. Usefulness of these implementations for stellar spectra classification from the functional and performance view will be performed. Powered by TCPDF (www.tcpdf.org)
Artificial Intelligence Approach to Credit Risk
Říha, Jan ; Baruník, Jozef (advisor) ; Vošvrda, Miloslav (referee)
This thesis focuses on application of artificial intelligence techniques in credit risk management. Moreover, these modern tools are compared with the current industry standard - Logistic Regression. We introduce the theory underlying Neural Networks, Support Vector Machines, Random Forests and Logistic Regression. In addition, we present methodology for statistical and business evaluation and comparison of the aforementioned models. We find that models based on Neural Networks approach (specifically Multi-Layer Perceptron and Radial Basis Function Network) are outperforming the Logistic Regression in the standard statistical metrics and in the business metrics as well. The performance of the Random Forest and Support Vector Machines is not satisfactory and these models do not prove to be superior to Logistic Regression in our application.
Modern regression methods in data mining
Kopal, Vojtěch ; Holeňa, Martin (advisor) ; Gemrot, Jakub (referee)
The thesis compares several non-linear regression methods on synthetic data sets gen- erated using standard benchmarks for a continuous black-box optimization. For that com- parison, we have chosen the following regression methods: radial basis function networks, Gaussian processes, support vector regression and random forests. We have also included polynomial regression which we use to explain the basic principles of regression. The com- parison of these methods is discussed in the context of black-box optimization problems where the selected methods can be applied as surrogate models. The methods are evalu- ated based on their mean-squared error and on the Kendall's rank correlation coefficient between the ordering of function values according to the model and according to the function used to generate the data. 1

National Repository of Grey Literature : 23 records found   previous4 - 13next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.