
Gini coefficient maximization in binary logistic regression
Říha, Samuel ; Hanzák, Tomáš (advisor) ; Hlávka, Zdeněk (referee)
This Bachelor thesis describes a binary logistic regression model. By means of the term loss function a parameter estimation for the model is derived. A "rich" set of "proper" loss functions  beta family of Fisherconsistent loss functions  is defined. In the second part of the thesis, four basic goodnessoffit criteria  Gini coefficient, Cstatistics, KolmogorovSmirnov statistics and coefficient of determination R2 are defined. Further on, a possibility of parameter estimation by maximizing the Gini coefficient is analysed. Several algorithms are designed for this purpose. They are compared with so far existing methods in one simulated data set and three real ones. 1


Regression trees
Masaila, Aleh ; Hanzák, Tomáš (advisor) ; Zvára, Karel (referee)
Title: Regression trees Author: Aleh Masaila Department: Department of Probability and Mathematical Statistics Supervisor: Mgr.Tomáš Hanzák Abstract: Although regression and classification trees are used for data analysis for several decades, they are still in the shadow of more traditional methods such as linear or logistic regression. This paper aims to describe a couple of the most famous regression trees and introduce a new direction in this area  a combination of regression trees and committee methods, so called the regression forests. There is a practical part of work where we try properties, strengths and weaknesses of the examined methods on real data sets. Keywords: regression tree, CART, MARS, regression forest, bagging, boosting, random forest 1


Regression goodnessoffit criteria according to dependent variable type
Šimsa, Filip ; Hanzák, Tomáš (advisor) ; Hlubinka, Daniel (referee)
This work is devoted to the description of linear, logistic, ordinal and multinominal regression models and interpretation of its parameters. Then it introduces a variety of quality indicators of mathematical models and the re lations between them. It focuses mainly on the Gini coefficient and the coefficient of determination R2 . The first mentioned is established by modifying the Lorenz curve for ordinal and continuous variables and by comparing the estimated proba bilities for nominal variable. The coefficient of determination R2 is newly defined for the nominal variable and is examined its relationship with Gini coefficient. As suming normally distributed scores and errors of the model is numerically derived the relation between the Gini coefficient and the coefficient of determiantion for different distribution of continuous dependent variable. Theoretical calculations and definitions are illustrated on two real data sets. 1


Decomposition methods for time series with irregular observations
Hanzák, Tomáš ; Cipra, Tomáš (advisor) ; Prášková, Zuzana (referee)
This work deals with extensions of classical exponential smoothing type methods for univariate time series with irregular observations. Extensions of simple exponential smoothing, Holt method, HoltWinters method and double exponential smoothing which have been developed in past are presented. An alternative method to Wright's modification of simple exponential smoothing for irregular data, based on the corresponding ARIMA process, is suggested. Exponential smoothing of order m for irregular data as a generalization of simple and double exponential smoothing is derived. A similar method using a DLS (discounted least squares) estimation of polynomial trend of order m is derived as well. In all cases the recursive character of these methods is preserved making them easy to implement and high computationally effective. A program in which most of the methods presented here are available is a part of the work. Some numerical examples of their application are also included.

 

Capital Requirement for Operational Risk Modeling
Poláchová, Kateřina ; Orsáková, Martina (advisor) ; Hanzák, Tomáš (referee)
Operational risk is one of important concepts in financial institutions. It needs to be managed, measured and minimized. Bank has to hold capital requirements to cover potential losses from this risk. The aim of this work is to find, describe and apply a model determining how much capital is needed. This work is dedicated to Loss Distribution Approach based on modelling severity and frequency of losses separately for each business line and operational risk event type. With help of Monte Carlo method we can obtain total loss model by aggregating specific distribution functions. Resulting capital requirement is the sum of partial capital requirements of business line/event type that are 99,9% VaR of total loss. Keywords: Operational Risk, Loss Distribution Approach, Extreme Value Theory, Monte Carlo Simulation, ValueatRisk


Step by step credit risk model construction
Rychnovský, Michal ; Charamza, Pavel (advisor) ; Hanzák, Tomáš (referee)
Nazev pracc: Postnpna vyslavba modelu ohoduoconi kroditniho ri/,ika Autor: Michal Ryclmovsky Katedra: Kaledra pravdepoelobnejsti a maternal icke statistiky Vedouci bakalafske pracc: RNDr. Pavel Charam/a, CSc. Email vedouciho: pavol.charani/a''^media research.ex Abstrakt: Ciloni toto pracc je pfibli/it podstatu vvstavby skoringovych mo eleln. Popisnjeme zde metodu logisticke regrese, odhaelovani jejich paramotrn a testovani jcjicli vy/,nanmosti. Na /aklado, proiiioiniych odds ratio potoin zavadimo indei>endence model jako odhad podminone saneo s]>laceni klienta.. Tento ... dale zoljecnHJinne pfidavanini vah jedmjtlivyni sku])inani a ka tegoriini charakt.eristik klienta.. Ta.kto pficha/Jnie k WOE niodeln a jjlnemu logistickemn niodeln. Vennjeine se take nicfeni divcr/ilikacni schopnosti ino deln pomoci Lorenxovy kfivky a Somerovy d statistiky jako odhadu Giuiho koeficientn. Nakonec a])likujeine popsane nietody na praktiekon vystavbn yk(')riiigovych niodeln a na realnych dateeh porovnanie vhodnost a di\erx,ifi kacni scho])nost pi'edstavovanych niodelu. Soneast.i ])race je take vystup na. int.ernetovon encyklo]>edii \\ikiiiedia. Klicova slova: kreditni rixiko, skoringove niodely, logisticka. 1'egrese. Title: Step by step credit risk model construction Author: Michal Rychnovsky Department: Department...


Regression trees
Masaila, Aleh ; Hanzák, Tomáš (advisor) ; Zvára, Karel (referee)
Title: Regression trees Author: Aleh Masaila Department: Department of Probability and Mathematical Statistics Supervisor: Mgr.Tomáš Hanzák Abstract: Although regression and classification trees are used for data analysis for several decades, they are still in the shadow of more traditional methods such as linear or logistic regression. This paper aims to describe a couple of the most famous regression trees and introduce a new direction in this area  a combination of regression trees and committee methods, so called the regression forests. There is a practical part of work where we try properties, strengths and weaknesses of the examined methods on real data sets. Keywords: regression tree, CART, MARS, regression forest 1

 

Estimation and goodnessoffit criteria in logistic regression model
Ondrušková, Markéta ; Hanzák, Tomáš (advisor) ; Zvára, Karel (referee)
In this bachelor thesis we describe binary logistic regression model and estimation of model's parameters by maximum likelihood method. Then we propose algorithm for the least squares method. In the goodnessoffit criteria part we define Lorenz curve, Gini coefficient, Cstatistics, KolmogorovSmirnov statistics and coefficient of determination R2 . We derive their relation to different sample coefficients of correlation. We derive typical relation between Gini coeffi cient, KolmogorovSmirnov statistics and newly also coefficient of determination R2 via model of normally distributed score of bad and good clients. These derived teoretical results are verified on three real data sets. Keywords: Binary logistic regression, maximum likelihood, ordinary least squa res, Gini coefficient, coefficient of determination. 1
