National Repository of Grey Literature 37 records found  previous4 - 13nextend  jump to record: Search took 0.01 seconds. 
Nové metody ve schvalování úvěrů
Rychnovský, Michal ; Arlt, Josef (advisor) ; Pecáková, Iva (referee) ; Veselý, Petr (referee)
This thesis contributes to the field of applied statistics and financial modeling by analyzing mathematical models used in retail credit underwriting processes. Specifically, it has three goals. First, the thesis aims to challenge the performance criteria used by established statistical approaches and propose focusing on predictive power instead. Secondly, it compares the analytical leverage of the established and other suggested methods according to the newly proposed criteria. Third, the thesis seeks to develop and specify a new comprehensive profitability-based underwriting model and critically reflect on its strengths and weaknesses. In the first chapter I look into the area of probability of default modeling and argue for comparing the predictive power of the models in time rather than focusing on the random testing sample only, as typically suggested in the scholarly literature. For this purpose I use the concept of survival analysis and the Cox model in particular, and apply it to a real Czech banking data sample alongside the commonly used logistic regression model to compare the results using the Gini coefficient and lift characteristics. The Cox model performs comparably on the randomly chosen validation sample and clearly outperforms the logistic regression approach in the predictive power. In the second chapter, in the area of loss given default modeling I introduce two Cox-based models, and compare their predictive power with the standard approaches using the linear and logistic regression on a real data sample. Based on the modified coefficient of determination, the Cox model shows better predictions. Third chapter focuses on estimating the expected profit as an alternative to the risk estimation itself and building on the probability of default and loss given default models, I construct a comprehensive profitability model for fix-term retail loans underwriting. The model also incorporates various related risk-adjusted revenues and costs, allowing more precise results. Moreover, I propose four measures of profitability, including the risk-adjusted expected internal rate of return and return on equity and simulate the impact of the model on each of the measures. Finally, I discuss some weaknesses of these approaches and solve the problem of finding default or fraud concentrations in the portfolio. For this purpose, I introduce a new statistical measure based on a pre-defined expert critical default rate and compare the GUHA method with the classification tree method on a real data sample. While drawing on the comparison of different methods, this work contributes to the debates about survival analysis models used in financial modeling and profitability models used in credit underwriting.
Valuation of real estates using statistical methods
Funiok, Ondřej ; Pecáková, Iva (advisor) ; Řezanková, Hana (referee)
The thesis deals with the valuation of real estates in the Czech Republic using statistical methods. The work focuses on a complex task based on data from an advertising web portal. The aim of the thesis is to create a prototype of the statistical predication model of the residential properties valuation in Prague and to further evaluate the dissemination of its possibilities. The structure of the work is conceived according to the CRISP-DM methodology. On the pre-processed data are tested the methods regression trees and random forests, which are used to predict the price of real estate.
Hodnocení Výsledků Fuzzy Shlukování
Říhová, Elena ; Pecáková, Iva (advisor) ; Řezanková, Hana (referee) ; Žambochová, Marta (referee)
Cluster analysis is a multivariate statistical classification method, implying different methods and procedures. Clustering methods can be divided into hard and fuzzy; the latter one provides a more precise picture of the information by clustering objects than hard clustering. But in practice, the optimal number of clusters is not known a priori, and therefore it is necessary to determine the optimal number of clusters. To solve this problem, the validity indices help us. However, there are many different validity indices to choose from. One of the goals of this work is to create a structured overview of existing validity indices and techniques for evaluating fuzzy clustering results in order to find the optimal number of clusters. The main aim was to propose a new index for evaluating the fuzzy clustering results, especially in cases with a large number of clusters (defined as more than five). The newly designed coefficient is based on the degrees of membership and on the distance (Euclidean distance) between the objects, i.e. based on principles from both fuzzy and hard clustering. The suitability of selected validity indices was applied on real and generated data sets with known optimal number of clusters a priory. These data sets have different sizes, different numbers of variables, and different numbers of clusters. The aim of the current work is regarded as fulfilled. A key contribution of this work was a new coefficient (E), which is appropriate for evaluating situations with both large and small numbers of clusters. Because the new validity index is based on the principles of both fuzzy clustering and hard clustering, it is able to correctly determine the optimal number of clusters on both small and large data sets. A second contribution of this research was a structured overview of existing validity indices and techniques for evaluating the fuzzy clustering results.
Building credit scoring models using selected statistical methods in R
Jánoš, Andrej ; Bašta, Milan (advisor) ; Pecáková, Iva (referee)
Credit scoring is important and rapidly developing discipline. The aim of this thesis is to describe basic methods used for building and interpretation of the credit scoring models with an example of application of these methods for designing such models using statistical software R. This thesis is organized into five chapters. In chapter one, the term of credit scoring is explained with main examples of its application and motivation for studying this topic. In the next chapters, three in financial practice most often used methods for building credit scoring models are introduced. In chapter two, the most developed one, logistic regression is discussed. The main emphasis is put on the logistic regression model, which is characterized from a mathematical point of view and also various ways to assess the quality of the model are presented. The other two methods presented in this thesis are decision trees and Random forests, these methods are covered by chapters three and four. An important part of this thesis is a detailed application of the described models to a specific data set Default using the R program. The final fifth chapter is a practical demonstration of building credit scoring models, their diagnostics and subsequent evaluation of their applicability in practice using R. The appendices include used R code and also functions developed for testing of the final model and code used through the thesis. The key aspect of the work is to provide enough theoretical knowledge and practical skills for a reader to fully understand the mentioned models and to be able to apply them in practice.
Clustering and regression analysis of micro panel data
Sobíšek, Lukáš ; Pecáková, Iva (advisor) ; Komárek, Arnošt (referee) ; Brabec, Marek (referee)
The main purpose of panel studies is to analyze changes in values of studied variables over time. In micro panel research, a large number of elements are periodically observed within the relatively short time period of just a few years. Moreover, the number of repeated measurements is small. This dissertation deals with contemporary approaches to the regression and the clustering analysis of micro panel data. One of the approaches to the micro panel analysis is to use multivariate statistical models originally designed for crosssectional data and modify them in order to take into account the within-subject correlation. The thesis summarizes available tools for the regression analysis of micro panel data. The known and currently used linear mixed effects models for a normally distributed dependent variable are recapitulated. Besides that, new approaches for analysis of a response variable with other than normal distribution are presented. These approaches include the generalized marginal linear model, the generalized linear mixed effects model and the Bayesian modelling approach. In addition to describing the aforementioned models, the paper also includes a brief overview of their implementation in the R software. The difficulty with the regression models adjusted for micro panel data is the ambiguity of their parameters estimation. This thesis proposes a way to improve the estimations through the cluster analysis. For this reason, the thesis also contains a description of methods of the cluster analysis of micro panel data. Because supply of the methods is limited, the main goal of this paper is to devise its own two-step approach for clustering micro panel data. In the first step, the panel data are transformed into a static form using a set of proposed characteristics of dynamics. These characteristics represent different features of time course of the observed variables. In the second step, the elements are clustered by conventional spatial clustering techniques (agglomerative clustering and the C-means partitioning). The clustering is based on a dissimilarity matrix of the values of clustering variables calculated in the first step. Another goal of this paper is to find out whether the suggested procedure leads to an improvement in quality of the regression models for this type of data. By means of a simulation study, the procedure drafted herein is compared to the procedure applied in the kml package of the R software, as well as to the clustering characteristics proposed by Urso (2004). The simulation study demonstrated better results of the proposed combination of clustering variables as compared to the other combinations currently used. A corresponding script written in the R-language represents another benefit of this paper. It is available on the attached CD and it can be used for analyses of readers own micro panel data.

National Repository of Grey Literature : 37 records found   previous4 - 13nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.