
Neighborhood components analysis and machine learning
Hanousek, Jan ; Antoch, Jaromír (advisor) ; Maciak, Matúš (referee)
In this thesis we focus on the NCA algorithm, which is a modification of knearest neighbors algorithm. Following a brief introduction into classification algorithms we overview KNN algorithm, its strengths and flaws and what lead to the creation of the NCA. Then we discuss two of the most widely used mod ifications of NCA called Fast NCA and Kernel (fast) NCA, which implements the socalled kernel trick. Integral part of this thesis is also a proposed algo rithm based on KNN (/NCA) and Linear discriminant analysis titled TSKNN (/TSNCA), respectively. We conclude this thesis with a detailed study of two real life financial problems and compare all the algorithms introduced in this thesis based on the performance in these tasks. 1


Bayesian factor analysis
Vávra, Jan ; Komárek, Arnošt (advisor) ; Maciak, Matúš (referee)
Bayesian factor analysis  abstract Factor analysis is a method which enables highdimensional random vector of measurements to be approximated by linear combinations of much lower number of hidden factors. Classical estimation procedure of this model lies on the cho ice of the number of factors, the decomposition of variance matrix while keeping identification conditions satisfied and on the appropriate choice of rotation for better interpretation of the model. This model will be transferred into bayesian framework which offers the usage of prior information unlike the classical appro ach. The number of hidden factors can be considered as a random parameter and the dependency of each measurement on at most one factor can be forced by suitable specification of prior distribution. Estimates of model parameters are based on posterior distribution which is approximated by Monte Carlo Markov Chain methods. Bayesian approach solves the problem of selection of the num ber of factors, the model estimation and the ensuring of the identifiability and the interpretability at the same time. The ability to estimate the real number of hidden factors is tested in a simulation study. 1


Joinpoint Regression
Lain, Michal ; Maciak, Matúš (advisor) ; Hlávka, Zdeněk (referee)
The theme of this thesis is the joinpoint regression, the description of model, its properties and its construction. We are interested in methods of estimating parameters. We show practical use of the model. In the first chapter we define the model, we describe alternative forms and properties. In the second chapter we focus on estimating parameters of model. We briefly mention of Hudson method, profile likelihood, grid search and LASSO. We mention likelihood ratio for testing hypotheses about values of parameters. The third chapter deals with comparison of models by number of break points by permutation tests and information cri terions. In the fourth chapter we deal with practical examples. We show diverse application of the model. We compare methods using simulations and show model application. 1


Structural Equation Models with Application in Social Sciences
Veselý, Václav ; Pešta, Michal (advisor) ; Maciak, Matúš (referee)
We investigate possible usage of ErrorsinVariables estimator (EIV), when esti mating structural equations models (SEM). Structural equations modelling pro vides framework for analysing complex relations among set of random variables where for example the response variable in one equation plays role of the predic tor in another equation. First an overview of SEM and some common covariance based estimators is provided. Special case of linear regression model is investi gated, showing that the covariance based estimators yield the same results as ordinary least squares. A compact review of EIV models follows, ErrorsinVariables models are re gression models where not only response but also predictors are assumed to be measured with an error. Main contribution of this paper then lies in defining modifications of the EIV estimator to fit in the SEM framework. General opti mization problem to estimate the parameters of structural equations model with errorsinvariables si postulated. Several modifications of two stage least squares are also proposed for future research. Equationwise ErrorsinVariables estimator is proposed to estimate the coeffi cients of structural equations model. The coefficients of every structural equation are estimated separately using EIV estimator. Some theoretical conditions...


Statistical inference in varying coefficient models
Splítek, Martin ; Maciak, Matúš (advisor) ; Pešta, Michal (referee)
Tato práce se zabývá modely s promìnlivými koe cienty se za mìøením na statistickou inferenci. Hlavní my¹lenkou tìchto modelù je pou¾ití regresních koe cientù, mìnících se v závislosti na nìjakém modi kátoru vlivu, namísto konstantních koe cientù klasické lineární regrese. Nejprve si de nujeme tyto modely a jejich odhadové procedury, kterých bylo doposud publikováno nì kolik variant. K odhadu se pou¾ívá lokální regrese nebo rùzné druhy splajnù { vyhlazovací, polynomiální èi penalizované. Od metody odhadu se následnì od víjí i daná statistická inference, ke které uvedeme odvozené vychýlení, rozptyl, asymptotickou normalitu, kon denèní pásma a testování hypotéz. Hlavním cílem na¹í práce je kompaktnì shrnout vybrané metody a jejich inferenci. Na závìr je navr¾ena proceduru pro výbìr promìnných.


Regularization and variable selection in regression models
Lahodová, Kateřina ; Komárek, Arnošt (advisor) ; Maciak, Matúš (referee)
This diploma thesis focuses on regularization and variable selection in regres sion models. Basics of penalised likelihood, generalized linear models and their evaluation and comparison based on prediction quality and variable selection are described. Methods called LASSO and LARS for variable selection in normal linear regression are briefly introduced. The main topic of this thesis is method called Boosting. General Boosting algorithm is introduced including functional gradient descent, followed by selection of base procedure, especially the componentwise linear least squares method. Two specific application of general Boosting algorithm are introduced with derivation of some important characteristics. These methods are AdaBoost for data with conditional binomial distribution and L2Boosting for condi tional normal distribution. As a final point a simulation study comparing LASSO, LARS and L2Boosting methods was conducted. It is shown that methods LASSO and LARS are more suitable for variable selection whereas L2Boosting is more fitting for new data prediction.


Varying coefficient models
Sekera, Michal ; Maciak, Matúš (advisor) ; Komárek, Arnošt (referee)
The aim of this thesis is to provide an overview of the varying coefficient mod els  a class of regression models that allow the coefficients to vary as functions of random variables. This concept is described for independent samples, longi tudinal data, and time series. Estimation methods include polynomial spline, smoothing spline, and local polynomial methods for models of a linear form and local maximum likelihood method for models of a generalized linear form. The statistical properties focus on the consistency and asymptotical distribution of the estimators. The numerical study compares the finite sample performance of the estimators of coefficient functions. 1


Problem of the nearest correlation matrix
Sotáková, Martina ; Pešta, Michal (advisor) ; Maciak, Matúš (referee)
This work deals with the problem of finding the correlation matrix closest to the given symetric matrix, the distance of which is measured considering the Frobenius norm. The theoretical part of the thesis describes a method used for finding the solution to this problem based on the dual approach and application of Newton method. The method is further modified for other cases. In the practical part we apply the theory to simple math problems.


Causes of Effects and Effects of Causes
Zemánková, Lucie ; Maciak, Matúš (advisor) ; Antoch, Jaromír (referee)
The thesis deals with an associative and causal relationship between two different random phenomena and presents basic statistical methods for investigation of these relationships. Firstly it focuses on demonstrating the association between phenomena and shows that finding a causal relation between phenomena requires appropriate randomization of the system or intervention in the system. After intervening in the system, it is no longer possible to observe all situations, socalled counterfactual observation, but the causal relationship can still be demonstrated using appropriate technical procedures and theoretical assumptions. The thesis further summarizes different ways of representation of causal structures, first by means of graphs, where basic methods of estimating the causal structure are presented, and later by structural equations that already capture the quantitative measure of causal relations.


Confidence bands for regression curves
Zavřelová, Adéla ; Hlávka, Zdeněk (advisor) ; Maciak, Matúš (referee)
This thesis deals with the constructions of the confidence band for a linear regression model. Basic characteristics of a linear model are given and constructions of different confidence bands are described for models, where the relationship is set by a one variable function. The main focus is on bands of polynomial models.
