
DurbinWatson test
Lipták, Patrik ; Zvára, Karel (advisor) ; Anděl, Jiří (referee)
The Bachelor Thesis deals with DurbinWatson test which is used to test an inde pendence of residuals in a normal linear regression model. The test is applicable in a case of collecting data gradually and if values of a dependent variable form time series. In the first part, thesis provides detailed derivation of a distribution of test statistic (or its bounds), as well as conclusion describing how to make a right decision in testing a hypothesis that the value of correlation coefficient is equal to 0. In the second part, three practical examples with real data are used to demonstrate this theoretical basis. Moreover, calculations are supplemented by illustrative graphs and they are made in computing environment R for com parison. 1


Regression trees
Masaila, Aleh ; Hanzák, Tomáš (advisor) ; Zvára, Karel (referee)
Title: Regression trees Author: Aleh Masaila Department: Department of Probability and Mathematical Statistics Supervisor: Mgr.Tomáš Hanzák Abstract: Although regression and classification trees are used for data analysis for several decades, they are still in the shadow of more traditional methods such as linear or logistic regression. This paper aims to describe a couple of the most famous regression trees and introduce a new direction in this area  a combination of regression trees and committee methods, so called the regression forests. There is a practical part of work where we try properties, strengths and weaknesses of the examined methods on real data sets. Keywords: regression tree, CART, MARS, regression forest, bagging, boosting, random forest 1


Interval estimates for binomial proportion
Borovský, Marko ; Zvára, Karel (advisor) ; Sečkárová, Vladimíra (referee)
The subject of this thesis is the point estimate and interval estimates of the binomial proportion. Interval estimation of the probability of success in a binomial distribution is one of the most basic and crucial problems in statistical practice. The thesis is divided into three chapters. The first chapter is about maximum likelihood estimation for a binomial proportion. Futhermore, we will describe several methods of the construction of confidence intervals. In the end, we will compare all intervals in term of the actual coverage probability and expected length. 1


Confidence regions in nonlinear regression
Marcinko, Tomáš ; Zvára, Karel (advisor) ; Komárek, Arnošt (referee)
The aim of this thesis is a comprehensive description of the properties of a nonlinear least squares estimator for a nonlinear regression model with normally distributed errors and thorough development of various methods for constructing confidence regions and confidence intervals for the parameters of the nonlinear model. Due to the fact that, unlike the case of linear models, there is no easy way to construct an exact confidence region for the parameters, most of these methods are only approximate. A short simulation study comparing observed coverage of various confidence regions and confidence intervals for models with different curvatures and sample sizes is also included. In case of negligible intrinsic curvature the use of likelihoodratio confidence regions seems the most appropriate.


Construction of classifiers suitable for segmentation of clients
Hricová, Jana ; Antoch, Jaromír (advisor) ; Zvára, Karel (referee)
Title: Construction of classifiers suitable for segmentation of clients Author: Bc. Jana Hricová Department: Department of Probability and Mathematical Statistics Supervisor: prof. RNDr. Jaromír Antoch, CSc., Department of Probability and Mathematical Statistics Abstract: The master thesis discusses methods that are a part of the data analy sis, called classification. In the thesis are presented classification methods used to construct tree like classifiers suitable for customer segmentation. Core methodo logy that is discussed in our thesis is CART (Classification and Regression Trees) and then methodologies around ensemble models that use historical data to cons truct classification and regression forests, namely Bagging, Boosting, Arcing and Random Forest. Here described methods were applied to real data from the field of customer segmentation and also to simulated data, both processed with RStudio software. Keywords: classification, tree like classifiers, random forests


Maximum likelihood estimators and their approximations
Tyuleneva, Anastasia ; Omelčenko, Vadim (advisor) ; Zvára, Karel (referee)
Title: Maximum likelihood estimators and their approximations Author: Anastasia Tyuleneva Department: Department of Probability and Mathematical Statistics Supervisor: Mgr. Vadym Omelchenko Abstract: Maximum likelihood estimators method is one of the most effective and accurate methods that was used for estimation distributions and parameters. In this work we will find out the pros and cons of this method and will compare it with other estimation models. In the theoretical part we will review important theorems and definitions for creating common solution algorithms and for processing the real data. In the practical part we will use the MLE on the case study distributions for estimating the unknown parameters. In the final part we will apply this method on the real price data of EEX A. G, Germani. Also we will compare this method with other typical methods of estimation distributions and parameters and chose the best distribution. All tests and estimators will be provided by Mathematica software. Keywords: parametr estimates, Maximum Likelihood estimators, MLE, Stable distribution, Characteristic function, Pearson's chisquared test, RaoCrámer. .


ExpectationMaximization Algorithm
Vichr, Jaroslav ; Pešta, Michal (advisor) ; Zvára, Karel (referee)
EM (ExpectationMaximization) algorithm is an iterative method for finding maximum likelihood estimates in cases, when either complete data include missing values or assuming the existence of additional unobserved data points can lead to more simple formulation of the model. Each of its iterations consists of two parts. During the E step (expectation) we calculate the expected value of the loglikelihood function of the complete data, with respect to the observed data and the current estimate of the parameter. The M step (maximization) then finds new estimate, which will maximize the function obtained in the previous step and which will be used in the next iteration in step E. EM algorithm has important use in e.g. price and manage risk of the portfolio.


Examination of k regression lines
Drozen, Alan ; Zvára, Karel (advisor) ; Omelka, Marek (referee)
In the present work we study the problem of k regression lines in the general linear model. First we describe the general linear model with a multivariate normal distribution of errors and we show some of its basic characteristics. Then we introduce a model with k regression lines. Further, we describe a test for testing the hypothesis of two regression lines being parallel and another one for testing all or some of the k regression lines being parallel or identical. Then we derive the test of the submodel of the general linear model and analyze issues such as the power of this test, the submodel of another submodel, the orthogonality and reparametrization. We show geometric interpretations of the general linear model and of the submodel test as well. In the subsequent part, we focus on nonparametric tests. We present four permutation tests for testing the submodel in the general linear model. Finally we perform numerical simulation to find out whether the tests match the required size and to determine their power.


Nonlinearity in time series models
Kalibán, František ; Anděl, Jiří (advisor) ; Zvára, Karel (referee)
The thesis concentrates on property of linearity in time series models, its definitions and possibilities of testing. Presented tests focus mainly on the time domain; these are based on various statistical methods such as regression, neural networks and random fields. Their implementation in R software is described. Advantages and disadvantages for tests, which are implemented in more than one package, are discussed. Second topic of the thesis is additivity in nonlinear models. The definition is introduced as well as tests developed for testing its presence. Several test (both linearity and additivity) have been implemented in R for purposes of simulations. The last chapter deals with application of tests to real data. 1


Estimators and tests in panel data models
Zvejšková, Magdalena ; Hušková, Marie (advisor) ; Zvára, Karel (referee)
This work investigates mainly panel data models in which crosssections can be considered independent. In the first part, we summarize results in the field of pool models and oneway error component models with fixed and random effects. We focus especially on the ways of estimating unknown parameters and on effects significance tests. We also briefly describe twoway error component model issues. In the second part, estimators of first order autoregressive panel data model parameters are derived, for both fixed and random parameters case. The work proves unbiasedness, consistency and asymptotic normality of selected estimators. Using these features, hypothesis tests about corresponding parameters are derived. Application of models is illustrated using real data and simulated data examples. Powered by TCPDF (www.tcpdf.org)
