National Repository of Grey Literature 37 records found  beginprevious21 - 30next  jump to record: Search took 0.01 seconds. 
Algorithms for anomaly detection in data from clinical trials and health registries
Bondarenko, Maxim ; Blaha, Milan (referee) ; Schwarz, Daniel (advisor)
This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
Algorithms for anomaly detection in data from clinical trials and health registries
Bondarenko, Maxim ; Blaha, Milan (referee) ; Schwarz, Daniel (advisor)
This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records.
Robust regression - outlier detection
Hradilová, Lenka ; Blatná, Dagmar (advisor) ; Černý, Jindřich (referee)
This master thesis is focused on methods of outlier detection. The aim of this work is to assess the suitability of using robust methods on real data of EKO-KOM, a.s. The first part of the thesis provides an overview and a theoretical treatise on classic and robust methods of outlier detection. These methods are subsequently applied to the obtained data file of EKO-KOM, a.s. in the practical part of the thesis. At the conclusion of the thesis, there are recommendations about suitability of methods, which are based on comparison of classical and robust methods.
Stable distributions and their applications
Volchenkova, Irina ; Klebanov, Lev (advisor) ; Beneš, Viktor (referee)
The aim of this thesis is to show that the use of heavy-tailed distributions in finance is theoretically unfounded and may cause significant misunderstandings and fallacies in model interpretation. The main reason seems to be a wrong understanding of the concept of the distributional tail. Also in models based on real data it seems more reasonable to concentrate on the central part of the distribution not tails. Powered by TCPDF (www.tcpdf.org)
Robustification of statistical and econometrical regression methods
Jurczyk, Tomáš ; Víšek, Jan Ámos (advisor) ; Hlávka, Zdeněk (referee) ; Malý, Marek (referee)
Title: Robustification of statistical and econometrical regression methods Author: Mgr. Tomáš Jurczyk Department: Department of probability and mathematical statistics Supervisor: prof. RNDr. Jan Ámos Víšek CSc., IES FSV UK Praha Abstract: Multicollinearity and outlier presence are two problems of data which can occur during the regression analysis. In this thesis we are interested mainly in situations where combined outlier-multicollinearity problem is present. We will show first the behavior of classical methods developed for overcoming one of these problems. We will investigate the functionality of methods proposed as robust multicollinearity detectors as well. We will prove that proposed two-step procedures (in one step typically based on robust regression methods) are failing in outlier detection and therefore also multicollinearity detection, if the strong multicollinearity is present in the majority of the data. We will propose a new one-step method as a candidate for the robust detector of multicollinearity as well as the robust ridge regression estimate. We will derive its properties, behavior and propose the diagnostic tools derived from that method. Keywords: multicollinearity, outliers, robust detector of multicollinearity, ro- bust ridge regression 1
Outliers
Kudrnáč, Vojtěch ; Zvára, Karel (advisor) ; Anděl, Jiří (referee)
This paper concerns itself with the methods of identifying outliers in an otherwise normally distributed data set. Several significant tests and criteria designed for this purpose are described here, Peirce's criterion, Chauvenet's criterion, Grubbs' test, Dixon's test and Cochran's test. Deriving of the tests and criteria is indicated and finally the results of the use of the test and criteria on simulated data with normal distribution and inserted outlier are looked into. Codes in programming language R with the implementation of these test and criteria using existing functions are included. Powered by TCPDF (www.tcpdf.org)
Robust Regression Estimators: A Comparison of Prediction Performance
Kalina, Jan ; Peštová, Barbora
Regression represents an important methodology for solving numerous tasks of applied econometrics. This paper is devoted to robust estimators of parameters of a linear regression model, which are preferable whenever the data contain or are believed to contain outlying measurements (outliers). While various robust regression estimators are nowadays available in standard statistical packages, the question remains how to choose the most suitable regression method for a particular data set. This paper aims at comparing various regression methods on various data sets. First, the prediction performance of common robust regression estimators are compared on a set of 24 real data sets from public repositories. Further, the results are used as input for a metalearning study over 9 selected features of individual data sets. On the whole, the least trimmed squares turns out to be superior to the least squares or M-estimators in the majority of the data sets, while the process of metalearning does not succeed in a reliable prediction of the most suitable estimator for a given data set.
Robust Regularized Discriminant Analysis Based on Implicit Weighting
Kalina, Jan ; Hlinka, Jaroslav
In bioinformatics, regularized linear discriminant analysis is commonly used as a tool for supervised classification problems tailormade for high-dimensional data with the number of variables exceeding the number of observations. However, its various available versions are too vulnerable to the presence of outlying measurements in the data. In this paper, we exploit principles of robust statistics to propose new versions of regularized linear discriminant analysis suitable for highdimensional data contaminated by (more or less) severe outliers. The work exploits a regularized version of the minimum weighted covariance determinant estimator, which is one of highly robust estimators of multivariate location and scatter. The performance of the novel classification methods is illustrated on real data sets with a detailed analysis of data from brain activity research.
Fulltext: content.csg - Download fulltextPDF
Plný tet: v1241-16 - Download fulltextPDF
Diagnostics for Robust Regression: Linear Versus Nonlinear Model
Kalina, Jan
Robust statistical methods represent important tools for estimating parameters in linear as well as nonlinear econometric models. In contrary to the least squares, they do not suffer from vulnerability to the presence of outlying measurements in the data. Nevertheless, they need to be accompanied by diagnostic tools for verifying their assumptions. In this paper, we propose the asymptotic Goldfeld-Quandt test for the regression median. It allows to formulate a natural procedure for models with heteroscedastic disturbances, which is again based on the regression median. Further, we pay attention to nonlinear regression model. We focus on the nonlinear least weighted squares estimator, which is one of recently proposed robust estimators of parameters in a nonlinear regression. We study residuals of the estimator and use a numerical simulation to reveal that they can be severely heteroscedastic also for data generated from a model with homoscedastic disturbances. Thus, we give a warning that standard residuals of the robust nonlinear estimator may produce misleading results if used for the standard diagnostic tools
On Exact Heteroscedasticity Testing for Robust Regression
Kalina, Jan ; Peštová, Barbora
The paper is devoted to the least weighted squares estimator, which is one of highly robust estimators for the linear regression model. Novel permutation tests of heteroscedasticity are proposed. Also the asymptotic behavior of the permutation test statistics of the Goldfeld-Quandt and Breusch-Pagan tests is investigated. A numerical experiment on real economic data is presented, which also shows how to perform a robust prediction model under heteroscedasticity.

National Repository of Grey Literature : 37 records found   beginprevious21 - 30next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.