Národní úložiště šedé literatury Nalezeno 6 záznamů.  Hledání trvalo 0.01 vteřin. 
A Nonparametric Bootstrap Comparison of Variances of Robust Regression Estimators.
Kalina, Jan ; Tobišková, Nicole ; Tichavský, Jan
While various robust regression estimators are available for the standard linear regression model, performance comparisons of individual robust estimators over real or simulated datasets seem to be still lacking. In general, a reliable robust estimator of regression parameters should be consistent and at the same time should have a relatively small variability, i.e. the variances of individual regression parameters should be small. The aim of this paper is to compare the variability of S-estimators, MM-estimators, least trimmed squares, and least weighted squares estimators. While they all are consistent under general assumptions, the asymptotic covariance matrix of the least weighted squares remains infeasible, because the only available formula for its computation depends on the unknown random errors. Thus, we take resort to a nonparametric bootstrap comparison of variability of different robust regression estimators. It turns out that the best results are obtained either with MM-estimators, or with the least weighted squares with suitable weights. The latter estimator is especially recommendable for small sample sizes.
How to down-weight observations in robust regression: A metalearning study
Kalina, Jan ; Pitra, Z.
Metalearning is becoming an increasingly important methodology for extracting knowledge from a data base of available training data sets to a new (independent) data set. The concept of metalearning is becoming popular in statistical learning and there is an increasing number of metalearning applications also in the analysis of economic data sets. Still, not much attention has been paid to its limitations and disadvantages. For this purpose, we use various linear regression estimators (including highly robust ones) over a set of 30 data sets with economic background and perform a metalearning study over them as well as over the same data sets after an artificial contamination.
Robust Metalearning: Comparing Robust Regression Using A Robust Prediction Error
Peštová, Barbora ; Kalina, Jan
The aim of this paper is to construct a classification rule for predicting the best regression estimator for a new data set based on a database of 20 training data sets. Various estimators considered here include some popular methods of robust statistics. The methodology used for constructing the classification rule can be described as metalearning. Nevertheless, standard approaches of metalearning should be robustified if working with data sets contaminated by outlying measurements (outliers). Therefore, our contribution can be also described as robustification of the metalearning process by using a robust prediction error. In addition to performing the metalearning study by means of both standard and robust approaches, we search for a detailed interpretation in two particular situations. The results of detailed investigation show that the knowledge obtained by a metalearning approach standing on standard principles is prone to great variability and instability, which makes it hard to believe that the results are not just a consequence of a mere chance. Such aspect of metalearning seems not to have been previously analyzed in literature.
How to down-weight observations in robust regression: A metalearning study
Kalina, Jan ; Pitra, Zbyněk
Metalearning is becoming an increasingly important methodology for extracting knowledge from a data base of available training data sets to a new (independent) data set. The concept of metalearning is becoming popular in statistical learning and there is an increasing number of metalearning applications also in the analysis of economic data sets. Still, not much attention has been paid to its limitations and disadvantages. For this purpose, we use various linear regression estimators (including highly robust ones) over a set of 30 data sets with economic background and perform a metalearning study over them as well as over the same data sets after an artificial contamination. We focus on comparing the prediction performance of the least weighted squares estimator with various weighting schemes. A broader spectrum of classification methods is applied and a support vector machine turns out to yield the best results. While results of a leave-1-out cross validation are very different from results of autovalidation, we realize that metalearning is highly unstable and its results should be interpreted with care. We also focus on discussing all possible limitations of the metalearning methodology in general.
Robust Regression Estimators: A Comparison of Prediction Performance
Kalina, Jan ; Peštová, Barbora
Regression represents an important methodology for solving numerous tasks of applied econometrics. This paper is devoted to robust estimators of parameters of a linear regression model, which are preferable whenever the data contain or are believed to contain outlying measurements (outliers). While various robust regression estimators are nowadays available in standard statistical packages, the question remains how to choose the most suitable regression method for a particular data set. This paper aims at comparing various regression methods on various data sets. First, the prediction performance of common robust regression estimators are compared on a set of 24 real data sets from public repositories. Further, the results are used as input for a metalearning study over 9 selected features of individual data sets. On the whole, the least trimmed squares turns out to be superior to the least squares or M-estimators in the majority of the data sets, while the process of metalearning does not succeed in a reliable prediction of the most suitable estimator for a given data set.
Výběr relevantních pravidel pro podporu klinického rozhodování
Kalina, Jan ; Zvárová, Jana
Systémy pro podporu klinického rozhodování jsou důležitými telemedicínskými nástroji se schopností pomáhat lékařům při procesu rozhodování při stanovení diagnózy, terapie či prognózy pacientů. Navrhli a implementovali jsme prototyp systému pro podporu diagnostického rozhodování, který má podobu internetové klasifikační služby. Specifikem tohoto systému je sofistikovaná statistická komponenta, která umožňuje pracovat i s velkým počtem příznaků. Optimalizuje totiž výběr těch příznaků, které jsou nejdůležitější pro určení diagnózy. Její chování jsme ověřili při analýze dat genových expresí z kardiovaskulární genetické studie. Článek diskutuje principy mnohorozměrného statistického uvažování a ukazuje obtíže analýzy vysoce dimenzionálních dat, kdy počet pozorovaných proměnných (příznaků) převyšuje počet pozorování (pacientů).

Chcete být upozorněni, pokud se objeví nové záznamy odpovídající tomuto dotazu?
Přihlásit se k odběru RSS.