National Repository of Grey Literature 26 records found  previous7 - 16next  jump to record: Search took 0.00 seconds. 
Weighted Halfspace Depths and Their Properties
Kotík, Lukáš ; Hlubinka, Daniel (advisor) ; Omelka, Marek (referee) ; Mosler, Karl (referee)
Statistical depth functions became well known nonparametric tool of multivariate data analyses. The most known depth functions include the halfspace depth. Although the halfspace depth has many desirable properties, some of its properties may lead to biased and misleading results especially when data are not elliptically symmetric. The thesis introduces 2 new classes of the depth functions. Both classes generalize the halfspace depth. They keep some of its properties and since they more respect the geometric structure of data they usually lead to better results when we deal with non-elliptically symmetric, multimodal or mixed distributions. The idea presented in the thesis is based on replacing the indicator of a halfspace by more general weight function. This provides us with a continuum, especially if conic-section weight functions are used, between a local view of data (e.g. kernel density estimate) and a global view of data as is e.g. provided by the halfspace depth. The rate of localization is determined by the choice of the weight functions and theirs parameters. Properties including the uniform strong consistency of the proposed depth functions are proved in the thesis. Limit distribution is also discussed together with some other data depth related topics (regression depth, functional data depth)...
Some Robust Distances for Multivariate Data
Kalina, Jan ; Peštová, Barbora
Numerous methods of multivariate statistics and data mining suffer from the presence of outlying measurements in the data. This paper presents new distance measures suitable for continuous data. First, we consider a Mahalanobis distance suitable for high-dimensional data with the number of variables (largely) exceeding the number of observations. We propose its doubly regularized version, which combines a regularization of the covariance matrix with replacing the means of multivariate data by their regularized counterparts. We formulate explicit expressions for some versions of the regularization of the means, which can be interpreted as a denoising (i.e. robust version) of standard means. Further, we propose a robust cosine similarity measure, which is based on implicit weighting of individual observations. We derive properties of the newly proposed robust cosine similarity, which includes a proof of the high robustness in terms of the breakdown point.
Volatility of selected separators/classifiers wrt. data sets from field of particle physics
Jiřina, Marcel ; Hakl, František
We study the volatility, i.e. influence of random changes in data sets to overall separation/classification behavior of separators/classifiers. This is motivated by the fact, that simulated data and true data from ATLAS experiment may differ, and a question arises what if separators or cuts are optimized for simulated data, and then used for true data from the experiment. This behavior was studied using simulated data modified by artificial distortions of known size. We found that even slight change in data sets causes a little worse result than supposed but, surprisingly, even relatively large distortions give then nearly the same results. Only truly great variations cause degradation of separation quality of separator/classifier as well as of the cuts method.
Fulltext: content.csg - Download fulltextPDF
Plný tet: v1126-11 - Download fulltextPDF
Testing Random Forests for Unix and Windows
Jiřina, Marcel ; Jiřina jr., M.
Fulltext: content.csg - Download fulltextPDF
Plný tet: v1075-10 - Download fulltextPDF
Klasifikátor založený na inverzních hodnotách indexů II. teorie a příloha
Jiřina, Marcel ; Jiřina jr., M.
A theory of a new method for the classification of data into classes is presented. The method is based on the sum of reciprocals of neighbors' indexes. We show that neighbors' indexes are in close relation to the approximate polynomial transform of the neighbors' distances. The sum of the reciprocals of indexes for all neighbors forms truncated harmonic series due to a finite number of its elements. For the neighbors of one class there is a sum of the selected elements of this truncated series. It is proved that the ratio of these sums gives just the probability that the point to be classified - the query point - is of that class.
Fulltext: content.csg - Download fulltextPDF
Plný tet: v1041-08 - Download fulltextPDF
Klasifikátor založený na inverzních hodnotách indexů
Jiřina, Marcel ; Jiřina jr., M.
A new method for the classification of data into classes is presented. The method is based on the sum of reciprocals of neighbors' indexes. We show that neighbors' indexes are in close relation to the polynomial transform of the neighbors' distances. The sum of the reciprocals of indexes for all neighbors forms truncated harmonic series due to a finite number of its elements. For the neighbors of one class there is a sum of the selected elements of this truncated series. It is proved that the ratio of these sums gives just the probability that the point to be classified -- the query point -- is of that class. The classification ability is demonstrated on real-life data from the Machine Learning Repository and the results are compared with published results obtained through other methods.
Fulltext: content.csg - Download fulltextPDF
Plný tet: v1034-08 - Download fulltextPDF
Analysis of Decay Processes Separation
Jiřina, Marcel ; Hakl, František
Fulltext: content.csg - Download fulltextPDF
Plný tet: v1035-08 - Download fulltextPDF
Metoda váhované metriky s nehladkým procesem učení
Jiřina, Marcel ; Jiřina jr., M.
A new approach to the Learning Weighted Metrics method for optimized classification of data with 1-NN rule Vidal is proposed. New approach is based on application of updating rule similar to one of Madaline neural network, and on dynamic optimization of the step size similar to Runge's method of half step. A short theory is given and the classification ability is demonstrated.
Fulltext: content.csg - Download fulltextPDF
Plný tet: v1026-08 - Download fulltextPDF

National Repository of Grey Literature : 26 records found   previous7 - 16next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.