Řezanková, H. - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: Řezanková, H.

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Measures for Classification Results Evaluation Řezanková, Hana ; Húsek, Dušan Fulltext: content.csg - PDF Plný tet: v1220-15 - PDF Detailed record
	Biologicky inspirované modely založené na prototypech a aplikace gompertzovské dynamiky ve shlukové analýze Pastorek, Lukáš ; Řezanková, Hana (advisor) ; Húsek, Dušan (referee) ; Nánásiová, Oľga (referee) The thesis deals with the analysis of the clustering and mapping techniques derived from the principles of the neural and statistical learning and growth theory. The selected branch of the unsupervised bio-inspired prototype-based models is described in terms of the proposed logical framework, which highlights the continuity of these methods with the classical "pure" statistical methods. Moreover, as those methods are broadly understood as the "black boxes" with the unpredictable, unclear and especially hidden behavior, the examples of the spatial computational and organizational patterns in two-dimensional space are provided. Additionally, this thesis presents the novel concept based on the non-linear, non-Gaussian Gompertzian function, which has been widely used as the universal law in dynamic growth models, but has not yet been applied in the field of computational intelligence. The essence of Gompertzian dynamics is mathematically analyzed and a novel simple version of the Gompertzian normalized function is introduced. Furthermore, the function was modified for use in the field of artificial intelligence and neural implications were discussed. Additionally, the novel neural networks were proposed and derived from the topological principles of Kohonen's self-organizing maps and neural gas algorithm. The Gompertzian networks were evaluated using several indicators for various generated and real datasets. Gompertzian neural networks with fixed grid and integrated neighborhood ranking principle generally show lower mean squared errors than the original SOM algorithms. Likewise, the unconstrained Gompertzian networks have demonstrated overall low error rates comparable to neural gas algorithm, more stable and lower error solutions than the k- means sequential procedure. In conclusion, the Gompertzian function has been shown to be a viable concept and an effective computational tool for multidimensional data analysis. Detailed record
	Analysis of Relations between Financial and Social Indicators in the Microfinance Sector Kovář, Radoslav ; Řezanková, Hana (advisor) ; Vrabec, Michal (referee) Quantitative reporting of both financial and social performance of microfinancial institutions has significantly progressed in the last decade. Compatibility of financial and social goals in this sector has been evaluated in a growing number of papers, including this bachelor thesis. Description of data and of statistical methods (linear regression analysis and ANOVA) is followed by a detailed presentation of each model. The central part of this thesis consists in fitting the corresponding regressions and interpreting the results. The influence of certain social indicators on a typical cost structure and financial sustainability as well as the influence of certain financial indicators on social targeting was verified. Investors and donors could base their decisions on these results provided a thorough cross-checking on larger data samples. The common origin of all linear regression tests in the Wald test is also proved in this thesis. Detailed record
	Classical and recent approaches in cluster analysis Řezanková, Hana The paper focuses on the development of selected approaches in cluster analysis. There are recently proposed similarity measures for objects characterized by nominal variables, development of algorithms for k-clustering and development of methods for clustering large data files and categorical data. As concerns algorithms for k-clustering, attention is paid to take into account the uncertainty in classifying objects into clusters, namely FCM (fuzzy k-means), PCM, FPCM, RCM, RFCM and RFPCM algorithms. For large data files, algorithms CURE, ROCK, CLARA, CLARANS and BIRCH are included, for categorical data clustering there are COOLCAT and ROCK algorithms. Two-step cluster analysis to cluster large data sets with variables of different types is mentioned. Detailed record
	The analysis of dependence of the material deprivation of the households in the Czech Republic on the selected indicators Cafourková, Magdalena ; Řezanková, Hana (advisor) ; Pecáková, Iva (referee) The aim of this thesis is to analyse the material deprivation of the households with regard to the selected indicators, i.e. the costs that the household spends on housing, a region where the household is located, the number of the members and the dependent children in the household, age and sex of a head of the household, and economic activity and education level of the members of the household. The thesis aims not only to prove the dependence among the selected indicators but also to quantify this dependence by using the odds ratio. The individual effect of all variables was proven except of the one related to the number of the dependent children. It was also demonstrated that the factors constituting a threat for the households by a material deprivation rate vary by the different age groups. However, it can be concluded that across all the age groups, the material deprivation rate is determined by the sex of a head of the household, education level of the members of the household, and the costs that the household spends on housing. Detailed record
	Influence of unisex rates on the life insurance market Vondráčková, Hana ; Řezanková, Hana (advisor) ; Zemánek, Martin (referee) The diploma thesis deals with possible reaction of life insurance companies to legislative decree, in which the European Union forbid, within the equality of sexes, to use sex as a factor for assessing different amount of premium between men and women. Using analysis of the product pricing of five important insurance companies helps to show, which risks (and to which extent) this change covers. Because there will be limited possibility to decide on the premium amount according to one of the most important parameters, it can be expected, this decision will have significant influence on the whole insurance market and it will indicate direction of life insurance to the future. Decision about business strategy is influenced also by other entries, such as position of the insurance company, its business network etc. As the most likely direction of market development is here considered new or more distinctive segmentation of clients, hence there are, on accessible data, using the hypothesis about the relative frequency, tested propositions of the new criteria, which could help determine more fair premium. Detailed record
	Comparison of selected classification methods for multivariate data Stecenková, Marina ; Řezanková, Hana (advisor) ; Berka, Petr (referee) The aim of this thesis is comparison of selected classification methods which are logistic regression (binary and multinominal), multilayer perceptron and classification trees, CHAID and CRT. The first part is reminiscent of the theoretical basis of these methods and explains the nature of parameters of the models. The next section applies the above classification methods to the six data sets and then compares the outputs of these methods. Particular emphasis is placed on the discriminatory power rating models, which a separate chapter is devoted to. Rating discriminatory power of the model is based on the overall accuracy, F-measure and size of the area under the ROC curve. The benefit of this work is not only a comparison of selected classification methods based on statistical models evaluating discriminatory power, but also an overview of the strengths and weaknesses of each method. Detailed record
	Evaluation of Cluster Analysis Methods Löster, Tomáš ; Řezanková, Hana (advisor) ; Berka, Petr (referee) ; Dohnal, Gejza (referee) Cluster analysis includes a range of methods and practices that are used primarily for classification of objects. It takes an important role in many areas. Since the resulting distribution of objects into clusters may vary depending on the selected methods and specifications, it is appropriate to assess the results obtained. This paper proposes new ways of evaluating these results in a situation where objects are characterized by qualitative variables or by variables of different types. These coefficients can be used either to compare different methods (in terms of better outcomes) or for finding of the optimal number of clusters. All of them are based on the detection of variability which is also used for measuring of dissimilarity of objects and clusters. The newly proposed evaluation methods are applied to real data sets (of different sizes, with different number of variables, including variables of different types) and the behavior of these coefficients in different conditions is being examined. These data sets have known as well as unknown classification of objects into clusters. The best coefficient for evaluating clustering results with different types of variables can be considered, based on the analysis carried out, the modified coefficient of CHF. Local maximum value according to which the results of the clustering are evaluated, almost always exists. The analysis has proven that in most cases this value meets the expected results of the well-known classification of objects into clusters. The existence of local extremes of the other coefficients depends on specific data sets and is not always feasible. Detailed record
	Quality measures of classification models and their conversion Hanusek, Lubomír ; Hebák, Petr (advisor) ; Řezanková, Hana (referee) ; Skalská, Hana (referee) Predictive power of classification models can be evaluated by various measures. The most popular measures in data mining (DM) are Gini coefficient, Kolmogorov-Smirnov statistic and lift. These measures are each based on a completely different way of calculation. If an analyst is used to one of these measures it can be difficult for him to asses the predictive power of a model evaluated by another measure. The aim of this thesis is to develop a method how to convert one performance measure into another. Even though this thesis focuses mainly on the above-mentioned measures, it deals also with other measures like sensitivity, specificity, total accuracy and area under ROC curve. During development of DM models you may need to work with a sample that is stratified by values of the target variable Y instead of working with the whole population containing millions of observations. If you evaluate a model developed on a stratified data you may need to convert these measures to the whole population. This thesis describes a way, how to carry out this conversion. A software application (CPM) enabling all these conversions makes part of this thesis. With this application you can not only convert one performance measure to another, but you can also convert measures calculated on a stratified sample to the whole population. Besides the above mentioned performance measures (sensitivity, specificity, total accuracy, Gini coefficient, Kolmogorov-Smirnov statistic), CPM will also generate confusion matrix and performance charts (lift chart, gains chart, ROC chart and KS chart). This thesis comprises the user manual to this application as well as the web address where the application can be downloaded. The theory described in this thesis was verified on the real data. Detailed record
	Cluster analysis of large data sets: new procedures based on the method k-means Žambochová, Marta ; Řezanková, Hana (advisor) ; Húsek, Dušan (referee) ; Antoch, Jaromír (referee) Abstract Cluster analysis has become one of the main tools used in extracting knowledge from data, which is known as data mining. In this area of data analysis, data of large dimensions are often processed, both in the number of objects and in the number of variables, which characterize the objects. Many methods for data clustering have been developed. One of the most widely used is a k-means method, which is suitable for clustering data sets containing large number of objects. It is based on finding the best clustering in relation to the initial distribution of objects into clusters and subsequent step-by-step redistribution of objects belonging to the clusters by the optimization function. The aim of this Ph.D. thesis was a comparison of selected variants of existing k-means methods, detailed characterization of their positive and negative characte- ristics, new alternatives of this method and experimental comparisons with existing approaches. These objectives were met. I focused on modifications of the k-means method for clustering of large number of objects in my work, specifically on the algorithms BIRCH k-means, filtering, k-means++ and two-phases. I watched the time complexity of algorithms, the effect of initialization distribution and outliers, the validity of the resulting clusters. Two real data files and some generated data sets were used. The common and different features of method, which are under investigation, are summarized at the end of the work. The main aim and benefit of the work is to devise my modifications, solving the bottlenecks of the basic procedure and of the existing variants, their programming and verification. Some modifications brought accelerate the processing. The application of the main ideas of algorithm k-means++ brought to other variants of k-means method better results of clustering. The most significant of the proposed changes is a modification of the filtering algorithm, which brings an entirely new feature of the algorithm, which is the detection of outliers. The accompanying CD is enclosed. It includes the source code of programs written in MATLAB development environment. Programs were created specifically for the purpose of this work and are intended for experimental use. The CD also contains the data files used for various experiments. Detailed record

See also: similar author names
2	Řezanková, Hana

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English