National Repository of Grey Literature 5 records found  Search took 0.01 seconds. 
Analysis of the Quality of life using cluster analysis and comparison with the Human Development Index
Pánková, Barbara ; Miskolczi, Martina (advisor) ; Langhamrová, Jana (referee)
Nowadays quality of life is often discussed topic. In defining this term, there is considerable ambiguity and disunity, since there is no universally accepted definition, nor theoretically sophisticated model. However, despite this fact, the level of quality of life is currently one of the most discussed topic. Monitoring the quality of life by using a variety of indicators are engaged in several international organizations, one of them is the Development Programme of the United Nations. This organization annually publishes the Human Development Index, which divides the world´s countries into four groups according to their level of development: low, medium, high and very high development. The aim of this thesis is to analyze the quality of life in 125 countries by using cluster analysis, accurately the Ward's method. Quality of life in this thesis is evaluated based on 19 demographic and economic indicators, which include life expectancy, literacy rate, access to drinking water and infant mortality rate. The cluster analysis divided the country into individual clusters by their similarities. Six clusters were created by this analysis, which had been compared with the results of Human Development Index. The clusters very well reflect the division, which is commonly used in the characterization of developing and developed countries. Each of the six clusters can be very well described and characterized in terms of quality of life. It is also possible qualify those clusters as poorest developing, low developed, moderately developed, medium development, high and very high development countries. Based on the results it can be stated that this analysis is consistent with other indicators of quality of life and the resulting clusters are identical with the division of countries which is commonly used.
Clustering and regression analysis of micro panel data
Sobíšek, Lukáš ; Pecáková, Iva (advisor) ; Komárek, Arnošt (referee) ; Brabec, Marek (referee)
The main purpose of panel studies is to analyze changes in values of studied variables over time. In micro panel research, a large number of elements are periodically observed within the relatively short time period of just a few years. Moreover, the number of repeated measurements is small. This dissertation deals with contemporary approaches to the regression and the clustering analysis of micro panel data. One of the approaches to the micro panel analysis is to use multivariate statistical models originally designed for crosssectional data and modify them in order to take into account the within-subject correlation. The thesis summarizes available tools for the regression analysis of micro panel data. The known and currently used linear mixed effects models for a normally distributed dependent variable are recapitulated. Besides that, new approaches for analysis of a response variable with other than normal distribution are presented. These approaches include the generalized marginal linear model, the generalized linear mixed effects model and the Bayesian modelling approach. In addition to describing the aforementioned models, the paper also includes a brief overview of their implementation in the R software. The difficulty with the regression models adjusted for micro panel data is the ambiguity of their parameters estimation. This thesis proposes a way to improve the estimations through the cluster analysis. For this reason, the thesis also contains a description of methods of the cluster analysis of micro panel data. Because supply of the methods is limited, the main goal of this paper is to devise its own two-step approach for clustering micro panel data. In the first step, the panel data are transformed into a static form using a set of proposed characteristics of dynamics. These characteristics represent different features of time course of the observed variables. In the second step, the elements are clustered by conventional spatial clustering techniques (agglomerative clustering and the C-means partitioning). The clustering is based on a dissimilarity matrix of the values of clustering variables calculated in the first step. Another goal of this paper is to find out whether the suggested procedure leads to an improvement in quality of the regression models for this type of data. By means of a simulation study, the procedure drafted herein is compared to the procedure applied in the kml package of the R software, as well as to the clustering characteristics proposed by Urso (2004). The simulation study demonstrated better results of the proposed combination of clustering variables as compared to the other combinations currently used. A corresponding script written in the R-language represents another benefit of this paper. It is available on the attached CD and it can be used for analyses of readers own micro panel data.
Cluster analysis as a tool for classification of objects
Budilová, Šárka ; Löster, Tomáš (advisor) ; Šulc, Zdeněk (referee)
Cluster analysis is a popular method of multivariate statistics. Based on mutual similarities between objects this method is able to classify and divide objects into several groups or clusters. The results of the clustering can be different by using different methods, measures of distance and procedures. The main aim of this thesis is to compare the results of several methods of cluster analysis with the known classification of classes from the original data file. In total, there are 15 data files, which were analyzed and each of them contained known information about the right allocation of objects in groups. The success of clustering of each method was calculated by comparing the known classification of classes and resulted clusters. In addition to the comparison of individual methods of cluster analysis was compared the impact of standardization and correlation to the success of each method. To reflect the distance betweeen the objects within each clusters, squared Euclidean distance was used. The results of this thesis point out that better success of clustering were achieved in the case of correlated variables in data file. The succes of clustering was higher about 2 percent points than in the case when correlated variables were deleted from data set. The methods divided 69,8 % objects before standardization and 70,8 % objects after standardization. The results also show a large importance of standardization in the case of Ward´s method. After standardization this method rank the most objects into correct classification classes and were more succesful, about nine percent points. In the case of correlated variables is the succes of the method 76,4 %. Standardization positively influences also centroid method and the method of farthest neighbour. Median method, nearest neighbour method and the method of average linkage achieve higher success of clustering in the case of original, nonstandardized variables (uneven variables).
Discriminant and cluster analysis as a tool for classification of objects
Rynešová, Pavlína ; Löster, Tomáš (advisor) ; Řezanková, Hana (referee)
Cluster and discriminant analysis belong to basic classification methods. Using cluster analysis can be a disordered group of objects organized into several internally homogeneous classes or clusters. Discriminant analysis creates knowledge based on the jurisdiction of existing classes classification rule, which can be then used for classifying units with an unknown group membership. The aim of this thesis is a comparison of discriminant analysis and different methods of cluster analysis. To reflect the distances between objects within each cluster, squeared Euclidean and Mahalanobis distances are used. In total, there are 28 datasets analyzed in this thesis. In case of leaving correlated variables in the set and applying squared Euclidean distance, Ward´s method classified objects into clusters the most successfully (42,0 %). After changing metrics on the Mahalanobis distance, the most successful method has become the furthest neighbor method (37,5 %). After removing highly correlated variables and applying methods with Euclidean metric, Ward´s method was again the most successful in classification of objects (42,0%). From the result implies that cluster analysis is more precise when excluding correlated variables than when leaving them in a dataset. The average result of discriminant analysis for data with correlated variables and also without correlated variables is 88,7 %.
Evaluating the success of cluster analysis methods
Maršálková, Kateřina ; Löster, Tomáš (advisor) ; Makhalova, Elena (referee)
Cluster analysis is one of the classification methods of multivariate statistical analysis. The task of this analysis is to classify the objects into clusters so that objects inside these clusters are as similar as possible. The aim of this study is to evaluate the success of the classification of objects using six hierarchical cluster analysis methods. To reflect the distance between the objects, are used squared Euclidean and Mahalanobis distances. The success methods are evaluated through the information, which cluster the object belongs to, and this information is already contained in the data files. This thesis pointed out that the Ward's method is one of the most successful hierarchical method in a classification of objects into clusters. This method has been more successful in sorting objects than the other hierarchical methods, both in the case of leaving the correlated variables in the data file as well as removing them. The results of this work show that the highest success of classification objects into clusters is when the data set is cleaned of correlated variables. If the data file is not cleaned, the methods reach better results when the distance between objects is measured by Euclidean metric.