National Repository of Grey Literature 1 records found  Search took 0.00 seconds. 
Cluster analysis as a tool for classification of objects
Budilová, Šárka ; Löster, Tomáš (advisor) ; Šulc, Zdeněk (referee)
Cluster analysis is a popular method of multivariate statistics. Based on mutual similarities between objects this method is able to classify and divide objects into several groups or clusters. The results of the clustering can be different by using different methods, measures of distance and procedures. The main aim of this thesis is to compare the results of several methods of cluster analysis with the known classification of classes from the original data file. In total, there are 15 data files, which were analyzed and each of them contained known information about the right allocation of objects in groups. The success of clustering of each method was calculated by comparing the known classification of classes and resulted clusters. In addition to the comparison of individual methods of cluster analysis was compared the impact of standardization and correlation to the success of each method. To reflect the distance betweeen the objects within each clusters, squared Euclidean distance was used. The results of this thesis point out that better success of clustering were achieved in the case of correlated variables in data file. The succes of clustering was higher about 2 percent points than in the case when correlated variables were deleted from data set. The methods divided 69,8 % objects before standardization and 70,8 % objects after standardization. The results also show a large importance of standardization in the case of Ward´s method. After standardization this method rank the most objects into correct classification classes and were more succesful, about nine percent points. In the case of correlated variables is the succes of the method 76,4 %. Standardization positively influences also centroid method and the method of farthest neighbour. Median method, nearest neighbour method and the method of average linkage achieve higher success of clustering in the case of original, nonstandardized variables (uneven variables).

Interested in being notified about new results for this query?
Subscribe to the RSS feed.