National Repository of Grey Literature 511 records found  beginprevious470 - 479nextend  jump to record: Search took 0.00 seconds. 
Practical applications of data mining technologies in health insurance companies
Kulhavý, Lukáš ; Pour, Jan (advisor) ; Kučera, Petr (referee)
This thesis focuses on data mining technology and its possible practical use in the field of health insurance companies. Thesis defines the term data mining and its relation to the term knowledge discovery in databases. The term data mining is explained, inter alia, with methods describing the individual phases of the process of knowledge discovery in databases (CRISP-DM, SEMMA). There is also information about possible practical applications, technologies and products available in the market (both products available free and commercial products). Introduction of the main data mining methods and specific algorithms (decision trees, association rules, neural networks and other methods) serves as a theoretical introduction, on which are the practical applications of real data in real health insurance companies build. These are applications seeking the causes of increased remittances and churn prediction. I have solved these applications in freely-available systems Weka and LISP-Miner. The objective is to introduce and to prove data mining capabilities over this type of data and to prove capabilities of Weka and LISP-Miner systems in solving tasks due to the methodology CRISP-DM. The last part of thesis is devoted the fields of cloud and grid computing in conjunction with data mining. It offers an insight into possibilities of these technologies and their benefits to the technology of data mining. Possibilities of cloud computing are presented on the Amazon EC2 system, grid computing can be used in Weka Experimenter interface.
Data Mining Aplications in Marketing
Ďurkovský, Jaroslav ; Novotný, Ota (advisor) ; Maryška, Miloš (referee)
In my work I deal with the issue of data mining and its use in the commercial sphere. Specifically, I focused on marketing and sales forecast. The aim of my work was first to assemble knowledge from data mining and then use it to create sales forecasts using data mining add-on for Excel. In the first part I gather the theoretical information about data mining. I focused on definition, the methodology, algorithms and of course on the most frequent usage. The second part consists of the practical application of acquired knowledge. I focus on making the sales forecast of HERO CZECH company. I used data mining add-on for Microsoft Excel 2007. Results are compared with real forecasts prepared by the Key Account Manager. Results of my work proved that forecasts from data mining add-on for Excel were not more accurately than the existing ones from Key Account Manager. Nevertheless, I believe that the use of data mining methods have found use in preparing the forecast, at least as a means of support.
Post-processing of association rules by multicriterial clustering method
Kejkula, Martin ; Rauch, Jan (advisor) ; Berka, Petr (referee) ; Máša, Petr (referee)
Association rules mining is one of several ways of knowledge discovery in databases. Paradoxically, data mining itself can produce such great amounts of association rules that there is a new knowledge management problem: there can easily be thousands or even more association rules holding in a data set. The goal of this work is to design a new method for association rules post-processing. The method should be software and domain independent. The output of the new method should be structured description of the whole set of discovered association rules. The output should help user to work with discovered rules. The path to reach the goal I used is: to split association rules into clusters. Each cluster should contain rules, which are more similar each other than to rules from another cluster. The output of the method is such cluster definition and description. The main contribution of this Ph.D. thesis is the described new Multicriterial clustering association rules method. Secondary contribution is the discussion of already published association rules post-processing methods. The output of the introduced new method are clusters of rules, which cannot be reached by any of former post-processing methods. According user expectations clusters are more relevant and more effective than any former association rules clustering results. The method is based on two orthogonal clustering of the same set of association rules. One clustering is based on interestingness measures (confidence, support, interest, etc.). Second clustering is inspired by document clustering in information retrieval. The representation of rules in vectors like documents is fontal in this thesis. The thesis is organized as follows. Chapter 2 identify the role of association rules in the KDD (knowledge discovery in databases) process, using KDD methodologies (CRISP-DM, SEMMA, GUHA, RAMSYS). Chapter 3 define association rule and introduce characteristics of association rules (including interestingness measuress). Chapter 4 introduce current association rules post-processing methods. Chapter 5 is the introduction to cluster analysis. Chapter 6 is the description of the new Multicriterial clustering association rules method. Chapter 7 consists of several experiments. Chapter 8 discuss possibilities of usage and development of the new method.
Quality measures of classification models and their conversion
Hanusek, Lubomír ; Hebák, Petr (advisor) ; Řezanková, Hana (referee) ; Skalská, Hana (referee)
Predictive power of classification models can be evaluated by various measures. The most popular measures in data mining (DM) are Gini coefficient, Kolmogorov-Smirnov statistic and lift. These measures are each based on a completely different way of calculation. If an analyst is used to one of these measures it can be difficult for him to asses the predictive power of a model evaluated by another measure. The aim of this thesis is to develop a method how to convert one performance measure into another. Even though this thesis focuses mainly on the above-mentioned measures, it deals also with other measures like sensitivity, specificity, total accuracy and area under ROC curve. During development of DM models you may need to work with a sample that is stratified by values of the target variable Y instead of working with the whole population containing millions of observations. If you evaluate a model developed on a stratified data you may need to convert these measures to the whole population. This thesis describes a way, how to carry out this conversion. A software application (CPM) enabling all these conversions makes part of this thesis. With this application you can not only convert one performance measure to another, but you can also convert measures calculated on a stratified sample to the whole population. Besides the above mentioned performance measures (sensitivity, specificity, total accuracy, Gini coefficient, Kolmogorov-Smirnov statistic), CPM will also generate confusion matrix and performance charts (lift chart, gains chart, ROC chart and KS chart). This thesis comprises the user manual to this application as well as the web address where the application can be downloaded. The theory described in this thesis was verified on the real data.
An Empirical Comparison of Commercial Data Mining Tools
Faruzel, Petr ; Berka, Petr (advisor) ; Máša, Petr (referee)
The presented work "An Empirical Comparison of Commercial Data Mining Tools" deals with data mining tools from world's leading software providers of statistical solutions. The aim of this work is to compare commercial packages IBM SPSS Modeler and SAS Enterprise Miner in terms of their specification and utility considering a chosen set of evaluation criteria. I would like to achieve the appointed goal by a detailed analysis of selected features of the surveyed software packages as well as by their application on real data. The comparison is founded on 29 component criteria which reflect user's requirements regarding functionality, usability and flexibility of the system. The pivotal part of the comparative process is based on an application of the surveyed data mining tools on data concerning meningoencephalitis. Results predestinate evaluation of their performance while analyzing small and large data. Quality of developed data models and duration of their derivation are stated in reference to the use of six comparable data mining techniques for classification. Small data more likely comply with IBM SPSS Modeler. Although it produces slightly less accurate models, their development times are much shorter. Increasing the amount of data changes the situation in favor of competition. SAS Enterprise Miner manages better results while analyzing large data. Considerably more accurate models are accompanied by slightly shorter times of their development. Functionality of the surveyed data mining tools is comparable, whereas their usability and flexibility differentiate. IBM SPSS Modeler offers apparently better usability and learnability. Users of SAS Enterprise Miner have a slightly more flexible data mining tool at hand.
Implementation of social network services in company
Chernenko, Nina ; Šedivá, Zuzana (advisor) ; Žid, Norbert (referee)
The aim of this work is to streamline to the reader the existing social networks, to carry out the analysis by category and indicate the possibility of using social networks services for businesses. To reach the aim a detailed analysis of each social network described in this work was carried out and the offered services were tested. The benefit of the work is a survey analysis of social network according to their types and the possibility of using social networks services in commercial purposes. The work is complemented with a case study where Regabus company is implemented to the social network Facebook.
Business Intelligence principles and their use in questionnaire investigation
Hanuš, Václav ; Maryška, Miloš (advisor) ; Novotný, Ota (referee)
This thesis is oriented on practical usage of tools for data mining and business intelligence. Main goals are processing of source data to suitable form and test use of chosen tool on the test case. As input data I used database which was created as result of processing forms from research to verify the level of IT and economics knowledge among Czech universities. These data was modified into the form, which allows processing them via data mining tools included in Microsoft SQL Server 2008. I choose two cases for verification the potentials of these tools. First case was focused on clustering using Microsoft Clustering algorithm. Main task was to sort the universities into the clusters by comparing their attributes which was amounts of credits of each knowledge group. I had to deal with two problems. It was necessary to reduce the number of groups of subjects, otherwise there was a danger of creation too many clusters which I couldn't put the name on. Another problem was unequal value of credits in each group and this problem caused another problem with weights of these groups. Solution was at the end quite simple. I put together similar groups to bigger formation with more general category. For unequal value, I used parameter for each of new group and transform it to scale 0-5. Second case was focused on prediction task using Microsoft Logistic Regresion algorithm and Microsoft Neural Network algorithm. In this case was the goal to predict the number of presently studying students. I had a historical data from years 2001-2009. A predictive model was processed based on them and I could compare the prediction with real data. In this case, it was also necessary to transform the source data, otherwise it couldn't be processed by tested tool. Original data was placed into the view instead of table and contained not only wished objects but more types of these. For example divided by a sex. Solution was in creation of new table in database where only relevant objects for test case were placed. Last problem come up when I tried to use prediction model to predict data for year 2010 for which there wasn't real data in the table. Software reported an error and couldn't make prediction. During my research on the Microsoft technical support I find some threads which refer to similar problem, so it's possible that this is a system error whit will be fix in forthcoming actualization. Fulfillment of these cases provided me enough clues to determine abilities of these tools from Microsoft. After my former school experience with data mining tools from IBM (former SSPS) and SAS, I can recognize, if tested tools can match these software from major data mining supplier on the market and if it can be use for serious deployment.
Approaches to document digitalization solutions.
Novotný, Vladimír ; Šedivá, Zuzana (advisor) ; Benáčanová, Helena (referee)
The objective of this thesis is to provide a survey of document digitalization and to analyse the market of companies outsourcing the document digitalization. The first part of the thesis decribes the technology of scannig and the methods of document recognition and data minig. It also decribes the systems of barcodes used to identify documents. Furthermore, this thesis includes the principles of document saving and electronic (digital) signature issues from the viewpoint of Czech legislation. Its contribution lies in analysing the companies dealing with outsourcing of the document digitalization and in the view of a company using these services. A brief outlook to the future regarding this topic is included as well.
Business Intelligence analysis of pharmacy Alfa in the city of Nymburk and analysis of trends in sales of over-the-counter medicine.
Vítek, Pavel ; Novotný, Ota (advisor) ; Červinková, Miroslava (referee)
This thesis deals with the theme of business analysis of real company -- pharmacy, running a business in market environment of the city of Nymburk. Main focus is on research of actual position of the pharmacy in the local market in context of a new competitor entering the market. The whole thesis is divided into two consistent parts: The first part is a short theoretical introduction to the methods used and general background of the market in the city of Nymburk. The following practical part analyzes business of the company and development of sales of over-the-counter medicine in context of a new competitor entering the market in the period examined for the purpose of this thesis. Methods applied to achieve the the main goals of the thesis are following: the SWOT analysis method, which is used to discover the strengths and weaknesses of the company itself and to define threats and opportunities, based on the market environment. The subsequent method used within research is Balanced Scorecard method which is used to design the Key Performance Indicators for measuring and observing the company's performance and development. Finally data mining methods of shopping basket, segmentation and forecasting were used to analyse trends in over-the-counter medicine sales. All these methods are keystones for formulation of a basic concept for future strategical and tactical decisions.
Systém předzpracování dat pro dobývání znalostí z databází
Kotinová, Hana ; Berka, Petr (advisor) ; Šimůnek, Milan (referee)
Abstract Aim of this diploma thesis was to create an aplication for data preprocessing. The aplication uses files in csv format and is useful for preparing data while solving datamining tasks. The aplication was created using the programing language Java. This text discusses problems, their solutions and algorithms associated with data preprocessing and discusses similar systems such as Mining Mart and SumatraTT. A complete aplication user guide is provided in the main part of this text.

National Repository of Grey Literature : 511 records found   beginprevious470 - 479nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.