National Repository of Grey Literature 55 records found  beginprevious21 - 30nextend  jump to record: Search took 0.00 seconds. 
Datamining in data from financial institution
Fedorko, Michal ; Rauch, Jan (advisor) ; Kotlář, Ondřej (referee)
The main purpose of this paper is to create datamining analysys of volutary termations in financial institution based in Czech republic on the data stored by HR department. Only hard data currently stored was input. For creating analysys the CRISP-DM metodology was used. For modeling itself the LISp-Miner was used. Association rules were the main approach to solving the task..Several interesting association rules were found and interprated. Outcome of the paper is for internal campains of the customer and there is even motivation for furthure predictive modeling and this was the first step.
Datamining on publicly accessible data
Pangrác, Jiří ; Rauch, Jan (advisor) ; Chudán, David (referee)
This bachelor thesis deals with the datamining methods on publicly accessible data. Data mining is a technique of mining potentially interesting relations from data. Analysis is carried out on data provided by Česká obchodní inspekce, the czech office for trade inspection, which are accessible to public. I am trying to find possible answers for some analytical questions asked. For the analysis itself, LISp-Miner system was used focusing on 4ft-Miner and CF-Miner procedures. Besides the actual analysis, this thesis includes a brief description of LISp-Miner system and datamining generally. The main goal of this work is presentation of the results for their possible practical use.
Vytvoření predikčního modelu předpovědi počasí pomocí neuronové sítě a asociačních pravidel
Kadlec, Jakub ; Rauch, Jan (advisor) ; Berka, Petr (referee)
This diploma thesis introduces three different methods of creating a neural network binary classifier for the purpose of automated weather prediction with attribute pre-selection using association rules and correlation patters mining by the LISp-Miner system. First part of the thesis consists of collection of theoretical knowledge enabling the creation of such predictive model, whereas the second part describes the creation of the model itself using the CRISP-DM methodology. Final part of the thesis analyses the performance of created classifiers and concludes the proposed methods and their possible benefits over training the network without attribute pre-selection.
Options of presentation of KDD results on Web
Koválik, Tomáš ; Rauch, Jan (advisor) ; Šimůnek, Milan (referee)
This diploma thesis covers KDD analysis of data and options of presentation of KDD results on Web. The paper is divided into three main sections, which follow the whole process of this thesis. In the first section are mentioned theoretical basics needed for understanding of discussed problem. In this section are described notions data matrix and domain knowledge, concept of CRISP-DM methodology, GUHA method, system LISp-Miner and implementation of GUHA method in LISp-Miner including description of core procedures 4ft-Miner and CF-Miner. The second section is dedicated to the first goal of this paper. It briefly summarizes analysis made during pre-analysis phase. Then is described process of analysis of domain knowledge in a given data set. The third part focuses on the second goal of this thesis, which is problem of presentation of KDD results on Web. This section covers brief theoretical basis for used technologies. Then is described development of export script for automatic generation of website from results found using LISp-Miner system including description of structure of the output and recommendations for work in LISp-Miner system.
Automation of a data mining process in the road accidents data from London by the LISp-Miner system
Soukup, Tomáš ; Rauch, Jan (advisor) ; Vojíř, Stanislav (referee)
This thesis is focused on the area of automated data mining and to describe steps associated with solving analytical questions using the LISp-Miner system in the data with road accident records. Analytical tasks were primarily created based on domene knowledge from road accidents statistics in Great Britain and from previous analysis in my semestral project. The aim of this thesis is creation of an automated data mining process for analyze the input data by applying 4ft-Miner, Ac4ft-Miner a SD4ft-Miner procedures, and looking for a new knowledge for every single year of the analyzed period. The implementation language is the LMCL language that enables usage of the LISp-Miner system's functionality in an automated way. These created scripts could be used for analyses of another dataset with the same structure or with some manual changes in initial parameters for the quite different data.
Use of data mining techniques for open data
Prokůpek, Miroslav ; Rauch, Jan (advisor) ; Chudán, David (referee)
This diploma thesis examines applications of datamining methods to open data. It is realized by solving analytical questions using the LISp-Miner system. Analytical questions are examined in data from The Czech Trade Inspection Authority from the perspective of the data owner. Procedure used to solve analytical questions is 4ft-Miner. There are presented and resolved four analytical questions, which are the results of the work. Work includes a detailed description of the transformation of the relational database into a format suitable for data mining. A detailed description of the data is also included. The theoretical part deals with the GUHA method and CRISP-DM methodology.
Towards Complex Data and Information Quality Management
Pejčoch, David ; Rauch, Jan (advisor) ; Máša, Petr (referee) ; Novotný, Ota (referee) ; Kordík, Pavel (referee)
This work deals with the issue of Data and Information Quality. It critically assesses the current state of knowledge within tvarious methods used for Data Quality Assessment and Data (Information) Quality improvement. It proposes new principles where this critical assessment revealed some gaps. The main idea of this work is the concept of Data and Information Quality Management across the entire universe of data. This universe represents all data sources which respective subject comes into contact with and which are used under its existing or planned processes. For all these data sources this approach considers setting the consistent set of rules, policies and principles with respect to current and potential benefits of these resources and also taking into account the potential risks of their use. An imaginary red thread that runs through the text, the importance of additional knowledge within a process of Data (Information) Quality Management. The introduction of a knowledge base oriented to support the Data (Information) Quality Management (QKB) is therefore one of the fundamental principles of the author proposed a set of best
Analysis of the real data from the restaurant sector
Šimeček, Petr ; Rauch, Jan (advisor) ; Šimůnek, Milan (referee)
The aim of this thesis is to analyze the real data from the restaurant sector in the center of Prague, prove assumptions based on existing knowledge and explore hidden relations. The database management system MySQL was used for the initial transformation of the original data structure. The data after the transformation were converted into a form that it was possible to manipulate with it using the procedure LMDataSource of the system LISp-Miner. The analysis of association of relations were used for the procedure 4ft-Miner of the system LISp-Miner. The MySQL database system was used for the frequency analysis to obtain results, and Microsoft Word and Excel were used to interpret the results. Some of the assumptions in the research were found proven. Furthermore, an interesting combination of relations was discovered. The output of this work allows the owner of the data to use some of the data analysis results for the optimization of internal processes. In addition, this study points out other possible ways to analyze these data.
Analýza dat týkajících se risku sebevraždy u mentálně nemocných
Hron, Jiří ; Rauch, Jan (advisor) ; Malá, Ivana (referee)
The three goals of this thesis are to present a coherent overview of the research on suicide in both the general population and among mentally ill, to analyse records of hospitalisations of mentally ill from years 2006 to 2012 while looking for patterns either leading to identification of suicide risk factors or useful for predicting probability of suicide at the time of discharge, and finally to compare a selected subset of statistical, data mining and machine learning methods in relation to their applicability to the second goal. The overview is based on information from over 40 published articles. The analysis and the comparison make use of associative rules mining, visual and stepwise methods for exploration, standard and conditional logistic regression models for inference, and variations of random forests for prediction. To the best of author's knowledge, none of the three goals was previously pursued by any other researcher in the Czech Republic, certainly not using the data set provided for purposes of this thesis. A new modification of random forest combined with a set of logistic regression in order to refine prediction accuracy is also briefly explored. The structure closely follows the above--stated goals starting from the chapters on related work and on the theoretical basis of the methods used, and concluding by the analysis itself and discussion of its results.
Srovnání vybraných nástrojů dobývání znalostí z databází z hlediska implementace asociačních pravidel
Lízler, Robert ; Nekvapil, Viktor (advisor) ; Rauch, Jan (referee)
This bachelor thesis deals with a comparison between two selected data mining software tools, LISp-Miner, developed at the department of information and knowledge engineering at the faculty of informatics and statistics of the University of Economics, Prague, and Rapidminer, a globally popular software suite. The focus of the comparison is mining for association rules. The aim of this work is first to provide a user-oriented evaluation of how the software tools compare in the selected area and second to attempt to discover some interesting differences between the results of how the software tools implement association mining procedures. To reach these goals, the software tools will be tested and evaluated by the author according to a selected set of criteria, grouped into 3 categories: functionality, usability and performance. The work is structured as follows: chapter 1 sets out some theoretical background and introduces the software tools, chapter 2 evaluates the available functionality of the tools for various steps of the overall association mining procedure, chapter 3 rates the software tools based on their usability and user-friendliness, while chapter 4 summarises the results of testing the software tools on a selected data set.

National Repository of Grey Literature : 55 records found   beginprevious21 - 30nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.