National Repository of Grey Literature 11 records found  1 - 10next  jump to record: Search took 0.00 seconds. 
Analýza indexů akciových trhů a režimů na komoditních trzích
Kuchina, Elena ; Cahlík, Tomáš (advisor) ; Máša, Petr (referee) ; Lukáčik, Martin (referee)
The thesis focuses on the identification of the typical scenarios of the mutual relations among the stock markets considering different regimes on the commodity markets. For the identified scenarios the investment recommendations have been suggested. Considering different regimes the commodity markets go through and the mutual linkage among the stock markets during different situations on the commodity markets, six scenarios of the stock markets' mutual relations have been analyzed. It was shown that during most unstable period, when highly volatile regime prevails simultaneously on the energy, precious metals and non-energy commodity markets, the whole economy becomes to be more tied: the stock market indices demonstrate stronger interdependence, and as a consequence the benefits of diversification begin to fail. During the simultaneous presence of low volatility on all three analyzed commodity markets the agreement between occurrences of highly volatile state of most stock markets, besides the indices within the European region (DAX, CAC 40, IBEX 35), is rather weak. Similarly the correlation within regions and with other regions is weaker comparing with other situations on the commodity markets, so the standard investment strategy can be kept. It was also shown that the interdependence among the stock markets during the period of high volatility on the energy market differs depending on the source underlying the oil price shocks causing higher volatility. The regimes prevailing on the commodity and stock markets during different time periods have been detected by applying Hidden Markov Model methodology. To examine the similarity between the stock market indices in terms of highly volatile regimes' occurrences, Jaccard's similarity coefficient is employed. The correlation among the stock markets was computed by Spearman correlation coefficient. The final part of research is devoted to the model-based approach used to analyze the dependence of the movement direction of SSEC index on other stock market indices between two trading days during different situations on the commodity markets. The dependency analysis was performed by applying Stochastic Gradient Boosting methodology.
Dolování asociačních pravidel jako podpora pro OLAP
Chudán, David ; Svátek, Vojtěch (advisor) ; Máša, Petr (referee) ; Novotný, Ota (referee) ; Kléma, Jiří (referee)
The aim of this work is to identify the possibilities of the complementary usage of two analytical methods of data analysis, OLAP analysis and data mining represented by GUHA association rule mining. The usage of these two methods in the context of proposed scenarios on one dataset presumes a synergistic effect, surpassing the knowledge acquired by these two methods independently. This is the main contribution of the work. Another contribution is the original use of GUHA association rules where the mining is performed on aggregated data. In their abilities, GUHA association rules outperform classic association rules referred to the literature. The experiments on real data demonstrate the finding of unusual trends in data that would be very difficult to acquire using standard methods of OLAP analysis, the time consuming manual browsing of an OLAP cube. On the other hand, the actual use of association rules loses a general overview of data. It is possible to declare that these two methods complement each other very well. The part of the solution is also usage of LMCL scripting language that automates selected parts of the data mining process. The proposed recommender system would shield the user from association rules, thereby enabling common analysts ignorant of the association rules to use their possibilities. The thesis combines quantitative and qualitative research. Quantitative research is represented by experiments on a real dataset, proposal of a recommender system and implementation of the selected parts of the association rules mining process by LISp-Miner Control Language. Qualitative research is represented by structured interviews with selected experts from the fields of data mining and business intelligence who confirm the meaningfulness of the proposed methods.
Towards Complex Data and Information Quality Management
Pejčoch, David ; Rauch, Jan (advisor) ; Máša, Petr (referee) ; Novotný, Ota (referee) ; Kordík, Pavel (referee)
This work deals with the issue of Data and Information Quality. It critically assesses the current state of knowledge within tvarious methods used for Data Quality Assessment and Data (Information) Quality improvement. It proposes new principles where this critical assessment revealed some gaps. The main idea of this work is the concept of Data and Information Quality Management across the entire universe of data. This universe represents all data sources which respective subject comes into contact with and which are used under its existing or planned processes. For all these data sources this approach considers setting the consistent set of rules, policies and principles with respect to current and potential benefits of these resources and also taking into account the potential risks of their use. An imaginary red thread that runs through the text, the importance of additional knowledge within a process of Data (Information) Quality Management. The introduction of a knowledge base oriented to support the Data (Information) Quality Management (QKB) is therefore one of the fundamental principles of the author proposed a set of best
Data Quality Tools Benchmark
Černý, Jan ; Pejčoch, David (advisor) ; Máša, Petr (referee)
Companies all around the world are wasting their funds due to the poor data quality. Rationally speaking as the volume of processed data increase, the volume of error data increase too. This diploma thesis explains what is it data quality about, what are the causes of data quality errors, the impact of poor data and the way it can be measured. If you can measure it, you can improve it. This is where data quality tools are used. There are vendors that offer commercial solutions and there are also vendors that offer open-source solutions of data quality tools. Comparing DataCleaner (open-source tool) with DataFlux (commercial tool) using defined criteria this diploma thesis proves that those two tools could be equal in terms of data profiling, data enhancement and data monitoring. DataFlux is slightly better in standardization and data validation. Data deduplication is not included in tested version of DataCleaner, although DataCleaner's vendor claimed it should be. One of the biggest obstacles why companies don't buy data quality tools could be its price. At this moment, it is possible to consider DataCleaner as an inexpensive solution for companies looking for data profiling tool. If Human Inference added data deduplication to DataCleaner, it could be also possible to consider it as an inexpensive solution covers whole data quality process.
The real application of methods knowledge discovery in databases on practical data
Mansfeldová, Kateřina ; Máša, Petr (advisor) ; Kliegr, Tomáš (referee)
This thesis deals with a complete analysis of real data in free to play multiplayer games. The analysis is based on the methodology CRISP-DM using GUHA method and system LISp-Miner. The goal is defining player churn in pool from Geewa ltd.. Practical part show the whole process of knowledge discovery in databases from theoretical knowledge concerning player churn, definition of player churn, across data understanding, data extraction, modeling and finally getting results of tasks. In thesis are founded hypothesis depending on various factors of the game.
Actual role of knowledge discovery in databases
Pešek, Jiří ; Berka, Petr (advisor) ; Máša, Petr (referee)
The thesis "Actual role of knowledge discovery in databases˝ is concerned with churn prediction in mobile telecommunications. The issue is based on real data of a telecommunication company and it covers all steps of data mining process. In accord with the methodology CRISP-DM, the work looks thouroughly at the following stages: business understanding, data understanding, data preparation, modeling, evaluation and deployment. As far as a system for knowledge discovery in databases is concerned, the tool IBM SPSS Modeler was selected. The introductory chapter of the theoretical part familiarises the reader with the issue of so called churn management, which comprises the given assignment; the basic concepts related to data mining are defined in the chapter as well. The attention is also given to the basic types of tasks of knowledge discovery of databasis and algorithms that are pertinent to the selected assignment (decision trees, regression, neural network, bayesian network and SVM). The methodology describing phases of knowledge discovery in databases is included in a separate chapter, wherein the methodology of CRIPS-DM is examined in greater detail, since it represents the foundation for the solution of our practical assignment. The conclusion of the theoretical part also observes comercial or freely available systems for knowledge discovery in databases.
Post-processing of association rules by multicriterial clustering method
Kejkula, Martin ; Rauch, Jan (advisor) ; Berka, Petr (referee) ; Máša, Petr (referee)
Association rules mining is one of several ways of knowledge discovery in databases. Paradoxically, data mining itself can produce such great amounts of association rules that there is a new knowledge management problem: there can easily be thousands or even more association rules holding in a data set. The goal of this work is to design a new method for association rules post-processing. The method should be software and domain independent. The output of the new method should be structured description of the whole set of discovered association rules. The output should help user to work with discovered rules. The path to reach the goal I used is: to split association rules into clusters. Each cluster should contain rules, which are more similar each other than to rules from another cluster. The output of the method is such cluster definition and description. The main contribution of this Ph.D. thesis is the described new Multicriterial clustering association rules method. Secondary contribution is the discussion of already published association rules post-processing methods. The output of the introduced new method are clusters of rules, which cannot be reached by any of former post-processing methods. According user expectations clusters are more relevant and more effective than any former association rules clustering results. The method is based on two orthogonal clustering of the same set of association rules. One clustering is based on interestingness measures (confidence, support, interest, etc.). Second clustering is inspired by document clustering in information retrieval. The representation of rules in vectors like documents is fontal in this thesis. The thesis is organized as follows. Chapter 2 identify the role of association rules in the KDD (knowledge discovery in databases) process, using KDD methodologies (CRISP-DM, SEMMA, GUHA, RAMSYS). Chapter 3 define association rule and introduce characteristics of association rules (including interestingness measuress). Chapter 4 introduce current association rules post-processing methods. Chapter 5 is the introduction to cluster analysis. Chapter 6 is the description of the new Multicriterial clustering association rules method. Chapter 7 consists of several experiments. Chapter 8 discuss possibilities of usage and development of the new method.
An Empirical Comparison of Commercial Data Mining Tools
Faruzel, Petr ; Berka, Petr (advisor) ; Máša, Petr (referee)
The presented work "An Empirical Comparison of Commercial Data Mining Tools" deals with data mining tools from world's leading software providers of statistical solutions. The aim of this work is to compare commercial packages IBM SPSS Modeler and SAS Enterprise Miner in terms of their specification and utility considering a chosen set of evaluation criteria. I would like to achieve the appointed goal by a detailed analysis of selected features of the surveyed software packages as well as by their application on real data. The comparison is founded on 29 component criteria which reflect user's requirements regarding functionality, usability and flexibility of the system. The pivotal part of the comparative process is based on an application of the surveyed data mining tools on data concerning meningoencephalitis. Results predestinate evaluation of their performance while analyzing small and large data. Quality of developed data models and duration of their derivation are stated in reference to the use of six comparable data mining techniques for classification. Small data more likely comply with IBM SPSS Modeler. Although it produces slightly less accurate models, their development times are much shorter. Increasing the amount of data changes the situation in favor of competition. SAS Enterprise Miner manages better results while analyzing large data. Considerably more accurate models are accompanied by slightly shorter times of their development. Functionality of the surveyed data mining tools is comparable, whereas their usability and flexibility differentiate. IBM SPSS Modeler offers apparently better usability and learnability. Users of SAS Enterprise Miner have a slightly more flexible data mining tool at hand.

National Repository of Grey Literature : 11 records found   1 - 10next  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.