National Repository of Grey Literature 33 records found  previous11 - 20nextend  jump to record: Search took 0.00 seconds. 
The real application of methods knowledge discovery in databases on practical data
Mansfeldová, Kateřina ; Máša, Petr (advisor) ; Kliegr, Tomáš (referee)
This thesis deals with a complete analysis of real data in free to play multiplayer games. The analysis is based on the methodology CRISP-DM using GUHA method and system LISp-Miner. The goal is defining player churn in pool from Geewa ltd.. Practical part show the whole process of knowledge discovery in databases from theoretical knowledge concerning player churn, definition of player churn, across data understanding, data extraction, modeling and finally getting results of tasks. In thesis are founded hypothesis depending on various factors of the game.
Implementation of data preparation procedures for RapidMiner
Černý, Ján ; Berka, Petr (advisor) ; Kliegr, Tomáš (referee)
Knowledge Discovery in Databases (KDD) is gaining importance with the rising amount of data being collected lately, despite this analytic software systems often provide only the basic and most used procedures and algorithms. The aim of this thesis is to extend RapidMiner, one of the most frequently used systems, with some new procedures for data preprocessing. To understand and develop the procedures, it is important to be acquainted with the KDD, with emphasis on the data preparation phase. It's also important to describe the analytical procedures themselves. To be able to develop an extention for Rapidminer, its needed to get acquainted with the process of creating the extention and the tools that are used. Finally, the resulting extension is introduced and tested.
Pragmatic lemmatizer of Czech language
Vacek, Matěj ; Strossa, Petr (advisor) ; Kliegr, Tomáš (referee)
This thesis is focused on lemmatizing of nouns and adjectives. It is based on morphology of Czech language. The goal is to create a lemmatizer which can stem words with success rate 90% (at least). At the same time the lemmatizer should be very easy, it should consist as little rules as possible. Lemmatizer will be created to work with real estate adverts, especially houses for sale. In this thesis there will be made an analysis of specific characters of this area. Lemmatizer will be created according to results of this analysis. Lemmatizer was written in Java. Only three types of rules were used and generally the lemmatizer created correct stems in 96.4% of all words.
Generating data using the LM Reverse-Miner
Stluka, Jakub ; Šimůnek, Milan (advisor) ; Kliegr, Tomáš (referee)
In past years, great attention has been paid to evolutionary algorithms and they have been utilized in wide range of industries including data mining field, which nowadays presents a highly demanded product for many commercial institutions. Both mentioned topics are combined in this work. Main thesis subject is testing of new Reverse-Miner module, which can generate data with hidden properties using evolutionary algorithms while using also other modules of LISp-Miner system, commonly used for the purposes of data mining. Main goal lies in generation of two databases by the module in such way so they would meet explicitly set requirements. Other goals are also set within the thesis in the form of understanding the domain necessary for subsequent modeling. The result of the practical part of the thesis is represented not only by two successfully generated databases, but also by description of steps, methods and techniques used. The common recommendations for data preparation by module Reverse-Miner are later summarized, based on experience with modeling. Previous thesis outputs are furthermore contemplating the conclusion of analysis of technical means used for generation and they also provide several suggestions for possible future extensions.
Klasifikace entit pomocí Wikipedie a WordNetu
Kliegr, Tomáš ; Rauch, Jan (advisor) ; Berka, Petr (referee) ; Smrž, Pavel (referee) ; Žabokrtský, Zdeněk (referee)
This dissertation addresses the problem of classification of entities in text represented by noun phrases. The goal of this thesis is to develop a method for automated classification of entities appearing in datasets consisting of short textual fragments. The emphasis is on unsupervised and semi-supervised methods that will allow for fine-grained character of the assigned classes and require no labeled instances for training. The set of target classes is either user-defined or determined automatically. Our initial attempt to address the entity classification problem is called Semantic Concept Mapping (SCM) algorithm. SCM maps the noun phrases representing the entities as well as the target classes to WordNet. Graph-based WordNet similarity measures are used to assign the closest class to the noun phrase. If a noun phrase does not match any WordNet concept, a Targeted Hypernym Discovery (THD) algorithm is executed. The THD algorithm extracts a hypernym from a Wikipedia article defining the noun phrase using lexico-syntactic patterns. This hypernym is then used to map the noun phrase to a WordNet synset, but it can also be perceived as the classification result by itself, resulting in an unsupervised classification system. SCM and THD algorithms were designed for English. While adaptation of these algorithms for other languages is conceivable, we decided to develop the Bag of Articles (BOA) algorithm, which is language agnostic as it is based on the statistical Rocchio classifier. Since this algorithm utilizes Wikipedia as a source of data for classification, it does not require any labeled training instances. WordNet is used in a novel way to compute term weights. It is also used as a positive term list and for lemmatization. A disambiguation algorithm utilizing global context is also proposed. We consider the BOA algorithm to be the main contribution of this dissertation. Experimental evaluation of the proposed algorithms is performed on the WordSim353 dataset, which is used for evaluation in the Word Similarity Computation (WSC) task, and on the Czech Traveler dataset, the latter being specifically designed for the purpose of our research. BOA performance on WordSim353 achieves Spearman correlation of 0.72 with human judgment, which is close to the 0.75 correlation for the ESA algorithm, to the author's knowledge the best performing algorithm for this gold-standard dataset, which does not require training data. The advantage of BOA over ESA is that it has smaller requirements on preprocessing of the Wikipedia data. While SCM underperforms on the WordSim353 dataset, it overtakes BOA on the Czech Traveler dataset, which was designed specifically for our entity classification problem. This discrepancy requires further investigation. In a standalone evaluation of THD on Czech Traveler dataset the algorithm returned a correct hypernym for 62% of entities.
Search Engine Marketing of Nonprofit Organizations
Slavík, Michal ; Kliegr, Tomáš (advisor) ; Nemrava, Jan (referee)
The goal of this thesis is to design methodics for Search Engine Marketing (SEM) in nonprofit organizations (NPOs) which takes advantage of their specifics. Other goals include practical evaluation of the methodics and analysis of the current state of NPOs websites. Determined goals are reached by merging theoretical background from relevant literature with the knowledge gained during field research and with author's experience. Designed methodics is built on the following hypotheses: NPOs are able to negotiate better trade terms than trading companies, NPOs can delegate their volunteers to do some SEM activities. Field research confirmed both hypotheses. Hypothesis that NPOs websites are static because NPOs see no profit in regular publishing was disproved. The methodics consists of four phases and also includes recommended tools, metrics, topics for publishing and a list of linkbaiting activities. The thesis consists of five chapters. The first chapter summarizes the necessary theoretical background, while the second chapter defines terms and premises. The main methodics can be found in chapter three. The fourth chapter contains current state analysis based on examination of 31 websites. A comparison of the methodics' hypotheses and activities against the experience of 21 NPOs representatives and 3 experts in the field of SEO is given in the last chapter. Opinions of the both groups of respondents are compared too. Based on the respondents' judgments on costs and utility of the methodics' activities a rank of these activities is finally created. The main contribution of this thesis is a conversion of the universal SEM theory into the specific conditions and language of NPOs practitioners and an analysis of the current state in this field.
Possibilities of XSLT in the current web browsers
Bittner, Ondřej ; Kosek, Jiří (advisor) ; Kliegr, Tomáš (referee)
The aim of this thesis is to analyse possibilities of current web browsers in performing XSLT transformations. Individual browsers are compared by variety of different aspects such as: the level of implementation of XSLT 1.0 and XSLT 2.0, possibilities for invoking trasnformations and performance during XSLT processing. The analysis aims on flaws, special features or non-standard behavior of indiviual browsers. Possibilities for XSLT transformations in browsers using non-native instruments are researched as well. The conclusion of this thesis is dedicated to description of sample application that is using XSLT transformations.
The gatheringof semantically enriched clickstreams
Bača, Roman ; Kliegr, Tomáš (advisor) ; Kuchař, Jaroslav (referee)
The aim of this thesis is to bring near to the readers the area of webmining and familiarize them with tools, which deal with data mining on the web. The main emphasis is placed on the analytical software program called Piwik. This analytical tool is compared with others nowadays available analytical tools. This thesis also aims to create a compact documentation of the software Piwik. The largest part of this documentation is devoted to the newly programmed plugin. The principle of information retrieval, based on user behavior on the web, is described from the common viewpoint and leads to more factual form of description of information retrieval using this new plugin.
Analysis of imperfections of the popular content management systems and a suggestion how to solve them
Kohout, Vojtěch ; Kliegr, Tomáš (advisor) ; Nekvasil, Marek (referee)
The main goal of this thesis is to reveal significant imperfections of the popular open-source Content Management Systems (CMS) and to design and implement a core of a custom extensible CMS system. Such a system, based on a modern PHP framework, could be a better choice than open-source CMS systems for webdesigners using it for commercial purposes. There are three open-source CMS systems analyzed from various perspectives in the first chapter: Drupal, CMS Made Simple and TYPOlight. In the second chapter there are introduced benefits from using a modern PHP framework and a custom CMS system is designed. It is completely implemented in the appendix of the thesis. The most important and the most original contribution of this thesis is a design and implementation of a core of a custom CMS system in Nette Framework.
Creation of interface for creating apriori association rules
Balhar, Jakub ; Kliegr, Tomáš (advisor) ; Hazucha, Andrej (referee)
This work aims at finding or creating web application, which allows user to easily create and edit constructs similar to association rules. I summarized and compared existing solutions in the first part and decided that none of these solutions meets the requirements of SEWEBAR project. In the second part I focus on the formats, used for communication with world outside the application and for communication betweeen parts of the applications. This is followed by definition of the web application, which I created and in the end of this part I explain in deep the architecture of the application. The resulting application is part of the SEWEBAR project, but it is written in the way, which makes it easy to deploy it to any other web projects, which needs to create or edit constructs similar to association rules. In the work I also cover possible configurations for this deployment.

National Repository of Grey Literature : 33 records found   previous11 - 20nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.