National Repository of Grey Literature 33 records found  beginprevious24 - 33  jump to record: Search took 0.00 seconds. 
Utilization of XML databases for retrieval of data-mining specifications
Marek, Tomáš ; Kliegr, Tomáš (advisor) ; Kosek, Jiří (referee)
The aim of this work is to create a querying system in analytical reports stored as PMML documents. These PMML documents are stored in native XML database, because these documents are structured as XML documents. Selected XML database is available for free and its resources and means meet the proposed solution. Also searching algorithm is created to search these documents by means of XQuery language. Inasmuch as searched data have the character of the XML data the use of language for querying XML data suggests. In terms of the use of the XQuery language structure of PMML document was explored and data links in these documents was used to ensure proper search results. Results of the search are association rules from these analytical reports stored in PMML documents, requests of the search are attributes to be in the rules, their values and other limits of the search. So that the whole system is complete and could be fully used, it is necessary to create a communication environment through which the work with stored data is performed. For this purpose, Java and REST(ful) architecture for creating applications are used.
Indexing and searching XML documents with Lucene
Beránek, Lukáš ; Kliegr, Tomáš (advisor) ; Pinkas, Otakar (referee)
The creation of analytical report is a process in which we try to obtain and preserve the results of data mining tasks for further usage. Next step after the creation is to transform them into user friendly and accessible way that can be easily access for example as an online HTML document in the SEWEBAR project. The increasing number of resulting documents is the main reason of the need to possess means to search on structured date like XML documents that correspond with the PMML standard in which the reports are currently being saved. The main goal will be in stating available means for indexing and full text searching of XML documents targeted upon searching association rules that can be found in output documents produced by programs LISp-Miner or Ferda. After the initial analysis and assessment of the current state an extension for CMS Joomla! will be created in order to satisfy the need for indexing and searching indexed data. As source files for created Jucene extension we use analytic reports saved in the database of the Joomla content management system stored in PMML format. Stored PMML document will be simplified, optimized and transformed by means of an XSL transformation for better indexing possibilities in requested structure and with maintaining logical order of the document data mining task. Transformed document will then be inserted into the Zend Lucene document index. To achieve this in PHP environment the DOMDocument library will be used. Created workflow will supply user interface for work with indexed rules. Also it will provide the users with means for searching association rules based on user specified queries which can be processed by Zend Search Lucene framework. When rules that correspond to the user query are found the system will score the results and display them to the user. One of the goals is not only to create the Jucene component but also to give its users step-by-step guidance either they are the site administrators or ordinary visitors.
CMS Joomla! and Ontopia Knowledge Suite Integration
Hazucha, Andrej ; Kliegr, Tomáš (advisor) ; Nekvasil, Marek (referee)
The aim of this thesis is to outline issues related to integration of Content Management Systems and Knowledge Bases based on semantic web technologies. The work begins with semantic technologies research and their use cases. The possibilities and proposals of integration of these technologies into CMS and collaborative wikis are discussed. As far as the most of open-source CMS are based on PHP platform tools written in PHP are insisted. CMS Joomla! and Ontopia Knowledge Suite integration is demonstrated in practical the part of the thesis. Possibility to communicate with different systems that allow HTTP requests is presented, too. Joomla! and OKS communication is through RESTful TMRAP protocol implemented in OKS. The query language used in this case is tolog. Communication with SPARQL endpoint or XML database is also demonstrated. Raw XML returned from Knowledge Base data source is transformed by XSLT into (X)HTML fragments. The transformations are user defined. Created demo application is included into SEWEBAR project. This application enables to incorporate results of semantically rich queries into analytical reports of data mining tasks within CMS Joomla! Interface.
Using semantic technologies in markup languages
Štencek, Jiří ; Nekvasil, Marek (advisor) ; Kliegr, Tomáš (referee)
This bachelor thesis analyzes the use of semantic technologies in the field of today's web portals. The aim is to map the major web servers and services. Work on the contrary seek cover all sites (blogs, corporate sites, etc.) that use semantic technologies, as it had almost no meaningful value. Contribution of this work should be an analysis of the implementation of semantic technologies on the Internet. This should show how much vision of the Semantic Web expands. How many web sites use this technology. Web sites that we use every day and which offer capabilities and features that we might not even know. Other benefits could be for example: extending the use of Semantic Web tools (Operator plugin, Semantic Radar), information awareness among Internet users who have never heard about this term. In other hand, it could be a basis to further and more detailed mapping of semantic sites. For example, statistically-oriented work on the utilization rates of ontological dictionaries. The work begins with an introduction to the world wide web as a beginning to the present, outlining the basic ideology WWW. Show us the pitfalls of the current WWW and its possible further development line. Chapter entitled Understanding the Semantic Web describes the basic building stones and architecture of this vision. Describes the framework RDF, ontology, and not forget the section on the safety of the Semantic Web. With this knowledge we have chapter Integration semantics on the current WWW to learn about options, where to find the necessary metadata and related principles of Linked Data. metadata to (X) HTML. More specifically, we describe microformats, RDFa and the eRDF. Conclusion chapter makes the comparison of these technologies and and show us practical examples of their implementation. The last chapter, which is called Analysis of the use of knowledge technologies now brings you an overview of the servers that use one of the above technologies. Describes the open source database, semantic search engines, ontological dictionaries and finally community and information portals. The results of the present chapter is a summary of the implementation and reflection on the real benefits and possible incentives Semantic Web.
Content and structure of web document
Vaněk, Vladimír ; Pinkas, Otakar (advisor) ; Kliegr, Tomáš (referee)
The Bachelor work focuses on aspects, affecting the results of searching in fulltext locators. All important factors which affect the success of web pages are described here. The work deals with the importance of on page and off page factors, the emphasis is put on the usage of correct HTML tags. All these facts are supported by statistics, technical studies, graphs and work experience.
Knowledge Processing within the GUHA Method
Šťastný, Daniel ; Rauch, Jan (advisor) ; Kliegr, Tomáš (referee)
This study presents an introduction into the data-mining methodology CRISP-DM (CRoss-Industry Standard Process for Data Mining). It provides a fundamental description of association rules and the GUHA method (General Unary Hypotheses Automaton) with related 4ft-Miner, SD4ft-Miner and Action Rules. The examples are shown on real data. Sequentially the study describes the role of the domain knowledge and the project SEWEBAR (SEmantic WEb and Analytical Reports) held at UEP. The practical output of this work is the XML Schema definition for the markup language BKEF (Background Knowledge Exchange Format) designed within the SEWEBAR and the transformation file programmed in the XSL ensuring visualization of the content of any BKEF file.
Extension development for CMS Joomla!
Vojíř, Stanislav ; Nemrava, Jan (advisor) ; Kliegr, Tomáš (referee)
Writing of analytical reports in natural language is important, but not very simple. At this time, most of analytical reports are written in Microsoft Word and saved as classical files. The object of this bachelor thesis is the change to simpler writing of analytical reports in online application CMS Joomla!. To this functionality, we use some extensions of CMS Joomla!. After demands definition, we use an exitor-xtd plug-in and a component for CMS Joomla! 1.5. Using extensions, which were written in this thesis, authors of analytical reports are able to select part of dates from data mining application and insert this part to edited analytical report. The data source for including is xHTML presentation of data from data mining application. In this bachelor thesis, there are described information about dividing of xHTML document, part selection and including to the edited text. The secondary functionality of build extensions is updating of included blocks in selected analytical report. Users can select finished analytical report and update the included blocks with newer data. At the University of Economics, Prague, all students of informatics learn about fundamental data mining in the subject 4iz210 - Information and Knowledge Processing. In this subject, the students will use CMS Joomla! with extensions from this bachelor thesis. The start-up mounting will be at the end of summer semester in academically year 2008-2009. This bachelor thesis could be used as inspiration for extension development for CMS Joomla!, too.
Web Analytics: Identification of new trends
Slavík, Michal ; Kliegr, Tomáš (advisor) ; Nekvasil, Marek (referee)
The goal of this thesis is to identify the main trends in the field of tools used to analyse web traffic. The necessary theoretical background is extracted from relevant literature and field research is chosen to gain knowledge of practitioners. Following trends have been identified: a growth in demand for Web Analytics software, an increasing interest in Web Analytics courses, an enlargment of measuring Web 2.0 and social networks, use of semantic information as the most fruitful section of academic research. The thesis also presents the main techniques of Web Usage Mining: association rules, sequential patterns, and clustering. A section about query categorization is also included. According to the field research, practitioners express most interest in clustering. The first two chapters present Web Analytics in general and introduce the main aspects of current applications. The third chapter covers theoretical research, the fifth one presents results of the field research. The fourth chapter raises the point that terminology of Web Analytics is not unified.
Obsah a struktura webových dokumentů a jejich přístupnost
Kalous, Martin ; Pinkas, Otakar (advisor) ; Kliegr, Tomáš (referee)
Bakalářská práce se zabývá otázkou optimalizace internetových stránek pro vyhledávače a analýzou návštěvnosti. Optimalizací internetových stránek pro vyhledávače (dále jen SEO) rozumíme správné naprogramování, nastavení a situování webových stránek takovým způsobem, aby nejen neporušovaly pravidla vyhledávačů, ale zároveň, aby měly větší šanci při samotném vyhledávání. V první části práce jsou popsány základní pravidla SEO, faktory ovlivňující hodnocení stránek (on-page faktory a off-page faktory). On-page faktory jsou veškeré objekty na webové stránce, off-page faktory jsou taktiky, jak na své stránky upoutat co největší pozornost, jak je zviditelnit bez zásahu do zdrojového kódu. Kromě etických taktik existují i taktiky neetické nebo spíše zakázané. Představené metody či taktiky jsou demonstrovány na kratičkých příkladech a na ukázkovém SEO auditu vybrané firmy, který je v příloze bakalářské práce. Druhá kapitola se zabývá dalšími způsoby zvyšování návštěvnosti. Jsou zmíněny hlavně reklamní systémy, které jsou na hlavních internetových portálech k dispozici. Třetí kapitola je v podstatě teoretické představení případové studie ? analýzy návštěvnosti. Návštěvnost signalizuje úroveň úspěšnosti online reklamy. Zde jsou uvedeny některé metody měření návštěvnosti a různé statistické nástroje, kterými se dá měřit. Ve čtvrté kapitole je konkrétně provedena ? analýza návštěvnosti vybrané firmy. Pomocí různých statistických nástrojů jsou zhodnoceny konkrétní stránky z hlediska návštěvnosti.
Clickstream Analysis
Kliegr, Tomáš ; Rauch, Jan (advisor) ; Berka, Petr (referee)
Thesis introduces current research trends in clickstream analysis and proposes a new heuristic that could be used for dimensionality reduction of semantically enriched data in Web Usage Mining (WUM). Click-fraud and conversion fraud are identified as key prospective application areas for WUM. Thesis documents a conversion fraud vulnerability of Google Analytics and proposes defense - a new clickstream acquisition software, which collects data in sufficient granularity and structure to allow for data mining approaches to fraud detection. Three variants of K-means clustering algorithms and three association rule data mining systems are evaluated and compared on real-world web usage data.

National Repository of Grey Literature : 33 records found   beginprevious24 - 33  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.