keywords:"Web Crawler" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"Web Crawler"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Advanced Web Crawler Činčera, Jaroslav ; Jirák, Ota (referee) ; Trchalík, Roman (advisor) This Master's thesis describes design and implementation of advanced web crawler. This crawler can be configured by user and is designed for web browsing according to specified parameters. Can acquire and evaluate content of web pages. Its configuration is performed by creating projects which are consisting of different types of steps. User can create simple action like downloading page, form submission, etc. or can create more complex and larger projects. Detailed record
	Incremental Web Crawling With Bubing System Ondřej, Karel ; Fajčík, Martin (referee) ; Škoda, Petr (advisor) This bachelor thesis deals with modification of BUbiNG system for incremental crawling. The paper describes the main problems related to incremental Internet crawling and the use of other open-source systems for incremental crawling. As a result, BUbiNG system supports re-visiting pages using two commonly used strategies. The first strategy always re-visits page after the same interval. The second strategy adjusts the interval between visits according to the frequency of page changes. Detailed record
	Web API Blocking Frandel, Martin ; Hranický, Radek (referee) ; Polčák, Libor (advisor) The aim of this work is to obtain the web APIs used in the top 1 000 000 pages of the Tranco ranking along with their subpages using the Web API Manager extension, then analyze and categorize the obtained data. Design a mechanism for the JShelter extension supporting blocking of individual web APIs that have been evaluated as tracking or advertising, implement the solution and then test it. In total, 2 973 276 web pages were analyzed. The captured data was aggregated with respect to web API insecurity, analyzed and the results described in the paper, with some API calls being blocked up to 93.33 % of the time. I was able to develop a method for identifying problematic APIs. Using polynomial regression, I found polynomials that describe the blocking behavior towards individual web APIs and their methods. I implemented the blocking functionality in the JShelter extension and successfully tested the solution. Detailed record
	Automatizované zhromažďovanie a štrukturalizácia dát z webových zdrojov Zahradník, Roman This diploma thesis deals with the creation of a solution for continuous data acquisition from web sources. The application is in charge of automatically navigating web pages, extracting data using dedicated selectors, and subsequently standardizing them for further processing for data mining. Detailed record
	Incremental Web Crawling With Bubing System Ondřej, Karel ; Fajčík, Martin (referee) ; Škoda, Petr (advisor) This bachelor thesis deals with modification of BUbiNG system for incremental crawling. The paper describes the main problems related to incremental Internet crawling and the use of other open-source systems for incremental crawling. As a result, BUbiNG system supports re-visiting pages using two commonly used strategies. The first strategy always re-visits page after the same interval. The second strategy adjusts the interval between visits according to the frequency of page changes. Detailed record
	Advanced Web Crawler Činčera, Jaroslav ; Jirák, Ota (referee) ; Trchalík, Roman (advisor) This Master's thesis describes design and implementation of advanced web crawler. This crawler can be configured by user and is designed for web browsing according to specified parameters. Can acquire and evaluate content of web pages. Its configuration is performed by creating projects which are consisting of different types of steps. User can create simple action like downloading page, form submission, etc. or can create more complex and larger projects. Detailed record
	Evaluation of the quality of IT services through the analysis of unstructured data Zimmermann, Radim ; Vencovský, Filip (advisor) ; Karkošková, Soňa (referee) The aim of this work is to obtain and analyze unstructured data from Web sites ( http://www.ispreview.co.uk ) ISPs in the UK using the analytical program KNIME Analytics Platform. In the first 7 chapters I described the theoretical tools that enable me to attain my goals . In the practical part I downloaded the data using a Web crawler, and then I analyzed the indexed search keywords and compare the two providers from the customer's perspective. The results were visualized using graphs and describe it . Contribution of my work is in the form of feedback or reports for business executives . Detailed record
	Monitoring of Internet and its benefits to business tools from SAS Institute Moravec, Petr ; Pour, Jan (advisor) ; Rott, Ondrej (referee) This Thesis is focused on the ways of getting information from the World Wide Web source . The Introduction pays attention to the theoretical approach towards data collection options . The main part of the Introduction is engaged in the Web Crawler program as the possibility of data collection from internet and consequently they are followed by alternative methods of data collection. E.g. Google Search API. The next part of the Thesis is dedicated to SAS products and their meanings in the context of reporting and internet monitoring. SAS Intelligence platform is presented as the crucial Company platform In the framework of the platform there could be found concrete SAS solutions. SAS Web Crawler and Semantic Server are described in SAS Content Categorization solution. Whilst the first two parts of Thesis are focused on the theory , the third and closing part pays attention to practical examples. There are illustrated examples of Internet Data collection, which are mainly realized in SAS. The practical part of Thesis follows the theoretical one and it cannot be detached. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English