keywords:"information retrieval" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"information retrieval"

Search:



Search Tips :: Simple Search

Search collections:

Sort by:	Display results:	Output format:

	Designing a Multilingual Fact-Checking Dataset from Existing Question-Answering Data Kamenický, Daniel ; Aparovich, Maksim (referee) ; Fajčík, Martin (advisor) Tato práce se zabývá nedostatkem vícejazyčných datových sad pro kontrolu faktů, které by obsahovaly důkazy podporující nebo vyvracející fakt. Proto se tato práce zabývá převodem datového souboru pro kontrolu faktů z již existujícího datového souboru otázek a odpovědí. V této práci jsou studovány dva přístupy ke konverzi datové sady. Prvním přístupem je vytvoření datové sady založené na jednojazyčném předem natrénovaném seq-2-seq modelu T5. Model je trénován na anglickém datovém souboru. Vstupy a výstupy jsou překládány do požadovaných jazyků. Druhým přístupem je využití vícejazyčného modelu mT5, který přebírá vstup a generuje výstup v požadovaném jazyce. Pro vícejazyčný model je zapotřebí přeložit trénovací datové sady. Jako hlavní problém této práce se ukázal překlad, který v málo zdrojovém jazyce dosáhl kolem 30 % úspěšnosti. Experimenty ukázaly lepší výsledky v tvrzeních generovaných z jednojazyčného modelu s využitím strojového překladu. Na druhou stranu, tvrzení generované z vícejazyčného modelu dosáhly úspěšnosti 73 % oproti tvrzením z jednojazyčného modelu s dosaženou úspěšností 88 %. Modely byly vyhodnoceny modelem ověřování faktů založeném na TF-IDF. Dosažená přesnost modelu na obou datových sadách se blíží 0,5. Z toho lze usoudit, že výsledné datové sady mohou být náročné pro modely ověřování faktů. Detailed record
	Matching Images to Texts Hajič, Jan ; Pecina, Pavel (advisor) ; Průša, Daniel (referee) We build a joint multimodal model of text and images for automatically assigning illustrative images to journalistic articles. We approach the task as an unsupervised representation learning problem of finding a common representation that abstracts from individual modalities, inspired by multimodal Deep Boltzmann Machine of Srivastava and Salakhutdinov. We use state-of-the-art image content classification features obtained from the Convolutional Neural Network of Krizhevsky et al. as input "images" and entire documents instead of keywords as input texts. A deep learning and experiment management library Safire has been developed. We have not been able to create a successful retrieval system because of difficulties with training neural networks on the very sparse word observation. However, we have gained substantial understanding of the nature of these difficulties and thus are confident that we will be able to improve in future work. Detailed record
	Automatic suggestion of illustrative images Odcházel, Ondřej ; Pecina, Pavel (advisor) ; Holub, Martin (referee) The objective of this thesis is to implement a web application designed for recommendation of stock photos. The application gets the input from newspaper articles in Czech or English and, based on the text itself, suggests appropriate stock photos. The implemented application also searches images according to visual similarity. The thesis deals with theoretical aspects of keywords extraction and language of text detection. Further it analyzes possibilities of efficient search for similar vectors that are used in the search component for visually similar images. It also describes the possibilities in development of modern web frontend and backend. The quality of algorithm for recommending stock photos is tested on users. Powered by TCPDF (www.tcpdf.org) Detailed record
	Searching relevant articles in extensive collections Vojt, Ján ; Novák, Jiří (advisor) ; Bartoš, Tomáš (referee) Searching text in articles is usually implemented with fulltext search. Using more advanced techniques however, it is possible to achieve significantly better results. The subject of this work is to create a universal library for searching extensible collections, specialized in czech language. The library makes use of tools capable of working with morphology while considering importance of words. It also conducts an experiment with word pairs, which adds context into the search process. The success rate of this experiment is tried on an extensible collection of data. Created library is a unique tool for processing extensible collections of czech text, while at the same time it is ready for further extension by new languages and methods. Detailed record
	Analysis of Information Sources for Development of Software Application for Printing Industry Urbánek, Matyáš ; Basl, Josef (advisor) ; Lipková, Helena (referee) Diploma work is targeted on analysis and description of information sources for software application and management information systems in printing industry. At first information needs and information overloading are shortly defined. Next are descripted concepts of printing industry, software for printing industry and relatives concepts of management information systems and workflow systems. In next chapter keywords for information retrieval was arranged in hierarchical tree structures. Core specialised journals and usefull information sources were found within information retrieval. In work are involved both deep web sources and surface web information sources. Found information object are descripted and cited by ISO 690. At the end of work are given categorised bibliographic lists and simple web page project for specialised information portal. Detailed record
	Semantic relation extraction from unstructured data in the business domain Rampula, Ilana ; Pecina, Pavel (advisor) ; Kuboň, Vladislav (referee) Text analytics in the business domain is a growing field in research and practical applications. We chose to concentrate on Relation Extraction from unstructured data which was provided by a corporate partner. Analyzing text from this domain requires a different approach, counting with irregularities and domain specific attributes. In this thesis, we present two methods for relation extraction. The Snowball system and the Distant Supervision method were both adapted for the unique data. The methods were implemented to use both structured and unstructured data from the database of the company. Keywords: Information Retrieval, Relation Extraction, Text Analytics, Distant Supervision, Snowball Detailed record
	Intelligent information retrieval and its trends Pačísková, Jana ; Papík, Richard (advisor) ; Ivánek, Jiří (referee) This thesis is focused on information retrieval in the context of its historical development, it presents trends in integration of intelligent features in it, and thus the emergence of intelligent information retrieval. Individual intelligent elements are described in a separate chapter, following chapter then introduces their use, including specific examples. Thesis also traces research on the topic of intelligent information retrieval in selected institutions both in the Czech republic and abroad; results of this survey for Czech republic are presented in the enclosed search. Detailed record
	Brno Communication Agent Křištof, Jiří ; Fajčík, Martin (referee) ; Smrž, Pavel (advisor) The aim of this thesis is the implementation of a communication agent, which provides information about Brno. The communication agent uses three - tier architecture . For the question answering , machine learning and neural network techniques are used . User tests determined the success rate 84 %. 58 % of the primary users were satisfied with the system. Main benefit of the work is facilitating the retrieving of information about Brno for its residents and visitors . Detailed record
	Multilingual Open-Domain Question Answering Slávka, Michal ; Dočekal, Martin (referee) ; Fajčík, Martin (advisor) Táto práca sa zaoberá automatickým viacjazyčným zodpovedaním na otázky v otvorenej doméne. V tejto práci sú navrhnuté prístupy k tejto málo prebádanej doméne. Konkrétne skúma, či: (i) použitie prekladu z angličtiny je dostačujúce, (ii) multilinguálne systémy vedia využiť preklad otázky do iných jazykov (iii) alebo je výhodnejšie nepoužívať žiaden preklad. Porovnávam použitie anglického systému založeného na modeli T5, ktorý využíva strojový preklad s natívne viacjazyčnými systémami založenými na viacjazyčnom modeli MT5. Anglický systém so strojovým prekladom mierne prekonáva svoje jednojazyčné náprotivky vo viacerých úlohách. Napriek tomu, že tento model bol natrénovaný na väčšom množstve dát zlepšenie nie je dostatočne signifikantné. To ukazuje, že použitie natívne viacjazyčných systémov je sľubným prístupom pre budúci výskum. Tiež prezentujem metódu získavania dokumentov v rôznych jazykoch pomocou algoritmu BM25 a porovnávam ju s anglickým retrievalom. Používanie viacjazyčných dôkazov sa javí ako prospešné a zlepšuje výkonnosť systému systémov. Detailed record
	Syntax in methods for information retrieval Straková, Jana Title: Information Retrieval Using Syntax Information Author: Bc. Jana Kravalová Department: Institute of Formal and Applied Linguistics Supervisor: Mgr. Pavel Pecina, Ph.D. Supervisor's e-mail address: pecina@ufal.mff.cuni.cz Abstract: In the last years, application of language modeling in infor- mation retrieval has been studied quite extensively. Although language models of any type can be used with this approach, only traditional n-gram models based on surface word order have been employed and described in published experiments (often only unigram language models). The goal of this thesis is to design, implement, and evaluate (on Czech data) a method which would extend a language model with syntactic information, automatically obtained from documents and queries. We attempt to incorporate syntactic information into language models and experimentally compare this approach with uni- gram and bigram model based on surface word order. We also empirically compare methods for smoothing, stemming and lemmatization, effectiveness of using stopwords and pseudo relevance feedback. We perform a detailed ana- lysis of these retrieval methods and describe their performance in detail. Keywords: information retrieval, language modelling, depenency syntax, smo- othing Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English