National Repository of Grey Literature 2 records found  Search took 0.00 seconds. 
High-performance inverted index database
Javorský, Dávid ; Kratochvíl, Miroslav (advisor) ; Peška, Ladislav (referee)
The goal of this thesis is to implement an inverted-index database software that provides improvements in handling raw non-textual data, which is beneficial for several areas of research. The main internal structures of the library are designed to be cache-oblivious, also aiming to reduce the size of stored data. This thesis includes an overview of common inverted index implementation methods and describes retaled structures in a suitable cache-based model. This resulted in improvements of compression ratio, and performance similar to currently available highly optimized databases. The benchmark conducted on cheminformatic data has shown that the resulting software is applicable as an immediate, efficient replacement of the storage back-ends of specialized molecule databases.
Accelerating structure search in small-molecule databases
Kratochvíl, Miroslav ; Bednárek, David (advisor) ; Hoksza, David (referee)
Structure search is one of the valuable capabilities of small-molecule databases. Available chemical cartridges typically provide acceptable search performance for processing user queries, but do not scale satisfactorily with dataset size. This thesis presents Sachem, a new open-source chemical car- tridge that implements a novel method of substructure search, which em- ploys newly designed fingerprints stored in inverted indexes. The perfor- mance of the method was assessed on datasets that contain tens of mil- lions of molecules. Comparison of the performance to that of other available cartridges revealed improvements in overall search speed, scaling potential and screen-out efficiency. Additionally, the thesis presents an application of Sachem; a SPARQL service that augments existing semantic services by including results of substructure and similarity searches in small-molecule databases. The result offers new possibilities for simpler querying of the interoperable heterogeneous data sources. 1

Interested in being notified about new results for this query?
Subscribe to the RSS feed.