keywords:"Cosine similarity" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"Cosine similarity"

Search:



Search Tips :: Simple Search

Search collections:

Sort by:	Display results:	Output format:

	Comparison of similarities of mass spectra and structures of small molecules Malíčková, Viktorie ; Galgonek, Jakub (advisor) ; Škrhák, Vít (referee) Methods for measuring the similarity of mass spectra and the structures of small molecules are crucial for advancements in medicinal chemistry, pharmacology, and metabolomics. One commonly used method for comparing the mass spectra of molecules is cosine similarity. This measures the similarity between two non-zero vectors by calculating the cosine of the angle between them. Comparing the mass spectra of molecules enables searching in molecular databases, clustering of spectra, and exploration of spectral libraries. Structural similarity is measured based on various molecular fingerprints, such as Daylight, RDKit, Atom-Pair, Topological Torsion, Extended-Connectivity fingerprints, and others. These fingerprints are compared using similarity coefficients. The methods for comparing structures and mass spectra of molecules mentioned can be applied using bioinformatic libraries such as RDKit and CDK for generating and analyzing structural fingerprints, and the MatchMS library for comparing mass spectra. The work provides a theoretical overview of molecular descriptors, including various types of molecular fingerprints and techniques for measuring structural similarity, as well as the principles of mass spectrometry and approaches to comparing mass spectra. The practical part of the work focuses on... Detailed record
	Data Mining Methods for Text Analysis Kozák, Ondřej ; Marcoň, Petr (referee) ; Dohnal, Přemysl (advisor) This bachelor thesis explores the current methodology and possibilities of text mining and the subsequent application of some methods. The thesis described methods for preprocessing, methods for converting text to vector space and methods for text analysis and discusses their possible applications. The different preprocessing methods were applied to the text and then the conversion to vector space was demonstrated using simple methods such as BOW, Bag of n-grams, TF-IDF or with machine learning methods which are FastText and GloVe. LSA, LDA, TextRank and cosine similarity methods were applied to the extracted vectors to extract information from the text. Detailed record
	Web Application of Recommender System Koníček, Igor ; Bartík, Vladimír (referee) ; Zendulka, Jaroslav (advisor) This master's thesis describes creation of recommender system that is used in real server cbdb.cz. A~fully operational recommender system was developed using collaborative and content-based filtering techniques. Thanks to many user feedback, we were able to evaluate their opinion. Many recommended books were tagged as desirable. This thesis is extending current functionality of cbdb.cz with recommender system. This system uses its extensive database of ratings, users and books. Detailed record
	Automatic Topic Detection, Segmentation and Visualization of On-Line Courses Řídký, Josef ; Beran, Vítězslav (referee) ; Szőke, Igor (advisor) The aim of this work is to create a web application for automatic topic detection and segmentation of on-line courses. During playback of processed records, the application should be able to offer records from thematically consistent on-line courses. This document contains problem description, list of used instruments, description of implementation, the principle of operation and description of final user interface. Detailed record
	Quality Analysis of Electronic Dictionaries Transformation Stehlíková, Petra ; Škoda, Petr (referee) ; Kouřil, Jan (advisor) The bachelor's thesis deals with electronic dictionaries, their formats and quality analysis of their conversions. The thesis describes Lexical Markup Framework format in detail. It also discusses the capabilities of advanced algorithms such as LSA for conversion quality analysis and the tools that can be used for the analysis. Based on this theoretical knowledge the scripts in Python language were created to analyze dictionaries in Lexical Markup Framework format. Detailed record
	Content Based Photo Search Bařinka, Radek ; Přibyl, Bronislav (referee) ; Španěl, Michal (advisor) This thesis deals with the problematics of searching of photographs by the content and existing applications dealing with this subject. The aim is the local working application for searching of photographs by the content given by a pattern. The solution consists of the simple graphical interface, the support of saving data and the reading of data from the transferable local database. The application searches the photographs of a given set that are similar to the given pattern. The results are visually depicted to the user. Feature extraction and detection by photo content is solved by means SURF algorithm, visual vocabulary created by method k-means and a description of photography as a bag of words. In addition,the searching of photographs by cosine similarity of vectors enriched with the independent calculation of homography and the selection of regions searched in an example photography. At the end of the technical report the results of testing are presented. Detailed record
	Algorithms for anomaly detection in data from clinical trials and health registries Bondarenko, Maxim ; Blaha, Milan (referee) ; Schwarz, Daniel (advisor) This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records. Detailed record
	Automatic Testing of JavaScript Restrictor Project Bednář, Martin ; Pluskal, Jan (referee) ; Polčák, Libor (advisor) The aim of the thesis was to design, implement and evaluate the results of automatic tests for the JavaScript Restrictor project, which is being developed as a web browser extension. The tests are divided into three levels - unit, integration, and system. The Unit Tests verify the behavior of individual features, the Integration Tests verify the correct wrapping of browser API endpoints, and the System Tests check that the extension does not suppress the desired functionality of web pages. The System Tests are implemented for parallel execution in a distributed environment which has succeeded in achieving an almost directly proportional reduction in time with respect to the number of the tested nodes. The benefit of this work is detection of previously unknown errors in the JavaScript Restrictor extension and provision of the necessary information that allowed to fix some of the detected bugs. Detailed record
	Algorithms for anomaly detection in data from clinical trials and health registries Bondarenko, Maxim ; Blaha, Milan (referee) ; Schwarz, Daniel (advisor) This master's thesis deals with the problems of anomalies detection in data from clinical trials and medical registries. The purpose of this work is to perform literary research about quality of data in clinical trials and to design a personal algorithm for detection of anomalous records based on machine learning methods in real clinical data from current or completed clinical trials or medical registries. In the practical part is described the implemented algorithm of detection, consists of several parts: import of data from information system, preprocessing and transformation of imported data records with variables of different data types into numerical vectors, using well known statistical methods for detection outliers and evaluation of the quality and accuracy of the algorithm. The result of creating the algorithm is vector of parameters containing anomalies, which has to make the work of data manager easier. This algorithm is designed for extension the palette of information system functions (CLADE-IS) on automatic monitoring the quality of data by detecting anomalous records. Detailed record
	Analysis and Data Extraction from a Set of Documents Merged Together Jarolím, Jordán ; Bartík, Vladimír (referee) ; Kreslíková, Jitka (advisor) This thesis deals with mining of relevant information from documents and automatic splitting of multiple documents merged together. Moreover, it describes the design and implementation of software for data mining from documents and for automatic splitting of multiple documents. Methods for acquiring textual data from scanned documents, named entity recognition, document clustering, their supportive algorithms and metrics for automatic splitting of documents are described in this thesis. Furthermore, an algorithm of implemented software is explained and tools and techniques used by this software are described. Lastly, the success rate of the implemented software is evaluated. In conclusion, possible extensions and further development of this thesis are discussed at the end. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English