Podobnostní vyhledávání v proteinových databázích

Hoksza, David

One of the principal operations in the area of bioinformatics is similarity assessment at the levels of protein sequence (string of characters) and protein structure (3D shape). It is employed in a wide range of applications such as protein structure prediction, protein function assessment, automatic classification, etc. The protein databases have been growing exponentially in recent years, thus making the existing methods for similarity retrieval inappropriate concerning the volume of the protein-related data. In this thesis, we focus on similarity retrieval on protein sequence and structure levels. At both levels, we propose improvements to the existing methods, as well as novel methods for managing proteins from the similarity perspective. In the first part of the thesis we approach the problem of similarity retrieval at protein sequence level. First, we evaluate the possibilities of utilizing metric access methods for efficient storing and retrieval of protein sequences. Then, we focus on the protein similarity measure itself. Since the similarity computation of protein sequences is based on dynamic programming, we introduce an improvement for increasing efficiency (response time) of the retrieval by reusing parts of the dynamic programming matrix, while maintaining original effectiveness (quality of...

guest :: login Digital Repository
		Search		Submit		Help		About