National Repository of Grey Literature 30 records found  1 - 10nextend  jump to record: Search took 0.01 seconds. 
Bioinformatic Tool for Classification of Bacteria into Taxonomic Categories Based on the Sequence of 16S rRNA Gene
Valešová, Nikola ; Hon, Jiří (referee) ; Smatana, Stanislav (advisor)
Tato práce se zabývá problematikou automatizované klasifikace a rozpoznávání bakterií po získání jejich DNA procesem sekvenování. V rámci této práce je navržena a popsána nová metoda klasifikace založená na základě segmentu 16S rRNA. Představený princip je vytvořen podle stromové struktury taxonomických kategorií a používá známé algoritmy strojového učení pro klasifikaci bakterií do jedné ze tříd na nižší taxonomické úrovni. Součástí práce je dále implementace popsaného algoritmu a vyhodnocení jeho přesnosti predikce. Přesnost klasifikace různých typů klasifikátorů a jejich nastavení je prozkoumána a je určeno nastavení, které dosahuje nejlepších výsledků. Přesnost implementovaného algoritmu je také porovnána s několika existujícími metodami. Během validace dosáhla implementovaná aplikace KTC více než 45% přesnosti při predikci rodu na datových sadách BLAST 16S i BLAST V4. Na závěr je zmíněno i několik možností vylepšení a rozšíření stávající implementace algoritmu.
Detection of Enzymes in Metagenomic Data
Smatana, Stanislav ; Martínek, Tomáš (referee) ; Hon, Jiří (advisor)
This thesis presents specification and implementation of a system for detection of enzymes in metagenomic data. The detection is based on a provided enzyme sequence and its goal is to search the metagenomic sample for its novel variants. In order to guarantee that found enzymes truly have the desired catalytic function, the system employs a number of catalytic function verification methods. Their specification, implementation and evaluation is one of the main contributions of this thesis. Experiments have shown, that proposed methods reach sensitivity as high as 89%, specificity of 95%, values of AUC metric above 0.9 and average throughput of 1,203 verifications per second on regular personal computer. Evaluation of the system also led to discovery of a partial sequence of novel haloalkane dehalogenase enzyme in a metagenomic sample from soil. The implementation is able to work on a personal computer as well as on a grid computing environment.
Bioinformatics Tool for Prediction of Protein Solubility
Čermák, Jiří ; Hon, Jiří (referee) ; Martínek, Tomáš (advisor)
To achieve cheaper and more efficient protein production, we must be able to predict protein solubility. In this thesis, we describe creation of bioinformatic data sets based on Target Track and eSol databases, we test the features used in existing protein solubility prediction tools and create a new predictor. Even though we fail to create an effective prediction tool we find out that in most cases the old features tested on the new data do not correlate with protein solubility as strongly as others repot in older and smaller datasets.
Tool for Classification of Lifestyle Traits Based on Metagenomic Data from the Large Intestine
Kubica, Jan ; Hon, Jiří (referee) ; Smatana, Stanislav (advisor)
This thesis deals with analysis of human microbiome using metagenomic data from large intestine. The main focus is placed on bacteria composition in a sample on different taxonomic levels regarding the lifestyle traits of an individual. For this purpose, a tool for classification of several attributes was created. It considers attributes like diet type and eating habits (vegetarian, vegan, omnivore), gluten and lactose intolerance, body mass index, age or sex. From range of machine learning perspectives considering K Nearest Neighbours (kNN), Random Forest (RF) and Support Vector Machines (SVM) were used. Datasets for training and final evaluation of the classifier were taken from American Gut project. The thesis also focuses on particular problems with metagenomic datasets like its multidimensionality, sparsity, compositional character and class imbalance.
Prediction of the Effect of Mutation on Protein Solubility
Velecký, Jan ; Martínek, Tomáš (referee) ; Hon, Jiří (advisor)
The goal of the thesis is to create a predictor of the effect of a mutation on protein solubility given its initial 3D structure. Protein solubility prediction is a bioinformatics problem which is still considered unsolved. Especially a prediction using a 3D structure has not gained much attention yet. A relevant knowledge about proteins, protein solubility and existing predictors is included in the text. The principle of the designed predictor is inspired by the Surface Patches article and therefore it also aims to validate the results achieved by its authors. The designed tool uses changes of positive regions of the electric potential above the protein's surface to make a prediction. The tool has been successfully implemented and series of computationally expensive experiments have been performed. It was shown that the electric potential, hence the predictor itself too, can be successfully used just for a limited set of proteins. On top of that, the method used in the article correlates with a much simpler variable - the protein's net charge.
Bacteria Classification Based on Marker Genes
Pelantová, Lucie ; Hon, Jiří (referee) ; Smatana, Stanislav (advisor)
The aim of this work is proposal of new method for bacteria classification based on sequences of marker genes. For this purpose was chosen 10 marker genes. Resulting MultiGene classifier processes data set by dividing it in several groups and choosing gene for each group which can distinguish this group with best results. This work describes implementation of MultiGene classifier and its results in comparison with other bacteria classifiers and with classification based entirely on gene 16S rRNA.
Optimization of Algorithms for Triplex Detection
Hon, Jiří ; Bendl, Jaroslav (referee) ; Martínek, Tomáš (advisor)
Triplex-forming DNA sequences have been implicated as important players in several key processes, such as transcriptional regulation, DNA recombination and mutagenesis, which emphasize their importance for biology, biotechnology and medicine. This bachelor thesis optimizes recently publicated dynamic programming algorithm for identification of triplex-forming sequences on three levels of design: user interface, memory usage and computation time. On the level of user interface, the algorithm was extended with existing visualization functions and rewritten into R/Bioconductor package. Memory usage optimization and processor cache analysis in combination with computation time reduction based on current computation state analysis lead to more than three times acceleration.
Prediction of Protein Solubility
Marušiak, Martin ; Martínek, Tomáš (referee) ; Hon, Jiří (advisor)
Protein solubility is closely related to the usability of proteins in industrial use and research. The successful prediction of solubility would therefore lead to a significant saving of financial resources. This work presents new solubility predictor Solpex based on machine learning that achieved better performance on independent test set than any comparable solubility prediction tool. The predictor implementation was preceded by a study of the biological nature of solubility, evaluation of existing solubility prediction approaches, datasets building, many experiments with novel features and selection of the best features for the predictor. As the most important step in machine learning is the datasets building, this work mainly benefits from own rigorous processing of the main source of solubility data - the TargetTrack database.
Bacteria Classification into Taxonomic Categories Based on Properties of 16s rRNA
Grešová, Katarína ; Hon, Jiří (referee) ; Smatana, Stanislav (advisor)
The main goal of this thesis was to design and implement a tool that would be able to classify the sequences of the 16S rRNA gene into taxonomic categories using the properties of the 16S rRNA gene. The created tool analyzes all input sequences simultaneously, which differs from common classification approaches, which classify input sequences individually. This tool relies on the fact that bacteria contain several copies of the 16S rRNA gene, which may differ in sequence. The main contribution of this work is design, implementation and evaluation of the capabilities of this tool. Experiments have shown that the proposed tool is able to identify the corresponding bacteria for smaller datasets and determine the correct ratios of their abundances. However, with larger datasets, the state space becomes very large and fragmented, which requires further improvements in order for it to search the state space in an efficient way.
Search of Related Enzymes
Borko, Simeon ; Smatana, Stanislav (referee) ; Hon, Jiří (advisor)
Millions of new proteins discovered each year cannot be characterized by classical biochemical methods due to their demands of time and cost. Among the unexplored proteins, there may be enzymes useful in both industry and academy, mostly for ecological production of chemical compounds. The result of the thesis is a web application which, based on the input proteins, searches the database for similar proteins. The proteins are filtered using essential residues of the input proteins and marked as putative biocatalysts. Finally, the proteins are annotated so that the user can make an informed decision about which proteins to select for experimental laboratory verification. The developed tool facilitates multi-step analysis and recommends proteins for experimental verification of their enzymatic function. The web interface is freely available at https://loschmidt.chemi.muni.cz/enzymeminer/. The tool was published in the international journal Nucleic Acids Research.

National Repository of Grey Literature : 30 records found   1 - 10nextend  jump to record:
See also: similar author names
1 Hon, Jakub
2 Hon, Jan
Interested in being notified about new results for this query?
Subscribe to the RSS feed.