Guan Lab

Department of Computational Medicine & Bioinformatics
Wed, 11/06/2013 - 10:23 -- gl_admin
TitlePredicting gene function in a hierarchical context with an ensemble of classifiers.
Publication TypeJournal Article
Year of Publication2008
AuthorsGuan Y, Myers CL, Hess DC, Barutcuoglu Z, Caudy AA, Troyanskaya OG
JournalGenome Biol
Volume9 Suppl 1
PaginationS3
Date Published2008
ISSN1465-6914
KeywordsAlgorithms, Animals, Bayes Theorem, Mice, Mitochondrial Proteins, Proteins, Saccharomyces cerevisiae, Saccharomyces cerevisiae Proteins
Abstract

BACKGROUND: The wide availability of genome-scale data for several organisms has stimulated interest in computational approaches to gene function prediction. Diverse machine learning methods have been applied to unicellular organisms with some success, but few have been extensively tested on higher level, multicellular organisms. A recent mouse function prediction project (MouseFunc) brought together nine bioinformatics teams applying a diverse array of methodologies to mount the first large-scale effort to predict gene function in the laboratory mouse.RESULTS: In this paper, we describe our contribution to this project, an ensemble framework based on the support vector machine that integrates diverse datasets in the context of the Gene Ontology hierarchy. We carry out a detailed analysis of the performance of our ensemble and provide insights into which methods work best under a variety of prediction scenarios. In addition, we applied our method to Saccharomyces cerevisiae and have experimentally confirmed functions for a novel mitochondrial protein.CONCLUSION: Our method consistently performs among the top methods in the MouseFunc evaluation. Furthermore, it exhibits good classification performance across a variety of cellular processes and functions in both a multicellular organism and a unicellular organism, indicating its ability to discover novel biology in diverse settings.

DOI10.1186/gb-2008-9-s1-s3
Alternate JournalGenome Biol.
PubMed ID18613947
PubMed Central IDPMC2447537
Grant ListP50 GM071508 / GM / NIGMS NIH HHS / United States
R01 GM071966 / GM / NIGMS NIH HHS / United States
T32 HG003284 / HG / NHGRI NIH HHS / United States