Guan Lab

Department of Computational Medicine & Bioinformatics
Thu, 05/05/2016 - 13:24 -- gyuanfan
TitleGenome-wide Functional Annotation of Human Protein-coding Splice Variants Using Multiple Instance Learning.
Publication TypeJournal Article
Year of Publication2016
AuthorsPanwar B, Menon R, Eksi R, Li H, Omenn GS, Guan Y
JournalJ Proteome Res
Date Published2016 May 4
ISSN1535-3907
Abstract

The vast majority of human multi-exon genes undergo alternative splicing and produce a variety of splice variant transcripts and proteins, which can perform different functions. These protein-coding splice variants (PCSVs) greatly increase the functional diversity of proteins. Most functional annotation algorithms have been developed at the gene-level; the lack of isoform-level gold standards is an important intellectual limitation for currently available machine learning algorithms. The accumulation of a large amount of RNA-seq data in the public domain greatly increases our ability to examine the functional annotation of genes at isoform-level. In the present study, we used a multiple instance learning (MIL) based approach for predicting the function of PCSVs. We used transcript-level expression values and gene-level functional associations from the Gene Ontology database. A support vector machine (SVM)-based five-fold cross-validation technique was applied. Comparatively, genes with multiple PCSVs performed better than single PCSV genes and performance also improved when more examples were available to train the models. We demonstrated our predictions using literature evidence of ADAM15, LMNA/C, and DMXL2 genes. All predictions have been implemented in a web resource called 'IsoFunc', which is freely available for the global scientific community through http://guanlab.ccmb.med.umich.edu/isofunc.

DOI10.1021/acs.jproteome.5b00883
Alternate JournalJ. Proteome Res.
PubMed ID27142340