Using several pair-wise informant sequences for de novo prediction of alternatively spliced transcripts.

Paul Flicek, Michael R. Brent

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

BACKGROUND: As part of the ENCODE Genome Annotation Assessment Project (EGASP), we developed the MARS extension to the Twinscan algorithm. MARS is designed to find human alternatively spliced transcripts that are conserved in only one or a limited number of extant species. MARS is able to use an arbitrary number of informant sequences and predicts a number of alternative transcripts at each gene locus. RESULTS: MARS uses the mouse, rat, dog, opossum, chicken, and frog genome sequences as pairwise informant sources for Twinscan and combines the resulting transcript predictions into genes based on coding (CDS) region overlap. Based on the EGASP assessment, MARS is one of the more accurate dual-genome prediction programs. Compared to the GENCODE annotation, we find that predictive sensitivity increases, while specificity decreases, as more informant species are used. MARS correctly predicts alternatively spliced transcripts for 11 of the 236 multi-exon GENCODE genes that are alternatively spliced in the coding region of their transcripts. For these genes a total of 24 correct transcripts are predicted. CONCLUSION: The MARS algorithm is able to predict alternatively spliced transcripts without the use of expressed sequence information, although the number of loci in which multiple predicted transcripts match multiple alternatively spliced transcripts in the GENCODE annotation is relatively small.

Original languageEnglish
Pages (from-to)S8.1-9
JournalGenome biology
Volume7 Suppl 1
StatePublished - 2006

Fingerprint

Dive into the research topics of 'Using several pair-wise informant sequences for de novo prediction of alternatively spliced transcripts.'. Together they form a unique fingerprint.

Cite this