Abstract
High-throughput DNA sequencing is now producing collections of genomes from moderately or closely related organisms. Such a collection may be represented as a multiple alignment M of orthologous sequences, which induces a phylogenetic tree τ. Long-range genomic alignments with phylogenies have not yet found a prominent place in BLAST-like similarity search algorithms, though using them directly as databases can potentially yield more accurate and more informative alignments. This work describes how to construct local alignments between a query and a multiple alignment in a way that explicitly uses a phylogenetic tree τ. We give an EM algorithm to find a locally optimal alignment when the location of the query on the tree τ is not known. An initial implementation of the method is tested on a large multiple alignment of sequences from eight vertebrate genomes.
Original language | English |
---|---|
Pages (from-to) | 15-29 |
Number of pages | 15 |
Journal | Lecture Notes in Bioinformatics (Subseries of Lecture Notes in Computer Science) |
Volume | 3388 |
DOIs | |
State | Published - 2005 |
Event | RECOMB 2004 International Workshop, RRCG 2004 - Comparative Genomics - Bertinoro, Italy Duration: Oct 16 2004 → Oct 19 2004 |