High-throughput DNA sequencing is now producing collections of genomes from moderately or closely related organisms. Such a collection may be represented as a multiple alignment M of orthologous sequences, which induces a phylogenetic tree τ. Long-range genomic alignments with phylogenies have not yet found a prominent place in BLAST-like similarity search algorithms, though using them directly as databases can potentially yield more accurate and more informative alignments. This work describes how to construct local alignments between a query and a multiple alignment in a way that explicitly uses a phylogenetic tree τ. We give an EM algorithm to find a locally optimal alignment when the location of the query on the tree τ is not known. An initial implementation of the method is tested on a large multiple alignment of sequences from eight vertebrate genomes.
|Number of pages||15|
|Journal||Lecture Notes in Bioinformatics (Subseries of Lecture Notes in Computer Science)|
|State||Published - 2005|
|Event||RECOMB 2004 International Workshop, RRCG 2004 - Comparative Genomics - Bertinoro, Italy|
Duration: Oct 16 2004 → Oct 19 2004