Abstract
Large-scale comparison of genomic DNA is of fundamental importance in annotating functional elements of genomes. To perform large comparisons efficiently, BLAST and other widely used tools use seeded alignment, which compares only sequences that can be shown to share a common pattern or "seed" of matching bases. The literature suggests that the choice of seed substantially affects the sensitivity of seeded alignment, but designing and evaluating seeds is computationally challenging. This work addresses problems arising in seed design. We give the fastest known algorithm for evaluating the sensitivity of a seed in a Markov model of ungapped alignments, as well as theoretical results on which seeds are good choices. We also describe Mandala, a software tool for seed design, and show that it can be used to improve the sensitivity of alignment in practice.
Original language | English |
---|---|
Pages | 67-75 |
Number of pages | 9 |
DOIs | |
State | Published - 2003 |
Event | Seventh Annual International Conference on Research in Computational Molecular Biology - Berlin, Germany Duration: Apr 10 2003 → Apr 13 2003 |
Conference
Conference | Seventh Annual International Conference on Research in Computational Molecular Biology |
---|---|
Country/Territory | Germany |
City | Berlin |
Period | 04/10/03 → 04/13/03 |
Keywords
- Biosequence comparison
- Genomic DNA
- Mandala
- Seed design
- Similarity search