Accelerator design for protein sequence HMM search

Rahul P. Maddimsetty, Jeremy Buhler, Roger D. Chamberlain, Mark A. Franklin, Brandon Harris

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

34 Scopus citations

Abstract

Profile Hidden Markov models (HMMs) are a powerful approach to describing biologically significant functional units, or motifs, in protein sequences. Entire databases of such models are regularly compared to large collections of proteins to recognize motifs in them. Exponentially increasing rates of genome sequencing have caused both protein and model databases to explode in size, placing an ever-increasing computational burden on users of these systems.Here, we describe an accelerated search system that exploits parallelism in a number of ways. First, the application is functionally decomposed into a pipeline, with distinct compute resources executing each pipeline stage. Second, the first pipeline stage is deployed on a systolic array, which yields significant fine-grained parallelism. Third, for some instantiations of the design, parallel copies of the first pipeline stage are used, further increasing the level of coarse-grained parallelism.A nave parallelization of the first stage computation has serious repercussions for the sensitivity of the search. We present a pair of remedies to this dilemma and quantify the regions of interest within which each approach is most effective. Analytic performance models are used to assess the overall speedup that can be attained relative to a single-processor software solution. Performance improvements of 1 to 2 orders of magnitude are predicted.

Original languageEnglish
Title of host publicationProceedings of the 20th Annual International Conference on Supercomputing, ICS 2006
Pages288-296
Number of pages9
DOIs
StatePublished - 2006
Event20th Annual International Conference on Supercomputing, ICS 2006 - Cairns, Queensland, Australia
Duration: Jun 28 2006Jul 1 2006

Publication series

NameProceedings of the International Conference on Supercomputing

Conference

Conference20th Annual International Conference on Supercomputing, ICS 2006
Country/TerritoryAustralia
CityCairns, Queensland
Period06/28/0607/1/06

Keywords

  • HMMER
  • Hidden Markov model
  • Protein motif

Fingerprint

Dive into the research topics of 'Accelerator design for protein sequence HMM search'. Together they form a unique fingerprint.

Cite this