TY - GEN
T1 - Exploiting coarse-grained parallelism to accelerate protein motif finding with a network processor
AU - Wun, Ben
AU - Buhler, Jeremy
AU - Crowley, Patrick
PY - 2005
Y1 - 2005
N2 - While general-purpose processors have only recently employed chip multiprocessor (CMP) architectures, network processors (NPs) have used heterogeneous multi-core architectures since the late 1990s. NPs differ qualitatively from workstation and server CMPs in that they replicate many simple, highly efficient processor cores on a chip, rather than a small number of sophisticated superscalar CPUs. In this paper, we compare the performance of one such NP, the Intel IXP 2850, to that of the Intel Pentium 4 when executing a scientific computing workload with a high degree of thread-level parallelism. Our target program, HMMer, is a bioinformatics tool that identifies conserved motifs in protein sequences. HMMer represents motifs as hidden Markov models (HMMs) and spends most of its time executing the well-known Viterbi algorithm to align proteins to these models. Our observations of HMMer on the IXP are therefore relevant to computations in many other domains that rely on the Viterbi algorithm. We show that the IXP achieves a speedup of 1.82 over the Pentium, despite the Pentium's 1.85x faster clock. Moreover, we argue that nextgeneration IXP NPs will likely provide a 10-20x speedup for our workload over the IXP 2850, in contrast to 5-10x speedup expected from a next-generation Pentium-based CMP.
AB - While general-purpose processors have only recently employed chip multiprocessor (CMP) architectures, network processors (NPs) have used heterogeneous multi-core architectures since the late 1990s. NPs differ qualitatively from workstation and server CMPs in that they replicate many simple, highly efficient processor cores on a chip, rather than a small number of sophisticated superscalar CPUs. In this paper, we compare the performance of one such NP, the Intel IXP 2850, to that of the Intel Pentium 4 when executing a scientific computing workload with a high degree of thread-level parallelism. Our target program, HMMer, is a bioinformatics tool that identifies conserved motifs in protein sequences. HMMer represents motifs as hidden Markov models (HMMs) and spends most of its time executing the well-known Viterbi algorithm to align proteins to these models. Our observations of HMMer on the IXP are therefore relevant to computations in many other domains that rely on the Viterbi algorithm. We show that the IXP achieves a speedup of 1.82 over the Pentium, despite the Pentium's 1.85x faster clock. Moreover, we argue that nextgeneration IXP NPs will likely provide a 10-20x speedup for our workload over the IXP 2850, in contrast to 5-10x speedup expected from a next-generation Pentium-based CMP.
UR - http://www.scopus.com/inward/record.url?scp=33746696604&partnerID=8YFLogxK
U2 - 10.1109/PACT.2005.21
DO - 10.1109/PACT.2005.21
M3 - Conference contribution
AN - SCOPUS:33746696604
SN - 076952429X
SN - 9780769524290
T3 - Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT
SP - 173
EP - 184
BT - 14th International Conference on Parallel Architectures and Compilation Techniques, PACT 2005
T2 - 14th International Conference on Parallel Architectures and Compilation Techniques, PACT 2005
Y2 - 17 September 2005 through 21 September 2005
ER -