Design of throughput-optimized arrays from recurrence abstractions

Arpith C. Jacob, Jeremy D. Buhler, Roger D. Chamberlain

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Many compute-bound applications have seen order-of-magnitude speedups using special-purpose accelerators. FPGAs in particular are good at implementing recurrence equations realized as arrays. Existing high-level synthesis approaches for recurrence equations produce an array that is latency-space optimal. We target applications that operate on a large collection of small inputs, e.g. a database of biological sequences, where overall throughput is the most important measure of performance. In this work, we introduce a new design-space exploration procedure within the polyhedral framework to optimize throughput of a systolic array subject to area and bandwidth constraints of an FPGA device. Our approach is to exploit additional parallelism by pipelining multiple inputs on an array and multiple iteration vectors in a processing element. We prove that the throughput of an array is given by the inverse of the maximum number of iteration vectors executed by any processor in the array, which is determined solely by the array's projection vector. We have applied this observation to discover novel arrays for Nussinov RNA folding. Our throughput-optimized array is 2x faster than the standard latency-space optimal array, yet it uses 15% fewer LUT resources. We achieve a further 2x speedup by processor pipelining, with only a 37% increase in resources. Our tool suggests additional arrays that trade area for throughput and are 4-5x faster than the currently used latency-optimized array. These novel arrays are 70-172x faster than a software baseline.

Original languageEnglish
Title of host publicationASAP 10 - 21st IEEE International Conference on Application-Specific Systems, Architectures and Processors, Conference Proceedings
Pages133-140
Number of pages8
DOIs
StatePublished - 2010
Event21st IEEE International Conference on Application-specific Systems, Architectures and Processors, ASAP 2010 - Rennes, France
Duration: Jul 7 2010Jul 9 2010

Publication series

NameProceedings of the International Conference on Application-Specific Systems, Architectures and Processors
ISSN (Print)1063-6862

Conference

Conference21st IEEE International Conference on Application-specific Systems, Architectures and Processors, ASAP 2010
Country/TerritoryFrance
CityRennes
Period07/7/1007/9/10

Keywords

  • Dynamic programming
  • FPGA
  • Recurrences
  • Systolic array
  • Throughput optimization

Fingerprint

Dive into the research topics of 'Design of throughput-optimized arrays from recurrence abstractions'. Together they form a unique fingerprint.

Cite this