Performance modeling for highly-threaded many-core GPUs

  • Lin Ma
  • , Roger D. Chamberlain
  • , Kunal Agrawal

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

17 Scopus citations

Abstract

Highly-threaded many-core GPUs can provide high throughput for a wide range of algorithms and applications. Such machines hide memory latencies via the use of a large number of threads and large memory bandwidth. The achieved performance, therefore, depends on the parallelism exploited by the algorithm, the effectiveness of latency hiding, and the utilization of multiprocessors (occupancy). In this paper, we extend previously proposed analytical models, jointly addressing parallelism, latency-hiding, and occupancy. In particular, the model not only helps to explore and reduce the configuration space for tuning kernel execution on GPUs, but also reflects performance bottlenecks and predicts how the runtime will trend as the problem and other parameters scale. The model is validated with empirical experiments. In addition, the model points to at least one circumstance in which the occupancy decisions automatically made by the scheduler are clearly sub-optimal in terms of runtime.

Original languageEnglish
Title of host publicationASAP 2014 - Proceedings of the 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages84-91
Number of pages8
ISBN (Print)9781479936090, 9781479936090
DOIs
StatePublished - 2014
Event25th IEEE International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2014 - Zurich, Switzerland
Duration: Jun 18 2014Jun 20 2014

Publication series

NameProceedings of the International Conference on Application-Specific Systems, Architectures and Processors
ISSN (Print)2160-0511
ISSN (Electronic)2160-052X

Conference

Conference25th IEEE International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2014
Country/TerritorySwitzerland
CityZurich
Period06/18/1406/20/14

Keywords

  • All-pairs Shortest Paths (APSP)
  • GPGPU
  • Performance Model
  • Threaded Many-core Memory (TMM) Model

Fingerprint

Dive into the research topics of 'Performance modeling for highly-threaded many-core GPUs'. Together they form a unique fingerprint.

Cite this