Stencil Computations on AMD and Nvidia Graphics Processors: Performance and Tuning Strategies
Loading...
Access rights
openAccess
CC BY
CC BY
publishedVersion
URL
Journal Title
Journal ISSN
Volume Title
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Date
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
23
Series
Concurrency and Computation: Practice and Experience, Volume 37, issue 12-14, pp. 1-23
Abstract
Over the last ten years, graphics processors have become the de facto accelerator for data-parallel tasks in various branches of high-performance computing, including machine learning and computational sciences. However, with the recent introduction of AMD-manufactured graphics processors to the world's fastest supercomputers, tuning strategies established for previous hardware generations must be re-evaluated. In this study, we evaluate the performance and energy efficiency of stencil computations on modern datacenter graphics processors and propose a tuning strategy for fusing cache-heavy stencil kernels. The studied cases comprise both synthetic and practical applications, which involve the evaluation of linear and nonlinear stencil functions in one to three dimensions. Our experiments reveal that AMD and Nvidia graphics processors exhibit key differences in both hardware and software, necessitating platform-specific tuning to reach their full computational potential.Description
| openaire: EC/H2020/818665/EU//UniSDyn
Other note
Citation
Pekkilä, J, Lappi, O, Robertsen, F & Korpi-Lagg, M 2025, 'Stencil Computations on AMD and Nvidia Graphics Processors: Performance and Tuning Strategies', Concurrency and Computation: Practice and Experience, vol. 37, no. 12-14, e70129, pp. 1-23. https://doi.org/10.1002/cpe.70129