Learning Centre

Performance analysis of sparse matrix-vector multiplication (Spmv) on graphics processing units (gpus)

 |  Login

Show simple item record

dc.contributor Aalto-yliopisto fi
dc.contributor Aalto University en
dc.contributor.author Alahmadi, Sarah
dc.contributor.author Mohammed, Thaha
dc.contributor.author Albeshri, Aiiad
dc.contributor.author Katib, Iyad
dc.contributor.author Mehmood, Rashid
dc.date.accessioned 2020-11-30T08:19:10Z
dc.date.available 2020-11-30T08:19:10Z
dc.date.issued 2020-10
dc.identifier.citation Alahmadi , S , Mohammed , T , Albeshri , A , Katib , I & Mehmood , R 2020 , ' Performance analysis of sparse matrix-vector multiplication (Spmv) on graphics processing units (gpus) ' , Electronics (Switzerland) , vol. 9 , no. 10 , 1675 , pp. 1-30 . https://doi.org/10.3390/electronics9101675 en
dc.identifier.issn 2079-9292
dc.identifier.other PURE UUID: c35727bc-13b8-4ab8-8d98-2c4aaba45d84
dc.identifier.other PURE ITEMURL: https://research.aalto.fi/en/publications/performance-analysis-of-sparse-matrixvector-multiplication-spmv-on-graphics-processing-units-gpus(c35727bc-13b8-4ab8-8d98-2c4aaba45d84).html
dc.identifier.other PURE LINK: http://www.scopus.com/inward/record.url?scp=85092442639&partnerID=8YFLogxK
dc.identifier.other PURE FILEURL: https://research.aalto.fi/files/53384870/Performance_Analytics.electronics_09_01675_v2.pdf
dc.identifier.uri https://aaltodoc.aalto.fi/handle/123456789/61797
dc.description.abstract Graphics processing units (GPUs) have delivered a remarkable performance for a variety of high performance computing (HPC) applications through massive parallelism. One such application is sparse matrix-vector (SpMV) computations, which is central to many scientific, engineering, and other applications including machine learning. No single SpMV storage or computation scheme provides consistent and sufficiently high performance for all matrices due to their varying sparsity patterns. An extensive literature review reveals that the performance of SpMV techniques on GPUs has not been studied in sufficient detail. In this paper, we provide a detailed performance analysis of SpMV performance on GPUs using four notable sparse matrix storage schemes (compressed sparse row (CSR), ELLAPCK (ELL), hybrid ELL/COO (HYB), and compressed sparse row 5 (CSR5)), five performance metrics (execution time, giga floating point operations per second (GFLOPS), achieved occupancy, instructions per warp, and warp execution efficiency), five matrix sparsity features (nnz, anpr, npr variance, maxnpr, and distavg), and 17 sparse matrices from 10 application domains (chemical simulations, computational fluid dynamics (CFD), electromagnetics, linear programming, economics, etc.). Subsequently, based on the deeper insights gained through the detailed performance analysis, we propose a technique called the heterogeneous CPU–GPU Hybrid (HCGHYB) scheme. It utilizes both the CPU and GPU in parallel and provides better performance over the HYB format by an average speedup of 1.7x. Heterogeneous computing is an important direction for SpMV and other application areas. Moreover, to the best of our knowledge, this is the first work where the SpMV performance on GPUs has been discussed in such depth. We believe that this work on SpMV performance analysis and the heterogeneous scheme will open up many new directions and improvements for the SpMV computing field in the future. en
dc.format.extent 30
dc.format.extent 1-30
dc.format.mimetype application/pdf
dc.language.iso en en
dc.publisher MDPI AG
dc.relation.ispartofseries Electronics (Switzerland) en
dc.relation.ispartofseries Volume 9, issue 10 en
dc.rights openAccess en
dc.title Performance analysis of sparse matrix-vector multiplication (Spmv) on graphics processing units (gpus) en
dc.type A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä fi
dc.description.version Peer reviewed en
dc.contributor.department Taibah University
dc.contributor.department Department of Computer Science
dc.contributor.department King Abdulaziz University
dc.subject.keyword CSR
dc.subject.keyword CSR5
dc.subject.keyword ELL
dc.subject.keyword Graphics processing units (GPUs)
dc.subject.keyword Heterogeneous computing
dc.subject.keyword High performance computing (HPC)
dc.subject.keyword HYB
dc.subject.keyword Parallelization
dc.subject.keyword Sparse matrix storage
dc.subject.keyword Sparse matrix-vector multiplication (SpMV)
dc.identifier.urn URN:NBN:fi:aalto-2020113020642
dc.identifier.doi 10.3390/electronics9101675
dc.type.version publishedVersion

Files in this item

Files Size Format View

There are no open access files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search archive

Advanced Search

article-iconSubmit a publication