Energy Measurement and Modeling in High Performance Computing with Intel's RAPL
Loading...
URL
Journal Title
Journal ISSN
Volume Title
School of Science |
Doctoral thesis (article-based)
| Defence date: 2018-04-27
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2018
Major/Subject
Mcode
Degree programme
Language
en
Pages
72 + app. 78
Series
Aalto University publication series DOCTORAL DISSERTATIONS, 46/2018
Abstract
Significant advancements in the cloud computing paradigm have persuaded service providers to offer new and old services using the cloud computing platform for advantages like elasticity, scalability, availability and cost-effectiveness. In addition, the goal of achieving exaflops computation by 2020 by the High Performance Computing (HPC) community and the rapid growth in data generated and analyzed in the scientific computing paradigm have paved the way for an unprecedented growth in the number of server systems in data centers. As an example, CERN is now producing approximately 30 petabytes of data annually, which need to be stored and analyzed for particle physics. The proliferation of applications like social networking, video on demand and big data, is just adding more to the total number of server systems in data centers. Such big numbers of power hungry servers have increased the energy demand of data centers, and as a result energy efficiency in HPC, scientific computing and cloud computing is now a big concern. In this thesis, we investigate the energy consumption of server based computing systems and propose practical solutions for measuring, modeling and analyzing the energy efficiency of such systems. In this thesis, we have extensively used and analyzed Intel's Running Average Power Limit (RAPL) as an energy measurement tool. Firstly, we have used RAPL to profile the performance and energy consumption of an application. Secondly, we propose two strategies to model the power consumption of computing systems: modeling the power consumption of components inside the CPU such as instruction decoders, L2 and L3 caches, etc and modeling the full system power consumption using operating system counters and RAPL. For modeling the power consumption, we have used regression based models, statistical models as well as non-linear additive models. To validate our findings, we have used real production logs from data center as well as instances from Amazon Elastic Compute Cloud (EC2). The proposed power models predict the power consumption with promising accuracy. Thirdly, we have performed an extensive evaluation of RAPL as a power measurement tool and pinpointed RAPL's performance with respect to measurement overhead, accuracy, granularity, etc. This comprehensive analysis also reveals some open issues with RAPL that might weaken its usability in certain scenarios for which we also pinpoint solutions. Finally, to show the applicability of RAPL, we analyze the energy efficiency of two large scale graph processing platforms: Apache Giraph and Spark's GraphX.Description
Supervising professor
Ylä-Jääski, Antti, Prof., Aalto University, Department of Computer Science, FinlandThesis advisor
Nurminen, Jukka K., Adj. Prof., Aalto University, Department of Computer Science, FinlandKeywords
RAPL, power modeling, big data, energy efficiency, distributed computing, energy profiling
Other note
Parts
- [Publication 1]: Kashif Nizam Khan, Filip Nybäck, Zhonghong Ou, Jukka K. Nurminen, Tapio Niemi, Giulio Eulisse, Peter Elmer, David Abdurachmanov. Energy Profiling Using IgProf. In Proceedings of 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp.1115-1118, May 2015.DOI: 10.1109/CCGrid.2015.118
-
[Publication 2]: Kashif Nizam Khan, Zhonghong Ou, Mikael Hirki, Jukka K. Nurminen, and Tapio Niemi. How much power does your server consume? Estimating wall socket power using RAPL measurements. Computer Science - Research and Development, Vol 31, Issue 4, pp. 207-214, August 2016.
DOI: 10.1007/s00450-016-0325-4 View at publisher
- [Publication 3]: Mikael Hirki, Zhonghong Ou, Kashif Nizam Khan, Tapio Niemi, Jukka K. Nurminen. Empirical study of the power consumption of the x86-64 instruction decoder. In USENIX Workshop on Cool Topics on Sustainable Data Centers (CoolDC’16), Santa Clara, CA, USA, March 2016. https://www.usenix.org/node/195139.
-
[Publication 4]: Kashif Nizam Khan, Mohammad A. Hoque, Tapio Niemi, Zhonghong Ou, and Jukka K. Nurminen. Energy efficiency of large scale graph processing platforms. In Proceedings of the 2016 ACM International Joint Conference in Pervasive and Ubiquitous Computing: Adjunct (UbiComp ’16 - HotPlanet Workshop), Heidelberg, Germany, pp.1287-1294, September 2016.
DOI: 10.1145/2968219.2968296 View at publisher
- [Publication 5]: Kashif Nizam Khan, Sanja Scepanovic, Tapio Niemi, Jukka K. Nurminen, Sebastian V. Alfthan and Olli Pekka Lehto. Analyzing the Power Consumption Behavior of a Large Scale Data Center. Accepted for publication in Computer Science - Research and Development, 9 pages, June 2017.
-
[Publication 6]: Kashif Nizam Khan, Mikael Hirki, Tapio Niemi, Jukka K. Nurminen, Zhonghong Ou. RAPL in Action:Experience in Using RAPL for Power Measurements. ACM Transactions on Modeling and Performance Evaluation of Computing Systems, Vol 3, Issue 2, Article 9, March 2018.
DOI: 10.1145/3177754 View at publisher