Cross-systems multi-level data pipelines optimization for predicting sunspot emergence
Loading...
URL
Journal Title
Journal ISSN
Volume Title
Perustieteiden korkeakoulu |
Master's thesis
Unless otherwise stated, all rights belong to the author. You may download, display and print this publication for Your own personal use. Commercial use is prohibited.
Authors
Date
2024-06-17
Department
Major/Subject
Computer Science
Mcode
SCI3042
Degree programme
Master’s Programme in Computer, Communication and Information Sciences
Language
en
Pages
87
Series
Abstract
The proliferation of big data pipelines has spurred collaborative efforts across multiple disciplines to explore the intricacies of those domains. One notable collaboration involves the synergy between Computer Science and other natural sciences. Researchers in diverse domains possess valuable insights that can significantly enhance the extraction of novel and impactful findings within their respective fields. However, the optimal utilization of these pipelines often requires harnessing the full potential of High Performance Computing (HPC) systems. A significant challenge arises from the fact that these pipelines are optimized for scientific accuracy, and therefore may fail to exploit the available resources to their maximum capacity. To address this issue, this thesis explores various approaches to separate the scientific development on the pipeline by the domain scientist from the HPC resource optimization by the computer scientist, and to capture the runtime conditions of processes, identify potential imbalances, and elucidate their underlying causes. The concept is exemplified by applying it to a pipeline proposed by Korpi-Lagg et al. [1]. We conduct a statistical analysis of this pipeline, and investigate existing imbalances and areas for optimization within the pipelines. Through these efforts, the thesis aims to contribute to the enhancement of big data pipeline efficiency and effectiveness across diverse domains. [1] M. J. Korpi-Lagg, A. Korpi-Lagg, N. Olspert, and H. L. Truong, “Solarcycle variation of quiet-Sun magnetism and surface gravity oscillation mode,” Astronomy & Astrophysics, vol. 665, p. A141, Sep. 2022.Description
Supervisor
Truong, Hong-LinhThesis advisor
Korpi-Lagg, AndreasKeywords
big data, pipelines, high performance computing, observability, monitoring, optimization