TENSAI - Practical and Responsible Observability for Data Quality-aware Large-scale Analytics
Loading...
Access rights
openAccess
CC BY
CC BY
preprint
URL
Journal Title
Journal ISSN
Volume Title
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä
This publication is imported from Aalto University research portal.
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
View publication in the Research portal (opens in new window)
View/Open full text file from the Research portal (opens in new window)
Other link related to publication (opens in new window)
Date
2024-12-19
Department
Major/Subject
Mcode
Degree programme
Language
en
Pages
43
Series
Journal of Data and Information Quality, Volume 16, issue 4, pp. 1-43
Abstract
Given a large-scale mobile network with a variety of equipment and radio access network technologies for an approximate 20 million subscribers, there are many types of data that can be used for big data analytics and machine learning (ML) tasks for network operations, monitoring, and optimization. However, a variety of data is measured, collected, and propagated through numerous complex data and software systems. Thus, people, software components, and data-driven operations for big data and ML pipelines face great challenges in dealing with data quality impacts. Data quality related problems occur and are propagated through complex operations involving different types of data, people, software components, and analytics that cannot be solved purely through data quality engineering. This article discusses our TENSAI framework, as a practical and responsible observability for ensuring data quality in such a mobile network. TENSAI focuses on methods of communication, strategy specifications, and data quality engineering for diverse types of data and analytics among different types of operations. TENSAI presents techniques for capturing and communicating causes/effects of data quality problems clearly to all relevant stakeholders, developing data quality-aware adaptation strategies for actions on data that can be integrated into analytics processes, and engineering data quality awareness in software and data pipelines. Thus, TENSAI supports full visibility of data quality problems and impacts among related systems to empower the utilization and adaptation of data analytics for different types of operations. We will illustrate our TENSAI with several real-world data types, pipelines, and cases based on a real-world mobile network.Description
Publisher Copyright: © 2024 Copyright held by the owner/author(s).
Keywords
Data analysis, data quality, machine learning, observability, telecommunication operations
Other note
Citation
Truong, H L & Nguyen, N N T 2024, ' TENSAI - Practical and Responsible Observability for Data Quality-aware Large-scale Analytics ', Journal of Data and Information Quality, vol. 16, no. 4, 25, pp. 1-43 . https://doi.org/10.1145/3708014