TENSAI - Practical and Responsible Observability for Data Quality-aware Large-scale Analytics

Loading...
Thumbnail Image

Access rights

openAccess
CC BY
preprint

URL

Journal Title

Journal ISSN

Volume Title

A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Date

2024-12-19

Major/Subject

Mcode

Degree programme

Language

en

Pages

43

Series

Journal of Data and Information Quality, Volume 16, issue 4, pp. 1-43

Abstract

Given a large-scale mobile network with a variety of equipment and radio access network technologies for an approximate 20 million subscribers, there are many types of data that can be used for big data analytics and machine learning (ML) tasks for network operations, monitoring, and optimization. However, a variety of data is measured, collected, and propagated through numerous complex data and software systems. Thus, people, software components, and data-driven operations for big data and ML pipelines face great challenges in dealing with data quality impacts. Data quality related problems occur and are propagated through complex operations involving different types of data, people, software components, and analytics that cannot be solved purely through data quality engineering. This article discusses our TENSAI framework, as a practical and responsible observability for ensuring data quality in such a mobile network. TENSAI focuses on methods of communication, strategy specifications, and data quality engineering for diverse types of data and analytics among different types of operations. TENSAI presents techniques for capturing and communicating causes/effects of data quality problems clearly to all relevant stakeholders, developing data quality-aware adaptation strategies for actions on data that can be integrated into analytics processes, and engineering data quality awareness in software and data pipelines. Thus, TENSAI supports full visibility of data quality problems and impacts among related systems to empower the utilization and adaptation of data analytics for different types of operations. We will illustrate our TENSAI with several real-world data types, pipelines, and cases based on a real-world mobile network.

Description

Publisher Copyright: © 2024 Copyright held by the owner/author(s).

Keywords

Data analysis, data quality, machine learning, observability, telecommunication operations

Other note

Citation

Truong, H L & Nguyen, N N T 2024, ' TENSAI - Practical and Responsible Observability for Data Quality-aware Large-scale Analytics ', Journal of Data and Information Quality, vol. 16, no. 4, 25, pp. 1-43 . https://doi.org/10.1145/3708014