Coordination-aware assurance for end-to-end machine learning systems: the R3E approach

Loading...
Thumbnail Image

Access rights

openAccess
acceptedVersion

URL

Journal Title

Journal ISSN

Volume Title

A3 Kirjan tai muun kokoomateoksen osa

Date

Major/Subject

Mcode

Degree programme

Language

en

Pages

Series

Abstract

Concerns of robustness, reliability, resilience, and elasticity in Machine Learning (ML) systems are important, and they must be considered in trade-off with efficiency factors. However, they need to be supported and optimized in an end-to-end manner, not just for ML models. In this chapter we present a conceptual approach to architectural design and engineering of the robustness, reliability, resilience, and elasticity (R3E) for end-to-end big data ML systems at runtime. We propose quality of analytics as a contractual means for optimizing end-to-end big data machine learning (BDML) systems. Based on that, we propose to define and abstract diverse types of components under R3E objects and devise operations and metrics for managing R3E attributes. Through a set of proposed coordination, monitoring, analytics, and testing methods, we identify essential tasks for tackling R3E concerns when developing BDML systems. Finally, we illustrate our approach with an example of an end-to-end BDML system for building objects classifications.

Description

Keywords

Other note

Citation

Truong, L 2022, Coordination-aware assurance for end-to-end machine learning systems: the R3E approach. in F Batarseh & L Freeman (eds), AI Assurance: Towards Trustworthy, Explainable, Safe, and Ethical AI. Elsevier, pp. 339-367. https://doi.org/10.1016/B978-0-32-391919-7.00024-X