Learning Progress Driven Multi-Agent Curriculum

Loading...
Thumbnail Image

Access rights

openAccess
CC BY
publishedVersion

URL

Journal Title

Journal ISSN

Volume Title

A4 Artikkeli konferenssijulkaisussa

Date

Major/Subject

Mcode

Degree programme

Language

en

Pages

17

Series

Proceedings of Machine Learning Research, Volume 267, pp. 77572-77588

Abstract

The number of agents can be an effective curriculum variable for controlling the difficulty of multiagent reinforcement learning (MARL) tasks. Existing work typically uses manually defined curricula such as linear schemes. We identify two potential flaws while applying existing reward-based automatic curriculum learning methods in MARL: (1) The expected episode return used to measure task difficulty has high variance; (2) Credit assignment difficulty can be exacerbated in tasks where increasing the number of agents yields higher returns which is common in many MARL tasks. To address these issues, we propose to control the curriculum by using a TD-error based learning progress measure and by letting the curriculum proceed from an initial context distribution to the final task specific one. Since our approach maintains a distribution over the number of agents and measures learning progress rather than absolute performance, which often increases with the number of agents, we alleviate problem (2). Moreover, the learning progress measure naturally alleviates problem (1) by aggregating returns. In three challenging sparse-reward MARL benchmarks, our approach outperforms state-of-the-art baselines.

Description

Publisher Copyright: © 2025 by the author(s).

Keywords

Other note

Citation

Zhao, W, Li, Z & Pajarinen, J 2025, 'Learning Progress Driven Multi-Agent Curriculum', Proceedings of Machine Learning Research, vol. 267, pp. 77572-77588. < https://raw.githubusercontent.com/mlresearch/v267/main/assets/zhao25o/zhao25o.pdf >