aalto1 untyped-item.component.html

Entropy based blending of policies for multi-agent coexistence

Loading...
Thumbnail Image

Access rights

openAccess
CC BY

Creative Commons license

Except where otherwised noted, this item's license is described as openAccess
publishedVersion

URL

Journal Title

Journal ISSN

Volume Title

A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Major/Subject

Mcode

Degree programme

Language

en

Pages

28

Series

Autonomous Agents and Multi-Agent Systems, Volume 39, issue 1

Abstract

Research on multi-agent interaction involving humans is still in its infancy. Most approaches have focused on environments with collaborative human behavior or a small, defined set of situations. When deploying robots in human-inhabited environments in the future, the diversity of interactions surpasses the capabilities of pre-trained collaboration models. ”Coexistence” environments, characterized by agents with varying or partially aligned objectives, present a unique challenge for robotic collaboration. Traditional reinforcement learning methods fall short in these settings. These approaches lack the flexibility to adapt to changing agent counts or task requirements without undergoing retraining. Moreover, existing models do not adequately support scenarios where robots should exhibit helpful behavior toward others without compromising their primary goals. To tackle this issue, we introduce a novel framework that decomposes interaction and task-solving into separate learning problems and blends the resulting policies at inference time using a goal inference model for task estimation. We create impact-aware agents and linearly scale the cost of training agents with the number of agents and available tasks. To this end, a weighting function blending action distributions for individual interactions with the original task action distribution is proposed. To support our claims we demonstrate that our framework scales in task and agent count across several environments and considers collaboration opportunities when present. The new learning paradigm opens the path to more complex multi-robot, multi-human interactions.

Description

Publisher Copyright: © The Author(s) 2025.

Other note

Citation

Rother, D, Herbert, F, Kalter, F, Koert, D, Pajarinen, J, Peters, J & Weisswange, T H 2025, 'Entropy based blending of policies for multi-agent coexistence', Autonomous Agents and Multi-Agent Systems, vol. 39, no. 1, 27. https://doi.org/10.1007/s10458-025-09707-7

Endorsement

Review

Supplemented By

Referenced By