YOAO: You Only Align Once — propagating cooperative behaviors in multi-agent systems.
Aligning every agent individually does not scale. We train cooperative behavior into a single seed agent, and watch it spread to untrained agents through interaction alone.
1Arcarae · 2UC Santa Cruz · 3Northwestern · 4Johns Hopkins
* Equal contribution
Abstract
Multi-agent systems require robust alignment, but aligning every agent individually does not scale to open environments with many interacting models. We propose Alignment Propagation: instilling cooperation in a few seed agents and letting it spread through interaction.
We introduce the Alignment Propagation Playground with two complementary settings: the Red-Black Game, a discrete social dilemma with broadcast deliberation, and Sugarscape, a spatial resource environment. Each is built to pressure agents to defect for local gain.
A single seed more than doubles cooperation on held-out Red-Black scenarios (26% → 62%), scaling to 96% with five seeds. Without retraining, the seeds transfer zero-shot to Sugarscape.