YOAO: You Only Align Once — propagating cooperative behaviors in multi-agent systems.

Aligning every agent individually does not scale. We train cooperative behavior into a single seed agent — and watch it spread to untrained agents through interaction alone.

Nicole Summer Hsing*¹ Asuka Yuxi Zheng*² Yi Zhao³ Haoqin Tu² Jen-tse Huang⁴

¹Arcarae ²UC Santa Cruz ³Northwestern ⁴Johns Hopkins

arXiv — Coming Soon Paper → Code →

Abstract

Multi-agent systems require robust alignment, but aligning every agent individually does not scale to open environments with many interacting models. We propose Alignment Propagation, where cooperative behavior is instilled in a single fine-tuned seed agent and spreads to untrained agents through interaction.

To study this effect, we introduce the Alignment Propagation Playground with two complementary settings: Red-Black Game, a discrete social dilemma with broadcast deliberation, and Sugarscape, a continuous resource-competition world with pairwise negotiation. We use a frontier model to generate cooperative trajectories and fine-tune a seed agent, then deploy seeds into otherwise untrained collectives.

A single seed more than doubles cooperation on held-out Red-Black scenarios (26% → 62%), scaling to 96% with five seeds. Without retraining, seeds transfer zero-shot to Sugarscape (91.5% trade success vs. 21.6% for an untrained baseline) and outperform prompt-based Gemini 3 Pro. Finally, topology governs propagation efficiency: broadcast deliberation requires 20% seeds to shift the group, whereas pairwise negotiation requires ~50%.

Headline Results

96%

Cooperation
with 5 seeds (Red-Black)

91.5%

Zero-shot trade
vs 21.6% baseline

66 / 48

Arcarae seed
vs Gemini 3 Pro prompting

20 / 50

Seed thresholds %
broadcast / pairwise

The Playground

Two environments, each designed to pressure agents to defect for local gain — and to test whether cooperation, instilled in a few, can propagate through interaction.

Environment 01

Red-Black Game

A discrete social dilemma across eight scenario framings (Climate, Pandemic, AGI Safety, Election, Standards, Trade War, GPU Contention, Baseline). Two teams of 5, ten rounds, broadcast deliberation where all agents see each other's reasoning. Late-stage multipliers (3×, 5×, 10×) heavily incentivize betrayal.

Environment 02

Sugarscape

A continuous spatial grid where agents must survive through resource gathering and pairwise trading. No broadcast channel — only private, one-on-one negotiation. Tests whether alignment trained in collective deliberation transfers zero-shot to localized exchange under adversarial exploiter prompts.

Contributions

Alignment Propagation

A method for training cooperative behavior once and spreading it through interaction. Prompting, even with larger models, fails to match this — instruction doesn't transfer, but persuasion skill does.

Environment transfer

Cooperative dispositions generalize zero-shot from discrete broadcast decisions to continuous pairwise trading — 91.5% vs 21.6% under adversarial prompts — showing cooperation is a transferable capability, not an environment-specific policy.

Propagation mechanism

Alignment spreads through principled reasoning: our mute test shows that removing a seed's ability to articulate eliminates propagation entirely. The mechanism is persuasion, not presence.

Topology determines efficiency

Broadcast deliberation shifts groups with only 20% seeds; pairwise negotiation requires ~50%. Communication structure is as important as seed placement in scaling alignment.

Citation

@inproceedings{hsing2026yoao,
  title     = {You Only Align Once: Propagating Cooperative Behaviors
               in Multi-Agent Systems},
  author    = {Hsing, Nicole and Zheng, Asuka Yuxi and Zhao, Yi
               and Tu, Haoqin and Huang, Jen-tse},
  booktitle = {Workshop on Lifelong Agents: Learning, Aligning, Evolving,
               ICLR},
  year      = {2026}
}