A cognitive architecture that gives language models an inner monologue — parallel reflection across goals, reasoning, and memory, synthesized between turns into a persistent self-narrative. Built for AI safety and personalization: preserves user-specific safety context and reduces sycophancy in multi-turn conversation.
Multiple cognitive theories — Global Workspace Theory, reconstructive episodic memory, inner speech, and complementary learning systems — converge on a shared set of architectural principles: parallel specialized processing, integrative synthesis into a bounded unified representation, and reconstructive rather than accumulative maintenance. We test whether these converging principles provide computational advantages when implemented in AI systems.
MIRROR operationalizes each principle as a concrete mechanism: an Inner Monologue Manager generates parallel cognitive threads (Goals, Reasoning, Memory); a Cognitive Controller synthesizes these into a bounded first-person narrative that is fully reconstructed each turn; and a temporal separation between fast response generation and slow deliberative consolidation mirrors complementary learning dynamics.
Evaluated on multi-turn dialogue requiring constraint maintenance under attentional interference, MIRROR yields 21% relative improvement across seven architecturally diverse language models. Ablation studies test the theoretical predictions directly: reconstructive synthesis improves all seven models (+5–20%); the integrated system outperforms either component alone for six of seven models — confirming that parallel exploration and integrative synthesis are complementary; and gains concentrate where theories predict, under high attentional load where global availability of integrated information is most needed.
These results demonstrate that converging principles from human cognition provide architecture-general computational advantages, and generate testable behavioral predictions about working memory, inner speech, and memory consolidation.
MIRROR separates the Talker (responsible for generating immediate responses using the latest narrative) from the Thinker (responsible for asynchronously processing turns through parallel threads). This temporal decoupling enables sophisticated reasoning without response latency.
@article{hsing2025mirror,
title = {MIRROR: Cognitive Inner Monologue Between Conversational Turns
for Persistent Reflection and Reasoning in Conversational LLMs},
author = {Hsing, Nicole S.},
journal = {arXiv preprint arXiv:2506.00430},
year = {2025}
}