SYS/03 · ARCHITECTURE RESEARCH

ARIA

Recurrent reasoning architecture with adaptive pondering, grafted onto a frozen transformer backbone.

PyTorchbfloat16Colab

TOTAL: ~146M params
TRAINABLE: ~22M
BACKBONE: frozen GPT-2
PRECISION: bfloat16

ARIA (Adaptive Recurrent Intelligence Architecture) tests a single thesis: whether a recurrent reasoning loop over frozen transformer knowledge can add test-time reasoning depth without retraining the backbone.

KEY FACTS

Frozen GPT-2 Small (124M) + recurrent reasoning core (~20M) + halting controller (~2M): ~146M total, ~22M trainable.
Adaptive halting: the model decides when it has reasoned enough before answering.
Trained across multiple Colab runs on ARC, GSM8K, and PIQA.
Produced concrete lessons on ponder-cost tuning and output-head design.

Model architecture

A frozen GPT-2 Small (124M) acts as the knowledge store. A trained Recurrent Reasoning Core (~20M) iterates over it step by step, and a Halting Controller (~2M) decides when to stop thinking. Roughly 146M total parameters, ~22M of them trainable. The design borrows a loose brain-region mapping: neocortex, prefrontal cortex, basal ganglia.

Training and findings

Trained across multiple Google Colab runs in bfloat16 on ARC, GSM8K, and PIQA. The work surfaced practical lessons on tuning the ponder cost (the penalty that controls how long the loop runs) and on output-head design, both of which materially affect whether adaptive depth helps or hurts.

Back to all projects