Issue #100·Wednesday, May 13, 2026·44 min read·22 stories

Anthropic adds /goal to Claude; Google ships Googlebook

OpenAI Daybreak ships, Linux killswitch after Dirty Frag, $127 Cloudflare D1 bill.

Anthropic added /goal to Claude Code, mirroring the command Codex shipped two weeks ago: set a completion condition in natural language and Claude keeps running turns until a separate model says you're done. Google's pre-I/O Android Show landed Googlebook, a new AI-native laptop class built around Gemini Intelligence as a system-wide layer. OpenAI's Daybreak turns Codex into a defender's cyber harness, while Linux maintainers propose a kernel killswitch after CopyFail's sequel left distros exposed for a decade.

NEWS

OpenAI Daybreak Turns Codex Into a Defender's Cybersecurity Harness

OpenAI unveiled Daybreak, an agentic cyber-defence stack that pairs OpenAI models with Codex as the harness. Defenders can run secure code review, threat modelling, patch validation, dependency risk analysis, and remediation inside the development loop instead of as a separate audit phase. Daybreak lands days after Google's GTIG report on adversaries automating vulnerability discovery, positioning it as the defender-first counterweight to the same capabilities.

CME Group Launches Futures on GPU Compute Prices

CME Group and Silicon Data are rolling out a futures market for GPU compute, letting AI teams hedge rental-rate exposure on a standardised contract instead of swallowing spot-price volatility. The contracts settle against Silicon Data's GPU price indexes. The move normalises compute as a tradable commodity and hands procurement teams the same hedging primitive that oil and gas buyers have used for decades.

Google Unveils Googlebook and Gemini Intelligence at Android Show I/O

Google's pre-I/O Android Show debuted Googlebook, a new AI-native laptop class built with Asus, Dell, HP, Lenovo, and Acer, alongside Gemini Intelligence as a system-wide layer across Android and ChromeOS. Vibe-coded widgets, Android XR glasses, and refreshed Android Auto rounded out the event. Release windows and prices for Googlebook were held back. The main I/O keynote on 19 May is expected to centre on Gemini frontier work.

Linux Maintainers Pitch a Kernel Killswitch After CopyFail and Dirty Frag

Linux stable co-maintainer Sasha Levin submitted Killswitch, a kernel feature that lets admins disable a vulnerable function at runtime instead of waiting for a patched kernel to be built, distributed, and rebooted. The proposal follows CopyFail, the local privilege escalation covered in edition 92, and the newly disclosed Dirty Frag, which left Linux exposed for nearly a decade. Fleet operators no longer have to choose between exposure and downtime.

DeepMind Sketches an AI-Native Mouse Pointer Powered by Gemini

Adrien Baranes and Rob Marchant published four interaction principles for an AI-enabled pointer that understands what it points at and why it matters, with experimental demos running in Google AI Studio. The pitch is that AI should meet users where they work rather than forcing context-dragging into a chat window. The Googlebook laptops announced the same day are the concrete productisation of these principles.

Google and SpaceX in Talks for Orbital Data Centres

The Wall Street Journal reports Google and SpaceX are negotiating a rocket-launch deal as part of Google's Project Suncatcher, which targets prototype orbital data centres by 2027. SpaceX is selling investors on orbital compute as the cheapest place for AI workloads ahead of its $1.75 trillion IPO. TechCrunch notes terrestrial data centres remain cheaper today once satellite construction and launch costs are factored in.

Claude Code Lands /goal, Mirroring Codex's Run-Until-Done Mode

Anthropic added /goal to Claude Code v2.1.139, mirroring the command Codex shipped two weeks ago. You set a completion condition in natural language and a separate fast model checks after every turn whether it holds. If not, Claude starts another turn without prompting. Suitable for migrations, multi-file refactors, or working through an issue backlog until empty.

TECHNICAL

How Pinterest Built a Production MCP Ecosystem

ByteByteGo walks through Pinterest's production MCP rollout: pre-built connectors for the tools internal teams already live in, OAuth and token-lifecycle plumbing handled centrally, and consistent server patterns so each new integration isn't a fresh weeks-long project. A concrete reference for teams who've stalled at the stage where every agent integration is a different auth flow and a different token refresh story.

WorkOS Project Horizon: Turning Internal Code Generation Into a Compounding Loop

WorkOS engineer Matt Dzwonczyk details Project Horizon, an internal code factory where every agent run can both ship work and feed improvements back into the platform. The framing is orchestration as the real constraint: deciding what to work on next, wiring context, running verification, getting changes safely into production. Useful template for teams weighing whether to build an internal coding-agent platform or buy one.

A $134 Cloudflare D1 Bill, 127 Billion Row Reads, and Four Fixes

Justin Ahinon opened his April Cloudflare invoice to find $127.60 of the total came from 127.6 billion D1 row reads on a database with 765,000 rows. The post-mortem walks the four fixes that cut 95% of the cost: composite indexes via Drizzle, ANALYZE to wake the query planner, a KV cache layer for nav data, and generalising the pattern across read-heavy routes. Required reading for serverless-SQLite teams.

At 1,000 GPUs, Zero Training Failures Is Mathematically Impossible

Lewis Lin's What Breaks at Scale works the failure math: even at 99.9% per-GPU uptime, the probability of a 30-day 1,000-GPU training run completing with zero failures is essentially zero. The piece sizes checkpoint cadence and recovery budget from first principles rather than vibes. Useful for any team about to spin up a multi-week multi-GPU run who hasn't yet had a midnight pager.

ANALYSIS

Michael Nygard: Microservices Solved a 2010s Org Problem, AI Dissolves It

Michael Nygard argues microservices were always a technical fix to an organisational problem from the early VC-funded land-grab era: letting product teams ship independently while founders raced for first-mover advantage. With AI flattening team sizes and absorbing coordination overhead, the case for splitting into ever-smaller services weakens. The essay reads as permission to consolidate, not a takedown of the architecture itself.

Auditability Is the Next Constraint for Agentic Dev Tools

A senior engineering leader at a large financial institution shipped AI coding agents into the dev workflow. Velocity climbed, then internal audit asked who approved an agent-opened MR updating a payment dependency, what inputs and prompts the agent used, what policy checks ran at MR time, and how to reproduce it. The team couldn't answer. The piece is the playbook for what agentic CI/CD compliance actually requires.

Daniel Stenberg: Anthropic's Mythos Found Exactly One curl Bug

curl maintainer Daniel Stenberg revisits April's Anthropic Mythos hype, when the company said the model was so good at finding security flaws that release would be staggered to selected companies first. His tally: Mythos found a single curl vulnerability. The post is a calibration check on bug-finding model marketing and the gap between vendor capability claims and what shows up in well-tested open-source code.

Jeff Gothelf: AI-Built MVPs Are a DoS Attack on Customer Discovery

Jeff Gothelf builds on Steve Blank's recent Stanford cohort, where eight teams arrived day one with MVPs that would have taken months a year ago, built with Claude Code, Replit, v0, Granola, and Perplexity. Blank's verdict: AI-built MVPs are an accidental denial of service attack on the search for a repeatable business model. Velocity outran validation; teams outsourced analysis to AI until the output became slop.

Tom Tunguz: Half of My Daily Tasks Run Fine on a Local 35B Model

Tom Tunguz logged five weeks of personal work and found 41.8% of his 1,476 tasks (email, scheduling, summarisation, admin) ran fine on a local 35B model. Harder reasoning still needs cloud frontier models. Pairs with Armin Ronacher's note last week that local models are runnable but not finished. The practical question for builders is which workflows to route locally first, and Tunguz's taxonomy is a starting point.

A Generation of Developers Who Can't Debug Their Own Code

The New Stack reports juniors are completing tasks 55% faster with AI assistance while 73% of organisations have reduced junior hiring over the past two years. The author traces a pattern: tests pass, review looks clean, then a timing bug surfaces and the junior who shipped it cannot explain why because they didn't write it. Sits uncomfortably next to Anthropic's claim that coding is solved.

Ed Zitron: Where Are All the Data Centres?

Ed Zitron interrogates the gigawatts-coming-online narrative against what's actually reported absorbing into the market. CBRE logged net absorption of 2,497 megawatts across primary markets for all of 2025, with quarterly figures between 700MW and 2GW, even as press releases described a 25GW funnel. His exhibit A: Anthropic taking on xAI's years-old Colossus-1, full of H100s and H200s from a competitor whose CEO called the company evil.

TOOLS

OpenClaw's Peekaboo: Screen-Aware Mac GUI Automation for AI Agents

OpenClaw cut Peekaboo 3, a macOS automation toolkit with pixel-accurate screen capture (windows, menus, Retina 2x), a natural-language agent that chains see, click, type, scroll, hotkey, menu, window, app, and dock primitives, plus an MCP server for delegating to Claude or other agents. Action-first design with synthetic-input fallback for apps that don't expose accessibility APIs. Bridges the gap between AI talking about your screen and AI using your screen.

Needle: Gemini Tool Calling Distilled Into a 26M Parameter Model

Cactus Compute open-sourced Needle, a 26M-parameter Simple Attention Network distilled from Gemini 3.1 specifically for tool calling. It clocks 6,000 tokens per second prefill and 1,200 decode on Cactus, with weights and the dataset generation pipeline on Hugging Face. Targeted at builders shipping on-device agents that need fast structured tool calls without round-tripping every request to a frontier model.

Cognition's Devin for Terminal Starts Local, Hands Off to a Cloud Sandbox

Cognition rolled out Devin for Terminal, a CLI agent that begins on your machine and hands off to a remote sandbox when the work runs long. The pitch is that Devin keeps working while you don't: the local session resumes in the cloud with its own computer, then syncs back. Positions Cognition against Claude Code and Codex CLI for engineers who don't want their laptop tied up overnight.

Nvidia Open-Sources OpenShell, an Apache 2.0 Sandbox Runtime for Enterprise Agents

Nvidia's Jensen Huang and ServiceNow's Bill McDermott unveiled OpenShell, an Apache 2.0 sandbox runtime for enterprise AI agents. The argument behind it: the lowest level of the stack should be a sandbox, and the agent should not be interacting directly with the operating system, host, or network. Hands security review a vendor-backed isolation primitive instead of bespoke per-team containers.