Back to archive
Issue #78··30 min read·15 stories

OpenAI Reserves IPO Shares for Retail + Meta Ships Muse Spark

Meta ships Muse Spark. GLM-5.1 goes long-horizon. DHH barely writes code now. Tokenmaxxing.

Meta Superintelligence Labs shipped Muse Spark, its first model in the Muse series, with multimodal reasoning and parallel sub-agents now powering Meta AI. Zhipu AI released GLM-5.1, an agentic model that stays productive over hundreds of iterations where previous models plateau. Also in this edition: 29% of Fortune 500 are now live paying customers of AI startups, and DHH went from typing every line to an agent-first workflow in six months.
NEWS

Zhipu AI's GLM-5.1 targets long-horizon agentic tasks, reaching state-of-the-art on SWE-Bench Pro and leading on NL2Repo and Terminal-Bench 2.0. Where previous models exhaust their repertoire early and plateau, GLM-5.1 stays productive over hundreds of rounds and thousands of tool calls, revising strategy and running experiments. The longer it runs, the better the result.

OpenAI CFO Sarah Friar told CNBC the company will hold a slice of its IPO for individual investors, citing strong retail demand during the latest funding round. She pointed to SpaceX's approach of reserving nearly 30% for retail buyers as a model. Enterprise revenue now makes up 40% of OpenAI's business and is on track to match consumer by the end of 2026.

Founded by two former DeepMind researchers who helped create AlphaGo, Reflection AI raised $2 billion at an $8 billion valuation, a 15x jump from seven months ago. The 60-person team is positioning itself as an open-source Western counterpart to DeepSeek, challenging closed labs like OpenAI and Anthropic. A frontier model trained on tens of trillions of tokens is planned for next year.

Meta Superintelligence Labs shipped Muse Spark, its first model in a new series built from a rebuilt AI stack over nine months. The model powers Meta AI with dual modes for instant answers and deeper reasoning, handles multimodal tasks like identifying products from photos, and launches parallel sub-agents for complex questions. Rolling out to WhatsApp, Instagram, and Messenger in coming weeks.

TECHNICAL

The third instalment in the Agentic Design Systems series walks through a self-healing architecture. Claude Code sits at the centre, connected via MCP to Figma, GitHub, Storybook, PostHog, and Sentry. The system detects design drift automatically, generates PRs for fixes, and learns from corrections. The author tested multiple AI tools and found Claude Code delivered the strongest results for design system reasoning.

Every Friday, a Spotify team packages hundreds of code changes from dozens of engineering teams into a single update reaching 675 million users. Over 95% of releases ship without issues. The piece walks through a release's two-week journey from code merge to production, showing how trunk-based development combined with a release architecture makes speed and safety reinforce each other.

Published on martinfowler.com, this piece proposes a structured practice for harvesting signal from AI coding sessions and feeding it back into shared team artefacts. Most teams plateau with AI tools because individual intuition about effective prompts stays personal. The feedback flywheel identifies four types of signal worth capturing and outlines how to convert scattered individual experience into collective improvement.

Linux malware often hides in Berkeley Packet Filter programs, remaining dormant until a specific "magic" packet arrives. Reverse-engineering these filters by hand takes hours. Cloudflare built a tool using the Z3 theorem prover to work backward from malicious filters and generate trigger packets in seconds. The approach treats BPF bytecode as constraints rather than instructions, turning a manual security bottleneck into an automated one.

ANALYSIS

Battery Ventures argues that AI coding agents are reshaping how developer tools get distributed. With agents writing most of the code, the bottleneck shifts to teaching them how to use your tool. "Agent skills" are installable context packages that equip coding agents with tool-specific knowledge, acting as programmable solutions engineers at the point of code generation.

Meta employees compete on an internal "Claudeonomics" leaderboard, with 60 trillion tokens consumed in 30 days. Jensen Huang says any $500K engineer spending less than $250K a year on tokens would "deeply alarm" him. Token budgets are becoming a fourth component of compensation alongside salary, equity, and bonus. The piece argues this spending culture is the industry's most expensive mistake so far.

Andreessen Horowitz compiled hard data on enterprise AI adoption, finding 29% of the Fortune 500 and 19% of the Global 2000 have signed top-down contracts with AI startups and gone live in production. This penetration in just over three years is remarkable for enterprises not known as early adopters. The report pushes back on the MIT study claiming 95% of generative AI pilots fail.

Six months ago, DHH told Lex Fridman he typed out all his code by hand. Now he takes an agent-first approach and barely writes code directly. In this Pragmatic Engineer episode, he discusses how AI agents changed his workflow while his quality standards stayed the same. 37signals' design philosophy and the role of taste in software also feature.

TOOLS

OpenOwl connects to Claude Code, Codex, or any MCP-compatible AI and automates desktop tasks through screen observation, clicking, and typing. Install with one command, describe what you need in plain language, and watch it navigate apps and browsers. Use cases range from LinkedIn lead scraping to Shopify price updates. Over 400 subscriptions sold, with early pricing at $3.99 per month.

Like BrowserUse but for the terminal. TUI-use spawns any program in a PTY, renders the screen as clean text via a headless xterm emulator, and lets agents send keystrokes. It fills the gap where AI agents stall the moment a program requests input. Works with REPLs, database CLIs, SSH sessions, and full-screen apps like vim and lazygit.

A developer who kept solving the same problems manually in Claude Code turned repeating workflows into armory, a collection of 106 production-grade packages. It includes 11 orchestrator agents for multi-phase work, 60+ skills across development and research, hooks for backup and cost tracking, and presets for different coding styles. Install with a single npx command.