Issue #94·Tuesday, May 5, 2026·38 min read·19 stories

Anthropic and OpenAI Launch Dueling Enterprise Ventures

Cerebras prices IPO at $26.6B. Uber burned its token budget by April. Anthropic red-teams Jupiter.

Anthropic and OpenAI both announced enterprise joint ventures within hours of each other, deploying a combined $5.5 billion to embed engineers inside companies and redesign workflows around agents. Meanwhile GitHub is processing 275 million commits a week (up from 1 billion in all of 2025) and actively losing availability. Cerebras priced the largest tech IPO of 2026, and Anthropic is red-teaming a new model called Jupiter ahead of tomorrow's Code with Claude event.

NEWS

Chinese Court Rules Companies Cannot Fire Workers Just Because AI Replaced Them

The Hangzhou Intermediate People's Court ruled that a company illegally fired a quality assurance supervisor after automating his role. The worker refused a demotion and salary cut from 25,000 to 15,000 yuan per month when AI took over his monitoring duties. The court held that AI automation is a business choice, not legal grounds for dismissal. The ruling has no direct precedent outside China but signals where employment law may head.

Anthropic and OpenAI Both Launch Enterprise Joint Ventures on the Same Day

Anthropic partnered with Blackstone, Goldman Sachs, and Hellman & Friedman on a $1.5 billion venture to deploy Claude inside portfolio companies. Hours earlier, OpenAI announced The Development Company, raising $4 billion from 19 investors at a $10 billion valuation. Both ventures embed engineers inside mid-sized companies to redesign workflows around AI agents, targeting the same talent bottleneck from opposite sides.

Cerebras Prices IPO at Up to $26.6 Billion, the Largest Tech Offering of 2026

The AI chipmaker will sell 28 million shares at $115 to $125 each, raising $3.5 billion. Cerebras claims its Wafer-Scale Engine 3 is faster for inference while using less power than GPU-based alternatives. Sam Altman, Greg Brockman, and Ilya Sutskever are among the angel investors who stand to gain. If successful, the offering proves appetite for even larger AI IPOs still in the wings.

Uber Burned Through Its Annual Token Budget by April, Now Rethinking Hiring

In a Verge interview, CEO Dara Khosrowshahi confirmed Uber exhausted its entire 2026 token budget in four months as AI coding tools consumed compute faster than planned. The company is rethinking how fast it hires as it spends more on tokens. Uber is also expanding into hotels via Expedia and positioning itself as an everything app, but the token admission is the signal worth watching.

AWS Brings OpenAI Models to Bedrock, Ending the Exclusivity Era in Cloud AI

Amazon launched GPT-5.4 on Bedrock in limited preview, with GPT-5.5 following within weeks. The move came 24 hours after OpenAI and Microsoft restructured their exclusive partnership. AWS also unveiled an agentic developer framework, a desktop productivity tool called Amazon Quick, and expanded Amazon Connect into four AI solutions targeting supply chains, hiring, healthcare, and customer experience.

Anthropic Is Red-Teaming a New Model Called Jupiter Ahead of Tomorrow's Event

TestingCatalog spotted Anthropic running safety probes on an internal build tagged Claude Jupiter V1. The codename follows the company's pattern of using planet names for pre-release models, with Neptune preceding Claude 4 under similar timing last year. Anthropic's Code with Claude developer conference in San Francisco is May 6, with London and Tokyo dates following later in spring.

GitHub Is Losing Nines: 275 Million Commits a Week as Agents Flood the Platform

GitHub processed 1 billion commits in all of 2025. Now it handles 275 million in a single week, putting it on track for 14 billion this year. COO Kyle Daigle confirmed the platform is actively losing availability as agentic coding tools generate commit volumes the infrastructure was never designed to handle. The numbers make GitHub's recent admission that their 10x scaling plan was insufficient feel conservative.

TECHNICAL

How OpenAI Rearchitected WebRTC to Serve Voice AI to 900 Million Users

OpenAI's real-time team rebuilt their WebRTC stack to solve three constraints colliding at scale: one-port-per-session media termination, stateful ICE/DTLS sessions needing stable ownership, and global routing for low first-hop latency. The solution splits relay handling from transceivers, preserving standard WebRTC behaviour for clients while changing how packets route internally. The post details the full split relay plus transceiver architecture.

Why Figma Replaced PgBouncer With a Custom Connection Manager

Figma outgrew PgBouncer as feature growth and traffic pushed novel workloads onto their horizontally and vertically sharded Postgres fleet. PGKeeper replaces it with a scalable connection and load management service sitting between their query router (DBProxy) and Postgres instances. Each database machine gets a dedicated set of pooler replicas in an n-to-1 relationship. The post walks through the design constraints and rollout.

antirez Spent Four Months Building a Redis Array Type. AI Changed How, Not Whether.

Salvatore Sanfilippo paired with Opus and then GPT 5.x to design and implement a new sparse array data type for Redis. The specification evolved through back-and-forth challenges about the right level of indirection. When two directory levels proved insufficient, he added a third because AI made the extra complexity tractable. Four months of work that would have taken the same duration without AI, just with less ambition.

Inside Stripe's AI Prototyping Tool That Turned PMs Into Designers

Design manager Owen Williams built Protodash, an internal platform using Cursor rules, MCP integrations, and Stripe's design system to let anyone create high-fidelity dashboard prototypes without writing code. What started as a bundle of Cursor rules evolved into a full web-based studio running in dev boxes. PMs now use it as much as designers, replacing memos with live prototypes that double as engineering handoffs.

ANALYSIS

Armin Ronacher Tracked 90 Days of AI Coding Output. Some Words Are Suspiciously Inflating.

The Flask creator analysed medium-frequency words across three months of local coding sessions and compared them against historical baselines from wordfreq. Words like capability, robust, and comprehensive showed pronounced inflation in agent output. A Google Trends cross-reference confirmed the pattern extends beyond his sessions. Something is shifting baseline language, and the obvious suspect is LLM-generated text saturating developer tooling.

Jack Clark: 60% Chance of Fully Automated AI Research by End of 2028

The former OpenAI policy director makes his case that no-human-involved AI R&D is likely within two years. He points to all pieces being in place for automating the production of today's AI systems, citing public papers and deployed products as evidence. Clark calls it a Rubicon into a nearly-impossible-to-forecast future and plans to spend most of 2026 working through the implications.

GitHub's CTO Admitted 10x Scaling Wasn't Enough. They Need 30x.

When GitHub's CTO says a scaling plan already in flight has to be torn up because code volume is growing faster than anticipated, that is not a routine capacity announcement. The article argues the bottleneck is validation, not production. Code review, test suites, and staging environments were never designed to absorb output at agent rates. Incremental fixes to existing SDLC pipelines will not hold.

Nathan Lambert: 'Distillation Attack' Is a Term That Will Poison AI Policy

Interconnects.ai argues that calling Chinese labs' API extraction attempts distillation attacks will permanently associate a legitimate training technique with corporate espionage. Distillation is how labs create smaller, cheaper models for customers. The piece draws a parallel to how open source vs open weights collapsed into confusion, and warns that policy responses targeting the method risk collateral damage to academic research.

You Are Not Immune to Mode Collapse

A LessWrong post reframes mode collapse beyond its machine learning origins into a general phenomenon affecting organisations, creative work, and hiring. Grant-making bodies converge on the same bets. Bands sound identical after the third album. The argument is that mode collapse happens whenever a system optimises against its own outputs without external correction, making diversity load-bearing even when there are no gains from trade.

TOOLS

Rapid-MLX: Local Inference on Apple Silicon at 4.2x Ollama's Speed

A new MLX-based engine claims 0.08 second cached time-to-first-token and 4.2x the throughput of Ollama on Apple Silicon hardware. It includes 17 tool parsers, prompt caching, reasoning separation, and cloud routing as a drop-in OpenAI-compatible replacement. Works with Claude Code, Cursor, and Aider out of the box. 1,041 stars with 161 added today.

Token Optimizer Finds Ghost Tokens Before Compaction Destroys Your Context

A Python tool that scans prompts and system messages for wasted tokens, ghost references, and patterns that degrade output quality during context window compaction. It targets tokens that served a purpose in earlier turns but now consume budget without contributing signal. The goal is surviving compaction gracefully rather than watching quality decay as conversations grow longer. 861 stars with 50 added today.

Browser Use Ships a Desktop App for Running Teams of Browser Agents

The open-source browser automation project now offers a native Mac, Windows, and Linux app that ports your Chrome cookies into a fresh Chromium instance so agents arrive logged in everywhere you are. Spawn tasks from anywhere with a keyboard shortcut. Supports Anthropic and OpenAI as providers. The philosophy: keep your normal browser for browsing, use this one purely as the agent half.