Back to archive
Issue #81··38 min read·19 stories

Hyperscalers Own 67% of Compute. Allbirds Joins Them.

Apple picks Amazon over Starlink. Microsoft Surface +$500 from RAM crunch. Codex roots Samsung TV.

Epoch AI's latest data puts five hyperscalers in control of 67% of global AI compute, and Wednesday's news shows what that concentration looks like in practice. Allbirds renamed itself NewBird AI and watched its market cap jump 582% on a $50 million plan to lease GPUs, while Microsoft raised Surface prices $500 because AI demand is eating consumer DRAM and absorbed a second Stargate site OpenAI walked away from. Apple picked Amazon Leo over Starlink for iPhone satellite service, and Cursor's multi-agent harness beat human kernel engineers on 235 Blackwell GPUs.
NEWS

Amazon spent $11.6 billion to acquire Globalstar and signed a separate agreement with Apple to power satellite features on current and future iPhones and Apple Watches. The deal turns Amazon Leo into a direct rival to Starlink in the Direct-to-Device satellite market. Apple had previously rejected a Starlink offer, making this a pointed bet against SpaceX.

Microsoft agreed to rent 30,000 Nvidia Vera Rubin chips at a Narvik, Norway data centre that OpenAI had branded as Stargate capacity, marking the second OpenAI-abandoned site Microsoft has picked up in three weeks. OpenAI has cut its long-range infrastructure target from $1.4 trillion to roughly $600 billion by 2030. Each retreat strengthens Microsoft's compute position over its 27%-owned partner.

Allbirds, once valued above $4 billion as a sustainable shoe brand, announced a pivot to AI compute infrastructure and watched its market cap climb from $21 million to $148 million in a day. The renamed NewBird AI plans to lease GPU capacity to customers underserved by spot markets and hyperscalers. Allbirds sold its shoe IP for $39 million two weeks ago.

Microsoft raised prices across the entire Surface lineup by up to $500, citing sustained DRAM costs from AI training infrastructure consuming global memory production. A Surface Laptop that launched at $999 in 2024 now starts at $1,499, a 50% jump in under two years. Apple's MacBook Air now undercuts the comparable Surface on price, and memory producers warn the shortage extends into 2027.

Google launched Skills in Chrome, letting users save any AI prompt as a reusable workflow that runs against the current page or selected tabs by typing a forward slash. The library ships with prebuilt Skills for protein macros, side-by-side spec comparisons, and document scanning. Skills inherit the same confirmation flow as Gemini in Chrome before taking actions like calendar writes.

Apple is sending much of its Siri team to a multi-week bootcamp on AI-assisted coding ahead of the smarter Siri it plans to unveil at WWDC. Per The Information, the Siri group is considered a laggard inside Apple, where other teams have already routed large budgets to Claude Code. Mike Rockwell now leads Siri under Craig Federighi, who replaced John Giannandrea late last year.

TECHNICAL

Researchers at Calif gave OpenAI's Codex a browser foothold and matching firmware source on a Samsung TV, then watched it work the chain to root by itself. Codex enumerated the surface, audited Samsung's vendor driver code, validated a physical-memory primitive on the live device, and adapted to static-only execution constraints. No bug or recipe was provided, only the environment.

Cursor and NVIDIA put a multi-agent harness on 235 Blackwell GPU kernels and ran it for three weeks. The system optimised down to assembly level and posted a 38% geomean speedup over baselines, performance the post says typically takes months or years of human kernel work. The harness builds, maintains, and deploys complex software autonomously across long-running optimisation problems.

A consultant sized a 64-H100 cluster for an enterprise LLM product and found utilisation flipping between 92% on prefill and 28% on decode every few milliseconds, paying for 64 GPUs to get the work of 20. The piece walks through disaggregated inference, the DistServe pattern Perplexity, Meta, LinkedIn, and Mistral now run in production, and when the added complexity is not worth it.

Slack's second post on its security investigation agent details how a Director, specialist Experts, and a Critic stay aligned across long investigations without drowning in shared context. Staff engineer Dominic Marks walks through the techniques that balance continuity against creative reasoning, including phase-scoped context windows. The architecture is the closest thing yet to a production blueprint for multi-phase agent teams.

ANALYSIS

Dustin Alper used Perplexity Computer to wire up his brokerage accounts into a working portfolio tracker without writing code, and reframes which software survives. Pure-code products like Jira and Figma have switching costs but no intrinsic moat. Software with two-sided markets, proprietary data, hardware, or messy real-world integrations like Uber, Bloomberg, and Plaid keeps its bridges expensive to rebuild.

Kyle Kingsbury sketches the job categories forming around production LLM systems: Incanters who specialise in prompting, Process Engineers who build error-catching workflows around model output, Meat Shields who absorb accountability when AI fails, and Haruspices who interpret model behaviour. The lawyer-citing-confabulations problem, he predicts, will spawn entire roles dedicated to deliberately seeded errors and provenance tracking.

Pete Koomen argues AI features feel useless because product teams bolt them onto interfaces designed before the model existed. His Gmail example: Gemini drafts a four-paragraph reply when the user wanted 'Hey garry, my daughter has the flu, won't make it.' The fix is letting users write the system prompt themselves, treating models as engines rather than as features inside legacy app shells.

Dwarkesh Patel asked Jensen Huang directly whether TPUs threaten Nvidia, why Nvidia stays out of the hyperscaler business, and whether the US should sell AI chips to China. Huang argued the supply chain itself, from TSMC to ODM packaging in Taiwan, is the actual moat, and pushed the case that letting China buy Nvidia chips is better than ceding the market to Huawei.

GitHub Next's Maggie Appleton calls the 'one person plus a wall of Claudes' vision a single-player fantasy that ignores how real software gets built. Her team's prototype, Ace, is a multiplayer agent workspace where engineers share chats, context, and cloud machines like Slack crossed with GitHub. The argument: scaling individual output cannot fix problems that need teams to coordinate.

Ahrefs analysed 1.4 million ChatGPT prompts to figure out which retrieved pages get cited and which get ignored. Title relevance to ChatGPT's internal 'fanout queries' is the dominant signal, with cited pages scoring 0.602 versus 0.484 for non-cited. Reddit makes up 67.8% of all non-cited URLs despite heavy retrieval, and natural-language URL slugs hit an 89.78% citation rate.

Epoch AI calculates that Amazon, Google, Meta, Microsoft, and Oracle collectively hold 67% of the world's AI compute as of Q4 2025, measured in H100-equivalents, up from 60% in Q1 2024. The dataset tracks chip acquisitions across Nvidia, Google, AMD, Huawei, and Amazon silicon. AI labs including OpenAI and Anthropic remain dependent on these five for both R&D and inference compute.

TOOLS

Kelet connects to your AI agent's stack via OpenTelemetry, Langfuse, OpenAI, Anthropic, LangChain, or your harness of choice, and clusters production failures into evidence-backed root causes with a prompt patch attached. Median time from trace ingestion to fix is 14.3 minutes across design-partner deployments. The pilot cohort found that 73% of teams had agent failures nobody had noticed.

LangAlpha treats investing as Bayesian rather than one-shot, persisting research workspaces the way coding harnesses persist a codebase. The agent dispatches parallel subagents to gather market data, news, and macro context, then writes Python against MCP servers instead of dumping raw data into the LLM context. Built-in skills include DCF models and morning-note generation with inline visualisations.