Issue #65·Wednesday, March 25, 2026·30 min read·15 stories

Sora Is Dead + Two Supply Chain Attacks Hit Open Source

Arm ships its first chip. Claude Code gets auto mode. And the AI wealth window is closing fast.

OpenAI raised another $10 billion pushing its record round past $120 billion, then on the same day confirmed it is shutting down Sora and losing Disney's $1 billion deal. Two separate supply chain attacks also hit open source: LiteLLM backdoored on PyPI and Trivy infecting 1,000+ cloud environments. Arm shipped its first in-house chip and Anthropic released auto mode for Claude Code.

NEWS

OpenAI's Record Round Passes $120B After Fresh $10B Tranche

· 3 min read

CFO Sarah Friar confirmed on CNBC that OpenAI raised an additional $10 billion, pushing its record funding round past $120 billion. New participants include a16z, D.E. Shaw, MGX, TPG, T. Rowe Price, and Microsoft rejoining. The initial tranche announced in February featured Amazon ($50B), NVIDIA ($30B), and SoftBank ($30B). Friar described it as possibly the last private raise before a potential IPO.

Arm's First In-House Chip: 136-Core CPU for AI Data Centres

· 5 min read

Arm launched its first in-house production chip, a 136-core data centre CPU co-developed with Meta as the debut customer. The chip targets CPU-side orchestration for agentic AI workloads, marking Arm's shift from licensing IP to shipping its own silicon. OpenAI has also committed to deploying it. CEO Rene Haas projects revenue reaching $25 billion by 2031, up from $4 billion today.

LiteLLM PyPI Packages Backdoored to Steal Credentials

· 4 min read

Malicious versions of the LiteLLM Python package (1.82.7 and 1.82.8) were pushed to PyPI by the TeamPCP group, injecting credential-stealing logic that mimics legitimate authentication flows. The backdoor harvested API keys, auth tokens, and cloud credentials. Both compromised versions have been pulled from PyPI, but anyone who installed them should rotate all affected secrets immediately.

Claude Code Gets Auto Mode for Fewer Permission Prompts

· 1 min read

Anthropic shipped auto mode for Claude Code, a permissions middle ground between approving every individual action and the risky skip-all-permissions flag. A classifier evaluates each tool call before execution, auto-approving safe operations while blocking destructive ones like mass deletions or data exfiltration. If Claude repeatedly attempts a blocked action, it escalates to the user for explicit approval. Available as a research preview on Team plans.

Trivy Supply Chain Attack Infects 1,000+ Cloud Environments

· 4 min read

A supply chain attack on Aqua Security's Trivy scanner has infected over 1,000 cloud environments with secret-stealing malware. Attackers exploited a GitHub Actions misconfiguration to distribute malicious scanner versions, then connected with the Lapsus$ extortion group. Open source security tooling itself has become a high-value supply chain target, and this incident landed on the same day as the LiteLLM compromise.

OpenAI Kills Sora 15 Months After Launch, Disney Pulls $1B Deal

· 4 min read

On the same day it confirmed another $10 billion in funding, OpenAI announced it is shutting down Sora just 15 months after launch. Both the consumer app and developer API will be discontinued, with no plans to integrate the feature into ChatGPT. Disney's $1 billion licensing deal, signed in December 2025, is cancelled as a result. CEO Sam Altman is redirecting compute towards core products including Codex.

TECHNICAL

Memory Bandwidth Is the Real Bottleneck for On-Device AI

· 48 min read

Every major chipmaker is shipping NPUs into phones and edge devices, but the on-device AI experience still falls short. The real bottleneck is memory bandwidth, not compute power. Liquid AI's LFM2 model attacks this with a hybrid gated convolution and attention architecture that cuts memory requirements. Their STAR system then profiles specific hardware to optimise model deployment per device.

Duplicating LLM Layers Boosts Reasoning Without Any Retraining

· 14 min read

In 2024, a researcher duplicated seven middle layers in Qwen2-72B with no retraining and produced the top model on the HuggingFace Open LLM Leaderboard. The method, called RYS (Repeat Your Self), has now been validated across Qwen3.5-27B and other architectures using 3,024 beam search candidates. Results suggest LLMs organise reasoning into discrete, reusable circuits in their middle layers, and copying them adds thinking time at zero training cost.

Cloudflare Ships Isolate-Based Sandboxing 100x Faster Than Containers

· 14 min read

Cloudflare's new Dynamic Worker Loader API creates isolated sandboxes for running AI-generated code on the fly. Built on V8 isolates rather than full OS-level virtualisation, it cuts startup times and memory usage compared to containers like Docker or Firecracker. The API targets builders shipping consumer-scale AI agents who need per-request code execution without container overhead.

ANALYSIS

Ed Zitron: The AI Industry Is Lying About Data Centre Capacity

· 42 min read

Ed Zitron takes apart the AI industry's capacity claims, citing Wood Mackenzie data showing US data centre additions halved between Q3 and Q4 2025. Most announced capacity has not been brought online, creating a growing gap between NVIDIA's reported GPU sales and actual deployed compute. He also targets companies forcing AI tools on reluctant employees, arguing it degrades both productivity and code quality.

The Bridge to Wealth Is Being Pulled Up with AI

· 49 min read

Daniel Homola makes a mathematical case that intelligence follows a bell curve but wealth follows a power law, and when you multiply one by the other, wealth wins. AI is cutting the historical wire from cognitive ability through credentials to high-paying work. Homola argues there is a 5-to-10 year window to convert AI fluency and domain knowledge into capital before this path closes permanently.

ChatGPT's Ad Test Is Really a Test of Trust

· 3 min read

Two years after Sam Altman called advertising a 'last resort' for ChatGPT, OpenAI is testing ads in the product. The stakes differ from social media: users rely on ChatGPT for decisions about purchases, health, and finances, creating an unusually high trust bar. Anthropic poked at this in its Super Bowl spot, questioning whether ads risk eroding that trust. If users sense bias in responses, the damage compounds fast.

Anthropic Data Shows AI Fluency Gap Fuelling a New Class Divide

· 4 min read

Anthropic's latest Economic Index, analysing over one million Claude conversations, found that experienced users achieve a 10% higher success rate than newcomers. The gap widens with time and is not explained by task type, location, or model choice. Axios frames this as a new class divide: the real split is not between AI users and non-users, but between fluent and casual ones.

TOOLS

Run Llama 70B on Apple Silicon with Hypura

· 6 min read

Hypura schedules LLM inference across Apple Silicon's GPU, RAM, and NVMe storage tiers based on access patterns and bandwidth costs. Models too large for physical memory, like Llama 70B, can run on consumer hardware by loading only the necessary tensor slices from disk. It ships with an Ollama-compatible API server, so existing toolchains plug in without changes.

Sub-Second Video Search Using Gemini's Native Embeddings

· 4 min read

SentrySearch uses Gemini's native video embedding API for sub-second semantic search across footage. Videos are split into chunks, embedded directly by Gemini, and stored in a local ChromaDB instance. Text queries match against these embeddings and return trimmed clips of relevant segments. Cost optimisations like still-frame skipping keep API bills manageable for longer videos.