CFO Sarah Friar confirmed on CNBC that OpenAI raised an additional $10 billion, pushing its record funding round past $120 billion. New participants include a16z, D.E. Shaw, MGX, TPG, T. Rowe Price, and Microsoft rejoining. The initial tranche announced in February featured Amazon ($50B), NVIDIA ($30B), and SoftBank ($30B). Friar described it as possibly the last private raise before a potential IPO.
Sora Is Dead + Two Supply Chain Attacks Hit Open Source
Arm ships its first chip. Claude Code gets auto mode. And the AI wealth window is closing fast.
On the same day it confirmed another $10 billion in funding, OpenAI announced it is shutting down Sora just 15 months after launch. Both the consumer app and developer API will be discontinued, with no plans to integrate the feature into ChatGPT. Disney's $1 billion licensing deal, signed in December 2025, is cancelled as a result. CEO Sam Altman is redirecting compute towards core products including Codex.
Arm launched its first in-house production chip, a 136-core data centre CPU co-developed with Meta as the debut customer. The chip targets CPU-side orchestration for agentic AI workloads, marking Arm's shift from licensing IP to shipping its own silicon. OpenAI has also committed to deploying it. CEO Rene Haas projects revenue reaching $25 billion by 2031, up from $4 billion today.
Malicious versions of the LiteLLM Python package (1.82.7 and 1.82.8) were pushed to PyPI by the TeamPCP group, injecting credential-stealing logic that mimics legitimate authentication flows. The backdoor harvested API keys, auth tokens, and cloud credentials. Both compromised versions have been pulled from PyPI, but anyone who installed them should rotate all affected secrets immediately.
A supply chain attack on Aqua Security's Trivy scanner has infected over 1,000 cloud environments with secret-stealing malware. Attackers exploited a GitHub Actions misconfiguration to distribute malicious scanner versions, then connected with the Lapsus$ extortion group. Open source security tooling itself has become a high-value supply chain target, and this incident landed on the same day as the LiteLLM compromise.
Anthropic shipped auto mode for Claude Code, a permissions middle ground between approving every individual action and the risky skip-all-permissions flag. A classifier evaluates each tool call before execution, auto-approving safe operations while blocking destructive ones like mass deletions or data exfiltration. If Claude repeatedly attempts a blocked action, it escalates to the user for explicit approval. Available as a research preview on Team plans.
Every major chipmaker is shipping NPUs into phones and edge devices, but the on-device AI experience still falls short. The real bottleneck is memory bandwidth, not compute power. Liquid AI's LFM2 model attacks this with a hybrid gated convolution and attention architecture that cuts memory requirements. Their STAR system then profiles specific hardware to optimise model deployment per device.
In 2024, a researcher duplicated seven middle layers in Qwen2-72B with no retraining and produced the top model on the HuggingFace Open LLM Leaderboard. The method, called RYS (Repeat Your Self), has now been validated across Qwen3.5-27B and other architectures using 3,024 beam search candidates. Results suggest LLMs organise reasoning into discrete, reusable circuits in their middle layers, and copying them adds thinking time at zero training cost.
Cloudflare's new Dynamic Worker Loader API creates isolated sandboxes for running AI-generated code on the fly. Built on V8 isolates rather than full OS-level virtualisation, it cuts startup times and memory usage compared to containers like Docker or Firecracker. The API targets builders shipping consumer-scale AI agents who need per-request code execution without container overhead.
Ed Zitron takes apart the AI industry's capacity claims, citing Wood Mackenzie data showing US data centre additions halved between Q3 and Q4 2025. Most announced capacity has not been brought online, creating a growing gap between NVIDIA's reported GPU sales and actual deployed compute. He also targets companies forcing AI tools on reluctant employees, arguing it degrades both productivity and code quality.
Daniel Homola makes a mathematical case that intelligence follows a bell curve but wealth follows a power law, and when you multiply one by the other, wealth wins. AI is cutting the historical wire from cognitive ability through credentials to high-paying work. Homola argues there is a 5-to-10 year window to convert AI fluency and domain knowledge into capital before this path closes permanently.
Anthropic's latest Economic Index, analysing over one million Claude conversations, found that experienced users achieve a 10% higher success rate than newcomers. The gap widens with time and is not explained by task type, location, or model choice. Axios frames this as a new class divide: the real split is not between AI users and non-users, but between fluent and casual ones.
Two years after Sam Altman called advertising a 'last resort' for ChatGPT, OpenAI is testing ads in the product. The stakes differ from social media: users rely on ChatGPT for decisions about purchases, health, and finances, creating an unusually high trust bar. Anthropic poked at this in its Super Bowl spot, questioning whether ads risk eroding that trust. If users sense bias in responses, the damage compounds fast.
Hypura schedules LLM inference across Apple Silicon's GPU, RAM, and NVMe storage tiers based on access patterns and bandwidth costs. Models too large for physical memory, like Llama 70B, can run on consumer hardware by loading only the necessary tensor slices from disk. It ships with an Ollama-compatible API server, so existing toolchains plug in without changes.
SentrySearch uses Gemini's native video embedding API for sub-second semantic search across footage. Videos are split into chunks, embedded directly by Gemini, and stored in a local ChromaDB instance. Text queries match against these embeddings and return trimmed clips of relevant segments. Cost optimisations like still-frame skipping keep API bills manageable for longer videos.