A critical vulnerability called BadHost in Starlette, an open source framework with 325 million weekly downloads, exposes millions of AI agent servers to attack. The flaw is trivial to exploit and affects FastAPI, vLLM, and LiteLLM, along with MCP servers that store credentials for external services. Any attacker who can reach an unprotected server can breach it and steal stored third-party credentials.
Beijing Restricts AI Talent; Stealing Models With an Antenna
ClickUp cuts 22%, hires 3,000 agents. 12% of auditor verdicts flip. A $2,500 robot you can print.
NEWS
Anthropic is turning its February AI Fluency research into a consumer feature inside Claude. References found in Claude's settings show a scorecard that scans activity across Chat, Cowork, and Claude Code sessions, scoring users against 11 behavioural indicators. The original study of 9,830 conversations found iteration and refinement was the strongest predictor of effective AI use.
Beijing is restricting overseas travel for AI workers at Alibaba and DeepSeek, applying rules previously reserved for academics and nuclear scientists. Domestically, the competition is just as intense. ByteDance is offering special stock options to retain its AI team, and one robotics startup advertised an 18 million dollar salary for a chief scientist.
Hugging Face released LeHumanoid Robot, a bipedal platform built from 3D-printed parts and off-the-shelf components starting at $2,500. The full-stack release includes printable part files, wiring docs, assembly instructions, plus software for calibration, control, and simulation. It is not the most advanced humanoid robot. It is the one you can actually build, repair, and run learning experiments on.
TECHNICAL
The brute force approach looked fine at a glance. Valid JSON, expected structure. But sampling revealed rules that were too broad, others missed entirely, nuances lost from source text. The fix was not a better prompt or fancier agent. It was making the agent's job smaller, preparing source data upfront and removing retrieval uncertainty so the model could focus on reasoning over content it already had.
Researchers from KAIST, NUS, and Zhejiang University showed that an off-the-shelf antenna can capture electromagnetic leakage from a running GPU and reconstruct the AI model's layer structure with 97.6% accuracy. The technique, called ModelSpy, works through walls. Someone with a backpack-sized receiver could walk past a server room and extract a model's architecture without touching a single machine.
This ICLR 2026 paper presents Head-Masked Nullspace Steering, a jailbreak method that identifies the attention heads responsible for safety behaviour, suppresses their write paths through targeted column masking, then injects perturbation in the orthogonal complement. The geometry-aware intervention preserves fluency while bypassing alignment. It achieved state-of-the-art attack success rates with fewer queries than prior methods across multiple benchmarks.
Anthropic details three isolation patterns across its products: gVisor containers for claude.ai, OS-level sandboxing (Seatbelt/bubblewrap) for Claude Code that cut permission prompts by 84%, and sealed VMs for Cowork where credentials never enter the guest machine. Prompt injection classifiers hit roughly 0.1% attack success. The team found custom security components consistently proved weaker than battle-tested primitives like hypervisors.
ANALYSIS
Meta cut 8,000 this month and redirected 7,000 to AI teams. Cloudflare cut 1,100 in its first mass layoff ever. Tech layoffs have passed 100,000 for the year. Cloudflare's CEO was direct about who got cut: middle management, finance, legal, internal auditing. The piece argues only two roles survive, the people who build and sell things, and the executives who decide what gets built.
ClickUp laid off 22% of its workforce and CEO Zeb Evans framed it not as cost-cutting but as an AI bet. The company deployed 3,000 internal AI agents and promised million-dollar salary bands for remaining staff who create outsized impact. A recent Gartner survey found 80% of companies using autonomous tech have cut jobs, but those reductions are not translating into meaningful financial returns.
Over 400,000 compliance officers in the US represent more than 40 billion dollars in annual labour spend, yet the work stays stubbornly manual. TD Bank was fined 3 billion dollars for failing to monitor 92% of its transactions, with a backlog of 70,000 detection alerts dating to 2018. The talent pipeline is strained too: 87% of entrants eventually leave the field and annual churn exceeds 20%.
AI writing is flooding social media and Mollick argues badly prompted output produces very little meaning per word, calling these posts meaning-shaped attention vampires. The hidden cost goes deeper. Writing skill took him decades to develop, and defaulting to AI skips that process entirely. Students given plain ChatGPT underperformed on tests, while those with tutoring-focused AI gained months of equivalent schooling.
GPU tensor cores spend much of LLM inference waiting because memory bandwidth, not compute, is the bottleneck. An H100 has hundreds of compute units for every byte it can pull from HBM. The piece maps startups attacking different layers: Groq replaced HBM with on-chip SRAM, Cerebras built a wafer-scale chip with 44 GB of SRAM and 21 PB/s internal bandwidth, MatX is designing scratchpad memories for transformer access patterns.
Can one AI system make another audit it less independently just by explaining itself? In this experiment, AI auditors correctly flagged borderline compliance issues on first review. But after the audited agent got a second turn to explain, the auditor changed its verdict to compliant in 1 in 8 cases. The pattern mirrors human audit capture, where formal independence exists but incentives and framing do the real work.
Inference ASICs physically bake in the assumption that weights are frozen and memory is separated from compute. The author traces a recurring cycle: CPUs enabled creative 3D graphics, fixed-pipeline GPUs killed that variety, programmable shaders reopened it, and specialised inference chips are closing the window again. Hardware optimised for open-loop deployment makes closed-loop learning experiments, where models update their own weights, increasingly difficult to run.
TOOLS
A terminal dashboard built with Bubble Tea that gives you a unified view of API keys across OpenAI, Anthropic, AWS, Google Cloud, and Mistral. It flags stale, idle, and never-used keys with health scoring, tracks changes through snapshot diffs, and stores credentials in AES-256-GCM encrypted SQLite. Provider management and filtering are built into the TUI.
Chrome's built-in AI APIs now run models directly on the user's device, with no cloud inference costs and no data leaving the browser. The Google I/O 2026 announcement covers summarisation, translation, language detection, and writing assistance. Once downloaded, models work offline with hardware acceleration. A polyfill path through Firebase AI Logic handles unsupported devices.
Orval v8 generates type-safe TypeScript clients, mocks, and validators from OpenAPI specs. It supports React Query, SWR, Angular, Vue, Svelte, and Solid out of the box. Mock generation auto-creates MSW handlers with Faker.js data, so you can test without a backend. First-class support for Zod validation, Hono server stubs, and MCP integration rounds out the release.
Roboflow walks through GPT-5's vision capabilities for practical computer vision tasks. The models read nutrition labels, describe warehouse layouts, count PCB defects, and generate structured JSON from scanned documents, all from a text prompt with no task-specific fine-tuning. The guide covers testing in both the OpenAI and Roboflow Playgrounds, plus integration into Roboflow Workflows for production pipelines.