Issue #42·Friday, February 20, 2026·16 min read·8 stories

Pentagon Threatens Anthropic Over AI Safeguards

One team uncovers what fills your AI coding context window. Plus: advanced prompt caching, and AI as an exoskeleton.

The Pentagon yesterday threatened to cut off Anthropic, escalating a dispute over AI safeguards and signaling increasing geopolitical pressure on major model providers. Separately, one team intercepted thousands of API calls to reveal what actually fills your AI coding tool's context window. This comes as a new analysis frames AI not as a coworker, but as an exoskeleton for builders.

▲NEWS

2 stories

Gemini 3.1 Pro Doubles ARC-AGI-2 Reasoning

Google announced Gemini 3.1 Pro, an upgraded AI model that more than doubles its predecessor's reasoning performance on the ARC-AGI-2 benchmark. It's rolling out to developer products and can generate animated SVGs from text or synthesize complex data into dashboards. The release is in preview to validate agentic workflows.

Read full story→

AI Hardware Development Accelerates at Apple: Glasses, Pendant, Camera AirPods

Apple is reportedly accelerating development on AI-focused smart glasses, a pendant, and enhanced AirPods designed to leverage Siri with visual context for AI actions. Expect Apple to drive new multimodal AI interaction patterns in consumer hardware, potentially influencing future platform APIs and user expectations.

⚙TECHNICAL

2 stories

AI Coding Agents Waste Context Tokens

An experiment intercepting 3,177 API calls across four AI coding tools found Gemini uses vastly more tokens by aggressively dumping files and history. Claude models also incur an "architectural tax" from tool definitions. This reveals diverse, often inefficient, context handling strategies, such as aggressive data dumping and fixed architectural overheads, which directly affect token costs and agent performance.

Prompt Caching Slashes OpenAI Latency, Costs

OpenAI's prompt caching reuses key/value tensors for identical prompt prefixes, especially for prompts over 1024 tokens. Stabilizing prefixes and consistent tool definitions improve cache hit rates, reducing latency and costs. The Responses API offers better cache utilization than Chat Completions.

◈ANALYSIS

3 stories

Autonomy Doubles in Claude Code, Users Still Intervene

Anthropic research shows Claude Code's autonomous turn duration nearly doubled, hitting the 99.9th percentile. Experienced users auto-approve more often but also intervene more frequently, implying agent systems require effective monitoring and intervention tools.

Kasava: AI Should Be an Exoskeleton, Not a Coworker

Kasava argues AI should amplify human capacity like an 'exoskeleton', not operate as an autonomous 'coworker'. The piece criticizes current agentic AI for often failing due to a lack of human context, advocating instead for tools that assist human decision-making and execution.

Octoverse Data Shows AI Drives Devs to Typed Languages

GitHub's analysis suggests AI tools like Copilot are shifting developer language choices, creating a 'convenience loop.' This loop makes devs favor AI-integrated tech, with Octoverse 2025 data showing TypeScript usage now surpasses Python and JavaScript. The piece argues AI compatibility is a critical factor in future tech stack decisions.

⚒TOOLS

1 story

Agent Tracking, Observability Added to MLflow Platform

MLflow, an open-source platform, now offers end-to-end tracking, observability, and evaluation for building AI agents and models. It integrates these features to manage the AI development lifecycle, supporting Python with an active community.