Issue #53·Monday, March 9, 2026·26 min read·13 stories

Paradox: AI Agents Make You Do More Work

Beyond the paradox: explore long context inference breakthroughs and how CLIs are being redesigned for AI agents.

A new take over the weekend suggests AI agents won't actually reduce your workload, but rather increase it, challenging a core assumption for many builders. Meanwhile, breakthroughs in long context inference are fundamentally rewriting how transformers operate, expanding their practical limits. Also, one team shipped a Claude Code system that handles 2-million-word novels by directly addressing AI's memory and hallucination issues.

▲NEWS

1 story

Ben Affleck's AI Studio Acquired by Netflix

Netflix acquired InterPositive, a startup co-founded by Ben Affleck, which builds AI tools for filmmakers. The company trains custom AI models on proprietary footage to assist with post-production tasks like wire removal and shot reframing, not to generate content from scratch.

Read full story→

⚙TECHNICAL

4 stories

Long Context Inference Battles Quadratic Tax, KV Cache

Long-context inference in Transformers faces two main hurdles: the 'quadratic tax' on attention and KV cache memory bottlenecks. The article explores solutions like KV cache compression (MLA, quantization), attention-replacing architectures (Mamba), and distributed attention (Ring Attention). These innovations can dramatically cut inference costs and boost concurrency; for example, DeepSeek-style MLA can increase concurrent users on a 70B model at 128K context from 1 to 27 per H100, reducing costs from $19.84 to $0.73 per million output tokens.

Monte Carlo: Humanoid Robots Echo Flash Dev Hurdles

Alastair Monte Carlo argues that humanoid robot engineering mirrors Flash development challenges in timing, sequencing, and state synchronization. He emphasizes perceptual continuity and aligning robot motion with human predictive models to build user trust. This requires architectural integrity, including hardware-rooted identity and secure boot processes.

Poehnelt: CLIs Need Redesign for AI Agents

Justin Poehnelt argues CLIs need a redesign for AI agents, moving beyond human-centric design. He advocates for machine-readable JSON output for direct API schema mapping and runtime schema introspection over static documentation. Key features include context window discipline with field masks, robust input hardening, and safety rails like dry-run and response sanitization.

One Builder's Markdown System Drives Parallel Coding Agents

One builder shared their system for running multiple AI coding agents in parallel, using tmux and Markdown 'Feature Designs' (FDs). FDs document project steps from problem to verification, acting as a decision trace to manage agent output quality and track the agent lifecycle.

◈ANALYSIS

3 stories

SaaStr: AI Agents Drive More Work, Not Less

The 'Cowan Paradox' argues that AI agents, much like past labor-saving devices, increase overall work by raising expectations and creating new tasks. This analysis suggests AI intensifies task scope and pace, shifting competitive advantage towards taste, judgment, and relationships. Deploying AI means planning for increased output and proactively managing cognitive load.

Daily Use Reveals Codex's Stability, Code Quality Gains

After a year of daily production use, one team reports OpenAI Codex has significantly improved in error handling, stability, multi-turn conversations, and code quality. Despite opaque model selection, network connectivity is much improved, making Codex an indispensable tool for routine maintenance and feature development. This has dramatically increased their feature velocity.

Tom Tunguz: AI Agents Drive Software NDR Decline

Tom Tunguz's analysis points to an accelerating decline in Net Dollar Retention (NDR) for public software companies, with contraction below 100% expected by 2026. He attributes this to macro pressures and AI agents replicating simpler products. Founders and PMs building products with easily replicable workflows should expect accelerating NDR decline as AI agents mature.

⚒TOOLS

5 stories

FlashAttention Boosts Transformer Speed, Cuts Memory

FlashAttention optimizes the attention mechanism for transformer models, delivering significant speedups and memory savings through kernel fusion and tiling. This allows training and running larger models on existing hardware. The CUDA kernel and Python interface are critical for builders working with large-scale AI models.

Genomic Design Tool Models All Life Domains

ArcInstitute released Evo2, a Jupyter Notebook-based tool for modeling and designing genomes across all domains of life. It offers a way to explore and manipulate genetic information, providing a direct, open-source path for bio-AI teams to prototype genomic models.

AI Novel Tool Manages 2M Character Context

The `webnovel-writer` tool uses Claude Code to assist in long-form web novel creation, specifically tackling AI "forgetting" context and "hallucinations." It supports continuous writing for projects up to 2 million characters. The project demonstrates a method for managing long-form context and reducing hallucinations in AI-assisted writing.

Share Claude Sessions as Interactive HTML Replays

Claude-replay converts Claude Code session logs into interactive, shareable HTML replays, solving the problem of bulky screen recordings or raw transcripts. The tool generates a single HTML file with speed control, collapsible thinking blocks, and secret redaction, making it useful for agent demos and bug reports.

Open-Source CLI Unifies Google Workspace for Agents

A new open-source CLI, `gws`, unifies access to Google Workspace APIs like Drive and Gmail. It dynamically builds its command surface from Google's Discovery Service, providing structured JSON output and built-in agent skills for AI agent integration.