Back to archive
Issue #18··16 min read·8 stories

Offline AI on your phone, distillation attacks, and Claude in raids

Generative AI shifts focus to cognitive debt. Publishers restrict AI scraping. Fast LLM inference tricks.

Run text and image generation offline on your phone with the new Off Grid tool. Google and OpenAI are pushing back against cheap model distillation attacks. Plus, generative AI is highlighting cognitive debt over technical debt.

NEWS
3 stories
2

Publishers Block Internet Archive Over AI Scraping Fears

News publishers are blocking IA crawlers, citing concerns that the Wayback Machine can be used as a workaround for scraping. Publications like The Guardian and NYT are implementing robots.txt blocks (eg archive.org_bot) and other crawler restrictions to protect content from unauthorized scraping.

TECHNICAL
2 stories
1

Fast mode economics: low-batch inference vs specialised accelerators

Anthropic speeds up its Opus model with low-batch inference (up to 2.5x, around 170 tok/s) at a higher cost per request, while OpenAI achieves over 1000 tok/s on a less capable model (Spark) using specialized Cerebras chips. The author questions the trade-off of model capability for speed.

2

Context tax: cutting agent spend by shrinking token bloat

LLM agents incur a 'Context Tax' from excessive tokens, which costs money, increases latency, and makes your agent dumber—a triple penalty. Strategies like optimizing KV cache, filesystem storage for tool outputs, and precise tool design can cut these costs and boost agent efficiency.

ANALYSIS
2 stories
1

AI code generation can amplify 'cognitive debt' for devs

Building features fast with AI tools can increase 'cognitive debt'—the mental burden of understanding complex systems. This debt lives in the developers' minds, making future work harder as AI-generated code outpaces comprehension. Counter with executable docs, tests-as-spec, traceable architecture decisions, and smaller agent scopes.

2

AI boom absent from economic data, economist notes

Macroeconomic data shows no clear AI impact on employment, productivity, or wages yet, per a Fortune interview with economist [Economist Name]. This mirrors the 'computer age' paradox. Economists suggest AI will likely enhance labor in some sectors rather than replace workers across the board. Expect ROI scrutiny; instrument productivity internally, don't wait for macro proof.

TOOLS
1 story
1

Run AI Models Offline on Your Phone with Off Grid

Off Grid is a new mobile app enabling users to run LLMs like Llama 3.2 and image generators like Stable Diffusion entirely offline on their smartphones. Inference runs on-device; project claims nothing is uploaded. Model downloads still require network. NPU-accelerated on Snapdragon and Core ML on iOS, keeping all data local per project claim.