Back to archive
Issue #41··22 min read·11 stories

OpenAI Deploys on Cerebras, First Non-Nvidia

A single harness improved 15 LLMs' coding, Karpathy builds a transformer, and AI SaaS valuations diverge.

OpenAI deployed GPT-5.3-Codex-Spark on Cerebras chips yesterday, marking its first production move away from Nvidia. This signals a strategic diversification in the compute landscape, offering builders context on future infrastructure options. One builder also showed how a single harness improved 15 LLMs at coding, underscoring the power of system design, while new analysis highlights the radical re-rating of AI-native SaaS valuations.

NEWS
6 stories

AI Mode Identifies Logical Flaws in Research Papers

Google released Gemini 3 Deep Think, a specialized reasoning mode for scientific and engineering problems. Available to Ultra subscribers and select enterprises, early testers used it to identify logical flaws in research papers and optimize crystal growth. The model shows state-of-the-art performance on academic benchmarks in mathematics, physics, and chemistry.

Read full story
2

CLI Proxy Cuts LLM Tokens 60-90%

RTK (Rust Token Killer) is a CLI tool that filters and compresses command outputs, saving 60-90% of LLM tokens for common operations like `ls` and `git diff`. It offers various installation methods and a `discover` command to identify additional savings opportunities by scanning session history.

3

Coding Model Deploys on Cerebras for Fast Inference

OpenAI launched GPT-5.3-Codex-Spark, a model optimized for fast, interruptible coding tasks, deployed on Cerebras Systems hardware. This marks OpenAI's first production deployment away from Nvidia for inference, signaling a strategic diversification of its infrastructure for responsiveness. It's available as a research preview for ChatGPT Pro subscribers.

4

Anthropic Raises $30B at $380B Valuation

Anthropic secured $30 billion in Series G funding, pushing its post-money valuation to $380 billion. GIC and Coatue led the investment, which will fund frontier research, product development, and infrastructure. The investment reflects Claude's rapid enterprise adoption, with Claude Code now estimated to author 4% of all GitHub public commits and the platform expanding its multi-cloud infrastructure for enhanced resilience.

5

Hitzig: Ex-OpenAI Researcher Quits Over ChatGPT Ads

A former OpenAI researcher resigned, arguing that testing ads on ChatGPT risks user manipulation. The author believes leveraging personal information shared with the AI could allow advertisers to exploit user fears and desires. This move, similar to Facebook's past, creates a false choice between restricting access and exploiting users, rather than seeking alternative monetization.

6

Recursive LMs Handle 10M+ Token Contexts

A new post introduces Recursive Language Models (RLMs), an inference strategy addressing unbounded input context lengths and 'context rot'. RLMs let models recursively interact with context using REPL environments to decompose complex queries. One RLM instance outperformed GPT-5 on the OOLONG long-context benchmark, and RLMs showed no performance degradation with over 10 million tokens at inference time. The authors suggest RLMs could be the next milestone beyond CoT and ReAct for inference-time scaling.

TECHNICAL
3 stories
1

243-Line Python Exposes Transformer Mechanics

Andrej Karpathy created microGPT, a 243-line Python Transformer implementation with no external dependencies. It exposes the core mathematical principles of LLMs, including automatic differentiation and the GPT architecture, serving as an educational tool for learning LLM mechanics from scratch.

2

Thesis: Code Harness, Not LLM, Bottlenecks Coding Performance

One article argues the "harness" for LLMs interacting with code is a bigger bottleneck than the models themselves. The author's "Hashline" tool, which tags code lines with content hashes, dramatically reduces edit failures and token waste for 16 models. This suggests builders can achieve significant coding agent performance gains by optimizing the LLM-code interface, often exceeding benefits from model upgrades at zero training cost.

3

Specialized Agent Teams Ship Features in Hours

Antfarm Patterns orchestrates specialized AI agent teams, assigning distinct roles like planner, developer, verifier, tester, and reviewer. This multi-agent approach overcomes single-agent limitations by using fresh contexts and clear handoffs, resulting in features shipping in hours instead of weeks.

ANALYSIS
4 stories
1

Saastr: AI-Native Valuations Crush Legacy SaaS

The article highlights the valuation gap between CS Disco (~1x ARR) and AI-native legal tech like Harvey (~110x ARR). Saastr argues that decelerating growth and being an incumbent in an AI-disrupted market lead to an 'incumbency penalty' for older SaaS, while rapid growth and an 'AI-native' narrative drive high valuations. Founders of pre-AI SaaS should expect extreme valuation pressure unless they dramatically accelerate growth, pursue acquisition, or prove competitive AI capabilities.

2

RunLLM: Micro-Task Strategy Built Cursor's Data Moat

One analysis argues Cursor's edge comes from breaking coding tasks into small, incremental steps, similar to autocomplete. This low-stakes interaction allows frequent user feedback, building trust and a powerful data moat. This approach contrasts with agents aiming for full task autonomy, suggesting a focus on minimal, verifiable units maximizes correctness and data collection.

3

Shumer: AI Automates Engineering Work

The author argues current AI capabilities, especially in code generation, are vastly underestimated and poised to disrupt all industries within one to five years. He asserts AI can now perform complex tasks with minimal human input, even exhibiting judgment. This rapid progress, driven by a few key labs, is far beyond what most people perceive from free AI versions.

4

AI COGS Redefine Product Pricing (BVP Thesis)

BVP's analysis argues AI product pricing diverges from traditional SaaS due to high compute and inference costs. Their playbook details three business models—Copilots, Agents, AI-enabled Services—and pricing metrics like consumption or outcome, noting hybrid models offer flexibility for early-stage startups.

TOOLS
1 story
1

Postgres Extension Embeds AI Agents In Rows

Pgclaw is a new open-source Postgres extension that embeds AI agents directly into database rows. It introduces a 'claw' data type, letting agents classify ticket priority, generate summaries, or write and run code via Claude Code. The extension integrates with Postgres features like ACID compliance and JOINs, supporting various LLM providers.