Issue #2·Friday, January 2, 2026·22 min read·11 stories

Your LLM Agents Keep Failing Basic Instructions? Here's Why.

Claude plugin for coding context, sales teams replaced by agents, and xAI's 2 GW power needs

Yesterday, Unsloth released a tool for 2x faster LLM fine-tuning, using 70% less VRAM. This directly impacts iteration speed and compute spend for builders. Separately, new research explains why LLMs miss basic prompt instructions, a critical lesson for designing agentic systems that actually work.

▲NEWS

3 stories

xAI's Colossus Supercomputer Nears 2 GW Power

xAI is expanding its Colossus supercomputer with a third building, "MACROHARDRR," aiming for nearly two gigawatts of power and over one million GPUs. This multi-billion dollar investment boosts xAI's training capacity but forces them to build a natural gas plant due to massive energy demands.

Read full story→

2026 IPOs: Anthropic, OpenAI, SpaceX

Analysts expect 2026 to see major AI IPOs, with Anthropic and OpenAI potentially debuting publicly. These could collectively surpass hundreds of prior US offerings. SpaceX is also reportedly in the mix, alongside three Chinese tech giants planning $1B+ Hong Kong IPOs.

Instagram CEO: Don't Trust Online Content Anymore

Instagram CEO Adam Mosseri declared the 'intimate feed' dead, stating users 'can’t necessarily trust what you see anymore' due to AI. This marks a fundamental shift in digital authenticity assumptions, requiring new verification UX.

⚙TECHNICAL

3 stories

Build Your Own Deep Learning Library with NumPy

A new book teaches how to build a deep learning library from scratch, focusing on an autograd engine and layer modules using only NumPy. You'll create a custom library capable of training models like MNIST, simple CNNs, and simple ResNets.

A practical internal agent playbook: context compaction, subagents, VFS

Imprint details their internal AI agent workflows, sharing methods for context window compaction via summarization, structuring subagents for specific tasks, and using a virtual file system (VFS) for state management. They advocate a 'build to throw away' method for learning.

Optimal Small LLM: 3.8x Faster, More Factual

At 70M, many transformer variants cluster tightly; recipe and data dominate. New research shows a diffusion model, Dhara-70M, sacrifices 1.33% accuracy (on specific benchmarks) for a 3.8x throughput boost and superior factuality (measured by a specific metric).

◈ANALYSIS

1 story

SaaStr Founder Swaps 10 Sales Reps for 20 AI Agents

SaaStr founder Jason Lemkin states, 'We replaced our team of 10 Sales Execs with 20 AI Agents, managed by 1.2 humans.' He discusses effective AI tools for sales and predicts most SDRs/BDRs will be obsolete within a year.

⚒TOOLS

4 stories

Unsloth Speeds LLM Fine-Tuning by 2x, Cuts VRAM 70%

Unsloth claims up to 2x faster fine-tuning with ~70% less VRAM on supported runs for popular LLMs like Llama, Gemma, Qwen, DeepSeek, and OpenAI gpt-oss.

Claude-Mem Gives Claude Persistent Coding Memory

Claude-mem is a Claude Code plugin that captures tool calls, diffs, and summaries during coding sessions. It compresses this data using Claude's agent-sdk and injects relevant context into future sessions, creating a persistent memory. Implement filters, allowlists, or redaction for secret hygiene.

NVIDIA Delivers Reproducible DL Examples

NVIDIA's DeepLearningExamples repository offers deep learning scripts. Pick one model, run the reference script on your target GPU, capture throughput and accuracy, and use it as a baseline for optimization and vendor comparisons.

Tasker Automates Desktop Tasks with AI

Tasker is a free, open-source desktop agent for browser automation. It records actions or takes plain English descriptions, using AI to adapt to website changes and run locally for privacy. It is still brittle on heavy JS apps, CAPTCHA, 2FA, and anti-bot.