Agent safety, LLM randomness, and a new CLI for models. Plus: Google & Apple talk AI.
Yesterday saw the release of `simstudioai/sim`, an open-source platform to build and deploy AI agent workflows, giving builders more control over their agent development. LLMs also struggle to generate truly random numbers from statistical distributions, a specific limitation if you're building probabilistic systems. The 'Have-Lots' analysis yesterday offers context on how compute and talent resources concentrate in the AI rush, relevant for infrastructure planning and fundraising.
Apple confirmed a multi-year partnership with Google, stating that the next generation of Apple Foundation Models will be based on Gemini models and Google Cloud technology.
UK regulator Ofcom opened a formal investigation into X under the Online Safety Act following reports about Grok-generated child sexual abuse material. Ofcom will assess X's compliance with content safety duties.
Brooker argues for an external "box" approach to AI agent safety, using a deterministic control layer outside the agent that strictly limits its tool access and actions. This method prioritizes secure execution environments and fine-grained policy enforcement over internal alignment, aiming for verifiable control regardless of agent reasoning or prompt injections.
In a Claire Vo / Lenny’s episode, OpenAI's Codex product lead, Alexander Embiricos, shared advanced workflows for maximizing AI coding agents, including parallelizing with Git worktrees and detailed PLANS.md implementation. He also revealed how OpenAI built the Sora Android app in 28 days using Codex.
A new paper reveals LLMs are terrible at generating random numbers from statistical distributions. Benchmarking 11 frontier models across 15 distributions showed a 13% median pass rate for batch generation, while independent single-sample requests failed almost entirely. Don't roll dice with LLMs for statistical guarantees.
An Axios analysis claims the AI boom is concentrating returns, with vast fortunes gained through compute allocation, exclusive model access, and distribution moats. This trend reportedly exacerbates economic inequality, with disproportionate gains for an AI-connected elite.
Simon Willison says he treats code ported by LLMs as a derivative work, retaining original licenses and copyright. He also suggests labeling AI-generated code as "alpha slop" until it's production-tested, and advises builders to retain LICENSE and NOTICE files, preserve copyright headers, and document provenance.
SimStudioAI released "sim," an open-source platform for building and deploying AI agent workflows. It features a visual canvas to connect agents, tools, and blocks, allowing instant execution. Copilot generates nodes from natural language, and it offers optional vector-store grounding. Builders can self-host via Docker.
Simon Willison's `llm` tool is a CLI and Python library that lets you talk to large language models directly. It features a plugin ecosystem for many providers and local models, stores prompts and responses in SQLite, and supports embeddings and tool execution.