2026-05-08 6 min read
Today’s useful pile: Zyphra’s open ZAYA1 preview, OpenAI’s realtime voice push, AWS trying to make short GPU bursts less cursed, AgentCore Browser leaving the DOM, Gemini Flash-Lite going GA, and ChatGPT adding a trusted-contact safety rail.
2026-05-08 5 min read
Petri 3.0 turns Anthropic’s open alignment-testing tool into a more hackable, more realistic eval stack under Meridian Labs. Useful, if buyers treat it as a test harness instead of a trust sticker.
2026-05-07 6 min read
Anthropic says Claude Mythos Preview can find and exploit serious software flaws at a new scale. Project Glasswing is its attempt to put that capability in defenders’ hands before attackers get the same advantage.
2026-05-07 5 min read
Amazon Bedrock AgentCore Payments brings Coinbase, Stripe, x402, budgets, and observability into agent workflows. The useful question is not whether agents can pay — it is who controls when they are allowed to.
2026-05-05 5 min read
OpenAI is replacing GPT-5.3 Instant with GPT-5.5 Instant as ChatGPT’s default. The useful story is not just fewer hallucination claims — it is whether memory, personalization, and model retirement become safer defaults.
2026-04-28 4 min read
Google’s new official Agent Skills repository gives agents compact, task-specific instructions for Cloud products instead of stuffing whole documentation sites into context.
2026-04-28 5 min read
A 13B model trained on pre-1931 text is less a nostalgia demo than a practical test bed for clean data, synthetic tuning, and what language models really learn from the web.
2026-04-25 3 min read
Romain Huet confirmed that OpenAI's dedicated Codex line is dead. The main model and the coding model are now the same system, changing how builders should evaluate GPT-5.5.
2026-04-25 4 min read
OpenAI pushed GPT-5.5 to Chat Completions and Responses with a 1M context window, while putting GPT-5.5-pro behind Responses. The real product is fewer retries — and a nudge off legacy chat endpoints.
2026-04-25 3 min read
OpenAI released detailed guidance on prompting GPT-5.5, and the primary lesson is demolition. Treat it as a new model family, delete your bloated prompt preambles, and keep your tool users updated while the model thinks.
2026-04-25 4 min read
The new prompt guidance for GPT-5.5 is an exercise in demolition. The advice isn't to add new magic words; it's to clear out legacy prompt debt and define the destination rather than the path.
2026-04-25 3 min read
API access means teams can stop admiring GPT-5.5 from the showroom and start deciding where it actually deserves production budget.
2026-04-25 3 min read
OpenAI’s workspace agents sound autonomous, but the useful test is much duller: can they take a real workflow, preserve context, and return an artifact that is actually reviewable?
2026-04-24 3 min read
OpenAI pitches its new model as better at complex coding and data analysis. The real test is whether it can navigate messy workflows without requiring constant human cleanup.
2026-04-23 2 min read
The new image model is definitely stronger, but the real lesson is that AI generation only works when teams apply constraints, budgets, and a review process.
2026-04-23 2 min read
OpenAI’s workspace agents aren't just about doing more chores. They are a deliberate march into the enterprise control layer, where permissions and approvals rule the world.
2026-04-23 2 min read
OpenAI is pitching GPT-5.5 as a smarter model, but the practical upgrade is supposed to be less hand-holding. If we don't have to hover over it while it works, that's an actual feature.
2026-04-23 2 min read
OpenAI is wrapping agent language around the most boring parts of enterprise life—shared chores, routing, and approvals. It's not glamorous, but it is unfortunately essential.
2026-04-16 3 min read
With native sandboxes, filesystem tools, and workspace manifests, OpenAI is admitting that agents need unglamorous harnesses to keep them from becoming clever incident generators.
2026-04-03 3 min read
Codex-only seats for Business and Enterprise teams are a pricing move designed to make coding-agent pilots easier to start, measure, and quietly expand without terrifying the finance department.
2026-03-25 3 min read
OpenAI is expanding ChatGPT's commerce capabilities with visual browsing and comparisons. The real battle isn't about owning the checkout button; it's about influencing the shopper before the cart even appears.
2026-03-18 3 min read
OpenAI’s GPT-5.4 mini and nano models are the unglamorous, cost-controlling workhorses that make complex agent systems economically viable.
2026-03-06 3 min read
Putting ChatGPT inside Excel isn't about magical insights. It's about automating the miserable middle of finance work: tracing formulas, building scenarios, and untangling inherited models.