M
Matt Wolfe·TechAI News: Massive Updates From OpenAI and Anthropic
TL;DR
GPT-5.5 tops intelligence benchmarks and ChatGPT Images 2.0 leads image rankings, while Anthropic's Claude Design enables animations and their Mythos model was accessed by unauthorized users.
Key Points
- 1.GPT-5.5 is now the top-ranked model on the Artificial Analysis Intelligence Index. It scores 82.7% on Terminal Bench (beating Anthropic's unreleased Mythos at 82%) and costs $5/1M input tokens and $30/1M output tokens — double GPT-5.4's price, but uses significantly fewer tokens per task.
- 2.GPT-5.5's key practical improvement is doing more with less context. A vague health prompt returned a generic plan from 5.4, while 5.5 pulled prior conversation history to deliver a fully personalized plan with travel-specific routines and schedule-aware workout days.
- 3.ChatGPT Images 2.0 now leads LM Arena's image model rankings with a score of 1,500, far ahead of former leader Nano Banana (Gemini 3.1 Flash Image) at 1,271. It features thinking capabilities, web search, dense text rendering, and can generate scannable barcodes that correctly resolve to real books.
- 4.Anthropic's Claude Design enables polished visual work and basic animations directly in the browser. Using Opus 4.7, it can generate interactive website prototypes, slide decks, bar charts, and After Effects-style animated line graphs from simple prompts, though it defaults to a recognizable dark-mode aesthetic.
- 5.Claude's new Live Artifacts feature in Co-Work creates dashboards that refresh with live data from connected apps and files like Google Calendar, Drive, or CSVs — though currently limited connector availability (e.g., only Figma in the demo) reduces immediate utility.
- 6.Anthropic's Mythos model was accessed by unauthorized users despite Anthropic never releasing it publicly due to safety concerns. Sam Altman publicly mocked the strategy, comparing it to marketing a bomb and selling a shelter, calling it 'clearly incredible marketing.'
- 7.Several new open and proprietary models launched this week beyond OpenAI and Anthropic. Alibaba released Qwen 3.6 Max Preview (proprietary) and Qwen 3.6 27B (open-source); Kimi K2.6 (open-source) supports 300 parallel sub-agents and beats Opus 4.6 and GPT-5.4 on some benchmarks; Google DeepMind released Deep Research Max.
- 8.OpenAI released two additional tools: a Privacy Filter model and ChatGPT for Clinicians. The Privacy Filter is open-weight, runs locally to mask PII without data leaving the device, and is available on Hugging Face and GitHub. ChatGPT for Clinicians is free to verified US clinicians for documentation and medical research.
- 9.Four robots completed a half marathon in China in under one hour, with at least one finishing faster than any human half-marathon record. The event also showcased robots falling, going the wrong direction, and struggling with tape on the ground, highlighting both progress and limitations.
Life's too short for long videos.
Summarize any YouTube video in seconds.
Quit Yapping — Try it Free →