W
Wes Roth·TechM2.7 just BROKE the Entire Industry...
TL;DR
MiniMax's M2.7 model autonomously ran 100+ self-improvement cycles with zero human input, matching Gemini 3.1's benchmark score on a single consumer-grade GPU.
Key Points
- 1.MiniMax M2.7 handles 30–50% of the reinforcement learning team's workflow. An early checkpoint of M2.7 was tasked with building its own research agent harness, managing data pipelines, bug fixes, log analysis, and experiment launches end-to-end.
- 2.The model ran 100+ autonomous self-improvement rounds with zero human input. It followed a loop of hypothesis → experiment → benchmark comparison → commit or revert, essentially executing the scientific method autonomously to optimize its own scaffold.
- 3.M2.7 achieved a 30% improvement on internal benchmarks through self-evolution. It tuned variables like model temperature, rewrote its own tools, and updated work guidelines — though the internal benchmark criteria remain unverified.
- 4.On OpenAI's MLE-Bench, M2.7 scored 66.6, tying Gemini 3.1, running on a single A30 GPU costing $3,000–$7,000. Top models Claude Opus 4 (75.7) and GPT-5 (71.2) scored higher, but MiniMax matched Google's frontier model at a fraction of the infrastructure cost.
- 5.M2.7 scores near top-tier models across multiple coding benchmarks. SWE-Pro: 56.22 (near Opus), Vibe Pro: 55.6 (matches GPT-5 Codex), Terminal-Bench-2: 57, SWE Multilingual: 76.5, with second-place overall on GDP-Val among all models tested.
- 6.In live production debugging tests, M2.7 reduced recovery time to under 3 minutes by using non-blocking index creation to stop damage first before submitting a fix — demonstrating causal reasoning under time pressure rather than brute-force problem solving.
- 7.MiniMax explicitly states M2.7 is restructuring how they operate as a company, calling it 'significantly accelerating our evolution into an AI-native organization.' They also launched Open Room, an open-source AI agent that interacts with local files, with most of its code written by AI.
Life's too short for long videos.
Summarize any YouTube video in seconds.
Quit Yapping — Try it Free →