Reviewing Nvidia Nemotron 3 Super and Releasing Our Own Model!

TL;DR

Nvidia's 120B Nemotron 3 Super and the community-built Kappa model both run locally, enabling powerful multi-agent AI orchestration through the open-source Turnstone platform.

Key Points

1.Nemotron 3 Super specs: 120B parameter mixture-of-experts model with only 12B parameters active at once; runs locally on Nvidia Spark/GB10 hardware with 128GB VRAM in NVFP4 or FP8 format with 131K context window
2.Kappa model release: A fine-tune of GPT-OSS 20B with D&D-style character alignment, strong long-context recall, and tool calling; runs in as little as 10GB VRAM (MXFP4), available on HuggingFace
3.Turnstone platform: An open-source, community-built multi-agent AI orchestration tool (by Patrick) that connects local and cloud models, supports tool calling, and is deployable via `docker compose up`
4.Context engineering over prompt engineering: Modern AI workflows are shifting to feeding models rich context (files, documentation, man pages) rather than crafting clever prompts, especially for multi-agent setups
5.The "car wash problem": Nemotron 3 Super defaults to recommending walking due to alignment training overriding logical reasoning; Kappa's D&D alignment makes it more likely to push back and reason correctly
6.Kappa vs. Nemotron on reasoning: In some alignment and tool-calling scenarios, the 20B Kappa model outperforms the 120B Nemotron because it glazes less, pushes back on bad ideas, and reasons more honestly

Life's too short for long videos.

Summarize any YouTube video in seconds.

Quit Yapping — Try it Free →