T
Two Minute Papers·TechDeepSeek Just Fixed One Of The Biggest Problems With AI
TL;DR
DeepSeek's Engram adds a memory lookup system to transformers, replacing costly from-scratch reasoning with a simple retrieval mechanism that improves performance everywhere.
Key Points
- 1.Standard transformers wastefully recompute facts from scratch every query. Modern AI like ChatGPT and Gemini run dense mathematical calculations even for simple factual lookups, because they lack a cheap memory retrieval mechanism.
- 2.DeepSeek's Engram module acts as a 'pantry' for pre-stored facts. It uses n-gram embeddings combined with multi-head hashing so the model can instantly locate and retrieve memorized information instead of reconstructing it.
- 3.Replacing 20–25% of mixture-of-experts layers with Engram made the model smarter, not just faster. Loss curves dropped significantly, and Engram outperformed prior techniques on every single benchmark tested — an unusually clean sweep.
- 4.A context-aware gating mechanism filters out irrelevant or contradictory retrieved memories. The gate computes a dot product between current context and retrieved memory, dropping to zero if they conflict, preventing 'rotting fish' contamination of outputs.
- 5.Disabling Engram cut trivia accuracy by 70% but left reading comprehension at 93%, proving the module stores facts separately. The limitation is placement: Engram must sit early in the network — inserting it too deep wastes its benefit since computation has already occurred.
Life's too short for long videos.
Summarize any YouTube video in seconds.
Quit Yapping — Try it Free →