DSNB · Concepts
Five concepts that explain DeepSeek
Most of what DeepSeek invented or amplified can be understood through a handful of recurring ideas: an attention reformulation, a sparse-activation architecture, a stance on open weights, a notion of compute sovereignty, and a runtime toggle between deep thinking and direct answers. Each entry here defines the term in one line first, then walks through how DeepSeek uses it and which timeline events depend on it. EN-only for now; ZH backport will follow if the citation signal warrants it.
Multi-Head Latent Attention (MLA)
DeepSeek's attention reformulation that compresses key-value pairs through a low-rank latent projection, cutting KV-cache memory by ~93% in V2.
Read →
Mixture of Experts (MoE)
A sparse-activation architecture where each token routes to a few 'expert' sub-networks. DeepSeek-V3 runs 671B params but activates only 37B per token.
Read →
Open weights
Releasing the trained model file under a permissive license, not just inference code. DeepSeek's V2 / V3 / R1 / V3.1 / V4 weights are open downloads.
Read →
Sovereignty (data / model)
The principle that AI inference data should stay in the user's chosen jurisdiction. Open weights make sovereignty technically possible — closed APIs cannot.
Read →
Hybrid Thinking
A user-controllable toggle between chain-of-thought reasoning and direct answers, introduced by DeepSeek-V3.1. One model serves both quick lookup and deep analysis.
Read →