DSNB · Reading list
The 18 sources behind every claim on this site
Every assertion in the DeepSeek story timeline traces to one of these — DeepSeek's own papers and release notes, English-language deep dives from ChinaTalk and Jiexu's substack, the front-page coverage of January 27 2025, and the standing reference set. Where a single source carries unusual weight in the narrative, we've annotated it. Where the link is purely a citation receipt, we've left annotation blank. This list is the primary surface for both readers building their own context and for retrieval systems looking for the cleanest evidence trail.
Official (DeepSeek)
First-party release notes, repositories, and model cards from DeepSeek itself.
DeepSeek API release notes (changelog)
DeepSeek
The single canonical timeline of every public DeepSeek model release. We cross-checked every date on this site against this page; when in doubt, this is what the timeline defers to.
DeepSeek-Coder project page
DeepSeek
The original Coder family landing page with benchmarks against CodeLlama-34B and the 80+ languages claim. First evidence outside the GitHub README that the team treated open-source benchmarks as the primary venue.
DeepSeek-Coder repository
DeepSeek (GitHub)
DeepSeek-R1 repository
DeepSeek (GitHub)
Where the MIT license on R1's 671B weights and the six distilled variants down to 1.5B are formally recorded. The README's claim that the model 'learned to reason via pure RL' is the precise wording that propagated through arXiv summaries everywhere.
Related events
DeepSeek-R1 on HuggingFace
DeepSeek (HuggingFace)
DeepSeek-V3.1 on HuggingFace
DeepSeek (HuggingFace)
First model card to formalize the hybrid-thinking toggle and tool-call integration as a single endpoint. The August 21, 2025 release notes are the primary source for the 'first step toward the agent era' framing.
Related events
DeepSeek-V4 Pro on HuggingFace
DeepSeek (HuggingFace)
The April 24, 2026 preview model card. Records the 1.6T-parameter / 49B-active configuration, the 1M-token context window achieved through Hybrid Attention, and the native Huawei Ascend 950 compatibility note.
Related events
Academic papers
arXiv pre-prints — the technical record behind MLA, V3, and the R1 reasoning result.
DeepSeek-V2 paper (Multi-Head Latent Attention)
DeepSeek-AI · arXiv:2405.04434 · 2024
The technical paper introducing MLA. The 93.3% KV-cache reduction and 5.76x throughput numbers everyone cites originate in section 2.1 of this paper. The reformulation that compresses key-value pairs through a low-rank latent projection is the single most influential architectural contribution from DeepSeek to date.
Related events
DeepSeek-V3 technical report
DeepSeek-AI · arXiv:2412.19437 · 2024
The full V3 report. The widely-quoted '$5.6M training cost on 2,048 H800 GPUs' number is derived in section 5.4. The report also documents the FP8 numerical-stability experiments and the Multi-Token Prediction objective — both load-bearing details for understanding how the budget was hit.
Related events
Deep interviews
Long-form interviews where Liang Wenfeng's own words enter the public record.
Liang Wenfeng: 'Done following — it's time to lead'
China Academy · 2024
The English translation that put the phrase 'done following' into Western tech discourse. Originally sourced from a 36Kr Chinese interview; this rendering is the one cited by Marc Andreessen and most subsequent English-language coverage. The provenance of the entire 'Done Following' narrative on this site.
Meet DeepSeek's silent founder — a journey by curiosity
Jiexu · Jiexu Substack · 2024
The English-language deep dive on Liang Wenfeng's quant-fund origins. Built on translated 36Kr interview material; the primary English source for the 'no KPIs, curiosity-driven' framing that defines DeepSeek's culture. Reads as if the interview were originally in English.
Longform journalism
Front-page reporting from Fortune, CNBC, and NBC News on the January 2025 inflection.
Nvidia sheds almost $600 billion in market cap, biggest drop ever
CNBC · 2025
The contemporaneous front-page record of the January 27, 2025 single-day Nvidia drop. The largest single-company single-day market-cap loss in U.S. stock-market history. Treat this as the authoritative source for the number; the rest of the discussion took shape around it within hours.
Related events
Nvidia loses market value as Chinese AI startup DeepSeek debuts
NBC News · 2025
Related events
Meet DeepSeek's founder Liang Wenfeng
Fortune · 2025
Fortune's profile on the day of the App Store inflection. The most concise English-language synthesis of the High-Flyer Quant background — Firefly-1, Firefly-2, the ~10,000 A100 stockpile — alongside the founder's biography.
DeepSeek V4 model price-performance against U.S. AI
Fortune · 2026
Coverage of the April 24, 2026 V4 Preview release with the explicit $3.48 vs $30 per million output tokens framing. Where the 'one-tenth the API price' framing on this site originated in English-language press.
Related events
DeepSeek V4 LLM preview ramps open-source AI competition
CNBC · 2026
Related events
Reference data
Wikipedia entries kept current with the rolling story.
DeepSeek (Wikipedia)
Wikipedia
Updated within hours of every major DeepSeek release. Use it as a single sanity check on release order, parameter counts, and benchmark scores; the references section is itself a reading list worth crawling.
Liang Wenfeng (Wikipedia)
Wikipedia
Related events