Concepts
Hybrid Thinking
One-line definition
A user-controllable toggle between chain-of-thought reasoning and direct answers, introduced by DeepSeek-V3.1. One model serves both quick lookup and deep analysis.
Full explainer
Hybrid Thinking is the V3.1 capability that lets a single model run either as a chain-of-thought reasoner (slow, careful, internally verbose) or as a direct-answer engine (fast, terse) — selectable per request. Before V3.1, this distinction was generally a model-family split (e.g., GPT-4o vs o1; Claude 3.5 vs Claude 3 Opus thinking variants). Collapsing it into a single endpoint matters because agent workflows alternate naturally: deep reasoning to plan a step, direct execution to invoke a tool, more reasoning to interpret the tool's output. With separate models, this requires routing logic between endpoints, with separate context windows. V3.1 keeps everything in one context. Liang Wenfeng framed the August 21, 2025 release as 'a first step toward the agent era' precisely because of this architectural collapse.
Related events
Primary sources