DeepSeek published a preview of DeepSeek-V4 on April 24, offering two open-weight MoE variants: V4-Pro (1.6T total / 49B active parameters) and V4-Flash (284B / 13B). Both ship with a 1M-token context window designed in from the start rather than retrofitted, using a Hybrid Attention Architecture that requires only 10% of the KV cache of DeepSeek-V3.2 at 1M context. Weights are available on Hugging Face under an MIT license; API pricing for Pro is $1.74/$3.48 per million input/output tokens, roughly 40× cheaper than GPT-5.5 at equivalent quality. Simon Willison's hands-on notes characterize V4-Pro as competitive with current frontier models while trailing the very top tier by roughly three to six months.

DeepSeek releases V4 open-weight model family with native 1M-token context

Citations