DeepSeek-V4 技术报告指南.docxVIP

  • 3
  • 0
  • 约19.02万字
  • 约 114页
  • 2026-04-30 发布于北京
  • 举报

DeepSeek-V4:

TowardsHighlyEfficientMillion-TokenContextIntelligence

DeepSeek-AI

research@deepseek.com

Abstract

WepresentapreviewversionofDeepSeek-V4series,includingtwostrongMixture-of-Experts(MoE)languagemodels—DeepSeek-V4-Prowith1.6Tparameters(49Bactivated)andDeepSeek-V4-Flashwith284Bparameters(13Bactivated)—bothsupportingacontextlengthofonemilliontokens.DeepSeek-V4seriesincorporateseveralkeyupgradesinarchitectureandop-timization:(1)ahybridattentionarchitecturethatcombinesCompressedSparseAttention(CSA)andHeavilyCompressed

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档