DeepSeek-V4 技术报告白皮书.pptxVIP

  • 4
  • 0
  • 约17.87万字
  • 约 10页
  • 2026-04-30 发布于北京
  • 举报

DeepSeek-V4:

TowardsHighlyEfficientMillion-TokenContextIntelligence

DeepSeek-AI

research@deepseek.com

Abstract

WepresentapreviewversionofDeepSeek-V4series,includingtwostrongMixture-of-Experts(MoE)languagemodels—DeepSeek-V4-Prowith1.6Tparameters(49Bactivated)andDeepSeek-V4-Flashwith284Bparameters(13Bactivated)—bothsupportingacontextlengthofonemilliontokens.DeepSeek-V4seriesincorporateseveralkeyupgradesinarchitectureandop-timization:(1)ahybridattentionarchitecturethatcombinesCompressedSparseAttention(CSA)andHeavilyCompr

文档评论(0)

1亿VIP精品文档

相关文档