阿里团队Qwen2.5-1M系列大模型技术报告 .docxVIP

  • 0
  • 0
  • 约6.82万字
  • 约 24页
  • 2026-02-10 发布于浙江
  • 举报

阿里团队Qwen2.5-1M系列大模型技术报告 .docx

2025-01-26

Qwen2.5-1MTechnicalReport

AnYang,BowenYu,ChengyuanLi,DayihengLiu,FeiHuang,HaoyanHuang,JiandongJiang,JianhongTu,JianweiZhang,JingrenZhou,JunyangLin,KaiDang,KexinYang,Le

Yu,MeiLi,MinminSun,QinZhu,RuiMen,TaoHe,WeijiaXu,WenbiaoYin,Wenyuan

Yu,XiafeiQiu,XingzhangRen,XinlongYang,YongLi,ZhiyingXu,ZipengZhang?QwenTeam,AlibabaGroup

Abstract

Inthisreport,weintroduceQwen2.5-1M,aseriesofmodelsthatextendthecontextlengthto1milliontokens.Comparedtotheprevious128Kversion,theQwen2.5-1Mserieshavesignificantlyenhancedlong-contextcapabilitiesthroughlong-contextpre-trainingandpost-training.Keytechniquessuchaslongdatasynthesis,progressivepre-training,andmulti-stagesupervisedfine-tuningareemployedtoeffectivelyenhancelong-contextperformancewhilereducingtrainingcosts.

Topromotetheuseoflong-contextmodelsamongabroaderuserbase,wepresentandopen-sourceourinferenceframework.Thisframeworkincludesalengthextrapolationmethodthatcanexpandthemodelcontextlengthsbyatleastfourtimes,orevenmore,withoutadditionaltraining.Toreduceinferencecosts,weimplementasparseattentionmethodalongwithchunkedprefilloptimizationfordeploymentscenariosandasparsityrefinementmethodtoimproveprecision.Additionally,wedetailouroptimizationsintheinferenceengine,includingkerneloptimization,pipelineparallelism,andschedulingoptimization,whichsignificantlyenhanceoverallinferenceperformance.Byleveragingourinferenceframework,theQwen2.5-1Mmodelsachievearemarkable3xto7xprefillspeedupinscenarioswith1milliontokensofcontext.Thisframeworkprovidesanefficientandpowerfulsolutionfordevelopingapplicationsthatrequirelong-contextprocessingusingopen-sourcemodels.

TheQwen2.5-1Mseriescurrentlyincludestheopen-sourcemodelsQwen2.5-7B-Instruct-1MandQwen2.5-14B-Instruct-1M,aswellastheAPI

文档评论(0)

1亿VIP精品文档

相关文档