基于特征分解与混合机制将大语言模型特征蒸馏至小模型 Two Heads are Better than One Distilling Large Language Model Features Into Small Models with Feature Decomposition and Mixture.pdf

基于特征分解与混合机制将大语言模型特征蒸馏至小模型 Two Heads are Better than One Distilling Large Language Model Features Into Small Models with Feature Decomposition and Mixture.pdf

TwoHeadsAreBetterthanOne:DistillingLargeLanguageModelFeaturesinto

SmallModelswithFeatureDecompositionandMixture

*†*

TianhaoFu

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档