- 0
- 0
- 约1.7万字
- 约 74页
- 2026-02-10 发布于浙江
- 举报
Large
Language
Models
IntroductiontoLargeLanguageModels
Languagemodels
?Rememberthesimplen-gramlanguagemodel?Assignsprobabilitiestosequencesofwords
?Generate?Istrained
textbysamplingpossiblenextwordsoncountscomputedfromlotsoftext
?Largelanguagemodelsaresimilaranddifferent:?Assignsprobabilitiestosequencesofwords
?Generatetextbysamplingpossiblenextwords?Aretrainedbylearningtoguessthenextword
Largelanguagemodels
?Eventhroughpretrainedonlytopredictwords?Learnalotofusefullanguageknowledge
?Sincetrainingonalotoftext
Threearchitecturesforlargelanguagemodels
Decoders Encoders Encoder-decodersGPT,Claude, BERTfamily,Flan-T5,WhisperLlama HuBERT
Mixtral
Encoders
Manyvarieties!
?Popular:MaskedLanguageModels(MLMs)?BERTfamily
?
Trainedbypredicting
wordsonbothsides
wordsfromsurrounding
?Areusuallyfinetuned(trainedonsuperviseddata)
forclassificationtasks.
Encoder-Decoders
?Trainedtomapfromonesequencetoanother
?Verypopularfor:
?machinetranslation(mapfromonelanguagetoanother)
?speechrecognition(mapfromacousticstowords)
Large
Language
Models
IntroductiontoLargeLanguageModels
Large
Language
Models
LargeLanguageModels:Whattaskscantheydo?
Bigidea
Manytaskscanbeturnedintotasksofpredictingwords!
Thislecture:decoder-onlymodels
Alsocalled:
?CausalLLMs
?AutoregressiveLLMs?Left-to-rightLLMs
?Predictwordslefttoright
本报告来源于三个皮匠报告站(),由用户Id:673421下载,文档Id:620742,下载日期:2025-04-01
ConditionalGeneration:Generatingtext
conditionedonprevioustext!
CompletionText
all the
LanguageModelingHead
Softmax
Unencoderlayer
logits
U U
TransformerBlocks
… …
Encoder
+ i + i + i + i + i +i + i
E E E E E E E
So long andthanksfor all the
Pre?xText
ManypracticalNLPtaskscanbecastaswordprediction!
Sentimentanalysis:“IlikeJackieChan”
1.Wegivethelanguagemodelthisstring:Thesentimento
原创力文档

文档评论(0)