深度学习及其应用_ 课件 0525注意力编解码器.pptxVIP

下载本文档

1
0
约3.05千字
约 19页
2026-02-26 发布于山东
举报

深度学习及其应用_ 课件 0525注意力编解码器.pptx

DeepLearninganditsApplicationATTENTIONMECHANISM

Transformer-EncoderSJTUDeepLearningLecture.2

SJTUDeepLearningLecture.3TransformerTransformeriskeycomponentofBERT/GPTParallelcomputingReplacingRNNLSTM,becomingthemosteffectiveextractor

SJTUDeepLearningLecture.4Transformer

SJTUDeepLearningLecture.5Transformer

SJTUDeepLearningLecture.6TransformerPositionalEncodingResidualConnection(AddNorm)Encoder-DecoderAttention

SJTUDeepLearningLecture.7TransformerAttentionmechanismcannotdistinguishthepositionorderofinputwordsTheanimalscrossthestreet. ||Crossthethestreetanimals.

SJTUDeepLearningLecture.8Transformer

SJTUDeepLearningLecture.9PositionEncodingPositionEncodingwhereposisthepositionandiisthedimensionForanyfixedoffsetk,PEpos+kcanberepresentedasalinearfunctionofPEpos

SJTUDeepLearningLecture.10PositionEncoding

SJTUDeepLearningLecture.11PositionEncodingSinePECosinePE

SJTUDeepLearningLecture.12Residuals

SJTUDeepLearningLecture.13Residuals

Transformer-DECODERSJTUDeepLearningLecture.14

SJTUDeepLearningLecture.15TransformerTransformeriskeycomponentofBertParallelcomputingReplacingRNNLSTM,becomingthemosteffectiveextractor

SJTUDeepLearningLecture.16TransformerTheencoder’sinputsflowthroughaself-attentionlayerTheoutputsoftheself-attentionlayerarefedtoafeed-forwardneuralnetworkThedecoderhasboththoselayers,butbetweenthemisanattentionlayerthathelpsthedecoderfocusonrelevantpartsoftheinputsentenceMulti-HeadAttention

SJTUDeepLearningLecture.17TransformerAttentionbetweeneverytwotokensAttentionfrombeforetokens

Reference[1]Vaswani,Ashish,etal.Attentionisallyouneed.Advancesinneuralinformationprocessingsystems.2017.[2]Devlin,Jacob,etal.Bert:Pre-trainingofdeepbidirectionaltransformersforlanguageunderstanding.arXivpreprintarXiv:1810.04805(2018).[3]Serrano,Sofia,andNoahA.

您可能关注的文档

文档评论（0）

1亿VIP精品文档

更多 >

深度学习及其应用_ 课件 0525注意力编解码器.pptxVIP