深度学习及其应用_ 课件 0525注意力编解码器.pptxVIP

  • 1
  • 0
  • 约3.05千字
  • 约 19页
  • 2026-02-26 发布于山东
  • 举报

深度学习及其应用_ 课件 0525注意力编解码器.pptx

DeepLearninganditsApplicationATTENTIONMECHANISM

Transformer-EncoderSJTUDeepLearningLecture.2

SJTUDeepLearningLecture.3TransformerTransformeriskeycomponentofBERT/GPTParallelcomputingReplacingRNNLSTM,becomingthemosteffectiveextractor

SJTUDeepLearningLecture.4Transformer

SJTUDeepLearningLecture.5Transformer

SJTUDeepLearningLecture.6TransformerPositionalEncodingResidualConnection(AddNorm)Encoder-DecoderAttention

SJTUDeepLearningLecture.7TransformerAttentionmechanismcannotdistinguishthepositionorderofinputwordsTheanimalscrossthestreet. ||Crossthethestreetanimals.

SJTUDeepLearningLecture.8Transformer

SJTUDeepLearningLecture.9PositionEncodingPositionEncodingwhereposisthepositionandiisthedimensionForanyfixedoffsetk,PEpos+kcanberepresentedasalinearfunctionofPEpos

SJTUDeepLearningLecture.10PositionEncoding

SJTUDeepLearningLecture.11PositionEncodingSinePECosinePE

SJTUDeepLearningLecture.12Residuals

SJTUDeepLearningLecture.13Residuals

Transformer-DECODERSJTUDeepLearningLecture.14

SJTUDeepLearningLecture.15TransformerTransformeriskeycomponentofBertParallelcomputingReplacingRNNLSTM,becomingthemosteffectiveextractor

SJTUDeepLearningLecture.16TransformerTheencoder’sinputsflowthroughaself-attentionlayerTheoutputsoftheself-attentionlayerarefedtoafeed-forwardneuralnetworkThedecoderhasboththoselayers,butbetweenthemisanattentionlayerthathelpsthedecoderfocusonrelevantpartsoftheinputsentenceMulti-HeadAttention

SJTUDeepLearningLecture.17TransformerAttentionbetweeneverytwotokensAttentionfrombeforetokens

Reference[1]Vaswani,Ashish,etal.Attentionisallyouneed.Advancesinneuralinformationprocessingsystems.2017.[2]Devlin,Jacob,etal.Bert:Pre-trainingofdeepbidirectionaltransformersforlanguageunderstanding.arXivpreprintarXiv:1810.04805(2018).[3]Serrano,Sofia,andNoahA.

文档评论(0)

1亿VIP精品文档

相关文档