深度学习及其应用_ 课件 0525神经网络优化器.pptxVIP

  • 1
  • 0
  • 约2.54千字
  • 约 19页
  • 2026-02-26 发布于山东
  • 举报

深度学习及其应用_ 课件 0525神经网络优化器.pptx

DNN:AdvancedTrainingtechniquesDeepLearninganditsApplication

AdvancedOptimizersSJTUDeepLearningLecture.2

OptimizersNesterovMomentum(NAG)AdaptiveLearningRate:AdaGradAdaptiveLearningRate:AdaDeltaAdaptiveLearningRate:RMSPropAdaptiveLearningRate:AdamSJTUDeepLearningLecture.3

NesterovMomentum(NAG)StandardmomentumNesterovmomentum:gradientisevaluatedafterthecurrentvelocityisapplied.Adda“correctionfactor”tolookaheadSJTUDeepLearningLecture.4

SGDMomentumvs.NAGSJTUDeepLearningLecture.5SGDMomentumNAG

NesterovMomentum(NAG)IllustrationThisanticipatoryupdatepreventsusfromgoingtoofastandresultsinincreasedresponsiveness.WorkmoreforbatchupdateratherthanSGDSJTUDeepLearningLecture.6

Momentumvs.NAGSJTUDeepLearningLecture.7

LearningRateMomentumsprovide“smart”gradients.AnotherimportantfactorofGDislearningrate.small?slowlarge?non-optimalSJTUDeepLearningLecture.8ChoosingaproperlearningratecanbedifficultGloballearningratemaynotbesuitableforallparameters

AdaptiveLearningRate:AdaGradAdaGrad:GreaterprogressinthemoregentlyslopeddirectionSJTUDeepLearningLecture.9

?AdaptiveLearningRate:AdaGradSJTUDeepLearningLecture.10

AdaptiveLearningRate:AdaDeltaSJTUDeepLearningLecture.11

AdaDeltaAdaDeltaisanextensionofAdaGradthatseekstoreduceitsaggressive,monotonicallydecreasinglearningrate.?Wedon’tneedtopresetthegloballearningrateasitdoesnotappearintheruleAdaptiveLearningRate:AdaDeltaSJTUDeepLearningLecture.12

RMSPropAdaptiveLearningRate:RMSPropSJTUDeepLearningLecture.13

RMSPropPros:Usefulfornon-convexfunctionAne?ectiveandpracticaloptimizationalgorithmfordeepneuralnetworksCons:RequirespresetlearningrateAdaptiveLearningRate:RMSPropSJTUDeepLearningLecture.14

AppliedmomentumontheRMSProprescaledgradientCombiningRMSPropandNesterovMomentumSJTUDeepLearningLecture.15

CombiningRMSPropandMomentum

文档评论(0)

1亿VIP精品文档

相关文档