Approximate policy iteration a survey and some new methods-近似策略迭代一项调查和一些新方法.pdfVIP

  • 5
  • 0
  • 约27.05万字
  • 约 26页
  • 2017-10-15 发布于上海
  • 举报

Approximate policy iteration a survey and some new methods-近似策略迭代一项调查和一些新方法.pdf

Approximate policy iteration a survey and some new methods-近似策略迭代一项调查和一些新方法

_,cDn加Z 201l Z抛D拶A印Z 9(3)310-335 DOI ▲ ■ J 1■ ■』 J● ■ lteratlon:aannS0me ADDr0XlmateD0llCy SUrVey newmethodS DiIni仕iP.BERTSEKAS Dcpam∞ntofElecmcalEn萄necringandC伽叩utersci即ce,M舔sachu∞ttsI璐timte considertlleclassicalite:rationmemodof Abstl鼍ct:Wb policv dvnaITlicpmg瑚mmjng(DP),wherea1,proximations of 肌dsimulanonareusedtodealwithmecurSeof anumberof andrate dimensionality.Wbsurvey issues:convergence evaluation tosimuladonnoiseof evaIu· conve理enceofappmximatepoIicy methods,singul州西andsusceptibiliH policy ande划嘲nced osciIlationand ation,exDlora止ionissues,cons仃ained policyitemtion,policy chanering,andoptiIIlistic如d dis砸butedite瑚lion.Ourdiscussionof evaluationiscouchedin temsandaimsto available policv policy general unify山e of andto metwomain evaluation me山odsinttlelightrecentresearchdevelopmentscomp硼_e policy approaches:projected aIId contextof山ese mrodiff.erem equationsteInporaldin.erences(TD),andaggregation.IIl山e appmaches,wesun,ey of inversion as tvpessimulanon-basedalgorimms:matrixmethods,suchleast—squarestemporaldiffbrence(LSTD),柚d iterative

您可能关注的文档

文档评论(0)

1亿VIP精品文档

相关文档