- 11
- 0
- 约8.12千字
- 约 39页
- 2021-06-15 发布于北京
- 举报
Deep Reinforcement LearningExample: Playing Video GameStart with observation ??Observation Observation ?Obtain reward ??Obtain reward ?Action : “fire” ?Action : “right” (kill an alien)Usually there is some randomness in the environmentExample: Playing Video GameStart with observation ??Observation Observation ?This is an episode.After many turnsGame Over(spaceship destroyed)Learn to maximize the expected cumulative reward per episode?Obtain reward ?Action ApproachesModel-free ApproachPolicy-basedValue-basedLearning a CriticLearning an ActorActor + CriticModel-based ApproachOn-policy v.s
您可能关注的文档
最近下载
- AP宏观经济学 2004年真题 附答案和评分标准 AP Macroeconomics 2004 Real Exam with Answers and Scoring Guidelines.pdf VIP
- 五邑大学-本科-毕业论文-理科-格式模板范文.docx VIP
- 行政审批系统使用手册.pdf VIP
- 《透明式LED显示屏通用技术规范》.pdf
- 视频会议系统使用说明书.doc VIP
- 销售行业述职报告5篇.docx VIP
- 2026广东清远市阳山县融媒体中心招聘新闻人员4人备考题库及答案详解(最新).docx VIP
- 数据科学与大数据技术专业建设方案(汇报PPT).pptx
- 2025各地融媒体中心招聘笔试历年真题+模拟题答案汇总.doc VIP
- U盘接口芯片CH378在音乐播放器设计中的应用.pdf VIP
原创力文档

文档评论(0)