基于偏向信息学习的双层强化学习算法.PDFVIP

下载本文档

1
0
约2.33万字
约 8页
2019-01-05 发布于四川
举报

基于偏向信息学习的双层强化学习算法.PDF

维普资讯计算机研究与发展 ISSNi000—1239／CN 1卜1777门rP JournalofComputerResearchandDevelopment 45(9)：1455-1462，2008 基于偏向信息学习的双层强化学习算法林芬石川。罗杰文史忠植 (中国科学院计算技术研究所智能信息处理重点实验室北京 100190) (中国科学院研究生院北京 100049) 。(at京邮电大学北京市智能软件与多媒体重点实验室北京 100876) (1inf@ics．ict．ac．cn) DualReinforcementLearningBasedonBiasLearning LinFen～，ShiChuan ，LuoJiewen～，andShiZhongzhi (KeyLaboratoryofIntelligentInformationProcessing，InstituteofComputingTechnology，Beijing100190) 。(GraduateUniversityofChineseAcademyofSciences，Beijing100049) 。(SmartSoftwareandMultimediaofBeijingKeyLaboratory，BeijingUniversityofPostsandTelecommunications， Beijing100876) Abstract Reinforcementlearninghasreceivedmuch attention in thepastdecade． Its incremental natureandadaptivecapabilitiesmakeitsuitableforuseinvariousdomains。suchasautomaticcontro1． mobileroboticsandmulti—agentsystem．A criticalproblem inconventionalreinforcementlearningis theslow convergenceofthelearningprocess．To acceleratethelearning speed，biasinformation is incorporatedtoboostlearningprocesswithprioriknowledge．Currentmethodsusebiasinformation for the action selection strategies in reinforcement learning． They may suffer from the non— convergenceproblem whenprioriknowledgeisincorrect．A dualreinforcementlearningmodelbased on bias learning is proposed，which integrates reinforcement learning processand bias learning process． Bias information is used for action selection strategies in reinforcement learning and reinforcementlearning isused toguidebiaslearningprocess．Thusthe dualreinforcementlearning modelcouldmakeeffectiveuseofprioriknowledge，and eliminatethenegativeeffectsofincorrect prioriknowledge．Finally，theproposed dualmodelisvalidated by experimenton maze problem including simpleenvironmentand complex environmen

您可能关注的文档

文档评论（0）

1亿VIP精品文档

更多 >

基于偏向信息学习的双层强化学习算法.PDFVIP