基于偏向信息学习的双层强化学习算法.PDFVIP

  • 1
  • 0
  • 约2.33万字
  • 约 8页
  • 2019-01-05 发布于四川
  • 举报

基于偏向信息学习的双层强化学习算法.PDF

维普资讯 计算 机研究与发展 ISSNi000—1239/CN 1卜1777门rP JournalofComputerResearchandDevelopment 45(9):1455-1462,2008 基于偏 向信息学习的双层强化学习算法 林 芬 石 川。罗杰文 史忠植 (中国科学院计算技术研究所智能信息处理重点实验室 北京 100190) (中国科学院研究生院 北京 100049) 。(at京邮电大学北京市智能软件与多媒体重点实验室 北京 100876) (1inf@ics.ict.ac.cn) DualReinforcementLearningBasedonBiasLearning LinFen~,ShiChuan ,LuoJiewen~,andShiZhongzhi (KeyLaboratoryofIntelligentInformationProcessing,InstituteofComputingTechnology,Beijing100190) 。(GraduateUniversityofChineseAcademyofSciences,Beijing100049) 。(SmartSoftwareandMultimediaofBeijingKeyLaboratory,BeijingUniversityofPostsandTelecommunications, Beijing100876) Abstract Reinforcementlearninghasreceivedmuch attention in thepastdecade. Its incremental natureandadaptivecapabilitiesmakeitsuitableforuseinvariousdomains。suchasautomaticcontro1. mobileroboticsandmulti—agentsystem.A criticalproblem inconventionalreinforcementlearningis theslow convergenceofthelearningprocess.To acceleratethelearning speed,biasinformation is incorporatedtoboostlearningprocesswithprioriknowledge.Currentmethodsusebiasinformation for the action selection strategies in reinforcement learning. They may suffer from the non— convergenceproblem whenprioriknowledgeisincorrect.A dualreinforcementlearningmodelbased on bias learning is proposed,which integrates reinforcement learning processand bias learning process. Bias information is used for action selection strategies in reinforcement learning and reinforcementlearning isused toguidebiaslearningprocess.Thusthe dualreinforcementlearning modelcouldmakeeffectiveuseofprioriknowledge,and eliminatethenegativeeffectsofincorrect prioriknowledge.Finally,theproposed dualmodelisvalidated by experimenton maze problem including simpleenvironmentand complex environmen

文档评论(0)

1亿VIP精品文档

相关文档