- 1
- 0
- 约2.33万字
- 约 8页
- 2019-01-05 发布于四川
- 举报
维普资讯
计算 机研究与发展 ISSNi000—1239/CN 1卜1777门rP
JournalofComputerResearchandDevelopment 45(9):1455-1462,2008
基于偏 向信息学习的双层强化学习算法
林 芬 石 川。罗杰文 史忠植
(中国科学院计算技术研究所智能信息处理重点实验室 北京 100190)
(中国科学院研究生院 北京 100049)
。(at京邮电大学北京市智能软件与多媒体重点实验室 北京 100876)
(1inf@ics.ict.ac.cn)
DualReinforcementLearningBasedonBiasLearning
LinFen~,ShiChuan ,LuoJiewen~,andShiZhongzhi
(KeyLaboratoryofIntelligentInformationProcessing,InstituteofComputingTechnology,Beijing100190)
。(GraduateUniversityofChineseAcademyofSciences,Beijing100049)
。(SmartSoftwareandMultimediaofBeijingKeyLaboratory,BeijingUniversityofPostsandTelecommunications,
Beijing100876)
Abstract Reinforcementlearninghasreceivedmuch attention in thepastdecade. Its incremental
natureandadaptivecapabilitiesmakeitsuitableforuseinvariousdomains。suchasautomaticcontro1.
mobileroboticsandmulti—agentsystem.A criticalproblem inconventionalreinforcementlearningis
theslow convergenceofthelearningprocess.To acceleratethelearning speed,biasinformation is
incorporatedtoboostlearningprocesswithprioriknowledge.Currentmethodsusebiasinformation
for the action selection strategies in reinforcement learning. They may suffer from the non—
convergenceproblem whenprioriknowledgeisincorrect.A dualreinforcementlearningmodelbased
on bias learning is proposed,which integrates reinforcement learning processand bias learning
process. Bias information is used for action selection strategies in reinforcement learning and
reinforcementlearning isused toguidebiaslearningprocess.Thusthe dualreinforcementlearning
modelcouldmakeeffectiveuseofprioriknowledge,and eliminatethenegativeeffectsofincorrect
prioriknowledge.Finally,theproposed dualmodelisvalidated by experimenton maze problem
including simpleenvironmentand complex environmen
原创力文档

文档评论(0)