异质Agent间的知识迁移强化学习-中国科技论文在线.PDF

下载文档

10
0
约1.54万字
约 4页
2019-04-13 发布于天津
举报
保障服务

异质Agent间的知识迁移强化学习-中国科技论文在线.PDF

第5卷第2期 120 中国科技论文在线 Sciencepaper Online 2010 年 2 月异质 Agent 间的知识迁移强化学习刘博，雷汝海（中国矿业大学信息与电气工程学院, 江苏徐州 221116 ）摘要：针对现有知识迁移方法仅适用于同质强化学习Agent 的问题，提出一种能够在具有不同状态动作空间的异质 Agent 间迁移知识的Q 学习算法。该算法的主要思想是通过新旧Agent共同学习过的任务，利用神经网络离线学习两 Agent 间的Q值函数映射关系，利用构造的Q值函数映射器把旧Agent 学过而新Agent没有学过的任务的Q值映射到新 Agent上，从而可以减少新Agent 的学习尝试次数，提高学习速度。10×10 格子世界仿真结果验证了所提知识迁移Q学习算法的有效性。关键词：强化学习；知识迁移；异质Agent ；Q值中图分类号：TP18 文献标志码：A 文章编号：1673－7180(2010)02 －0120 －4 Knowledge transfer between heterogeneous reinforcement learning agent Liu Bo ，Lei Ruhai (School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China) Abstract: Aiming at the problem of the existing knowledge transfer methods are only suitable for homogenous reinforcement learning agents, a kind of Q learning algorithm that can transfer knowledge between heterogeneous Agents with different state and action spaces. The main idea of the proposed Q learning algorithm can be described as the follows. Based on a task that was already learned by an old and a new Agent, a neural network was used to off-line learn a mapping relationship of Q value function between the two Agents. The constructed mapping of Q value function was then used to obtain Q value of the new Agent in a new task that was already learned by the old Agent while was not learned by the new Agent. The proposed Q learning algorithm can decrease the number of trials of the new Agent and so as to improve learning speed. Simulation results of 10×10 mazes illustrate the validity of the proposed Q learning algorithm. Ke

您可能关注的文档

文档评论（0）

1亿VIP精品文档

更多 >

异质Agent间的知识迁移强化学习-中国科技论文在线.PDF