异质Agent间的知识迁移强化学习-中国科技论文在线.PDF

异质Agent间的知识迁移强化学习-中国科技论文在线.PDF

第5卷 第2期 120 中国科技论文在线 Sciencepaper Online 2010 年 2 月 异质 Agent 间的知识迁移强化学习 刘 博,雷汝海 (中国矿业大学信息与电气工程学院, 江苏徐州 221116 ) 摘 要:针对现有知识迁移方法仅适用于同质强化学习Agent 的问题,提出一种能够在具有不同状态动作空间的异质 Agent 间迁移知识的Q 学习算法。该算法的主要思想是通过新旧Agent共同学习过的任务,利用神经网络离线学习两 Agent 间的Q值函数映射关系,利用构造的Q值函数映射器把旧Agent 学过而新Agent没有学过的任务的Q值映射到新 Agent上,从而可以减少新Agent 的学习尝试次数,提高学习速度。10×10 格子世界仿真结果验证了所提知识迁移Q学 习算法的有效性。 关键词:强化学习;知识迁移;异质Agent ;Q值 中图分类号:TP18 文献标志码:A 文章编号:1673-7180(2010)02 -0120 -4 Knowledge transfer between heterogeneous reinforcement learning agent Liu Bo ,Lei Ruhai (School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China) Abstract: Aiming at the problem of the existing knowledge transfer methods are only suitable for homogenous reinforcement learning agents, a kind of Q learning algorithm that can transfer knowledge between heterogeneous Agents with different state and action spaces. The main idea of the proposed Q learning algorithm can be described as the follows. Based on a task that was already learned by an old and a new Agent, a neural network was used to off-line learn a mapping relationship of Q value function between the two Agents. The constructed mapping of Q value function was then used to obtain Q value of the new Agent in a new task that was already learned by the old Agent while was not learned by the new Agent. The proposed Q learning algorithm can decrease the number of trials of the new Agent and so as to improve learning speed. Simulation results of 10×10 mazes illustrate the validity of the proposed Q learning algorithm. Ke

文档评论(0)

1亿VIP精品文档

相关文档