第5卷 第2期
120 中国科技论文在线 Sciencepaper Online 2010 年 2 月
异质 Agent 间的知识迁移强化学习
刘 博,雷汝海
(中国矿业大学信息与电气工程学院, 江苏徐州 221116 )
摘 要:针对现有知识迁移方法仅适用于同质强化学习Agent 的问题,提出一种能够在具有不同状态动作空间的异质
Agent 间迁移知识的Q 学习算法。该算法的主要思想是通过新旧Agent共同学习过的任务,利用神经网络离线学习两
Agent 间的Q值函数映射关系,利用构造的Q值函数映射器把旧Agent 学过而新Agent没有学过的任务的Q值映射到新
Agent上,从而可以减少新Agent 的学习尝试次数,提高学习速度。10×10 格子世界仿真结果验证了所提知识迁移Q学
习算法的有效性。
关键词:强化学习;知识迁移;异质Agent ;Q值
中图分类号:TP18 文献标志码:A 文章编号:1673-7180(2010)02 -0120 -4
Knowledge transfer between heterogeneous reinforcement learning agent
Liu Bo ,Lei Ruhai
(School of Information and Electrical Engineering, China University of Mining and Technology,
Xuzhou, Jiangsu 221116, China)
Abstract: Aiming at the problem of the existing knowledge transfer methods are only suitable for homogenous
reinforcement learning agents, a kind of Q learning algorithm that can transfer knowledge between heterogeneous Agents
with different state and action spaces. The main idea of the proposed Q learning algorithm can be described as the follows.
Based on a task that was already learned by an old and a new Agent, a neural network was used to off-line learn a mapping
relationship of Q value function between the two Agents. The constructed mapping of Q value function was then used to
obtain Q value of the new Agent in a new task that was already learned by the old Agent while was not learned by the new
Agent. The proposed Q learning algorithm can decrease the number of trials of the new Agent and so as to improve learning
speed. Simulation results of 10×10 mazes illustrate the validity of the proposed Q learning algorithm.
Ke
您可能关注的文档
最近下载
- 2026年上饶职业技术学院单招职业技能考试题库带答案详解(典型题).docx VIP
- 土体施工扰动特点研究.pdf VIP
- 九年级音乐上册 《中国人民解放军军歌》教学课件.pptx VIP
- 2025年项目管理专业项目档案的属性、价值与作用专题试卷及解析.pdf VIP
- RAZ-F分级阅读英语绘本The Food Chain(带练习册).pdf VIP
- 医疗污水处理故障应急上报演练脚本.docx VIP
- 2026开封市第三届职业技能大赛车身修理(世赛选拔)项目技术工作文件.pdf VIP
- 混凝土工三级安全教育试卷附答案.docx VIP
- 高教版中职历史基础模块中国历史第二十六课社会主义建设在探索中曲折发展-课件.ppt VIP
- 年产25万吨苯乙烯项目7-清洁生产报告.docx VIP
原创力文档

文档评论(0)