基于梯度学习的参数控制帮助线程预取模型（.docVIP

下载本文档

3
0
约9.76千字
约 6页
2016-10-29 发布于天津
举报
版权申诉

基于梯度学习的参数控制帮助线程预取模型（.doc

1、原创力文档（book118）网站文档一经付费（服务费），不意味着购买了该文档的版权，仅供个人/单位学习、研究之用，不得用于商业用途，未经授权，严禁复制、发行、汇编、翻译或者网络传播等，侵权必究。。
2、本站所有内容均由合作方或网友上传，本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺！文档内容仅供研究参考，付费前请自行鉴别。如您付费，意味着您自己接受本站规则且自行承担风险，本站不退款、不进行额外附加服务；查看《如何避免下载的几个坑》。如果您已付费下载过本站文档，您可以点击这里二次下载。
3、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等，请点击“版权申诉”（推荐），也可以打举报电话：400-050-0827(电话支持时间：9:00-18:30)。
4、该文档为VIP文档，如果想要下载，成为VIP会员后，下载免费。
5、成为VIP后，下载本文档将扣除1次下载权益。下载后，不支持退款、换文档。如有疑问请联系我们。
6、成为VIP后，您将拥有八大权益，权益包括：VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
7、VIP文档为合作方或网友上传，每下载1次，网站将根据用户上传文档的质量评分、类型等，对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档

基于梯度学习的参数控制帮助线程预取模型（.doc

基于梯度学习的参数控制帮助线程预取模型( 裴颂文，张俊格，宁静上海理工大学光电信息与计算机工程学院，上海 200093； 2. 上海市现代光学系统重点实验室，上海 200093 ）摘要: 对于非规则访存的应用程序，Cache访问缺失。采用帮助线程将数据预取到离CPU更近的Cache，可以有效提高多核系统的性能。当某个应用程序的访存开销大于计算开销时，传统的帮助线程的访存开销会高于主线程的计算开销，从而导致帮助线程落后于主线程。我们提出了一种改进的基于参数控制的帮助线程预取模型，该模型采用梯度下降算法对控制参数求解最优值，从而有效地控制帮助线程与主线程的访存任务量，使帮助线程领先于主线程。实验表明，基于参数选择的线程预取模型能获得1.2～1.5倍的系统性能加速比。关键字：数据预取; 帮助线程; 多核系统; 访存延迟; 梯度下降中图分类号：文献标识码：A 文章编号： Helper Thread Pre-fetching Model Based on Learning Gradients of Control Parameters PEI songwen, ZHANG junge, NING jing (1. School of Optical Electrical and Computer Engineering, University of Shanghai for Science and Technology，Shanghai 200093,China; 2. Shanghai Key Lab of Modern Optical System，Shanghai 200093,China) Abstract: The applications with irregular accessing memory would incur serious Cache miss in the run-time. Helper thread is an effective technology to improve the performance of multicore systems. Helper thread pre-fetches data from memory to the Cache which is the closest one to CPU. If the overhead of accessing memory for a given application is much greater than that of computation, it would make helper thread lag behind the main thread. Hereby, we propose an improved helper thread pre-fetching model by adding control parameters. Furthermore, gradient descent algorithm is one of the most popular machine learning algorithms, which is adopted here?to determine the optimal control parameters. The amount of the memory access tasks are controlled by the control parameters effectively, which makes helper thread be finished before main thread. The experiment results show that the speedup of system performance is achieved by 1.2 times to 1.5 times. Keywords: data pre-fetch; helper thread; multi-core system; memory latency; gradient descent 在微处理器的发展进入多核时代[1，Cache缺失而引起的访存延迟。传统的数据预取可分为硬件预取[5]和软件预取[6]两种。硬件预取在预取引擎的控制下，根据访存历史，预取方式7]实质上是一种Leader/Follower结构，帮助线程是去除了原程序计算任务的“精简版本”，他往往比主线程运行的Cache污染。根据不同的程序中访存开销和计算开销的规模，可将程序划分为以下三种类别。设程序的访存时间为，计算时间为。计算开销与访存开销