- 1、本文档共51页,可阅读全部内容。
- 2、原创力文档(book118)网站文档一经付费(服务费),不意味着购买了该文档的版权,仅供个人/单位学习、研究之用,不得用于商业用途,未经授权,严禁复制、发行、汇编、翻译或者网络传播等,侵权必究。
- 3、本站所有内容均由合作方或网友上传,本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺!文档内容仅供研究参考,付费前请自行鉴别。如您付费,意味着您自己接受本站规则且自行承担风险,本站不退款、不进行额外附加服务;查看《如何避免下载的几个坑》。如果您已付费下载过本站文档,您可以点击 这里二次下载。
- 4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等,请点击“版权申诉”(推荐),也可以打举报电话:400-050-0827(电话支持时间:9:00-18:30)。
查看更多
Vector Computers - Computer Science Department向量计算机-计算机科学系.ppt
3/22-24/11 CSE502-S11, Lec 14+15-Vector * Recent Multimedia Extensions for PCs 3/22-24/11 CSE502-S11, Lec 14+15-Vector * Outline Vector Processing Overview Vector Metrics, Terms Greater Efficiency than SuperScalar Processors Examples CRAY-1 (1976, 1979) 1st vector-register supercomputer Multimedia extensions to high-performance PC processors Modern multi-vector-processor supercomputer – NEC ESS Design Features of Vector Supercomputers Conclusions Next Reading Assignment: Chapter 4 MultiProcessors 3/22-24/11 CSE502-S11, Lec 14+15-Vector * Vector Instruction Execution ADDV C,A,B C[1] C[2] C[0] A[3] B[3] A[4] B[4] A[5] B[5] A[6] B[6] Execution using one pipelined functional unit C[4] C[8] C[0] A[12] B[12] A[16] B[16] A[20] B[20] A[24] B[24] C[5] C[9] C[1] A[13] B[13] A[17] B[17] A[21] B[21] A[25] B[25] C[6] C[10] C[2] A[14] B[14] A[18] B[18] A[22] B[22] A[26] B[26] C[7] C[11] C[3] A[15] B[15] A[19] B[19] A[23] B[23] A[27] B[27] Four-lane execution using four pipelined functional units 3/22-24/11 CSE502-S11, Lec 14+15-Vector * Vector Unit Structure Lane Functional Unit Vector Registers Memory Subsystem Elements 0, 4, 8, … Elements 1, 5, 9, … Elements 2, 6, 10, … Elements 3, 7, 11, … 3/22-24/11 CSE502-S11, Lec 14+15-Vector * Automatic Code Vectorization for (i=0; i N; i++) C[i] = A[i] + B[i]; load load add store load load add store Iter. 1 Iter. 2 Scalar Sequential Code Vectorization is a massive compile-time reordering of operation sequencing ? requires extensive loop dependence analysis Vector Instruction load load add store load load add store Iter. 1 Iter. 2 Vectorized Code Time 3/22-24/11 CSE502-S11, Lec 14+15-Vector * Vector Stripmining Problem: Vector registers have finite length (64) Solution: Break longer (than 64) loops into pieces that fit into vector registers, “Stripmining” ANDI R1, N, 63 # N mod 64 MTC1 VLR, R1 # Do remainder loop: LV V1, RA # Vector load A DSLL R2, R1, 3 # Multiply N%64 *8 DADDU RA, RA, R2 # Bump RA pointer LV
您可能关注的文档
- Social Media Marketing - Environmental Control Technology 社会媒体营销环境控制技术.ppt
- 10环艺园林工程清单计价第三章5.ppt
- 高一地理新教材板书.ppt
- DOSR Vision - Frontiers in Distributed Information SystemsDosR视觉的分布式信息系统的前沿.ppt
- 关键字教程-google.pptx
- [金牌原创]Eingebettete SystemeQualit.ppt
- 2011高考化学二轮专题复习与增分策略:推断题.ppt
- KPI打造高绩效指标分解案例.ppt
- 红外和其他 的联网介质.ppt
- 游戏设计概论:认识游戏:.ppt
文档评论(0)