高级计算机体系结构-3-2概论.ppt

Pipelining and Exploiting Instruction-Level Parallelism (ILP) Pipelining increases performance by overlapping the execution of independent instructions. The CPI of a real-life pipeline is given by (assuming ideal memory): Pipeline CPI = Ideal Pipeline CPI + Structural Stalls + RAW Stalls + WAR Stalls + WAW Stalls + Control Stalls Pipelining and Exploiting Instruction-Level Parallelism (ILP) Pipelining and Exploiting Instruction-Level Parallelism (ILP) Pipelining increases performance by overlapping the execution of independent instructions(ILP). The CPI of a real-life pipeline is given by (assuming ideal memory): Pipeline CPI = Ideal Pipeline CPI + Structural Stalls + RAW Stalls + WAR Stalls + WAW Stalls + Control Stalls A basic instruction block is a straight-line code sequence with no branches in, except at the entry point, and no branches out except at the exit point of the sequence . The amount of parallelism in a basic block is limited by instruction dependence present and size of the basic block. In typical integer code, dynamic branch frequency is about 15% (average basic block size of 7 instructions). Increasing Instruction-Level Parallelism A common way to increase parallelism among instructions is to exploit parallelism among iterations of a loop (i.e Loop Level Parallelism, LLP). This is accomplished by unrolling the loop either statically by the compiler, or dynamically by hardware, which increases the size of the basic block present. In this loop every iteration can overlap with any other iteration. Overlap within each iteration is minimal. for (i=1; i=1000; i=i+1;) x[i] = x[i] + y[i]; In vector machines, utilizing vector instructions is an important alternative to exploit loop-level parallelism, Vector instructions operate on a number of data items. The above loop would requi

文档评论(0)

1亿VIP精品文档

相关文档