Q1. Consider the following three hypothetical, but not atypical, processors, which we run with the SPEC gcc benchmark:
1. A simple MIPS two-issue static pipe running at a clock rate of 4 GHz and achieving a pipeline CPI of 0.8. This processor has a cache system that yields 0.005 misses per instruction.
2. A deeply pipelined version of a two-issue MIPS processor with slightl smaller caches and a 5 GHz clock rate. The pipeline CPI of the processor is 1.0, and the smaller caches yield 0.0055 misses per instruction on average.
3. A speculative superscalar with a 64-entry window. It achieves one-half of the ideal issue rate measured for this window size. This processor has the smallest caches, which lead to 0.01 misses per instruction, but it hides 25% of the miss penalty on every miss by dynamic scheduling. This processor has a 2.5 GHz clock.
Assume that the main memory time (which sets the miss penalty) is 50 ns. Determine the relative performance of these three processors.
Answer
First, we use the miss penalty and miss rate information to compute the contribution to CPI from cache misses for each configuration. We do this with the following formula:
Cache CPI =Misses per instruction * Miss penalty
We need to compute the miss penalties for each system:
Miss penalty= Memory access time / Clock cycle
The clock cycle times for the processors are 250 ps, 200 ps, and 400 ps, respectively. Hence, the miss penalties are
Miss penalty1 =50 ns / 250 ps = 200 cycles
Miss penalty2 = 50 ns / 200 ps =250 cycles
Miss penalty3 =0.75×50 ns / 400 ps =94 cycles
Applying this for each cache:
Cache CPI1 = 0.005 × 200 = 1.0
Cache CPI2 = 0.0055 × 250 = 1.4
Cache CPI3 = 0.01 × 94 = 0.94
We know the pipeline CPI contribution for everything but processor 3; its pipeline CPI is given by:
Pipeline CPI3 = 1/Issue rate =1/ (9 x 0.5)\ =1 /4.5 =0.221
Now we can find the CPI for each processor by adding the pipeline and cache CPI contributions:
CPI1 = 0.8 + 1.0 = 1.8
CPI2 = 1.0 + 1.4 = 2.4
CPI3 = 0.22 + 0.94 = 1.16
Since this is the same architecture, we can compare instruction execution rates in millions of instructions per second (MIPS) to determine relative performance:
Instruction execution rate = CR / CPI
Instruction execution rate1 = 4000 MHz /1.8 = 2222 MIPS
Instruction execution rate2 =5000 MHz /2.4 = 2083 MIPS
Instruction execution rate3 =2500 MHz /1.16 =2155 MIPS
In this example, the simple two-issue static superscalar looks best. In practice, performance depends on both the CPI and clock rate assumptions.
No comments:
Post a Comment