Computer Architecture: Numericals

Q1. Consider the following three hypothetical, but not atypical, processors, which we run with the SPEC gcc benchmark:

1. A simple MIPS two-issue static pipe running at a clock rate of 4 GHz and achieving a pipeline CPI of 0.8. This processor has a cache system that yields 0.005 misses per instruction.

2. A deeply pipelined version of a two-issue MIPS processor with slightl smaller caches and a 5 GHz clock rate. The pipeline CPI of the processor is 1.0, and the smaller caches yield 0.0055 misses per instruction on average.

3. A speculative superscalar with a 64-entry window. It achieves one-half of the ideal issue rate measured for this window size. This processor has the smallest caches, which lead to 0.01 misses per instruction, but it hides 25% of the miss penalty on every miss by dynamic scheduling. This processor has a 2.5 GHz clock.

Assume that the main memory time (which sets the miss penalty) is 50 ns. Determine the relative performance of these three processors.

Answer

First, we use the miss penalty and miss rate information to compute the contribution to CPI from cache misses for each configuration. We do this with the following formula:

Cache CPI =Misses per instruction * Miss penalty

We need to compute the miss penalties for each system:

Miss penalty= Memory access time / Clock cycle

The clock cycle times for the processors are 250 ps, 200 ps, and 400 ps, respectively. Hence, the miss penalties are

Miss penalty1 =50 ns / 250 ps = 200 cycles

Miss penalty2 = 50 ns / 200 ps =250 cycles

Miss penalty3 =0.75×50 ns / 400 ps =94 cycles

Applying this for each cache:

Cache CPI1 = 0.005 × 200 = 1.0

Cache CPI2 = 0.0055 × 250 = 1.4

Cache CPI3 = 0.01 × 94 = 0.94

We know the pipeline CPI contribution for everything but processor 3; its pipeline CPI is given by:

Pipeline CPI3 = 1/Issue rate =1/ (9 x 0.5)\ =1 /4.5 =0.221

Now we can find the CPI for each processor by adding the pipeline and cache CPI contributions:

CPI1 = 0.8 + 1.0 = 1.8

CPI2 = 1.0 + 1.4 = 2.4

CPI3 = 0.22 + 0.94 = 1.16

Since this is the same architecture, we can compare instruction execution rates in millions of instructions per second (MIPS) to determine relative performance:

Instruction execution rate = CR / CPI

Instruction execution rate1 = 4000 MHz /1.8 = 2222 MIPS

Instruction execution rate2 =5000 MHz /2.4 = 2083 MIPS

Instruction execution rate3 =2500 MHz /1.16 =2155 MIPS

In this example, the simple two-issue static superscalar looks best. In practice, performance depends on both the CPI and clock rate assumptions.

Computer Architecture

Friday, April 24, 2015

Numericals

Answer

No comments:

Post a Comment