hpca ยป

Consider an enhancement to a processor which executes 10 times faster than the original but only applies to 40% of the workload.

What is the overall speedup when the improvement is incorporated?

Instructor note: This problem is adapted from the a University of Buffalo Computer Science Lecture.

We would like to speed up a Floating Point Square Root operation using one of the two methods described below. What is the CPI for each design? Program CPI original = 2.0.

Method 1: Uses FPSQR hardware and is responsible for 20% of the square root execution time. Speedup by this component is by a factor of 10.

Method 2: Make all the floating point instructions run 2 times faster. Floating point instructions are responsible for 50% of the square root execution time.

Method 1 CPI: _____

Method 2 CPI: _____

We are considering an enhancement to the processor of a web server. The new CPU is 20 times faster on search queries than the old processor. The old processor is busy with search queries 70% of the time. What is the speedup gained by integrating the enhanced CPU?

Instructor notes: This problem is adapted from Professor Kleffel's course notes at CSUSB.

Your company icurrently uses the "High RISC" processor by RISCily Solutions. It is a 2GHz processor with an average CPI of 1.5. RISCy solutions has just come out with two new processors. You are to determine the speedup for each processor.

"Triple the RISC" is a three-pipelined machine that also runs at 2GHz and maintains the same CPI as the High RISC. It is estimated that one third of your code can use all three pipelines and another third can use two pipelines. For an n-pipelined processor, the ideal speedup is n.

"RISCily Fast" is a 5GHz single processor, that due to its speed encounters a larger memory penalty. When the cache misses (5% of the memory accesses time) on memory accesses (40% of all instructions), it incurs a penalty of 50 cycles more than the original High RISC processor. In all other ways it is identical to the High RISC.

Speedup for Triple the RISC: ____

Speedup for RISCily Fast: ____

Instructor notes: This problem is adapted from a midterm given by Schubert at CSUSB.

We are given a task which is split up into four parts:

```
P1 = 11%
P2 = 18%
P3 = 23%
P4 = 48%
```

Then we say:

```
P1 is not sped up, so S1 = 1 or 100%
P2 is sped up 5x, so S2 = 500%
P3 is sped up 20x, so S3 = 2000%
P4 is sped up 1.6x, so S4 = 160%
```

What is the overall speedup for the task?

Instructor notes: This problem is adapted from the Amdahl's Law Wikipedia page.

Memory operations currently take 30% of execution time.

A new widget called a "cache" speeds up 80% of memory operations by a factor of 4.

A second new widget called a "L2 cache" speeds up 1/2 the remaining 20% by a factor of 2.

What is the total speedup?

Instructor notes: This problem is adapted from the UCSD CS141 lecture notes.

Use Amdahl's Law to illustrate why it is important to keep a computer system balanced in terms of relative performance between, for example, I/O speed and raw CPU speed.

Instructor notes: Adapted from Hennessy and Patterson, Computer Architecture, 4th ed.

Three enhancements with the following speedup are proposed for a new architecture:

Enhancement A: speedup = 30

Enhancement B: speedup = 20

Enhancement C: speedup = 15

Only one enhancement is useable at a time.

How can Amdahl's Law be formulated to handle multiple enhancements?

Instructor notes: Adapted from Hennessy and Patterson, Computer Architecture, 4th ed.

As in Problem 8, three enhancements with the following speedup are proposed for a new architecture.

Only one enhancement is useable at a time.

If enhancements A and B are usable for 25% of the time, what fraction of the time must enhancement C be used to achieve an overall speedup of10?

Instructor notes: Adapted from Hennessy and Patterson, Computer Architecture, 4th ed.

As in Problem 8, three enhancements with the following speedup are proposed for a new architecture.

Only one enhancement is useable at a time.

Assume the enhancements can be used 25%, 35%, and 10% of the time for enhancements A, B, and C respectively.

Of the new execution time, what percentage is spent without using any of the enhancements?

Instructor notes: Adapted from Hennessy and Patterson, Computer Architecture, 4th ed.

As in Problem 8, three enhancements with the following speedup are proposed for a new architecture.

Only one enhancement is useable at a time.

Assume, for some benchmark, the possible fraction of use is 15% for each of the enhancements A and B and 70% for enhancement C. We want to maximize performance.

If only one enhancement can be implemented, which should it be?

If two enhancements can be implemented, which should be chosen?

Instructor notes: Adapted from Hennessy and Patterson, Computer Architecture, 4th ed.