# Memory/Problems 1 - 3

Suppose that CPI for a given architecture (with a perfect memory system, using 32 bit addresses) is 1,5. We are considering the following cache systems:

•  A 16 KB direct mapped "unifi ed" cache using "write-back". Miss-ratio = 2.9%. Does not aff ect the cycle length.
•  A 16 KB 2-way set{associative "unifi ed" cache using "write-back". Miss-ratio = 2.2%. Increases the cycle length with a factor 1.2
•  A 32 KB direct mapped "unifi ed" cache using "write-back". Miss-ratio = 2.0%. Increases the cycle length with a factor 1.25

Suppose a memory latency of 40 cycles, 4 bytes transferred per cycle and that 50% of the blocks are "dirty". There are 32 bytes per block and 20% of the instructions are "data transfer" instructions. A "write buff er" is not used.

a) Calculate the effective CPI for the three cache systems.

b) Recalculate the CPI using a system with a TLB with a 0.2 % miss rate and a 20 cycle penalty. The caches are physically addressed.

c) Which of the above cache systems is best?

d) How does a TLB a ect performance if the cache is virtually or physically addressed?

Solution

a) Block transfer time between cache and memory (penalty): 40+32/4 = 48 cycles. Number of block transfers per instruction between cache and memory:

Cache accesses/instr * (0.5 blocks writeback + 1 block fetch) * Miss ratio = (1+0.2) * (0.5+1) * Miss ratio = 1.8 * Miss ratio

CPI = baseCPI + MemoryStalls/instr = base CPI + BlockTransfers/instr * penalty

CPI 1 = 1.5 +1.8*0.029 * 48 = 4.01

CPI 2 =1.5 +1.8*0.022 * 48 = 3.40

CPI 3 =1.5 +1.8*0.020 * 48 = 3.22

b) We must do a TLB access for each cache access since the caches are physically addressed. We then in all three cases get an extra CPI o set of: 0.002201.2 = 0.048.

c) Comparing execution times using CPU performance formula:

EXE 1 = 4.01 * 1 * CP * IC = 4.01 * CP * IC

EXE 1 = 3.40 * 1.2 * CP * IC = 4.08 * CP * IC

EXE 1 = 3.22 * 1.25 * CP * IC = 4.025 * CP * IC

Cache 1 is the best.

d) In a virtually addressed cache the TLB is only accessed at cache misses. In a physically addressed cache TLB is accessed for each cache access.

This problem was adapted from: "Exercises for Computer Architecture", Anders Ardo, Lund University