Assume having a two level memory hierarchy: a cache and a main memory, which are connected with a 32 bit wide bus. A hit in the cache can be executed within one clock cycle. At a cache miss an entire block must be replaced. This is done by sending the address (32 bits) to the memory which needs 4 clock cycles before it can send back a block by the bus. Every bus-transfer requires one clock cycle. The processor will need to wait until the entire block is in the cache. The following table shows the average miss ratio for different block sizes:
Which block size results in the best average memory-access time?
Since a processor is clocked in discreet steps it's enough to count the extra cost you get on a miss. Here we have stated the miss cost as the number of clock cycles.
Miss cost tm(B) = M(B)/100 * (1 + 4 + B/4) which gives us
tm(32) = 0:016 * (5 + 32/4) = 0:2 as the smallest value --> Bopt = 32