This problem assumes that we are using a processor with a 15-stage pipeline and that the actual direction (T or NT) for a branch is determined during the 10th stage. The processor is executing a program in which 10% of executed instructions are conditional branches, and the only stalls occur when the direction of a conditional branch is mispredicted. The accuracies of different predictors for this program are as follows:
Assume that the program transformation replaces a conditional branch instruction with two ADD instructions. Remember to not only compute speedups for each predictor, but also to discuss the relationship between the speedup and the original accuracy of the predictor?
Now we still remove half the misprediction penalty, but we execute more instructions (one additional instruction for each removed branch). Since the number of instructions executed in the program is different for the non-transformed and the transformed program, we cannot simply compare CPIs to compute speedup. Instead, we compare execution times: If before the transformation the program executes N instructions, after the transformation the number of instructions will be 1.05N (10% of the original instructions are branches, half of those are transformed, and each transformed instruction becomes two instructions). If in the original program we executed 0.1N branch instructions, in the transformed program we execute 0.05*N branch instructions (half of the original branches are no longer branches). Since the clock cycle time did not change, we have:
where A is the accuracy of the branch predictor. For each predictor, we have:
Always-taken (A=0.45): Speedup is 1.152
Always-not-taken (A=0.55):Speedup is 1.122
2-bit (A=0.9): Speedup is 0.995
Global (A=0.95): Speedup is 0.974
Tournament (A=0.98): Speedup is 0.961
The speedup from this transformation gets worse when the predictor gets better and, in fact, the transformation is harmful when used together with one of the more accurate predictors. This is because the transformation creates a fixed penalty for each branch it replaces, but helps only if that replaced branch was mispredicted. In effect, the benefit from replacing a mispredicted branch is 8 cycles (9 saved by eliminating a misprediction, 1 lost by adding an extra instruction) but the penalty from replacing a correctly predicted branch is 1 cycle (from the extra instruction).Therefore, the transformation is a wash (no gain, no loss) when A=88.889% ( 8 correctly predicted branches for every incorrectly predicted one), is beneficial when A<88.889% and if harmful when A>88.889%.