Problem Set Solutions/Branches/Problem 3

Consider the following MIPS code. Note that R0 (register 0) in the MIPS ISA always has a value of zero. Also note that MIPS instructions always start at word-aligned addresses, so the two least significant bits of the instruction’s address are always zero. As a result, the two least significant bits are never used to index branch predictor tables. Instead, these bits are ignored and branch predictor table is indexed using the least significant bits of the instruction address that remain after the two least significant bits are removed.

0x110       li   R2,0           ; v=0
0x114       li   R3,16      ; Loop bound for LoopI
0x118       li   R4,0           ; i=0
    LoopI:
0x11C       beq  R4,R3,EndLoopI ; Exit LoopI if i==16
0x120       li   R5,0           ; j=0
    LoopJ:
0x124       beq  R5,R3,EndLoopJ ; Exit LoopJ if j==16
0x128       add  R6,R5,R4       ; j+i
0x12C       andi R6,R6,3        ; (j+i)%4
0x130       bne  R6,R0,EndIf    ; Skip if (j+i)%4!=0
0x134       add R2,R2,R5        ; v+=j
    EndIf:
0x138       addi R5,R5,1        ; j++
0x13C       beq  R0,R0,LoopJ    ; Go back to LoopJ
EndLoopJ:
0x140       addi R4,R4,1        ; i++
0x144       beq     R0,R0,LoopI ; Go back to LoopI
EndLoopI:

For this code, determine the exact number of branch mispredictions that occur during its execution (show your work), if we use A local history predictor with 8 entries. Each entry has a 2-bit local history and four 2-bit counters. All history bits and 2-bit counters start out with zero values. For each 2-bit counter use the 0-1-2-3/3-2-1-0 counting method.

Solution:

Only 3 of the 8 entries are used, with branches 0x11C and 0x13C sharing entry 111, branch 0x130 using entry 100, and branches 0x124 and 0x144 using entry 001.

The branch pattern observed by predictor entry 111 is 16 repetitions of {NT (from 0x11C), 16 times T (from 0x13C)} followed by one {T (from 0x11C)}. The history starts out as 00 and all counters are 0, so the first NT is predicted correctly. Then the first T uses history of 00, finds counter at 0, and predicts NT (incorrect). The counter for history 00 is incremented to 1, and the history becomes 01. The next T outcome is predicted with history 01, whose counter is also 0 and predicts NT (incorrect). The predictor for history 01 is incremented to 1, and the history becomes 11. The next (3rd of 16) T outcome is predicted using history 11, whose counter is also 0 (NT, incorrect). The counter for history 11 is then incremented to 1 and the history remains 11. The next (4th of 16) T outcome finds the history at 11 and its counter at 1, mispredicts, increments the counter to 2, and leaves history at 11. The remaining 12 T outcomes will be predicted correctly, will leave the counter for 11 history at 3, and the history at 11. At the end of this first iteration of the outer loop, we have had 5 mispredictions, the history is 11, and the four counters are: counter for history 00 is 1, counter for history 01 is 1, counter for history 10 is still 0, and counter for history 11 is 3. In the next iteration of the outer loop, history 11 and counter value 3 result in mispredicting T for the first outcome (which is actually NT, from branch 0x11C). The counter for history 11 is decremented to 2, and the history becomes 10. Then history 10 and its counter value (which is still 0) are used to predict NT, which is incorrect, that counter is incremented to 1, and the history becomes 01. Next, history 01 and its counter value of 1 result in predicting NT (incorrect again), that counter becomes 2, and history becomes 11. The remaining 14 T outcomes in this second iteration of the outer loop are predicted correctly. At the end of the second iteration of the outer loop, we have had 8 mispredictions (5 from the first iteration, 3 from the second), the history is at 11, and the counters for histories 00, 01, 10, and 11 are 1, 2, 1, and 3. In the third iteration of the outer loop, the first outcome (which should be NT) is again mispredicted using history 11 and counter value 3. This outcome will be mispredicted in every iteration because the numerous T outcomes train the counter for 11 history to predict T. The history then becomes 10, so its counter value of 1 is used and mispredicts the next outcome (which should be T), and the counter for 10 history becomes 2. From now on, this first T outcome will always be predicted correctly, because the counter for history 10 will start predicting T. The history becomes 01, counter value 2 is used and correctly predicts the (T) outcome. This outcome will also always be correctly predicted from now on because the counter for history 01 will remain at value 3 and correctly predict T. The history then becomes 11 and the remaining outcomes in this iteration of the outer loop are predicted correctly. Overall, in the first three iterations of the outer loop we have had 10 mispredictions (8 from first two iterations, two new mispredictions in iteration 3). Each of the remaining 13 iterations will bring one misprediction (for the first branch in the iteration, which uses history 11 and counter value 3 to incorrectly predict T). The final occurrence of branch 0x11C will use history of 11 and be predicted correctly (as T). Overall, we have a total of 22 (4+3+2+13*1) mispredictions from branches 0x11C and 0x13C.

The pattern observed by predictor entry 001 (for branches 0x124 and 0x144) is 16 repetitions of {16 times NT (from 0x124), then 2 times T (once from 0x124, one from 0x144)}. In the first iteration of the outer loop, the first 16 outcomes are predicted correctly with history 00 using counter value 0. The first T outcome also uses history 00 and counter value 0 and is mispredicted. The counter for history 00 is incremented to 1, and the history becomes 01. The last outcome in this iteration of the outer loop is then mispredicted because history 01 finds counter value 0. This counter is then incremented to 1 and the history becomes 11. At the end of the first iteration of the outer loop, we have had a total of 2 mispredictions, the history is 11, and counter values for histories 00, 01, 10, and 11 are 1, 1, 0, and 0, respectively. In the second iteration of the outer loop, history 11 and counter value of 0 correctly predict the NT outcome, then history 10 and counter value 0 correctly predict another NT, then history 00 and counter value of 1 correctly predict a third NT, the counter is decremented back to 0, and the remaining 13 NT outcomes are all predicted correctly using history 00 and counter value 0. In all subsequent iterations of the outer loop, all NT outcomes will be predicted correctly and counter values for histories 00, 11, and 10 will all indicate NT. The first T outcome in this iteration finds history 00 and counter value 0, so it is again mispredicted. This T outcome from branch 0x124 will always be mispredicted because it uses history 00 which is trained by the preceding NT outcomes to predict NT. Finally, the last branch (0x144) in this iteration of the outer loop uses history of 01 and counter value 1, so it is mispredicted again, but the counter value is then incremented to 2 and will result in correct predictions of this branch from now on. Overall, branches 0x124 and 0x144 result in a total of 18 mispredictions (2+2+14*1).

Branch 0x130, a shown for part b), has a pattern of 4X{ 4X{NT,T,T,T}, 4X{T,T,T,NT}, 4X{T,T,NT,T}, 4X{T,NT,T,T}} In the first iteration of the outer loop, the pattern is 4X{NT,T,T,T}, so the predictions and predictor entry updates are:

table1.jpg

The number of mispredictions in this first iteration of the outer loop is 9.

For the second iteration of the outer loop, the pattern is 4X{T,T,T,NT} and we have:

table2.jpg

The number of mispredictions in this 2nd iteration of the outer loop is 5.

For the third iteration of the outer loop, the pattern is 4X{T,T,NT,T} and we have:

table3.jpg

The number of mispredictions in this 3rd iteration of the outer loop is 7.

For the fourth iteration of the outer loop, the pattern is 4X{T,NT,T,T} and we have:

table4.jpg

The number of mispredictions in this 4th iteration of the outer loop is 3.

For the fifth iteration of the outer loop, the pattern is 4X{NT,T,T,T} (the same as in the first iteration) and we have:

table5.jpg

The number of mispredictions in this 5th iteration of the outer loop is 4.

Note that the state of the entire predictor at the end of the fifth iteration is the same as it was at the end of the first. Since the branch outcomes in 6th and 2nd iteration are also the same, from now on we will have the same pattern of predictions and mispredictions in iterations 6, 7, 8, and 9 that we had in iterations 2, 3, 4, and 5, then the same pattern will repeat in iterations 10, 11, 12, and 13, and finally iterations 14, 15, and 16 will have the same pattern as 2, 3, and 4. Overall, the number of mispredictions from branch 0x130 is 81 (9+3*(5+7+3+4)+5+7+3).

In summary, the number of mispredictions when this code is executed with this 8-entry local predictor with 2-bit history is 121 (22+18+81).