hpca/f2014 projects/project0/q and a

This page contains questions posted by students. The answers have been supplied by students and/or instructors.

Q1: One of the questions for project 0 part 2 says "Why is the branch predictor so much less accurate here than it was on crafty?"

I haven't seen any other reference to "crafty" - does this just mean the first part of project 0? **

Answer1: It should be "... than it was for the lu benchmark?". Originally I used SPEC for this project but decided to go with Splash because of potential licensing issues.

Q2: Looking at the report.pl output, I can guess to most of the acronyms in the report, but some are not familiar. Is there a description of the output fields somewhere I could get?

Answer2: which can be found at: http://iacoma.cs.uiuc.edu/~paulsack/sescdoc/

Q3: I'm running into a problem trying to follow the tutorial for project 0.

Running the following line:

~/sesc/sesc.opt ­c ~/sesc/confs/cmp16­noc.conf ­olu.out ­elr.err lu.mipseb ­n32 ­p 1

Gives me the following error Config:: Impossible to open the file [sesc.conf]

Perhaps I'm doing something wrong, but I'm wondering if anyone else has encountered the same issue. I tried using find to locate this file, but it cannot be found. I'm going to look through the sesc documentation tomorrow to attempt to resolve, but in the meantime I would appreciate any insight people have to share.

Answer3: Did you execute the command exactly as you have it posted? From what I can see you've left off the "-" character off from each operand. Typically, that is how a program identifies what kind of argument it's dealing with.

~/sesc/sesc.opt ­-c ~/sesc/confs/cmp16­noc.conf ­-olu.out ­-elr.err lu.mipseb ­-n32 ­-p1 Honestly, I don't why the dashes got stripped out of my post...but I copied directly from the PDF which appears to be correct. I'll try again this evening in case that is the source of the error. Thanks for helping me identify what most probably is a stupid mistake on my part.

Answer3: I don't know if the dashes were the only problem, but I figured it doesn't hurt to tell you this just in case:

Sometimes pasting from a PDF does not work well because the pasting into a word processor and/or "printing" into a PDF can change the encoding for some characters. For example, dash characters in a PDF can have a different character code than the one Linux expects - there are a number of dash characters in most fonts, and they vary mostly in exactly how long the dash is. Also, lowercase F followed by a lowercase I is often converted into a single "ligature" character (with a separate character code), etc.

Q4: Also can anyone give insight into how long they took if they finish, or a benchmark on how long it might take? I'm trying to figure out my time commitments.

Answer4: Assuming you have a decent computer, you need may be about 2-3 hours to run the simulation for Part A (after all it is a 512x512 matrix).

Part B compiles in a matter of few minutes, if not seconds.

Answer4: My Part 1 ran for about 57 minutes

Q5: So I downloaded Oracle VirtualBox as well as the project 0 simulator and files. When I try to run the image in virtualbox I do not see UD233 Project but instead HPCA Ubuntu Image (The tutorial states that I need to select UD233 Project). Does that matter or am I missing a step somewhere?

Answer5: Thats fine . Even mine shows HPCA Ubuntu image.

Q6: The second question of Part B is "How many instructions are executed between consecutive branch mispredictions?". Is this a simple number that is read off of the report and I'm just not seeing it (or understanding what I'm seeing), or is there something that needs to be calculated?

Answer6: The question is asking how many instructions (on average) do we execute between branch misprediction, i.e. if the program executes 1000 instructions and has 3 branch mispredictions (note that a misprediction is followed by a correct execution of the same branch), then we have 333.33 instructions between branch mispredictions.

Q7: The SHA256 checksum for HPCA+Ubuntu+Image.ova is 9db449ab125a4905804e37cfbe62370b95b393665a60a798573dda61554cf986.

Has anyone been able to install successfully on Windows 7 64 bit? Do you have the same checksum for the HPCA.ova file? Any other suggestions?

Answer7: I was only able to install on windows 7 without Virus Protection. I don't know whats going on there but I need virus protection on all my windows computers. What you can do is install Ubuntu alongside windows and run virtualbox from ubuntu.... so you are running ubuntu in ubuntu. Sounds weird but it works

A student added: Try using the older version of Virtual Box (4.3.12 - https://www.virtualbox.org/wiki/Download_Old_Builds_4_3) I took the Computer Networks course over the summer, which also used a VirtualBox VM, and when I upgraded from 4.3.12 to 4.3.14 it stopped working.

Q8: gcc ­o hello hello.c From the tutorial: You should try to compile and run your code natively, to see if it’s working: gcc ­o hello hello.c I don't understand why this has to be commanded after every time the text file is saved.

Answer8: This is the command to compile the hello.c into an executable file. If you make any changes to the hello.c file as suggested in the tutorial by adding the '!' you must recompile for the changes to take affect.

Answer8: Any changes in the text file (.C file) will result in a change in the object file. So you have to run the compiler(gcc) after any changes you have made to the .c file, otherwise your final output will not reflect the changes you have made.

Q9: Project 0 - H) hint 1 you can use… objdump - not sure how to use the line of code from Part 2, H) hint 1.  I placed it in the line to compile the hello file, and get an error...

at command prompt, I typed : /mipsroot/cross-tools/bin/mips-unknown-linux-gnu-objdump -d -o hello.mipseb hello.c

and got: 'no such file or directory but the 'objdump' file is in the directory. Could someone point me in the right direction? (on how to use objdump).

Answer9: remove the 'hello.c' file from the command line plus I don't think there is option '-o'

Hint: the output seems to be long and will scroll multiple pages on your screen, better pipe it to a file and read the file instead.

Piping to 'less' is also a good approach. You can search with '/'

Answer9: You can also pipe (> objdump.txt) and get this into a file that can be utilized more efficiently then just to the screen this way you can save and study it in greater detail.

Q10: Project0 Part 2 When I add the exclamation mark to the print statement, the no of instructions executed on the target machine goes down. Is it a usual behaviour? Or is there an anomaly in the simulation?

Answer10: It CAN happen that the number of instruction is reduced a bit even though the string is longer. To figure out how and when can it happen, one needs to see the object code (e.g. using mips...objdump). The compiler converts our printf to puts (because there is no string formatting going on), and puts first calls strlen and then does a lot of things. But you can get some idea about the longer-string-fewer-instructions behavior if you just look at the object code for strlen. And it's a nice way to learn and appreciate some of the neat low-level optimizations for string processing :)

Q11: As I am going through the questions in Project 0, I got little confused at which data result I should be looking at to answer this question. Possibly it needs to be calculated?

Answer11: The large "number of cycles" number,is followed by what percentage of those cycles are spent on various aspects of the execution, e.g. Busy is the percentage of cycles that the processor actually spends on executing useful instructions, MisBr is the percentage of cycles spent executing instructions that get quashed due to branch mispredictions, etc.

Q12: I feel like I understand most of what is going on in project0, but when I put some of the numbers together, I'm surprised, so I wanted to verify my understanding.

My BPred is 94.37%, and BJ is 9.55%. Wouldn't this mean that 5.63% of all instructions are mispredictions? And 5.63/9.55 = 59.0% of branches mispredict? That is worse than random chance!

Am I missing something?


The BPred statistic is just for branches - the simulator models relatively fancy processors, so it uses predecoding to tell it which instructions are branches.

Q13: Question C of Project 0 states: C) How much of the processor’s performance potential was wasted because of branch mispredictions (MisBr)? Why is this number so high, especially if you take into account the predictor’s accuracy. 

I saw an MisBr percentage of 8.0% and my answer to the second question is that this is high because even with a 94.37% successful branch prediction strategy, mispredictions incur a penalty that is dependent upon the size of the pipeline and the stage at which a branch an instruction is decoded and recognized as a branch. The problem is that I haven't been able to determine either of those values. Looking at ~/sesc/confs/cmp16-noc.conf I found two possibly relevant values: instQueueSize (set at 32) and decodeDelay (set at 2). The user guide level documentation shed some light on decodeDelay as did digging through the sesc source so I ran a benchmark where I set decodeDelay to 6 and I saw the MisBr climb to 11.7% but I want to be sure my understanding is correct. Is instQueueSize the length of the pipeline and decodeDelay the stage in the pipeline where decode occurs? If so, does this mean that the default configuration is a 32 stage pipeline with decode occurring at stage 2?

Answer 13: The instQueueSize is not the depth of the pipeline, it's the size of the instruction queue (where the instructions wait to be issued). The decodeDelay, renameDelay, wakeupDelay, regFileDelay, iBJLat, etc. specify how many stages we have in various stages of the pipeline. The stage in which branches are resolved would be the sum of all the delays up to and including iBJLat. But note that this is a pretty fancy out-of-order processor, which reorders instructions (explained in lessons on ILP, Instruction Scheduling, and Reorder Buffer) and has separate sub-pipelines for different kinds of instructions. So the "stage in which branches are resolved" is not a straightforward thing to define. However, all instructions go through the first few steps of execution (decode, rename, etc.), so if you make decodeDelay longer, all instructions see a longer pipeline, so changes in decodeDelay give you effects similar to what you'd expect from changing the number of stages in a simple pipeline. For example, adding 4 stages to decodeDelay will be similar to making a simple pipeline four stages longer, and thus you'd get a bigger impact of branch mispredictions.

Q14: Is it normal that the number of instructions could be less in Part G than Part F? I imagine if the cpu's history counter was not being reset inside the simulations, so after having to run Part F then shortly after run Part G, the history of F was impacting G causing there to be less miss-predictions so less instructions. But in my Part G it was less accurate: 44% (G) versus 45% (F), had more miss-predictions: 96.8 (G) vs 96.6 (F), yet still came out with less instructions 4860 (G) vs 5144 (F).

Answer14: It can happen, but not for the "saved state" reasons. Each simulation starts afresh, without remembering branch history from the previous one. The real reason for changes in the instruction count has to do with the code, and you need to figure that out to answer the project questions. As I already hinted at, you can take a look at the code for strlen, it is one of the things that gets executed when we do that printf and it will help you figure out how instruction count can change as the string changes (and how it can get smaller for a longer string!).

Q15 Also can anyone give insight into how long they took if they finish, or a benchmark on how long it might take? I'm trying to figure out my time commitments.

Answer15 Assuming you have a decent computer, you need may be about 2-3 hours to run the simulation for Part A (after all it is a 512x512 matrix).

Part B compiles in a matter of few minutes, if not seconds.