cs259 ยป

CS259 Introduction


1. Overview of Course

My name is Andreas Zeller. I'm a researcher at Saarland University in Germany.

And I am researching large programs and why they fail.

I have done some fair work in automatic debugging, also in mining the histories of programs. I've been working with companies such as Microsoft, SAP, or Google, examined their bugs, finding out what was wrong, and it struck me that there's almost no teaching material available on debugging and how to debug programs.

So today, I'm going start with you a course on how to do debugging systematically, effectively, and in many cases, even automatically--enjoy.

Welcome to the Udacity course on debugging.

The aim of this course is to teach a systematic approach to debugging and we're even going to explore a number of automatic tools that do the debugging for you.

We're going to explore how debuggers work. In particular, the scientific method of debugging by which through a series of experiments we gradually refine a hypothesis until we end up with a diagnosis on why the program failed. On top of that, we're going to build our own interactive debugger in Python.

In the next unit, I'm going to introduce you to one of the most powerful debugging tools ever invented, that is assertions. Assertions are statements in the program that automatically check whether the state of the program is still correct. That is, while you're program is executing, the computer constantly monitors the program on whether a bug has occurred. This allows you to very quickly and effectively find out where a bug was first introduced. On top of that, we're going to build a tool that makes you infer assertions from executions.

In unit 3, I'm going to show you a technique named delta debugging which automatically simplifies problems. For instance, here's this 900-line HTML file which causes to crash in a program which processes it. With delta debugging, you can reduce this to just the eight characters that produce the bug just as well and all of this automatically.

In the next unit, I'm going to show you how to find out where a specific failure came from. You see an execution as a series of states. We are going to explore techniques that help you in tracking the way of an error all through the program execution. And on top of that, we're going to build a tool that isolates such cause effect change automatically.

In unit 5, we've been looking at reproducing failures. We're going to look at all the various input sources for your program and discuss how to capture and replay them such that you can faithfully reproduce a failure that happens in the field. Plus, we're going to explore statistical debugging which collects data from the field to tell you which parts of your program are most likely to be related to the failure.

In unit 6, we're going to see how to mine information from bug data bases and change data bases in order to figure out where bugs have been in your program in the past, where they accumulate, and which parts of your program therefore are going to be the most back prone in the future. And again this is a fully automatic technique.

This is so far, you've had no fun in debugging because it just sucks the life out of you. The aim of this course is to get most of the debugging effort off your shoulders because you can have the computer take care of most of the debugging work. So your mind is free for doing something more creative than debugging.

2. Late Night Debugging

So, here you are, you are sitting in front of your interactive debugger, stepping for the program, you're looking at the variables, stepping, and stepping, and stepping. Setting up breakpoints trying to figure out what's going on and no time asked. You sit in front of your screen. You dig through the code step by step by step by step. And you watch individual variables appear and disappear. Time passes.

It's late, so you order a pizza. Maybe a cup of coffee. You keep on watching the screen and some more coffee and just keep on stepping and stepping to your program as time goes by. It's pretty late already and you keep on concentrating, stepping, stepping, stepping. You are so close now. That's when the phone ring. Ring. Ring. Ring.

Who's on the phone? Who's on the phone? It's a significant other asking you, "When on earth are you planning to come home?" You are in a lose-lose situation. You either lose your concentration or you lose your significant other. You have 30 seconds left until you lose all the concentration you have so carefully build up. You have 30 seconds to decide what to do.

I know a fair amount of programmers who when faced with such a problem have taken the only right decision. And I think that debugging is the number one divorce ground for programmers.

3. Choices

So, here comes the quiz. What can you do to avoid such as a situation? Should you drink more coffee, give up smoking, don't take phone calls, don't write bugs in the first place, get a better debugger, or take the Udacity course on debugging? Hint. The correct answer is marked with a circle.

4. Cost of Bugs

The problem with debugging being time consuming and all actually translates into money and effort on a large scale. A commonly cited figure in the literature is that in any software project, at least 50% of the effort is spent on test and debugging. This number can even go up to 75%.

According to study made in 2002, software bugs are costing the US economy only $59.5 billion a year and improvements in testing and debugging could reduce this cost of software bugs by a third or $22 billion a year.

The main problem with debugging, however, is not that it takes time. The worst thing that it is a such process whose length is unpredictable. It can take anything between a few minutes, a few hours, and sometimes even days and weeks. Even if you don't know how much time it's going to take, be sure to use the systematic process, which gradually gets you towards the cause of the problem, but even if you never know how much time a bug will take, it's a bit of a blessing to use a process which gradually gets you towards its cause.

5. The First Bug in History

Here is a bonus. Where does the term "bug" come from? There's a story of a Harvard Mark II machine in which on September 9, 1947, a moth got stuck in the relay. The moth got carbonized and then caused a short circuit in the machine, which caused the machine to break. Technicians retrieved the moth from the relay and this was then recorded as the first bug actually being found in a computer. The bug is now on display at the Smithsonian in Washington.

The term "bug" as you can see was already known in 1947, and it's actually much older than that. You can even trace it back to Shakespeare who used the term in order to describe some form of spector sitting on the chest of people in the night and causing them nightmares.