**These are draft notes extracted from subtitles. Feel free to improve them. Contributions are most welcome. Thank you!**

**Please check the wiki guide for some tips on wiki editing.**

Contents

- 1 11. Programming Bayes Rule (Optional)
- 1.1 01 Printing Number
- 1.2 02 Printing Number Solution
- 1.3 03 Functions
- 1.4 04 Functions Solution
- 1.5 05 Complement

- 2 Return the probability of the inverse event (i.e. 1-p)
- 2.1 06 Complement Solution
- 2.2 07 Two Flips
- 2.3 08 Two Flips Solution
- 2.4 09 Three Flips
- 2.5 10 Three Flips Solution

- 3 Return the probability of exactly one head in three flips
- 3.1 11 Flip Two Coins
- 3.2 12 Flip Two Coins
- 3.3 13 Flip One Of Two
- 3.4 14 Flip One Of Two Solution
- 3.5 15 Program Flipping
- 3.6 16 Program Flipping Solution
- 3.7 17 Cancer Example 1
- 3.8 18 Cancer Example 1 Solution
- 3.9 19 Calculate Total
- 3.10 20 Calculate Total
- 3.11 21 Cancer Example 2
- 3.12 22 Cancer Example 2 Solution
- 3.13 23 Program Bayes Rule
- 3.14 24 Program Bayes Rule Solution
- 3.15 25 Program Bayes Rule 2
- 3.16 26 Program Bayes Rule 2 Solution
- 3.17 27 Conclusion

So this unit is optional, but I find it very empowering to program what we've learnedon the computer, because once you've programmed it, you can apply it forever.So, let me show you what I mean.So, here's our programming environment again, and I've given you a very, very simple program.You can process it, it's called print 0.3 and in the next window,I just want you to hit the run button to see what happens.

And not surprisingly, when you hit run, you should get 0.3 in the output window.

```
print 0.3
```

We will now give a slightly more complicated problem of the form we're going to be usingwe're going to be using and I hope you won't be too confused.So, this is it. There's two parts to it.There's the print command as before but rather than printing 0.3 directly,we're going to print the function that computes something from 0.3.Right now it's identity is going to print out exactly the same valuebut we're doing this is to set ourselves up to print something else.And here is how f is defined--we define f with the element p that will be set to 0.3,to be just a function that returns exactly the same value.Why do we do this? Well, to practice programming.So, hit the run button and see what happens.

```
#Write a function to return p as described in the video
def f(p):
#Insert your code here
print f(0.3)
```

And once again, we get 0.3 and the reason is, as we go up, 0.3 is being funneled into the function f.F starts over here and p is now 0.3.We return them and the value is 0.3 straight from the input and then the return of this is being printed.Sounds complex--well, from now on all I want you to do is to modify what's inside this function.

```
#Write a function to return p as described in the video
def f(p):
#Insert your code here
return p
print f(0.3)
```

So the first exercise, say this is the probability, let's print the probability of the inverse event.Let's make the function over here that takes p but returns 1 - p.So please go ahead and modify this code such that the return value is 1 minus p and not p.

```
def f(p):
#Insert your code here
print f(0.3)
```

This modification just replaces p by 1-p in the return function and then I run it I get 0.7.So the nice thing about our complimentary probability we can now plug ina different value over here, say 0.1--with 0.1, I get as an output 0.9.So congratulations, you've implemented the very first example of probability where the eventprobability is 0.1 and the complementary event and negation of itis encapsulated in this function over here.

```
def f(p):
#Insert your code here
return 1-p
print f(0.3)
```

Here is my next quiz for you. Supposed we have a coin with probability p.For example, p might be 0.5.You flip the coin twice and I want to compute the probability that this coincomes up head and heads in these 2 flips--obviously that's 0.5 times 0.5.But I want to do in a way that I can use any arbitrary value for pusing the same style of code as before.So all your going to modify is the 1-p into something that if I give a probability preturns to me the probability of seeing heads twice in this coin--that is the probability of heads.

```
#Given theat the probability of one head is p, return the probability of
#two flips resulting in two heads
def f(p):
#Insert your code here
print f(0.1)
```

And here is one way to implement this, just return p p and for 0.5, it gives me 0.25.If I make this a loaded coin of probability of heads a 0.1,then the outcome is 0.01 and I hit the run button.

```
#Given theat the probability of one head is p, return the probability of
#two flips resulting in two heads
def f(p):
#Insert your code here
return p*p
print f(0.1)
```

So let's up the ante and say we have a coin that has a certain probabilityof coming up with heads--again, it might be 0.5.Just like before it will be an input to the function f and now I'm going to flip the coin 3 timesand I want you to calculate the probability that the heads comes up exactly once.Three is not a variable so you could only works for 3 flips not for 2 or 4but the only input variable is going to be the coin probability 0.5.So please change this code to express that number.

```
#Return the probability of exactly one head in three flips
def f(p):
#Insert your code here
print f(0.5)
```

You might remember for P = 0.5 then you go to the truth table, you’ll find the answer is 0.375.If you set P to 0.8, the number actually goes down at 0.096.So, you can check the implementation to see if you get the exact the same numbers.So, here’s my result--when you build the truth table,you’ll find that exact the 3 possible outcomes have had exactly once; it’s H T T, T H T, and T T H.So, of the 8 possible outcomes of the coin flips, those 3 are the ones you want to count.Now, each has exactly the same probability of P for heads x (1 - P) x (1 - P).So, they get all 3 of them together, we just multiply these by 3.And this is how it looks in the source code 3 x P x (1 - P) x (1 - P) if,for example, I give this input 0.8, then I get 0.096 as an answer.But if you never programmed before and you got this fight, then congratulations!You might be actually a programmer.Obviously, if you programmed before, this should relatively straightforward but it’s fun to practice.Let’s now go to a case maybe of 2 coins.

```
def f(p):
#Insert your code here
return 3*p*(1-p)*(1-p)
print f(0.5)
```

So coin 1 has a probability of heads equals P₁ and coin 2 has a probability of heads equals P₂and this may have now be different probabilities.In a primary environment, they can account this by making 2 arguments separated by a comma,for example, 0.5 and 0.8, and then the function takes as an input, 2 arguments, P₁ and P₂,and they can use both of these variables in the return assignment.Let’s now flip both coins and write the code that computes the probabilitythat coin 1 equals heads and coin 2 equals heads for example of 0.5 and 0.8, this would be?

```
#Return the probability of flipping one head each from two coins
#One coin has a probability of heads of p1 and the other of p2
def f(p1,p2):
#Insert your code here
print f(0.5, 0.4)
```

Yes, 0.4 is the product of these two values over here.So, in reality the solution is just to apply the product p1 p2,and hitting the 1 button gives me indeed 0.4.I can now go and change this probability to 0.1 and 0.8.You probably already figured out that the answer is now 0.08,and indeed, my code gives me the following result, 0.08.

```
#Return the probability of flipping one head each from two coins
#One coin has a probability of heads of p1 and the other of p2
def f(p1,p2):
#Insert your code here
return p1*p2
print f(0.5, 0.4)
```

And now comes the hard part--I have coin 1 and coin 2.In each of them is a probability before, that’s coin 1 comes up with heads and coin 2 comes up with heads,we call these P₁ and P₂.But here’s the difficulty, before I flip the coin,I’m going to pick 1 coin and I pick coin C1 with probability P₀and I pick C2 with probability 1 - P₀.Once I’ve picked the coin, I flip it exactly once,and now I want to know what’s the probability of hits?Let’s do an exercise first, say P₀ is 0.3, P₁ is 0.5, and P₂ is 0.9.What’s the probability of observing heads for this specific example?

And the answer was 0.78 and how it got there is the following.I pick coin 1with probably 0.3 and once I picked it, the chance of getting heads is 0.5.But I might have alternatively picked C2, which has a probability of 1-0.3 and 0.7,and C2 gives me heads with a chance of 0.9.When I worked all this out, I get 0.78.

So the task for you now is to implement the function with three input argumentsthat it computes this number over here so that it can vary any of thoseand still get the absolute correct answer for this function over here.If you've never programmed before, this is tricky.You have to add one more argument and you have to change the returned function to implementa formula just like this but this using p0, p1, p2as arguments not just the fixed numerical numbers here.

```
#Two coins have probabilities of heads of p1 andd p2
#The probability of selecting the first coin is p0
#Return the probability of a flip landing on heads
def f(p0,p1,p2):
#Insert your code here
```

And here is my answer. You can really rid off the formula that I just gave you.It's easier if we pick coin one and it comes with head p1 and with 1-p0 we pick coin twoand it with comes head it will probably be p2.So you can now give in three arguments p0, p1, and p2 such as 0.3, 0.5,.0.9and it gets us 0.78 if I hit the run button.Interestingly, you had changed this numbers, for example the first one is 0.1 and the last one to 0.2.I now get a different result of 0.23.

```
#Two coins have probabilities of heads of p1 andd p2
#The probability of selecting the first coin is p0
#Return the probability of a flip landing on heads
def f(p0,p1,p2):
#Insert your code here
return p0*p1+(1-p0)*p2
```

Let's go the cancer example. These are prior possibility of cancer we should call P₀.This is a probability give a positive test given cancer. I call this P₁and careful, these are probably a given negative test result for don't have cancer and I call this P₂.Just to check suppose probability of cancer is 0.1, the sensitivity 0.9, specificity is 0.8.Given the probability that a test will come out positive.It's not Bayes rule yet, it's a simpler calculation and you should know exactly how to do this.

The answer is 0.27.We first consider the possibility we have actually have cancer, in which our testswill give us a positive result of 0.9 chance, and then we add the possibility of not having cancerthat's 1-0.1 or 0.9, and then when one gives us a positive result with 0.2 chance or 1-0.8.Resolving this gives us something like this that is 0.09+0.18 adds up to 0.27.So now, I want you to write the computer code that accepts arbitrary P₀, P₁, P₂and calculates the resulting probability of a positive test result and here's my answer.My code does exactly what I have shown you before.It first considers the possibility of cancer multiply this with the test sensitivity P₁ and then it absorbsthe opposite possibility, and of course, the specificity over here refers to a negative test results,so we take 1 minus that to get the +1.Adding these two parts up give us the desired results so let's try this.Here's my function f with the prime that we just assumed, and if I hit, run I get 0.27.Obviously, I can change the prime with this.So suppose we make it much less likely to have cancer in the prior from 0.1 to 0.01then my 0.27 changes to 0.207.Now realize it is not the posterior in Bayes' rule. This is just the probability of getting a positive test result.You can see this if you change the prior probability of cancer to zero,which means we don't have cancer no matter what the test result says,but there's 0.2 chance of getting a positive test result,and the reason is our test has a specificity of 0.8 that is even in the absence of cancer,there's 0.2 chance of getting a positive test result.

So now I want you to writethe computer code that acceptsarbitrary P₀, P₁, P₂and calculatesthe resulting probabilityof a positive test result.

```
#Calculate the probability of a positive result given that
#p0=P(C)
#p1=P(Positive|C)
#p2=P(Negative|Not C)
def f(p0,p1,p2):
#Insert your code here
return p1*p0+(1-p2)*(1-p0)
```

Here's my answer.My code does exactly what I've shown you before.It first considers the possibility of cancer,multiplies it with the test sensitivity p1and then it observes the opposite possibilityand of the course the specitivity over here refers to a negative test resultso we take 1 minus this to get the positive one.Adding these two products up gives us the desired result.So let's try this. It gives me a function f with the parameters we just assumedand if I hit run, I get 0.27.Obviously I can change these parameters,so, suppose I make it much less likely to have cancer in the prior from 0.1 to 0.01then my 0.27 changes to 0.207.Now I realise it's not the posterior in Bayes' Rule.It's just the probability of getting a positive test result.You can see this if you change the prior probability of cancer to 0which means we don't have cancer, no matter what the test result says.But there still is 0.2 chance of getting a positive test resultand the reason is our test has a specitivity of 0.8that is, even in the absence of cancer,there is a 0.2 chance of getting a positive test result.

```
#Calculate the probability of a positive result given that
#p0=P(C)
#p1=P(Positive|C)
#p2=P(Negative|Not C)
def f(p0,p1,p2):
#Insert your code here
```

Now, let's go to the holy grail and implement today's work.Let's look at the posterior probability of cancer given that we received the positive test result,and let's first do this manually for the example given up here.So what do you think it is?

And the answer is 0.0333 or a 1/3 and now we're going toapply the entire arsenal of inference we just learned about.The joint probability of cancer and positive is 0.1 0.9. That's the joint that's not normalized.So let's normalize it and we normalize it by the sum of the jointfor cancer and the joint for non-cancer.Joint for cancer we just computed but the joint for non-cancer assumes the opposite prior 1-0.1and it applies the positive result of a non-cancer case.Now because the specificity first is negative, we have to do thesame trick as before and multiply it with 1-0.8.When you worked this out, you find this to be 0 to 0.9 divided 0 to 0.9 + 0.9 0.2 that is 0.18So if you put these all of this together, you get exactly a third.

So I want you to program this in the IDE where there are three input parameters P⁰, P¹ and P².For those values, you should get a 1/3 and for those values over here, 0.01 as a prior0.7 as sensitivity and 0.9 as specificity, you'll get 0.066 approximately.So write this code and check whether these examples work for you.

```
#Return the probability of A condioned on B given that
#P(A)=p0, P(B|A)=p1, and P(Not B|Not A)=p2
def f(p0,p1,p2):
#Insert your code here
print f(0.1,0.9,0.8)
```

And here's my code, this implements Bayes rule.You take p0 a prior times a probability of seeing a positive test resultand divided by the sum of the same plus the expression for not having cancer,which is the inverse prior and the inverse of this specificity is shown over here.When I plug in my reference numbers, the ones from over here, I indeed get 0.33333.So, this is the correct code and we can plug in our return numbers.It's fun if we give it a zero probability prior to have cancerand guess what, no matter what the test is, you still don't have cancer.That's the beauty of Bayes' rule, it takes the prior very seriously.

```
#Return the probability of A condioned on B given that
#P(A)=p0, P(B|A)=p1, and P(Not B|Not A)=p2
def f(p0,p1,p2):
#Insert your code here
print f(0.1,0.9,0.8)
```

Now, let's do one last modification and let's write this procedureassuming you observed a negative test result.This means the posterior of having cancer under a negative result is0.0137 for those numbers over here and about 0.00336 for those numbers over here.In both cases, the posterior is significantly smaller than the priorbecause we received negative test results.So, go ahead and modify your procedure accordingly.

```
#Return the probability of A condioned on Not B given that
#P(A)=p0, P(B|A)=p1, and P(Not B|Not A)=p2
def f(p0,p1,p2):
#Insert your code here
print f(0.1,0.9,0.8)
```

And here's my implementation for the cancerous case.You don't have to plug in the measurement probability to see a negative test result,which is one minus the sensitivity and in the normalizer,we copy the first term over in the second term of the noncancer hypothesize.we just put in the specificity and when you put this all together and run the procedure,we indeed get 0.013698 and so on.That's the number I gave you for the first example.

```
#Return the probability of A condioned on Not B given that
#P(A)=p0, P(B|A)=p1, and P(Not B|Not A)=p2
def f(p0,p1,p2):
#Insert your code here
return ((1-p1)*p0)/(((1-p1)*p0)+(p2)*(1-p0))
print f(0.1,0.9,0.8)
```

So I really hope you enjoyed all this, and the reason isI really want you to do a program basic statistics by yourself.So, for the rest of your life, you know how to program it.When you know how to program it, you know how to do it,and it will empower you to solve basic problems in probability.I should tell you in my own life, I've built a lot of robots, and I've applied Bayes rule like crazy.My job talk at Stanford was all about Bayes' rule applied to robotics.It's a very powerful paradigm and I hope you enjoyed it.If you want to dive deeper, there's always Udacity CS373,which talks about programming and about half of the classes variations of Baye's rule.I'll spare you with the details here because it's a basic introduction class,but stay tuned for the next unit, I have another great surprise in store for you.