These are draft notes extracted from subtitles. Feel free to improve them. Contributions are most welcome. Thank you!
You had asked why did I drag you through all of these binomial distribution stuffand flipped so many coins, the reason is you're going to move now towardswhat's perhaps the most deep insight in all of statistics.It's called the central limit theorem.And the way I want you to get there is through a programming exercise.Now, I told you that all the programming is optionaland you can totally skip this one but I beg you to stay with me.What you're about to see is perhaps the most interesting way to understandthe central limit theorem and statistics of large numbers.In assignment #1, I literally want you to flip a coin 1000 times.And once you've done this, I want you to compute the mean of the outcomeand the standard deviation.Flipping a coin is a random event. It gives us things like 0 or 1s as outcomes.Like this thing over here.If we were to do this 1000 for a fair coin, you expect the outcome of the mean to be 0.5.You probably have no clue what to expect for the standard deviation.There's a couple of things I want to give you.In your programming environment, you'll find the function mean, the function variance,and the function standard deviation as you've practiced it before.And what if I did a little bit to make sure that whatever you do is of type floatwhat you need for computing the mean.Otherwise, it might be of type integer and then these calculations all go wrong.Same over here in the computation of the variance but ignore the float type conversion.Other than that, it's exactly what I've shown you before.I now want you to implement the function flip that takes this as an argumentthe number of coin flips you want to do, 1000 in this case,and then use as a function mean and then standard deviation to computethe mean and the standard deviation of the resulting sequence of outcomes.This will be a list filled with 0s or 1s.The thing to know is that with the function random.random,there's two of them with a dot in the middle, gives you a random valuethat sits between 0 and 1.Every time you call this function, you get a different random value,which is nice because you just have to call this 1000 times to get you 1000 samples.But then in the interval of 0 and 1 and you want to put them back into coin flips,so what you have to do is to call this expression here.And this expression over here will give you true or false,which is the same for the purpose here of 1 and 0 .It gives you true if the random value happens to be larger than 0.5 and false if it's smaller.So the assignment is to call this thing 1000 timesand make a list of these 1000 outcomes and put that code in and the function flipthat returns that list and then you're done.So here's a typical outcome for this code. If I run it, the mean might be 0.484.There's the standard deviation.If I run it again, I get a different mean of 0.51 and a different standard deviation
And here's my answer. It's a one-liner.A bid in an array of 1000 things, and this is the beauty of Python.There's ways to make it more complicated as it is before in the variance case.Now this is a little bit more compact. So I ran the test random.random larger than 0.5.And this thing over here gives me the true or false and I want to do this 1000 times.And doing this 1000 times invokes this command for x in range (N) where N is 1000and range N becomes a list of 0 to 999.This will go 1000 times of different x's.The x's that we've used here because the random coin flipdoesn't understand what the order of the coin flip is.They're the same every single time but this just means I ran this procedure over here 1000 times,collected the results in the bracketed list, and returned it.Specifically, if we were to print out f and hit the run button then what I get is the stuff down here.A list of 1000 items of false, true, and false. It makes for a beautiful wallpaper, doesn't it.
Now with this in place, now here comes thereally interesting question. It’s assignmentnumber two. And again it’s a programmingassignment – free to skip.Now that we have a function of flip, it givesme this list of a thousand outcomes from whichI cannot derive things like the mean. Run thisthing itself a thousand times and each time youget a different mean, so this means zero, meanone and so on all the way to a mean nine, nine,nine.And these means are continuous valuesobviously, between zero and one and give youthe same function as before, mean, variance,standard deviation, and flip and as I scrolldown, I find this function sample, I want youto put in code over here so that when I samplewith the same n, I run the flip experiment athousand times and every single time I computethe mean and now I assemble a list of all themeans into this thing called outcomes.The means will be continuous, I can do a historyplot, it’ll be better with many bins, so thisnotation over here gives me 30 bins. And togive you a feel for what to expect, this isa typical histogram I get out as a result.It’s really beautiful. If I increase n to 2000,I get this histogram over here. Apologize somenumbers are a little illegible over here butthe center of it is 0.5 and it falls off tosmaller number to the left, to the right. Youcan think of it as a distribution over themeans outcomes of large numbers of coin flipsand has an interesting shape. So go ahead andprogram it and see if you can reproduce theseresults.
And my answer is quite simple--again, it runs my experiment a 1000 times using the for x in rangeit summons a new list and this on the list is the mean of the list produced by flip.Flip itself every single time I run it will give me a 1000 0s or 1s or truths or falseand using the function mean, I compute the mean of that.But now I have an outer loop where I do this calculation of the mean a 1000 times.It summed them into a new list and that's my new list and that list is continuous valued.These are printed out by saying print outcomes after generating it.What I see is a list of 1000 numbers--it all have around some by 0.5,some of it is 0.046, some is 0.515.These are the empirical means for these sequences of a 1000 coin flips.See how it flip effectively 1,000,000 coins here, and this is the corresponding histogramplotting the frequency of the coins.[#000000]
The thing that is suspicious is that in this binomial distribution, it seems that the frequencyof outcomes is centered around the expected outcome of 0.5 that falls offaccording to a funny looking curve--often called a bell curve and the reason isthis could be the world's largest church bell.The significance of this bell curves in the relationship to what's called thecentral limit theorem will be discussed in the next unit.