These are draft notes extracted from subtitles. Feel free to improve them. Contributions are most welcome. Thank you!
Please check the wiki guide for some tips on wiki editing.
Welcome to my online class on statistics. The basics of statistics is that the world is full of data, and we the people have to make decisions. Statistics comes to our rescue. It takes data and turns it into information that we the people can use to make decisions. Whether you are in social sciences, medicine, engineering, public policy, psychology, climatology, robotics, even archaeology, health sciences,finance, business and marketing, or pretty much any other discipline that you can study. All of those are now driving by data, including unlikely fields like biology or physics and so many others. Statistics is an amazing discipline to know. It is universal, useful, and fun, as I hope you're going to see in the class I'm just about to teach.
One of the standard problems that people study in statistics has to do with purchasing decisions. Suppose you wish to buy a house. There are small houses and big houses, but you really like this one special house build by a famous designer. This house has a certain price. Say in US dollars it's $92,000.00. The question you'd like to ask yourself--is this okay? Is it too much--you should pay less--or too little? Let's go an find out. In statistics, the way we find out is by looking at data. Let's assume there is a database of previous house sales of homes in the same neighborhood. Just for simplicity, let's assume we know about two things--the size of the home and the cost at which it was sold. There is a house with 1400 square feet that sold at $112,000, a much larger one with 2400 square feet sold for $192,000, and so on for an entire number of other houses. Now it's a statistics problem. You have past data, and here is your very first quiz. Say the house you wish to purchase has 1300 square feet in size. How much money should you expect to pay?
Important: The grader doesn't accept answers formatted with commas, e. g. , if you think the answer is, you should enter .
Well, in our very first quiz we're just going to look it up. It turns out there was other sold at the same size, and it brought in $104,000. In the interest of statistics, the answer is $104,000. This is not the game theory class. Obviously, you wouldn't want to bid that much, but in the interest of statistics, that's what you'd expect to pay.
Same question now with 1800 square feet.
And yes, the answer is $144,000.
A more tricky question--what about if the house you're trying to purchase has 2100 square feet?
That one is tricky. I'm going to answer it the following way--21 is just halfway between 1800 and 2400. If you take halfway between $144,000 and $192,000,we get the mean of 144 and 192 thousand, and that is $168,000.
Now, that isn't always correct. I assume that this data has certain properties, which I'll talk about later, but let's move on and assume we can use the trick of finding prices just in between existing prices to price other sizes of houses like 1500 square feet. Please put your answer right here.
By "this data has certain properties", the instructor means that the relationship between size and price is linear, i. e. , if you plot the data, all your data points will fall over a straight line:
Yes the answer is $120. By our logic, 1500 lies between 1400 and 1800. In fact, it's a quarter away from 1400. We'd say it's $112,000 plus 1/4 of the way from 1400 to 1800. That's a difference of the price of an 1800 square foot home and a 1400 square foot home,adding a quarter of this gets us from $112,000 to just $120,000.
I guess by now you've probably figured it out,but let me ask you just to make sure you understand the logic behind this specific data set. What is the cost of the home per square foot?Here's my square foot, and please answer in the box over here.
The answer is $80 per square foot, and we get this by just dividing $112,000 by 1400. It turns out that this data set has this amazing property that the cost per square foot is constant. That allows us to interpolate the way we just did. In statistics that's often not the case, but I want to congratulate you. You did your very first unit of statistics. Congratulations. You completed Unit 1. But as we go forward, we're going to look into data where the cost might not just be a constant factor times the size of a home. See you in the next unit.