Data Analysis with R - Data is Ubiquitous

Hi, and welcome to Exploratory Data Analysis, or EDA. In the last few months, I learned about R, a programming language, and EDA with the help of my friends from Facebook. And in this course, I'll teach you how to use R to conduct Exploratory Data Analysis. larger process of collecting, learning from and acting on data. In this course, I'll share my advice about working with data and visualizations. you should feel confident when exploring new data sets to uncover meaningful patterns. In the last lesson, I'll walk you through an analysis of the diamond market with an eye toward building predictive models. how EDA can be used to answer questions with data. But before we dig in to EDA, let's talk about data. Data is ubiquitous. You can find information about hurricanes, forest fires and state finances on websites like data.gov. Social networking sites like Facebook, where we work. Collect petabytes of data everyday. And some people have started tracking their own personal data using calendars, mobile apps, and physical activity trackers. the world, and so little of it has been explored. For example, I was poking around on Google trends the other day and here's what I found. I searched for the word chicken, because I had read a news article that morning about chicken food poisoning and salmonella. And when I search for the word, Google Trends gave me this graph. This graph shows the interest in searching for the word chicken over time. In reality, it's counting how many times we see chicken in any newspaper headlines in a given month. So from 2005 to 2013, we can see that occurrences of chicken has been increasing. Now, I didn't want to stop my search there, so I decided to enter two more generic word to see what I could find. So I added the word music and I added the word movies. And then I got this graph. Now when you look at this graph, tell me some things that you notice. You can talk about anything interesting that you find in this graph, or you might compare the first graph to this one and write some things that you find. Now, keep in mind there is no right or wrong answer here, I just want to get you thinking about data.

Instructor notes: Facebook processes more than 500 terabyte of data a day (2012) One of Facebook's tools, Presto (mainly used for adhoc analysis), processes over 1 petabyte of data per day. Learn more about salmonella.