About this Course

Exploratory data analysis is an approach for summarizing and visualizing the important characteristics of a data set. Promoted by John Tukey, exploratory data analysis focuses on exploring data to understand the data’s underlying structure and variables, to develop intuition about the data set, to consider how that data set came into existence, and to decide how it can be investigated with more formal statistical methods.

If you're interested in supplemental reading material for the course check out the Exploratory Data Analysis book. (Not Required)

This course is also a part of our Data Analyst Nanodegree.

Course Cost
Free
Timeline
Approx. 2 months
Skill Level
Intermediate
Included in Course
  • Icon course 01 3edf6b45629a2e8f1b490e1fb1516899e98b3b30db721466e83b1a1c16e237b1 Rich Learning Content

  • Icon course 04 2edd94a12ef9e5f0ebe04f6c9f6ae2c89e5efba5fd0b703c60f65837f8b54430 Interactive Quizzes

  • Icon course 02 2d90171a3a467a7d4613c7c615f15093d7402c66f2cf9a5ab4bcf11a4958aa33 Taught by Industry Pros

  • Icon course 05 237542f88ede3178ac4845d4bebf431ddd36d9c3c35aedfbd92e148c1c7361c6 Self-Paced Learning

  • Icon course 03 142f0532acf4fa030d680f5cb3babed8007e9ac853d0a3bf731fa30a7869db3a Student Support Community

Join the Path to Greatness

This free course is your first step towards a new career with the Data Analyst Nanodegree Program.

Free Course

Data Analysis with R

by Facebook

Enhance your skill set and boost your hirability through innovative, independent learning.

Icon steps 54aa753742d05d598baf005f2bb1b5bb6339a7d544b84089a1eee6acd5a8543d

Course Leads

  • Moira Burke
    Moira Burke

    Instructor

  • Chris Saden
    Chris Saden

    Instructor

  • Solomon Messing
    Solomon Messing

    Instructor

  • Dean Eckles
    Dean Eckles

    Instructor

What You Will Learn

Lesson 1

What is EDA?

  • Start by learn about what exploratory data analysis (EDA) is and why it is important.
Lesson 1

What is EDA?

  • Start by learn about what exploratory data analysis (EDA) is and why it is important.
Lesson 2

R Basics

  • EDA, which comes before formal hypothesis testing and modeling, makes use of visual methods to analyze and summarize data sets.
  • R will be our tool for generating those visuals and conducting analyses.
  • We will install RStudio and packages, learn the layout and basic commands of R, practice writing basic R scripts, and inspect data sets.
Lesson 2

R Basics

  • EDA, which comes before formal hypothesis testing and modeling, makes use of visual methods to analyze and summarize data sets.
  • R will be our tool for generating those visuals and conducting analyses.
  • We will install RStudio and packages, learn the layout and basic commands of R, practice writing basic R scripts, and inspect data sets.
Lesson 3

Explore One Variable

  • Perform EDA to understand the distribution of a variable and to check for anomalies and outliers.
  • Learn how to quantify and visualize individual variables within a data set to make sense of a pseudo-data set of Facebook users.
  • Create histograms and boxplots, transform variables, and examine tradeoffs in visualizations.
Lesson 3

Explore One Variable

  • Perform EDA to understand the distribution of a variable and to check for anomalies and outliers.
  • Learn how to quantify and visualize individual variables within a data set to make sense of a pseudo-data set of Facebook users.
  • Create histograms and boxplots, transform variables, and examine tradeoffs in visualizations.
Lesson 4

Explore Two Variables

  • DA allows us to identify the most important variables and relationships within a data set before building predictive models.
  • Learn techniques for exploring the relationship between any two variables in a data set.
  • Create scatter plots, calculate correlations, and investigate conditional means.
Lesson 4

Explore Two Variables

  • DA allows us to identify the most important variables and relationships within a data set before building predictive models.
  • Learn techniques for exploring the relationship between any two variables in a data set.
  • Create scatter plots, calculate correlations, and investigate conditional means.
Lesson 5

Explore Many Variables

  • Learn powerful methods and visualizations for examining relationships among multiple variables.
  • Reshape data frames and how to use aesthetics like color and shape to uncover more information
  • Continue to build intuition around the Facebook data set and explore some new data sets as well.
Lesson 5

Explore Many Variables

  • Learn powerful methods and visualizations for examining relationships among multiple variables.
  • Reshape data frames and how to use aesthetics like color and shape to uncover more information
  • Continue to build intuition around the Facebook data set and explore some new data sets as well.
Lesson 6

Diamonds and Price Predictions

  • Investigate the diamonds data set alongside Facebook Data Scientist, Solomon Messing.
  • See how predictive modeling can allow us to determine a good price for a diamond.
  • As a final project, you will create your own exploratory data analysis on a data set of your choice.
Lesson 6

Diamonds and Price Predictions

  • Investigate the diamonds data set alongside Facebook Data Scientist, Solomon Messing.
  • See how predictive modeling can allow us to determine a good price for a diamond.
  • As a final project, you will create your own exploratory data analysis on a data set of your choice.

Prerequisites and Requirements

A background in statistics is helpful but not required. Consider taking Intro to Descriptive Statistics and Intro to Inferential Statistics prior to taking this course. Relevant topics include:

  • Mean, median, mode
  • Normal, uniform, and skewed distributions
  • Histograms and box plots


Familiarity with the following CS and Math topics will help students:

  • Variable assignment
  • Comparison and logical operators ( <, >, <=, >=, ==, &, | )
  • If else statements
  • Square roots, logarithms, and exponentials

See the Technology Requirements for using Udacity.

Why Take This Course

You will...

  • Understand data analysis via EDA as a journey and a way to explore data
  • Explore data at multiple levels using appropriate visualizations
  • Acquire statistical knowledge for summarizing data
  • Demonstrate curiosity and skepticism when performing data analysis
  • Develop intuition around a data set and understand how the data was generated.
What do I get?
  • Instructor videos
  • Learn by doing exercises
  • Taught by industry professionals
Icon globe e82eae5d45465aba4fbe4bb746905ce55dc3324f310b79c60e4a20089057d347

Udacity 现已提供中文版本! A Udacity tem uma página em português para você! There's a local version of Udacity for you!

前往优达学城中文网站 Ir para a página brasileira Go to Indian Site or continue to Global Site