Free Course

Spark

by
Insight

Master how to work with big data and build machine learning models at scale using Spark!

Nanodegree Program

Data Scientist

Become a Data Scientist

Accelerate your career with the credential that fast-tracks you to job success.

About this Course

In this course, you’ll learn how to use Spark to work with big data and build machine learning models at scale, including how to wrangle and model massive datasets with PySpark, the Python library for interacting with Spark. In the first lesson, you will learn about big data and how Spark fits into the big data ecosystem. In lesson two, you will be practicing processing and cleaning datasets to get comfortable with Spark’s SQL and dataframe APIs. In the third lesson, you will debug and optimize your Spark code when running on a cluster. In lesson four, you will use Spark’s Machine Learning Library to train machine learning models at scale.

Course Cost
Free
Timeline
Approx. 10 hours
Skill Level
intermediate
Included in Product

Rich Learning Content

Taught by Industry Pros

Interactive Quizzes

Self-Paced Learning

Join the Path to Greatness

This course is your first step towards a new career with the Data Scientist Nanodegree Program.

Free Course

Spark

byInsight

Master how to work with big data and build machine learning models at scale using Spark!

Icon steps
 
 

Course Leads

David Drummond

David Drummond

VP of Engineering at Insight

Judit Lantos

Judit Lantos

Senior Data Engineer at Netflix

What You Will Learn

Prerequisites and Requirements

This course is ideal for students with programming and data analysis experience.

See the Technology Requirements for using Udacity.

Why Take This Course

Spark is a top open source project used by the largest companies and startups around the world to efficiently analyze messy data sets.

What do I get?
Instructor videosLearn by doing exercisesTaught by industry professionals