$150 /month

In collaboration with
Intermediate
Join 40,315 Students

Approx. 1 month

Assumes 6hr/wk

(work at your own pace)

This Course is a Part of the

Course Summary

The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing. Learn the fundamental principles behind it, and how you can use its power to make sense of your Big Data.

Why Take This Course?

  • How Hadoop fits into the world (recognize the problems it solves)
  • Understand the concepts of HDFS and MapReduce (find out how it solves the problems)
  • Write MapReduce programs (see how we solve the problems)
  • Practice solving problems on your own

Pre-Requisites and Requirements

Lesson 1 does not have technical prerequisites and is a good overview of Hadoop and MapReduce for managers.

To get the most out of the class you however need basic programming skills in Python, on a level provided by introductory courses, like our Introduction to Computer Science.

To learn more about Hadoop, you can also check out the book Hadoop: The Definitive Guide.

See the Technology Requirements for using Udacity

What Will I Learn

Projects

Use MapReduce to reveal surprising trends in Udacity forum data.

Syllabus

Lesson 1

What is 'big data'? The dimensions of Big Data. Scaling problems. HDFS and Hadoop ecosystem.

Lesson 2

The Basics of HDFS, MapReduce and Hadoop cluster.

Lesson 3

Writing a MapReduce program to answer questions about data.

Final Project

Answering questions about big sales data and analyzing large website logs.

Instructors & Partners

instructor photo

Sarah Sproehnle

Instructor

Sarah Sproehnle is the Vice President of Educational Services at Cloudera, a company that helps develop, manage and support Apache Hadoop. While she is a geek at heart, her passion is helping people learn complex technology. In addition to teaching people how to use Hadoop, she's taught database administration, various programming languages, and system administration.

instructor photo

Ian Wrigley

Instructor

Ian Wrigley is currently the Senior Curriculum Manager at Cloudera, responsible for the team which creates all the company's Hadoop training materials. He's been a tech journalist, an instructor, and a course author for over 20 years, during which time he's taught everything from C programming to copywriting for the Web. He describes his job as "teaching geeks to be geekier".

instructor photo

Gundega Dekena

Course Developer

Once upon a time Gundega was a Udacity student. In a way she still is, because she is learning new things from instructors she works with and her Udacity coworkers every day.

If you occasionally want to read fun news about robotics, science and games, follow her on G+ - https://plus.google.com/+GundegaDekena.

Cloudera

Ways to Take This Course

Full Course

  • Courseware
  • Projects with ongoing feedback and code-review
  • Guidance from Coaches
  • Verified Certificates

Courseware

  • View the course "Textbook" by watching lectures and taking auto- graded quizzes. Learn at your own pace. 100% free.
14 day money back guarantee. Love it or get a full refund.
track icon

View more courses in the Data Science Track