Intro to Hadoop and MapReduce
How to Process Big Data
About this Course
The Apache™ Hadoop® project develops open-source software for reliable, scalable, distributed computing. Learn the fundamental principles behind it, and how you can use its power to make sense of your Big Data.
Approx. 1 months
Included in Product
Rich Learning Content
Taught by Industry Pros
What You Will Learn
- What is Big Data?
- The problems big data creates.
- How Apache Hadoop addresses these problems.
HDFS and MapReduce
- Discover how HDFS distributes data over multiple computers.
- Learn how MapReduce enables analyzing datasets in parallel across multiple machines.
Prerequisites and Requirements
Lesson 1 does not have technical prerequisites and is a good overview of Hadoop and MapReduce for managers.
To get the most out of the class, however, you need basic programming skills in Python on a level provided by introductory courses like our Introduction to Computer Science course.
To learn more about Hadoop, you can also check out the book Hadoop: The Definitive Guide.
See the Technology Requirements for using Udacity.
Why Take This Course
- How Hadoop fits into the world (recognize the problems it solves)
- Understand the concepts of HDFS and MapReduce (find out how it solves the problems)
- Write MapReduce programs (see how we solve the problems)
- Practice solving problems on your own
What do I get?
- Instructor videos
- Learn by doing exercises
- Taught by industry professionals