Udacity part of Accenture logo

Spark and Data Lakes

Course

In this course, you will learn about the big data ecosystem and how to use Spark to work with massive datasets. You’ll also learn about how to store big data in a data lake and query it with Spark.

In this course, you will learn about the big data ecosystem and how to use Spark to work with massive datasets. You’ll also learn about how to store big data in a data lake and query it with Spark.

  • Intermediate

  • 2 weeks

  • Last Updated November 24, 2024

Skills you'll learn:

AWS data lakesELT

Prerequisites:

Amazon web services basicsDatabase fundamentalsIntermediate PythonIntermediate SQLData modeling basics

Intermediate

2 weeks

Last Updated November 24, 2024

Skills you'll learn:

AWS data lakes • ELT • Big data fluency • Data wrangling

Prerequisites:

Amazon web services basics • Database fundamentals • Intermediate Python

Course Lessons

Lesson 1

Introduction to Spark and Data Lakes

In this course you'll learn how Spark evaluates code and uses distributed computing to process and transform data. You'll work in the big data ecosystem to build data lakes and data lake houses.

Lesson 2

Big Data Ecosystem, Data Lakes, and Spark

In this lesson, you will learn about the problems that Apache Spark is designed to solve. You'll also learn about the greater Big Data ecosystem and how Spark fits into it.

Lesson 3

Spark Essentials

In this lesson, we'll dive into how to use Spark for wrangling, filtering, and transforming distributed data with PySpark and Spark SQL

Lesson 4

Using Spark in AWS

In this lesson, you will learn to use Spark and work with data lakes with Amazon Web Services using S3, AWS Glue, and AWS Glue Studio.

Lesson 5

Ingesting and Organizing Data in a Lakehouse

In this lesson you'll work with Lakehouse zones. You will build and configure these zones in AWS.

Lesson 6 • Project

STEDI Human Balance Analytics

In this project, you'll work with sensor data that trains a machine learning model. You'll load S3 JSON data from a data lake into Athena tables using Spark and AWS Glue.

Taught By The Best

Photo of Sean Murdock

Sean Murdock

Professor at Brigham Young University Idaho

Sean currently teaches cybersecurity and DevOps courses at Brigham Young University Idaho. He has been a software engineer for over 16 years. Some of the most exciting projects he has worked on involved data pipelines for DNA processing and vehicle telematics.

The Udacity Difference

Combine technology training for employees with industry experts, mentors, and projects, for critical thinking that pushes innovation. Our proven upskilling system goes after success—relentlessly.

Demonstrate proficiency with practical projects

Projects are based on real-world scenarios and challenges, allowing you to apply the skills you learn to practical situations, while giving you real hands-on experience.

  • Gain proven experience

  • Retain knowledge longer

  • Apply new skills immediately

Top-tier services to ensure learner success

Reviewers provide timely and constructive feedback on your project submissions, highlighting areas of improvement and offering practical tips to enhance your work.

  • Get help from subject matter experts

  • Learn industry best practices

  • Gain valuable insights and improve your skills

Enroll in Spark and Data Lakes. Choose the plan that works for you

All Access monthly

  • Unlimited access to our top-rated courses

  • Personalized Career Services

  • Cancel Anytime

  • Real-world projects

  • Personalized project reviews

  • Program certificates

Best Value

All Access bundle1

  • All the same great benefits as our monthly plan

  • The most cost-effective way to develop the skills you want

  1. 1Discount applies to the first 4 months of membership, after which plans are converted to month-to-month.

Your subscription also includes:

Udacity Accenture logo

Company

  • Facebook
  • Twitter
  • LinkedIn
  • Instagram

© 2011-2024 Udacity, Inc. "Nanodegree" is a registered trademark of Udacity. © 2011-2024 Udacity, Inc.
We use cookies and other data collection technologies to provide the best experience for our customers.