Udacity part of Accenture logo
Log InJoin for Free

Data lakes and Lakehouses with Spark and Azure Databricks

Course

Learn about the big data ecosystem and how to use Spark to work with massive datasets. Learners will also store big data in a data lake and develop Lakehouse architecture on the Azure Databricks platform.

Learn about the big data ecosystem and how to use Spark to work with massive datasets. Learners will also store big data in a data lake and develop Lakehouse architecture on the Azure Databricks platform.

Intermediate

3 weeks

Real-world Projects

Completion Certificate

Last Updated May 21, 2024

Skills you'll learn:

Big data fluency • Databricks • Data lakes • Apache Spark

Prerequisites:

No experience required

Course Lessons

Lesson 1

Course Introduction

In this lesson, you'll learn about the course, including the prerequisites, tools, environment, and course project.

Lesson 2

Big Data Ecosystem, Data Lakes, and Spark

In this lesson, you will learn about the problems that Apache Spark is designed to solve. You'll also learn about the greater Big Data ecosystem and how Spark fits into it.

Lesson 3

Data Wrangling with Spark

In this lesson, we'll dive into how to use Spark for cleaning and aggregating data.

Lesson 4

Spark Debugging and Optimization

In this lesson, you will learn best practices for debugging and optimizing your Spark applications.

Lesson 5

Azure Databricks

In this lesson, you'll create Spark Clusters and Spark code on the Azure Databricks platform.

Lesson 6

Data Lakes and Lakehouse with Azure Databricks

In this lesson, you'll create data lakes and Lakehouse architecture on the Azure Databricks platform

Lesson 7 • Project

Building an Azure Data Lake for Bike Share Data Analytics

In this project, you'll implement Lakehouse architecture on the Azure Databricks platform.

Taught By The Best

Photo of Matt Swaffer

Matt Swaffer

General Manager, MBS

Matt has been working in software development and data science for over 20 years. Matt's career is centered on the intersection of technology, data, and human psychology. He is passionate about using data science to have a meaningful impact on our people and our planet.

The Udacity Difference

Combine technology training for employees with industry experts, mentors, and projects, for critical thinking that pushes innovation. Our proven upskilling system goes after success—relentlessly.

Demonstrate proficiency with practical projects

Projects are based on real-world scenarios and challenges, allowing you to apply the skills you learn to practical situations, while giving you real hands-on experience.

  • Gain proven experience

  • Retain knowledge longer

  • Apply new skills immediately

Top-tier services to ensure learner success

Reviewers provide timely and constructive feedback on your project submissions, highlighting areas of improvement and offering practical tips to enhance your work.

  • Get help from subject matter experts

  • Learn industry best practices

  • Gain valuable insights and improve your skills

Unlock access to Data lakes and Lakehouses with Spark and Azure Databricks and the rest of our best-in-class catalog

  • Unlimited access to our top-rated courses

  • Real-world projects

  • Personalized project reviews

  • Program certificates

  • Proven career outcomes

Full Catalog Access

One subscription opens up this course and our entire catalog of projects and skills.

Month-To-Month

4 Months

*

Average time to complete a Nanodegree program

*Discount applies to the first 4 months of membership, after which plans are converted to month-to-month.

Your subscription also includes:

Udacity Accenture logo

Company

  • Facebook
  • Twitter
  • LinkedIn
  • Instagram

© 2011-2024 Udacity, Inc. "Nanodegree" is a registered trademark of Udacity. © 2011-2024 Udacity, Inc.
We use cookies and other data collection technologies to provide the best experience for our customers.