Data Streaming

Name: Data Streaming Nanodegree Program
Rating: 4.4 (127 reviews)

Nanodegree Program

Learn the latest skills to process data in real-time by building fluency in modern data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka Streaming.

Advanced

2 months

Real-world Projects

Completion Certificate

Last Updated December 29, 2023

Skills you'll learn:

Faust • Confluent Kafka Python client • Kafka rest proxy • KSQL

Prerequisites:

Intermediate Python • ETL • Basic descriptive statistics

Courses In This Program

Course 1 • 1 hour

Welcome to the Data Streaming Nanodegree Program

Lesson 1

Data Streaming Nanodegree Program Introduction

Welcome to the Data Streaming Nanodegree Program

Lesson 2

Getting Help

You are starting a challenging but rewarding journey! Take 5 minutes to read how to get help with projects and content.

Lesson 1

Data Streaming Nanodegree Program Introduction

Welcome to the Data Streaming Nanodegree Program

Lesson 2

Getting Help

You are starting a challenging but rewarding journey! Take 5 minutes to read how to get help with projects and content.

Course 2 • 4 weeks

Data Ingestion with Kafka and Kafka Streaming

Learn to use REST Proxy, Kafka Connect, KSQL, and Faust Python Stream Processing and use it to stream public transit statuses using Kafka and Kafka ecosystem to build a stream processing application that shows the status of trains in real-time.

Lesson 1

Introduction to Stream Processing

In this lesson students will learn what data streaming is. Students will learn the pros and cons of data streaming, and how it compares to traditional data strategies.

Lesson 2

Apache Kafka

In this lesson we’ll review the architecture and configuration of Apache Kafka.

Lesson 3

Data Schemas and Apache Avro

This lesson covers data schemas and data schema management, with a focus on Apache Avro.

Lesson 4

Kafka Connect and REST Proxy

This lesson covers producing and consuming data into Kafka with Kafka Connect and REST Proxy.

Lesson 5

Stream Processing Fundamentals

Learn to build real-time applications that instantly process events, the concepts of stream processing state storage, windowed processing, and stateful and non-stateful stream processing.

Lesson 6

Stream Processing with Faust

Students will learn how to use the Python stream processing library Faust to rapidly create powerful stream processing applications.

Lesson 7

KSQL

Learn how to write simple SQL queries to turn Kafka topics into KSQL streams and tables, and then write those tables back out to Kafka.

Lesson 8 • Project

Optimizing Public Transportation

For your first project, you’ll be streaming public transit status using Kafka and the Kafka ecosystem to build a stream processing application that shows the status of trains in real-time.

Lesson 1

Introduction to Stream Processing

In this lesson students will learn what data streaming is. Students will learn the pros and cons of data streaming, and how it compares to traditional data strategies.

Lesson 2

Apache Kafka

In this lesson we’ll review the architecture and configuration of Apache Kafka.

Lesson 3

Data Schemas and Apache Avro

This lesson covers data schemas and data schema management, with a focus on Apache Avro.

Lesson 4

Kafka Connect and REST Proxy

This lesson covers producing and consuming data into Kafka with Kafka Connect and REST Proxy.

Lesson 5

Stream Processing Fundamentals

Learn to build real-time applications that instantly process events, the concepts of stream processing state storage, windowed processing, and stateful and non-stateful stream processing.

Lesson 6

Stream Processing with Faust

Students will learn how to use the Python stream processing library Faust to rapidly create powerful stream processing applications.

Lesson 7

KSQL

Learn how to write simple SQL queries to turn Kafka topics into KSQL streams and tables, and then write those tables back out to Kafka.

Lesson 8 • Project

Optimizing Public Transportation

For your first project, you’ll be streaming public transit status using Kafka and the Kafka ecosystem to build a stream processing application that shows the status of trains in real-time.

Course 3 • 4 weeks

Streaming API Development and Documentation

In this course you will grow your expertise in the components of streaming data systems, and build a real time analytics application. Specifically, you will be able to identify components of Spark Streaming (architecture and API), build a continuous application with Structured Streaming, consume and process data from Apache Kafka with Spark Structured Streaming (including setting up and running a Spark Cluster), create a DataFrame as an aggregation of source DataFrames, sink a composite DataFrame to Kafka, and visually inspect a data sink for accuracy.

Lesson 1

Welcome

Introduction to Data Streaming with Spark

Lesson 2

Streaming Dataframes, Views, and Spark SQL

In this lesson, you'll learn about working with Spark Dataframes and views.

Lesson 3

Joins and JSON

In this lesson, you'll learn how to work with JSON and complete Joins for data streaming.

Lesson 4

Redis, Base64 and JSON

This lesson will focus on working with Redis, Base64, and JSON in Data Streaming.

Lesson 5 • Project

Evaluate Human Balance with Spark Streaming

As your final project for this course, you will demonstrate the skills you have learned by evaluating human balance with spark streaming.

Lesson 1

Welcome

Introduction to Data Streaming with Spark

Lesson 2

Streaming Dataframes, Views, and Spark SQL

In this lesson, you'll learn about working with Spark Dataframes and views.

Lesson 3

Joins and JSON

In this lesson, you'll learn how to work with JSON and complete Joins for data streaming.

Lesson 4

Redis, Base64 and JSON

This lesson will focus on working with Redis, Base64, and JSON in Data Streaming.

Lesson 5 • Project

Evaluate Human Balance with Spark Streaming

As your final project for this course, you will demonstrate the skills you have learned by evaluating human balance with spark streaming.

(Optional) Course 4 • 2 days

Career Services

Lesson 1 • Project

Take 30 Min to Improve your LinkedIn

Find your next job or connect with industry peers on LinkedIn. Ensure your profile attracts relevant leads that will grow your professional network.

Lesson 2 • Project

Optimize Your GitHub Profile

Other professionals are collaborating on GitHub and growing their network. Submit your profile to ensure your profile is on par with leaders in your field.

Lesson 1 • Project

Take 30 Min to Improve your LinkedIn

Find your next job or connect with industry peers on LinkedIn. Ensure your profile attracts relevant leads that will grow your professional network.

Lesson 2 • Project

Optimize Your GitHub Profile

Other professionals are collaborating on GitHub and growing their network. Submit your profile to ensure your profile is on par with leaders in your field.

Taught By The Best

Sean Murdock

Professor at Brigham Young University Idaho

Sean currently teaches cybersecurity and DevOps courses at Brigham Young University Idaho. He has been a software engineer for over 16 years. Some of the most exciting projects he has worked on involved data pipelines for DNA processing and vehicle telematics.

Judit Lantos

Senior Data Engineer at Netflix

Judit is a Senior Data Engineer at Netflix. Formerly a Data Engineer at Split, where she worked on the statistical engine of their full-stack experimentation platform, she has also been an instructor at Insight Data Science, helping software engineers and academic coders transition to DE roles.

David Drummond

VP of Engineering at Insight

David is VP of Engineering at Insight where he enjoys breaking down difficult concepts and helping others learn data engineering. David has a PhD in Physics from UC Riverside.

Ben Goldberg

Staff Engineer at SpotHero

In his career as an engineer, Ben Goldberg has worked in fields ranging from computer vision to natural language processing. At SpotHero, he founded and built out their data engineering team, using Airflow as one of the key technologies.

Ratings & Reviews

Average Rating: 4.4 Stars

127 Reviews

Aarthi G.

February 28, 2023

The course is great. Got to learn a lot with hands-on project experience. I'm giving 4 stars because mentor support could have been better. 1-1 chat availability and live troubleshooting with the mentors would have been great. FAQs for common problems would have saved time. Overall, I would recommend this program.

Rafael R.

October 17, 2022

The nanodegree in general was very aggregating and I certainly learned a lot of new things. However, the kafka module could improve a lot. First, because it is very theoretical and the practices are always isolated from that module x of kafka and you are not used to actually running simple kafka projects and gradually increasing the complexity, unlike the spark course which was very well done and I I congratulate you. The final kafka module project was the best part of the kafka course, where I received a docker-compose to run a project that works on my machine, but that totally deviates from the course progress (where I actually saw how things worked together, but there I was no longer in the position of learning, but in the position of being evaluated). Take the spark base course and fill that gap in the kafka course and your nanodegree will be even better.

Hazem F.

July 18, 2022

very good

Stefan H.

April 8, 2022

I really enjoyed the kafka part. It was quite extensive and I learned a lot. Although it took at least about 2/3 of my time in this nanodegree. The pyspark course I find a bit too shallow, too short and the exercises are quite repetitive.

Arnau V.

March 27, 2022

The program is good and I liked the project. However I had problems trying to set everything to run the projects locally, I'd love if it was more encouraged so that you really understand everything that is happening and running

The Udacity Difference

Combine technology training for employees with industry experts, mentors, and projects, for critical thinking that pushes innovation. Our proven upskilling system goes after success—relentlessly.

Demonstrate proficiency with practical projects

Projects are based on real-world scenarios and challenges, allowing you to apply the skills you learn to practical situations, while giving you real hands-on experience.

Gain proven experience
Retain knowledge longer
Apply new skills immediately

Top-tier services to ensure learner success

Reviewers provide timely and constructive feedback on your project submissions, highlighting areas of improvement and offering practical tips to enhance your work.