Real-world projects from industry experts
With real-world projects and immersive content built in partnership with top-tier companies, you’ll master the tech skills companies want.
Learn the skills to take you into the next era of data engineering. Build real-time applications to process big data at scale.
At 5-10 hours/week
Get access to the classroom immediately upon enrollment
Learn how to process data in real-time by building fluency in modern data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka Streaming. You’ll start by understanding the components of data streaming systems. You’ll then build a real-time analytics application. Students will also compile data and run analytics, as well as draw insights from reports generated by the streaming console.
To be successful in this program, you should have intermediate Python and SQL skills, as well as experience with ETL.
Learn the fundamentals of stream processing, including how to work with the Apache Kafka ecosystem, data schemas, ApacheAvro, Kafka Connect and REST proxy, KSQL, and Faust Stream Processing.
The goal of this course is to grow your expertise in the components of streaming data systems, and build a real time analytics application. Specifically, you will be able to identify components of Spark Streaming (architecture and API), build a continuous application with Structured Streaming, consume and process data from Apache Kafka with Spark Structured Streaming (including setting up and running a Spark Cluster), create a DataFrame as an aggregation of source DataFrames, sink a composite DataFrame to Kafka, and visually inspect a data sink for accuracy.
With real-world projects and immersive content built in partnership with top-tier companies, you’ll master the tech skills companies want.
On demand help. Receive instant help with your learning directly in the classroom. Stay on track and get unstuck.
You’ll have access to Github portfolio review and LinkedIn profile optimization to help you advance your career and land a high-paying role.
Tailor a learning plan that fits your busy life. Learn at your own pace and reach your personal goals on the schedule that works best for you.
We provide services customized for your needs at every step of your learning journey to ensure your success.
project reviewers
projects reviewed
reviewer rating
avg project review turnaround time
In his career as an engineer, Ben Goldberg has worked in fields ranging from Computer Vision to Natural Language Processing. At SpotHero, he founded and built out their Data Engineering team, using Airflow as one of the key technologies.
Judit is a Senior Data Engineer at Netflix. Formerly a Data Engineer at Split, where she worked on the statistical engine of their full-stack experimentation platform, she has also been an instructor at Insight Data Science, helping software engineers and academic coders transition to DE roles.
David is VP of Engineering at Insight where he enjoys breaking down difficult concepts and helping others learn data engineering. David has a PhD in Physics from UC Riverside.
Sean currently teaches cybersecurity and DevOps courses at Brigham Young University Idaho. He has been a software engineer for over 16 years. Some of the most exciting projects he has worked on involved data pipelines for DNA processing and vehicle telematics.
Learn how to analyze data in real-time using Apache Kafka and Spark, and build applications to process live insights from data at scale.
On average, successful students take 2 months to complete this program.
As businesses increasingly rely on applications that produce and process data in real-time, data streaming is an increasingly in-demand skill for data engineers. The Data Streaming Nanodegree program will prepare you for the cutting edge of data engineering as more and more companies look to derive live insights from data at scale.
Students will learn how to process data in real-time by building fluency in modern data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka Streaming.
You’ll start by understanding the components of data streaming systems. You’ll then build a real-time analytics application. You will also compile data and run analytics, as well as draw insights from reports generated by the streaming console.
This program is designed to upskill experienced Software Engineers and Data Engineers to learn the latest advancements in data processing, sending data records continuously to support live updating.
The projects in the Data Streaming Nanodegree program will prepare you to develop systems and applications capable of interpreting data in real-time, and position you for roles in all industries that require live data processing for functions including big data, cloud computing, web personalization, fraud detection, sensor monitoring, anomaly detection, supply chain maintenance, location-based services, and much more.
This program is intended for software engineers looking to build real-time data processing proficiency, as well as data engineers looking to enhance their existing skill set with the next advancement in data engineering.
There is no application. This Nanodegree program accepts everyone, regardless of experience and specific background.
The Data Streaming Nanodegree program is designed for students with intermediate Python and SQL skills, as well as experience with ETL. Basic familiarity with traditional batch processing and basic conceptual familiarity with traditional service architectures is desired, but not required.
Intermediate Python programming knowledge, of the sort gained through the Programming for Data Science Nanodegree program, other introductory programming courses or programs, or additional real-world software development experience. Including:
Intermediate SQL knowledge and linear algebra mastery, addressed in the Programming for Data Science Nanodegree program, including:
Udacity’s Programming for Data Science with Python Nanodegree program is great preparation for the Data Engineer Nanodegree program. You’ll learn to code with Python and SQL.
Similarly, the Data Engineering Nanodegree program is great preparation for the Data Streaming Nanodegree program.
The Data Streaming Nanodegree program is comprised of content and curriculum to support two projects. We estimate that students can complete the program in two months, working five to ten hours per week.
Each project will be reviewed by the Udacity reviewer network. Feedback will be provided, and if you do not pass the project, you will be asked to resubmit the project until it passes.
You will have access to this Nanodegree program for as long as your subscription remains active. The estimated time to complete this program can be found on the webpage and in the syllabus, and is based on the average amount of time we project that it takes a student to complete the projects and coursework. See the Terms of Use and FAQs for other policies regarding the terms of access to our Nanodegree programs.
Please see the Udacity Program FAQs for policies on enrollment in our programs.
Students should have access to the Internet and a 64 bit computer. There are no additional software and version requirements to complete this program, all coursework and projects can be completed via Student Workspaces in the Udacity online classroom.