Skip to content

Data Streaming

Nanodegree Program

Learn the skills to take you into the next era of data engineering. Build real-time applications to process big data at scale.

Enroll Now
  • Estimated time
    2 Months

    At 5-10 hours/week

  • Enroll by
    May 31, 2023

    Get access to the classroom immediately upon enrollment

  • Skills acquired
    Kafka Rest Proxy, Faust, Apache Spark, Kafka Connect

What you will learn

  1. Data Streaming

    Estimated 2 Months to complete

    Learn how to process data in real-time by building fluency in modern data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka Streaming. You’ll start by understanding the components of data streaming systems. You’ll then build a real-time analytics application. Students will also compile data and run analytics, as well as draw insights from reports generated by the streaming console.

    Prerequisite knowledge

    To be successful in this program, you should have intermediate Python and SQL skills, as well as experience with ETL.

    1. Foundations of Data Streaming

      Learn the fundamentals of stream processing, including how to work with the Apache Kafka ecosystem, data schemas, ApacheAvro, Kafka Connect and REST proxy, KSQL, and Faust Stream Processing.

    2. Streaming API Development and Documentation

      The goal of this course is to grow your expertise in the components of streaming data systems, and build a real time analytics application. Specifically, you will be able to identify components of Spark Streaming (architecture and API), build a continuous application with Structured Streaming, consume and process data from Apache Kafka with Spark Structured Streaming (including setting up and running a Spark Cluster), create a DataFrame as an aggregation of source DataFrames, sink a composite DataFrame to Kafka, and visually inspect a data sink for accuracy.

All our programs include

  • Real-world projects from industry experts

    With real-world projects and immersive content built in partnership with top-tier companies, you’ll master the tech skills companies want.

  • Real-time support

    On demand help. Receive instant help with your learning directly in the classroom. Stay on track and get unstuck.

  • Career services

    You’ll have access to Github portfolio review and LinkedIn profile optimization to help you advance your career and land a high-paying role.

  • Flexible learning program

    Tailor a learning plan that fits your busy life. Learn at your own pace and reach your personal goals on the schedule that works best for you.

Program offerings

  • Class content

    • Real-world projects
    • Project reviews
    • Project feedback from experienced reviewers
  • Student services

    • Student community
    • Real-time support
  • Career services

    • Github review
    • Linkedin profile optimization

Succeed with personalized services.

We provide services customized for your needs at every step of your learning journey to ensure your success.

Get timely feedback on your projects.

  • Personalized feedback
  • Unlimited submissions and feedback loops
  • Practical tips and industry best practices
  • Additional suggested resources to improve
  • 1,400+

    project reviewers

  • 2.7M

    projects reviewed

  • 88/100

    reviewer rating

  • 1.1 hours

    avg project review turnaround time

Learn with the best.

Learn with the best.

  • Ben Goldberg

    Staff Engineer at SpotHero

    In his career as an engineer, Ben Goldberg has worked in fields ranging from Computer Vision to Natural Language Processing. At SpotHero, he founded and built out their Data Engineering team, using Airflow as one of the key technologies.

  • Judit Lantos

    Senior Data Engineer at Netflix

    Judit is a Senior Data Engineer at Netflix. Formerly a Data Engineer at Split, where she worked on the statistical engine of their full-stack experimentation platform, she has also been an instructor at Insight Data Science, helping software engineers and academic coders transition to DE roles.

  • David Drummond

    VP of Engineering at Insight

    David is VP of Engineering at Insight where he enjoys breaking down difficult concepts and helping others learn data engineering. David has a PhD in Physics from UC Riverside.

  • Sean Murdock

    Professor at Brigham Young University Idaho

    Sean currently teaches cybersecurity and DevOps courses at Brigham Young University Idaho. He has been a software engineer for over 16 years. Some of the most exciting projects he has worked on involved data pipelines for DNA processing and vehicle telematics.

Top student reviews

 
0.0 stars
(0)
 
NaN stars

        

 
NaN stars

        

 
NaN stars

        

 
NaN stars

        

 
NaN stars

        

 
NaN stars

        

Data Streaming Nanodegree Program

Get started today

    • Learn

      Learn how to analyze data in real-time using Apache Kafka and Spark, and build applications to process live insights from data at scale.

    • Average Time

      On average, successful students take 2 months to complete this program.

    • Benefits include

      • Real-world projects from industry experts
      • Real-time classroom support
      • Career services

    Program details

    Program overview: Why should I take this program?
    • Why should I enroll?

      As businesses increasingly rely on applications that produce and process data in real-time, data streaming is an increasingly in-demand skill for data engineers. The Data Streaming Nanodegree program will prepare you for the cutting edge of data engineering as more and more companies look to derive live insights from data at scale.

      Students will learn how to process data in real-time by building fluency in modern data engineering tools, such as Apache Spark, Kafka, Spark Streaming, and Kafka Streaming.

      You’ll start by understanding the components of data streaming systems. You’ll then build a real-time analytics application. You will also compile data and run analytics, as well as draw insights from reports generated by the streaming console.

    • What jobs will this program prepare me for?

      This program is designed to upskill experienced Software Engineers and Data Engineers to learn the latest advancements in data processing, sending data records continuously to support live updating.

      The projects in the Data Streaming Nanodegree program will prepare you to develop systems and applications capable of interpreting data in real-time, and position you for roles in all industries that require live data processing for functions including big data, cloud computing, web personalization, fraud detection, sensor monitoring, anomaly detection, supply chain maintenance, location-based services, and much more.

    • How do I know if this program is right for me?

      This program is intended for software engineers looking to build real-time data processing proficiency, as well as data engineers looking to enhance their existing skill set with the next advancement in data engineering.

    Enrollment and admission
    • Do I need to apply? What are the admission criteria?

      There is no application. This Nanodegree program accepts everyone, regardless of experience and specific background.

    • What are the prerequisites for enrollment?

      The Data Streaming Nanodegree program is designed for students with intermediate Python and SQL skills, as well as experience with ETL. Basic familiarity with traditional batch processing and basic conceptual familiarity with traditional service architectures is desired, but not required.

      Intermediate Python programming knowledge, of the sort gained through the Programming for Data Science Nanodegree program, other introductory programming courses or programs, or additional real-world software development experience. Including:

      • Strings, numbers, and variables; statements, operators, and expressions;
      • Lists, tuples, and dictionaries; Conditions, loops;
      • Procedures, objects, modules, and libraries;
      • Troubleshooting and debugging; Research & documentation;
      • Problem solving; Algorithms and data structures

      Intermediate SQL knowledge and linear algebra mastery, addressed in the Programming for Data Science Nanodegree program, including:

      • Joins, Aggregations, and Subqueries
      • Table definition and manipulation (Create, Update, Insert, Alter)
    • If I do not meet the requirements to enroll, what should I do?

      Udacity’s Programming for Data Science with Python Nanodegree program is great preparation for the Data Engineer Nanodegree program. You’ll learn to code with Python and SQL.

      Similarly, the Data Engineering Nanodegree program is great preparation for the Data Streaming Nanodegree program.

    Tuition and term of program
    • How is this Nanodegree program structured?

      The Data Streaming Nanodegree program is comprised of content and curriculum to support two projects. We estimate that students can complete the program in two months, working five to ten hours per week.

      Each project will be reviewed by the Udacity reviewer network. Feedback will be provided, and if you do not pass the project, you will be asked to resubmit the project until it passes.

    • How long is this Nanodegree program?

      You will have access to this Nanodegree program for as long as your subscription remains active. The estimated time to complete this program can be found on the webpage and in the syllabus, and is based on the average amount of time we project that it takes a student to complete the projects and coursework. See the Terms of Use and FAQs for other policies regarding the terms of access to our Nanodegree programs.

    • Can I switch my start date? Can I get a refund?

      Please see the Udacity Program FAQs for policies on enrollment in our programs.

    Software and hardware: What do I need for this program?
    • What software and versions will I need in this program?

      Students should have access to the Internet and a 64 bit computer. There are no additional software and version requirements to complete this program, all coursework and projects can be completed via Student Workspaces in the Udacity online classroom.

    Data Streaming Nanodegree Program

    Enroll Now