Lesson 1
Welcome
Introduction to Data Streaming with Spark
Course
In this course you will grow your expertise in the components of streaming data systems, and build a real time analytics application. Specifically, you will be able to identify components of Spark Streaming (architecture and API), build a continuous application with Structured Streaming, consume and process data from Apache Kafka with Spark Structured Streaming (including setting up and running a Spark Cluster), create a DataFrame as an aggregation of source DataFrames, sink a composite DataFrame to Kafka, and visually inspect a data sink for accuracy.
In this course you will grow your expertise in the components of streaming data systems, and build a real time analytics application. Specifically, you will be able to identify components of Spark Streaming (architecture and API), build a continuous application with Structured Streaming, consume and process data from Apache Kafka with Spark Structured Streaming (including setting up and running a Spark Cluster), create a DataFrame as an aggregation of source DataFrames, sink a composite DataFrame to Kafka, and visually inspect a data sink for accuracy.
4 weeks
Real-world Projects
Completion Certificate
Last Updated August 29, 2023
No experience required
Lesson 1
Welcome
Introduction to Data Streaming with Spark
Lesson 2
Streaming Dataframes, Views, and Spark SQL
In this lesson, you'll learn about working with Spark Dataframes and views.
Lesson 3
Joins and JSON
In this lesson, you'll learn how to work with JSON and complete Joins for data streaming.
Lesson 4
Redis, Base64 and JSON
This lesson will focus on working with Redis, Base64, and JSON in Data Streaming.
Lesson 5 • Project
Evaluate Human Balance with Spark Streaming
As your final project for this course, you will demonstrate the skills you have learned by evaluating human balance with spark streaming.
Sean Murdock
Professor at Brigham Young University Idaho
Sean currently teaches cybersecurity and DevOps courses at Brigham Young University Idaho. He has been a software engineer for over 16 years. Some of the most exciting projects he has worked on involved data pipelines for DNA processing and vehicle telematics.
Sean Murdock
Professor at Brigham Young University Idaho
Sean currently teaches cybersecurity and DevOps courses at Brigham Young University Idaho. He has been a software engineer for over 16 years. Some of the most exciting projects he has worked on involved data pipelines for DNA processing and vehicle telematics.
Get Started Today