Skip to content

How to Become a Data Engineer

Nanodegree Program

Understand the latest AWS features used by data engineers to design and build systems for collecting, storing, and analyzing data at scale.

Enroll Now
  • Estimated time
    4 Months

    At 5-10 hrs/week

  • Enroll by
    June 7, 2023

    Get access to classroom immediately on enrollment

  • Skills acquired
    Apache Airflow, AWS Glue, Apache Spark, Redshift, Amazon S3

What you will learn

  1. Data Engineering with AWS

    Estimated 4 months to complete

    You’ll master the AWS data engineering skills necessary to level up your tech career. Learn data engineering concepts like designing data models, building data warehouses and data lakes, automating data pipelines, and managing massive datasets.

    Prerequisite knowledge

    It is recommended that learners have intermediate Python, intermediate SQL, and command line skills.

    1. Data Modeling

      Learners will create relational and NoSQL data models to fit the diverse needs of data consumers. They’ll also use ETL to build databases in Apache Cassandra.

    2. Cloud Data Warehouses

      In this data engineering course, learners will create cloud-based data warehouses. They will sharpen their data warehousing skills, deepen their understanding of data infrastructure, and be introduced to data engineering on the cloud using Amazon Web Services (AWS).

    3. Spark and Data Lakes

      Learners will build a data lake on AWS and a data catalog following the principles of data lakehouse architecture. They will learn about the big data ecosystem and the power of Apache Spark for data wrangling and transformation. They’ll work with AWS data tools and services to extract, load, process, query, and transform semi-structured data in data lakes.

    4. Automate Data Pipelines

      This data engineer training dives into the concept of data pipelines and how learners can use them to accelerate their career. This course will focus on applying the data pipeline concepts learns will learn through an open-source tool from Airbnb called Apache Airflow. This course will start by covering concepts including data validation, DAGs, and Airflow and then venture into AWS quality concepts like copying S3 data, connections and hooks, and Redshift Serverless. Next, learners will explore data quality through data lineage, data pipeline schedules, and data partitioning. Finally, they’ll put data pipelines into production by extending Airflow with plugins, implementing task boundaries, and refactoring DAGs.

All our programs include

  • Real-world projects from industry experts

    With real-world projects and immersive content built in partnership with top-tier companies, you’ll master the tech skills companies want.

  • Real-time support

    On demand help. Receive instant help with your learning directly in the classroom. Stay on track and get unstuck.

  • Career services

    You’ll have access to Github portfolio review and LinkedIn profile optimization to help you advance your career and land a high-paying role.

  • Flexible learning program

    Tailor a learning plan that fits your busy life. Learn at your own pace and reach your personal goals on the schedule that works best for you.

Program offerings

  • Class content

    • Content Co-created with Insight
    • Real-world projects
    • Project reviews
    • Project feedback from experienced reviewers
  • Student services

    • Student community
    • Real-time support
  • Career services

    • Github review
    • Linkedin profile optimization

Succeed with personalized services.

We provide services customized for your needs at every step of your learning journey to ensure your success.

Get timely feedback on your projects.

  • Personalized feedback
  • Unlimited submissions and feedback loops
  • Practical tips and industry best practices
  • Additional suggested resources to improve
  • 1,400+

    project reviewers

  • 2.7M

    projects reviewed

  • 88/100

    reviewer rating

  • 1.1 hours

    avg project review turnaround time

Learn with the best.

Learn with the best.

  • Amanda Moran

    Developer Advocate at DataStax

    Amanda is a developer advocate for DataStax after spending the last 6 years as a software engineer on 4 different distributed databases. Her passion is bridging the gap between customers and engineering. She has degrees from the University of Washington and Santa Clara University.

  • Ben Goldberg

    Staff Engineer at SpotHero

    In his career as an engineer, Ben Goldberg has worked in fields ranging from computer vision to natural language processing. At SpotHero, he founded and built out their data engineering team, using Airflow as one of the key technologies.

  • Valerie Scarlata

    Curriculum Manager at Udacity

    Valerie is a curriculum manager at Udacity who has developed and taught a broad range of computing curriculum for several colleges and universities. She was a professor and software engineer for over 10 years specializing in web, mobile, voice assistant, and social full-stack application development.

  • Matt Swaffer

    Solutions Architect

    Matt is a software and solutions architect focusing on data science and analytics for managed business solutions. In addition, Matt is an adjunct lecturer, teaching courses in the computer information systems department at the University of Northern Colorado where he received his PhD in Educational Psychology.

  • Sean Murdock

    Professor at Brigham Young University Idaho

    Sean currently teaches cybersecurity and DevOps courses at Brigham Young University Idaho. He has been a software engineer for over 16 years. Some of the most exciting projects he has worked on involved data pipelines for DNA processing and vehicle telematics.

Top student reviews

 
0.0 stars
(0)
 
NaN stars

        

 
NaN stars

        

 
NaN stars

        

 
NaN stars

        

 
NaN stars

        

 
NaN stars

        

Data Engineering with AWS

Get started today

    • Learn

      Learn the high-impact AWS skills that a data engineer uses on a daily basis.

    • Average Time

      On average, successful students take 4 months to complete this program.

    • Benefits include

      • Real-world projects from industry experts
      • Real-time classroom support
      • Career services

    Program details

    Program overview: Why should I take this program?
    • Why should I enroll?

      The data engineering field is expected to continue growing rapidly over the next several years, and there’s huge demand for data engineers across industries. Udacity has collaborated with industry professionals to offer up-to-date learning content that can advance your data engineering career.

      By the end of the Nanodegree program, you will have an impressive portfolio of real-world projects and valuable hands-on experience.

    • What jobs will this program prepare me for?

      This program is designed to teach you how to become a data engineer. These skills will prepare you for jobs such as analytics engineer, big data engineer, data platform engineer, and others. Data engineering skills are also helpful for adjacent roles, such as data analysts, data scientists, machine learning engineers, or software engineers.

    • How do I know if this program is right for me?

      This Nanodegree program offers an ideal path for experienced programmers to advance their data engineering careers. If you enjoy solving important technical challenges and want to learn to work with massive datasets, this is a great way to get hands-on practice.

    Enrollment and admission
    • Do I need to apply? What are the admission criteria?

      There is no application. This Nanodegree program accepts everyone, regardless of experience and specific background.

    • What are the prerequisites for enrollment?

      The Data Engineering with AWS Nanodegree program is designed for learners with intermediate Python, intermediate SQL, and command line skills.

      In order to successfully complete the program, learners should be comfortable with the following concepts:

      • Strings, numbers, and variables
      • Statements, operators, and expressions
      • Lists, tuples, and dictionaries
      • Conditions, loops
      • Procedures, objects, modules, and libraries
      • Troubleshooting and debugging
      • Problem-solving
      • Algorithms and data structures
      • Joins
      • Aggregations
      • Subqueries
      • Table definition and manipulation (Create, Update, Insert, Alter)
      • Run scripts from the command line

      If you need to sharpen your pre-requisite skills, try our below programs:

    • If I do not meet the requirements to enroll, what should I do?

      To prepare for this program learners are encouraged to enroll in one of the following programs:

    Tuition and term of program
    • How is this Nanodegree program structured?

      The Data Engineering with AWS Nanodegree program has 4 courses with 4 projects. We estimate that students can complete the program in 4 months working 5-10 hours per week.

      Each project will be reviewed by the Udacity reviewer network. Feedback will be provided and if you do not pass the project, you will be asked to resubmit the project until it passes.

    • How long is this Nanodegree program?

      Access to this Nanodegree program runs for the length of time specified above. If you do not graduate within that time period, you will continue learning with month-to-month payments. See the Terms of Use and FAQs for other policies regarding the terms of access to our Nanodegree programs.

    • Can I switch my start date? Can I get a refund?

      Please see the Udacity Program FAQs for policies on enrollment in our programs.

    Software and hardware: What do I need for this program?
    • What software and versions will I need in this program?

      There are no software and version requirements to complete this Nanodegree program. All coursework and projects can be done via Student Workspaces in the Udacity online classroom.

    Data Engineering with AWS

    Enroll Now