According to a recent report by DOMO, the world produces 2.5 million terabytes of data per day. Data is quickly becoming the lifeblood of digital transformation, and companies are scrambling to re-invent themselves as data-driven organizations. That’s why, according to Indeed and Glassdoor, the ratio of data engineer to data scientist job openings is roughly four-to-one.
Companies can’t find enough data engineers to store, organize, and manage their ever-increasing amount of data.
Data engineers are responsible for making data accessible to all the people who use it across an organization. That could mean creating a data warehouse for the analytics team, building a data pipeline for a frontend application, or summarizing massive datasets to be more user-friendly.
Today, we are excited to announce the Data Engineer Nanodegree Program. Students who take this program will learn the technical skills required to become a data engineer. With the launch of this program, anyone with an Internet connection (and the relevant background and skills) will be able to enroll. Companies all over the world are looking for data engineers and our goal is to help anyone who wishes to land a job in the field can do so.
Collaborating with Top Data Professionals
To develop this program’s world-class curriculum, we collaborated with professionals from companies including Insight Data Engineering, Slack, Stitch Fix, and Uber. Each of these collaborators contributed guidance and feedback to focus the program on the most in-demand skills. Also, each of the instructors in the program has extensive data engineering and teaching experience in the field. Below is a list of contributors on the program.
- Andrew Andreasen, Analytics Engineer at Stitch Fix
- David Miller, Physics Professor at University of Chicago
- Diana Pojar, Data Engineer at Slack
- Nathan Chan, Data Scientist at Zymergen
- Neelesh Salian, Software Engineering – Data Platform at Stitch Fix
- Reza Shiftehfar, Engineering Manager at Uber
- Sanjay Krishnan, Comp. Sci. Professor at the University of Chicago
- Amanda Moran, Developer Advocate at Datastax
- Ben Goldberg, Staff Engineer at SpotHero
- David Drummond and Judit Lantos from Insight Data Engineering
- Sameh El-Ansary, CEO at Novelari
Future of Data Engineering
The demand for data engineers continues to grow. A quick search for Data Engineer on Glassdoor or Indeed yields roughly 100,000 job openings, nearly five times the number of jobs when you search for Data Scientist. Also, Data Engineer topped the “Hottest Tech Jobs” list in December 2018, with job openings growing nearly 100% year over year. On top of this growth, Data engineers enjoy an average annual salary of nearly $100k!
During this program, students will complete four courses and five projects. Throughout the projects, students will play the part of a data engineer at a music streaming company. They’ll work with the same type of data in each project, but with increasing data volume, velocity, and complexity. Here’s a course-by- course breakdown.
Course 1 – Data Modeling
In this course, students will learn to create relational and NoSQL data models to fit the diverse needs of data consumers. In the project, students will build SQL (Postgres) and NoSQL (Apache Cassandra) data models using user activity data for a music streaming app.
Course 2 – Cloud Data Warehouses
In this course, students will learn to create cloud-based data warehouses. In the project, students will build an ELT pipeline that extracts data from Amazon S3, stages it in Amazon Redshift, and transforms it into a set of dimensional tables.
Course 3 – Data Lakes with Apache Spark
In this course, students will learn more about the big data ecosystem, how to work with massive datasets with Apache Spark, and how to store big data in a data lake. In the project, students will build an ETL pipeline for a data lake using Apache Spark and S3.
Course 4 – Data Pipelines with Apache Airflow
In this course, students will learn to schedule, automate, and monitor data pipelines using Apache Airflow. In the project, they’ll continue your work on the music streaming company’s data infrastructure by creating and automating a set of data pipelines.
In the capstone project, each project is unique to the student. They’ll define the scope of the project; gather data from several different data sources; transform, combine, and summarize it; and create a clean database for others to analyze.
The program will take roughly five months, assuming that students spend approximately 5-10 hrs per week in the classroom. The total cost of the program is $999, which includes access to Udacity’s services. These include:
- Project reviews: Each time a student submits a project, a member of Udacity’s reviewer network provides personalized feedback on how to improve the project. If the project does not meet specifications, they have the chance to improve and resubmit it. Our services model monitors student progress to make sure no student gets stuck.
- Access to a mentor: Udacity mentors are key to student success. They answer questions, review work, and give webinars to help students through the program.
- Student community: Students can connect with one another during the program to discuss the courses and projects, chat about job search strategies, or just network and support progress through the program.
- Career services: Through career services, students receive feedback on their LinkedIn profile, Github portfolio, etc. as well as learn useful tips for interviewing and landing a job.
Demand for data engineers has never been higher. The Udacity Data Engineer Nanodegree program’s combination of world-class curriculum and excellent services is the perfect path to join this exciting field. Registration is open today and the classroom opens for the first time on April 2. Enroll today!