Skip to content

Data Engineering

Course

Learn how to wrangle data on a massive scale! By the end of this course, you’ll be able to pull data from a wide range of sources, store it in a database, and create data pipelines (ETL, NLP, machine learning) that power real-world web applications.

Enroll Now

04Days06Hrs56Min33Sec

  • Estimated time
    28 hours

  • Enroll by
    September 28, 2022

    Get access to classroom immediately on enrollment

  • Prerequisites
    Python, SQL, Statistics, Machine Learning
In collaboration with
  • Appen

What You Will Learn

  1. Data Engineering

    28 hours to complete

    For many companies, data scientists who can also tackle data-engineering problems are worth their weight in gold. In this course, you’ll learn how to unlock data silos, pulling data from multiple sources and pipelining it into usable forms for analysts and top-level decision makers. At the end, you’ll even build an impressive machine-learning-powered web application that has real-world, life-saving significance.

    Prerequisite knowledge

    1. ETL Pipelines

      Understand what ETL pipelines are and cccess and combine data from CSV, JSON, logs, APIs and databases.

      • Natural Language Processing

        Prepare text data for analysis with tokenization, lemmatization, and removing stop words. Use scikit-learn to transform and vectorize text data and build features with bag of words and tf-idf.

        • Machine Learning Pipelines

          Understand the advantages of using machine learning pipelines to streamline the data preparation and modeling process. Use feature unions to perform steps in parallel and create more complex workflows and complete a case study to build a full machine learning pipeline that prepares data and creates a model for a dataset.

          • Course Project: Build Disaster Response Pipelines

            In this project, you’ll build a data pipeline to prepare the message data from major natural disasters around the world. You’ll build a machine learning pipeline to categorize emergency text messages based on the need communicated by the sender.

          woman-leading-group-meeting

          Introducing new Udacity Single Courses

          Our students asked and we listened. You can now get the in-demand tech skills you need faster and for less money by enrolling in one of our new, one-month Single Courses. You’ll get the specific job-ready skills you need in as little as four weeks and for a fraction of the cost.

          Of course if you are looking for a more robust, in-depth education, you can still enroll in one of our 3-6 month Nanodegree programs.

          Both programs are part-time and online, and they both offer 24/7 support, quality Udacity-produced content, courses created with the help of top tech companies, and more. You can always start with a Single Course and upgrade to a full Nanodegree program if you like.

          All our programs include:

          • Real-world projects from industry experts

            With real-world projects and immersive content built in partnership with top-tier companies, you’ll master the tech skills companies want.

          • Technical mentor support

            Our knowledgeable mentors guide your learning and are focused on answering your questions, motivating you, and keeping you on track.

          • Workspaces

            Validate your understanding of concepts learned by checking the output and quality of your code in real-time.

          • Flexible learning program

            Tailor a learning plan that fits your busy life. Learn at your own pace and reach your personal goals on the schedule that works best for you.

          Course offerings

          • Class content

            • Real-world projects
            • Project reviews
            • Project feedback from experienced reviewers
          • Student services

            • Technical mentor support
            • Student community

          Succeed with personalized services.

          We provide services customized for your needs at every step of your learning journey to ensure your success.

          Get timely feedback on your projects.

          • Personalized feedback
          • Unlimited submissions and feedback loops
          • Practical tips and industry best practices
          • Additional suggested resources to improve
          • 1,400+

            project reviewers

          • 2.7M

            projects reviewed

          • 88/100

            reviewer rating

          • 1.1 hours

            avg project review turnaround time

          Mentors available to answer your questions.

          • Support for all your technical questions
          • Questions answered quickly by our team of technical mentors
          • 1,400+

            technical mentors

          • 0.85 hours

            median response time

          Learn with the best.

          Learn with the best.

          • Juno Lee

            Curriculum Lead at Udacity

            Juno is the curriculum lead for the School of Data Science. She has been sharing her passion for data and teaching, building several courses at Udacity. As a data scientist, she built recommendation engines, computer vision and NLP models, and tools to analyze user behavior.

          • Andrew Paster

            Instructor

            Andrew has an engineering degree from Yale, and has used his data science skills to build a jewelry business from the ground up. He has additionally created courses for Udacity’s Self-Driving Car Engineer Nanodegree program.

          • Arpan Chakraborty

            Instructor

            Arpan is a computer scientist with a PhD from North Carolina State University. He teaches at Georgia Tech (within the Masters in Computer Science program), and is a coauthor of the book Practical Graph Mining with R.

          Data Engineering

          Get started today

          • Monthly access

            Pay as you go


            per

            /

            /

            Enroll now
            • Maximum flexibility to learn at your own pace.
            • Cancel anytime.
          • Learn

            How to pull data, store it, and build ETL, NLP and machine-learning data pipelines with Python.
          • Average Time

            On average, successful students take 28 hours to complete this program.
          • Benefits include

            • Real-world projects from industry experts
            • Technical mentor support

          Program Details

          • Do I need to apply? What are the admission criteria?
          • What are the prerequisites for enrollment?
          • How is this course structured?
          • How long is this course?
          • Can I switch my start date? Can I get a refund?
          • What software and versions will I need in this course?

          Data Engineering

          Enroll Now