Skip to content

Establishing Data Infrastructure

Course two of three

Learn how to make strategic decisions for your organization's data pipeline technology as you hone specialized skills in Data Product Management.

Learn More

Enrollments for individual courses are no longer available for this program. Visit our Data Product Manager page to learn more about enrolling in the complete Nanodegree program.

Enroll now and take all three courses!

What you will learn

  1. Establishing Data Infrastructure

    1 month to complete

    Data product managers need to ensure that their products have the appropriate data pipelines in place so that data collected from users can be extracted, transformed, and loaded into a data lake or warehouse for use in statistical analysis. Learn about data infrastructure components including data pipelines, data producers, data consumers, data storage, and data processing. Master the nuances of evaluating strategic decisions for data pipeline technology, including security and compliance, and create solutions for real-world data infrastructure problems.

    Prerequisite knowledge

    Prior Data Analysis & Product Management Experience Recommended

    1. Introduction to Data Pipelines

      Begin by understanding the importance and need of data pipelines and the various components of data pipelines, and learn how to organize data pipeline components to automate end-to-end data flow. Then, create conceptual data pipelines and conceptualize classic data problems that can be addressed by data pipelines.

      • Data Consumers

        Learn about primary data consumers, their data needs, and how to identify data consumers in an organization and their relevant data use cases. Develop an understanding of the components of a relational data model and apply relational data models to business scenarios.

        • Data Producers

          Learn how to create event data models and implement them to get business insights, and use data collected from event models to calculate product KPIs. Identify primary data producers in an organization and distinguish between backend data producers (SaaS, ERPs, and data stores) while also differentiating between types of data (structured vs. semi-structured vs. unstructured).

          • Data Strategy

            Understand the difference between ETL and ELT processes, distinguish between batch processing and stream processing, and learn to select the appropriate data processing components for a product based on data needs. Differentiate between a data warehouse and data lake, and between SQL and NoSQL databases, and determine the appropriate data storage components for a particular data infrastructure of a product based on data needs. Assess capabilities of various data warehousing options (build vs buy, cloud vs on-prem, open source vs proprietary, and insource vs outsource) to make strategic decisions for data infrastructure, and evaluate data security and compliance product use cases (PII, PCI, HIPAA, GDPR, and CCPA).

            • Final Project: Build a Scalable Data Strategy for Flyber

              In this project, you will act as a data product manager for Flyber, a fictional flying-taxi service, and create a data strategy to not only handle the massive amount of incoming data, but also process it to gain business insights. First, you will define the data needs of primary business stakeholders within the organization and create a data model to ensure the data collected supports those needs. Then, you will perform the necessary extraction and transformation of the data to make the data relevant to answer business questions. Finally, you will interpret data visualizations to understand the scale of Flyber’s data growth and choose an appropriate data warehouse to enable that growth.

            Learn with the best.

            Learn with the best.

            • Vaishali Agarwal


              Vaishali has spent 12+ years in the tech eco-system with roles spanning product management and product development to content writing to coding. She is experienced in building platforms, high performance start-up divisions, streamlined operations, and managing customer expectations.

            All our programs include

            • Real-world projects from industry experts

              With real-world projects and immersive content built in partnership with top-tier companies, you’ll master the tech skills companies want.

            • Real-time support

              On demand help. Receive instant help with your learning directly in the classroom. Stay on track and get unstuck.

            • Career services

              You’ll have access to Github portfolio review and LinkedIn profile optimization to help you advance your career and land a high-paying role.

            • Flexible learning program

              Tailor a learning plan that fits your busy life. Learn at your own pace and reach your personal goals on the schedule that works best for you.

            Program offerings

            • Class content

              • Real-world projects
              • Project reviews
              • Project feedback from experienced reviewers
            • Student services

              • Student community
              • Real-time support
            • Career services

              • Github review
              • Linkedin profile optimization

            Succeed with personalized services.

            We provide services customized for your needs at every step of your learning journey to ensure your success.

            Get timely feedback on your projects.

            • Personalized feedback
            • Unlimited submissions and feedback loops
            • Practical tips and industry best practices
            • Additional suggested resources to improve
            • 1,400+

              project reviewers

            • 2.7M

              projects reviewed

            • 88/100

              reviewer rating

            • 1.1 hours

              avg project review turnaround time

            Program details

            Program overview: Why should I take this program?
            • Why should I enroll?

              Product Manager is a top 5 job on LinkedIn's Most Promising Jobs for 2019, and one of the most coveted roles in large tech enterprises, as well as entrepreneurial startups. All products developed for today's market are data products - running on data-derived insights to provide the right experience, to the right user, at the right time. Companies like Amazon, Netflix, Google, and more are able to provide personalized and engaging experiences to users because they utilize data science, machine learning, and artificial intelligence to better meet user needs.

              In the Data Product Manager Nanodegree program, you will hone specialized skills in Product Management, a role with a starting base salary of $125,000 and be equipped to build products that leverage data to position customers and businesses to thrive. This program is designed for students who want to assume key leadership roles in data product development and strategy in their company.

              Leverage market data to amplify product development. Learn how to apply data science techniques, data engineering processes, and market experimentation tests to deliver customized product experiences. Begin by leveraging the power of SQL and Tableau to inform product strategy. Then, develop data pipelines and warehousing strategies that prepare data collected from a product for robust analysis. Finally, learn techniques for evaluating the data from live products, including how to design and execute various A/B and multivariate tests to shape the next iteration of a product.

            • How do I know if this program is right for me?

              This Nanodegree program is perfect for existing Product Managers, Data Science professionals, and Engineers who are already in data or product-focused roles and want to further their skillset, as well as those who wish to break into the data product domain and help build products that utilize data to provide better product experiences.

              In most of the digital products, data is used to enhance product lines, better meet customer needs, make products that customers actually want, or create for a more personalized experience. Data that are collected from a product can be fed into machine learning algorithms and used to improve the overall user journey. If you want to build data-driven products backed by scalable data strategies to deliver the right experience to the right users, at the right time, then this Nanodegree program is right for you.

            • What jobs will this program prepare me for?

              This program will equip you with the skills to assume data product manager roles. You’ll learn directly from experienced Product Managers at Zendesk, Expedia, and DISQO, who have constructed this Nanodegree program to equip you with the most in-demand and relevant industry skills.

            • What is the difference between the Product Manager, the Growth Product Manager, the Data Product Manager, and the AI Product Manager Nanodegree programs?

              The Product Manager Nanodegree program will equip you with the foundational skills to assume entry-level product manager roles. You’ll learn directly from experienced Product Managers at Uber and Google, who have constructed this Nanodegree program to equip you with the most in-demand and relevant industry skills. This Nanodegree program teaches the core skill set required in all Product Manager roles, which is the foundation for more specialized roles like Growth Product Manager, Data Product Manager, AI Product Manager, and more.

              The AI Product Manager Nanodegree program is meant for product managers that are responsible for building and deploying AI products. The AI PM Nanodegree program is focused on the hands-on tasks of scoping a data set, training a model, and evaluating the performance of the model.

              The Growth Product Manager Nanodegree program is meant for experienced Product Managers who are looking to specialize their skills in product management and be equipped to fill growth-focused roles. You’ll learn how to grow the user base of your product, get customers engaged and activated as quickly as possible, and monetize your product to have it generate revenue.

              The Data Product Manager Nanodegree program is meant for experienced Product Managers who are looking to specialize their skills in product management and be equipped to fill data-focused roles in the development and strategy behind data products. You'll learn how to build an MVP launch strategy for a new service product that utilizes market insights extracted from extensive data analyses and visualizations, develop a data model with corresponding data pipelines and transformations to evaluate user activity of a product, and identify key behavioral and descriptive attributes of users to construct hypotheses for new product features and experiments to validate these hypotheses.

            Enrollment and admission
            • Do I need to apply? What are the admission criteria?

              There is no application. This Nanodegree program accepts everyone, regardless of experience and specific background.

            • What are the prerequisites for enrollment?

              No prior experience with data modeling and data engineering is required. However, a basic understanding of data terminology (i.e. big data, database, algorithms, etc.), some experience with data analysis (basic SQL & Tableau), and a general understanding of product management is helpful.

            • If I do not meet the requirements to enroll, what should I do?

              The following Nanodegree programs are not necessary to complete before starting this program, but could be helpful if you would like to prepare.

              You can check out the Product Manager Nanodegree program, SQL Nanodegree program, or the Programming for Data Science with Python Nanodegree program.

            Tuition and term of program
            • How is this Nanodegree program structured?

              The Data Product Manager Nanodegree program is comprised of content and curriculum to support three projects. Once you subscribe to a Nanodegree program, you will have access to the content and services for the length of time specified by your subscription. We estimate that students can complete the program in three months, working 10 hours per week.

              Each project will be reviewed by the Udacity reviewer network. Feedback will be provided and if you do not pass the project, you will be asked to resubmit the project until it passes.

            • How long is this Nanodegree program?

              You will have access to this Nanodegree program for as long as your subscription remains active. The estimated time to complete this program can be found on the webpage and in the syllabus, and is based on the average amount of time we project that it takes a student to complete the projects and coursework. See the Terms of Use and FAQs for other policies regarding the terms of access to our Nanodegree programs.

            • Can I switch my start date? Can I get a refund?

              Please see the Udacity Program FAQs for policies on enrollment in our programs.

            Software and hardware: What do I need for this program?
            • What software and versions will I need in this program?

              You will need to use SQL, Tableau, Google Slides or Microsoft PowerPoint, and Google Sheets or Microsoft Excel, as well as have access to the internet and a 64-bit computer. You will also need access to a computer for which the requirements are:

              Minimum browser requirements are:

              • Chrome 49+
              • Firefox 57+
              • Safari 10.1+ (Apple - macOS)
              • Edge 14+ (Windows)
              Minimum operating system (OS) requirements are:
              • Windows 8.1 or later
              • Apple MacOS 10.10 (Yosemite) and later
              • Any Linux OS that supports the browsers mentioned above
              • Any Chrome OS that supports the browsers mentioned above