Skip to content

Building a Reproducible Model Workflow


Learn to be more productive through ML projects that require reproducible workflow best practices.

Enroll Now
  • Estimated time
    1 month

  • Enroll by
    May 31, 2023

    Get access to classroom immediately on enrollment

  • Skills acquired
    Model Testing, Model Evaluation, Exploratory Data Analysis

What You Will Learn

  1. Building a Reproducible Model Workflow

    1 month to complete

    Learn the fundamentals of MLOps and how to create a clean, organized, reproducible, end-to-end machine learning pipeline from scratch using MLflow. Clean and validate data using pytest and tracking experiments, code, and results using GitHub and Weights & Biases. Plus, learn to select the best-performing model for production and deploy a model using MLflow.

    Prerequisite knowledge

    Intermediate Python, Jupyter Notebooks

    1. Machine Learning Pipeline

      Learn MLOps fundamentals and dive into version data and artifacts. Write a ML pipeline component and link together ML components.

      • Data Exploration & Preparation

        Execute and track the Exploratory Data Analysis (EDA). Clean and pre-process the data and segregate (split) datasets.

        • Data Validation

          Use pytest with parameters for reproducible and automatic data tests. Perform deterministic and non-deterministic data tests.

          • Training, Validation & Experiment Tracking

            Tame the chaos with experiment, code, and data tracking. Track experiments with W&B. Validate and choose the best-performing model. Export model as an inference artifact and test final inference artifact.

            • Release & Deploy

              Release pipeline code and learn options for deployment and how to deploy a model.

              • Course Project: Build an ML Pipeline for Short-Term Rental Prices in NYC

                Write a machine learning pipeline to solve the following problem: A property management company is renting rooms and properties in New York for short periods on various rental platforms. They need to estimate the typical price for a given property based on the price of similar properties. The company receives new data in bulk every week, so the model needs to be retrained with the same cadence, necessitating a reusable pipeline. Write an end-to-end pipeline covering data fetching, validation, segregation, train and validation, test, and release. Run it on an initial data sample, then re-run it on a new data sample simulating a new data delivery.

              All Our Courses Include

              • Real-world projects from industry experts

                With real-world projects and immersive content built in partnership with top-tier companies, you’ll master the tech skills companies want.

              • Real-time support

                On demand help. Receive instant help with your learning directly in the classroom. Stay on track and get unstuck.

              • Workspaces

                Validate your understanding of concepts learned by checking the output and quality of your code in real-time.

              • Flexible learning program

                Tailor a learning plan that fits your busy life. Learn at your own pace and reach your personal goals on the schedule that works best for you.

              Course offerings

              • Class content

                • Real-world projects
                • Project reviews
                • Project feedback from experienced reviewers
              • Student services

                • Student community
                • Real-time support

              Succeed with personalized services.

              We provide services customized for your needs at every step of your learning journey to ensure your success.

              Get timely feedback on your projects.

              • Personalized feedback
              • Unlimited submissions and feedback loops
              • Practical tips and industry best practices
              • Additional suggested resources to improve
              • 1,400+

                project reviewers

              • 2.7M

                projects reviewed

              • 88/100

                reviewer rating

              • 1.1 hours

                avg project review turnaround time

              Learn with the best.

              Learn with the best.

              • Giacomo Vianello

                Principal Data Scientist

                Giacomo Vianello is an end-to-end data scientist with a passion for state-of-the-art but practical technical solutions. He is Principal Data Scientist at Cape Analytics, where he develops AI systems to extract intelligence from geospatial imagery bringing, cutting-edge AI solutions to the insurance and real estate industries.

              Building a Reproducible Model Workflow

              Get started today

                • Learn

                  Learn to be more productive through ML projects that require reproducible workflow best practices.

                • Average Time

                  On average, successful students take 1 month to complete this program.

                • Benefits include

                  • Real-world projects from industry experts
                  • Real-time support

                Program Details

                • Do I need to apply? What are the admission criteria?

                  No. This Course accepts all applicants regardless of experience and specific background.

                • What are the prerequisites for enrollment?

                  To be successful in this program, learners should have intermediate Python skills and understanding of Jupyter Notebooks.

                • How is this course structured?

                  This course is comprised of content and curriculum to support one project. We estimate that students can complete the program in one month.

                  The project will be reviewed by the Udacity reviewer network and platform. Feedback will be provided and if you do not pass the project, you will be asked to resubmit the project until it passes.

                • How long is this course?

                  Access to this course runs for the length of time specified in the payment card above. If you do not graduate within that time period, you will continue learning with month to month payments. See the Terms of Use and FAQs for other policies regarding the terms of access to our programs.

                • Can I switch my start date? Can I get a refund?

                  Please see the Udacity Program Terms of Use and FAQs for policies on enrollment in our programs.

                • What software and versions will I need in this course?

                  Learners should have access to Python, Pytorch, Weights & Biases, Hydra, and MLflow.

                Building a Reproducible Model Workflow

                Enroll Now