This page is a work in progress(will be updated in January 2017) and does not contain the most up to date information - for a complete overview of the syllabus please go here.
Thank you for signing up for this Nanodegree! For important program information refer to the Nanodegree Student Handbook, also available for download in your Nanodegree Portal. You're part of a cohort - a community of students who will work at about the same pace and interact in Udacity Discussions, our forum system. We look forward to working with you and hearing your feedback in the forum!
Please note: This page contains the older syllabus for the MLND program for students who have enrolled prior to August 2016. For all students enrolled after, there is one more project to complete - The Digit Recognition Project. To view the up to date syllabus you can always just check your classroom page at classroom.udacity.com.
Need help getting started?
A Nanodegree is a new type of credential, designed to prepare you for a job. It is built with industry for you to master skills that employers truly seek in a Data Analyst. The Nanodegree program is project-based: you'll complete several projects, with guidance and code reviews from our Coaches, to learn and show off your skills. It offers a personalized learning roadmap: take only the courses you need to ace projects! We'll customize your path to be as efficient and effective as possible. See how it works.
You can download Supplemental Materials, Lesson Videos and Transcripts from Downloadables (bottom right corner of the Classroom) or from the Dashboard (first option on the navigation bar on the left hand side).
The prerequisites for this Nanodegree program are listed here.
To prepare for this program, you can utilize the courses in our Intro to Programming Nanodegree program, as well as Udacity's introductory statistics courses and the Linear Algebra Refresher Course.
These are some additional external resources:
This project is supported by material found in the Introduction to Machine Learning Foundations Lesson.
In this optional project, you will create decision functions that attempt to predict survival outcomes from the 1912 Titanic disaster based on each passenger’s features, such as sex and age. You will start with a simple algorithm and increase its complexity until you are able to accurately predict the outcomes for at least 80% of the passengers in the provided data. This project will introduce you to some of the concepts of machine learning as you start the Nanodegree program.
This project is supported by material found in our Model Evaluation & Validation course. Dependent on your background knowledge of topics such as cross validation, model training and testing, and basic statistics, you may not need to review the entire course prior to completing the project.
The Boston housing market is highly competitive, and you want to be the best real estate agent in the area. To compete with your peers, you decide to leverage a few basic machine learning concepts to assist you and a client with finding the best selling price for their home. Luckily, you’ve come across the Boston Housing dataset which contains aggregated data on various features for houses in Greater Boston communities, including the median value of homes for each of those areas. Your task is to build an optimal model based on a statistical analysis with the tools available. This model will then used to estimate the best selling price for your client’s home.
This project is supported by material found in our Supervised Learning course. Dependent on your background knowledge of topics in supervised learning techniques such as decision trees, support vector machines (SVM), and regression, you may not need to review the entire course prior to completing the project.
A local school district has a goal to reach a 95% graduation rate by the end of the decade by identifying students who need intervention before they drop out of school. As a software engineer contacted by the school district, your task is to model the factors that predict how likely a student is to pass their high school final exam, by constructing an intervention system that leverages supervised learning techniques. The board of supervisors has asked that you find the most effective model that uses the least amount of computation costs to save on the budget. You will need to analyze the dataset on students' performance and develop a model that will predict the likelihood that a given student will pass, quantifying whether an intervention is necessary.
This project is supported by material found in our Unsupervised Learning course. Dependent on your background of topics in unsupervised learning techniques such as clustering and principal component analysis (PCA), you may not need to review the entire course prior to completing the project.
A wholesale distributor recently tested a change to their delivery method for some customers, by moving from a morning delivery service five days a week to a cheaper evening delivery service three days a week.Initial testing did not discover any significant unsatisfactory results, so they implemented the cheaper option for all customers. Almost immediately, the distributor began getting complaints about the delivery service change and customers were canceling deliveries — losing the distributor more money than what was being saved. You’ve been hired by the wholesale distributor to find what types of customers they have to help them make better, more informed business decisions in the future. Your task is to use unsupervised learning techniques to see if any similarities exist between customers, and how to best segment customers into distinct categories.
This project is supported by material found in our Reinforcement Learning course. Dependent on your background of topics in reinforcement learning techniques such as markov decision processes (MDP) and game theory, you may not need to review the entire course prior to completing the project.
In the not-so-distant future, taxicab companies across the United States no longer employ human drivers to operate their fleet of vehicles. Instead, the taxicabs are operated by self-driving agents — known as smartcabs — to transport people from one location to another within the cities those companies operate. In major metropolitan areas, such as Chicago, New York City, and San Francisco, an increasing number of people have come to rely on smartcabs to get to where they need to go as safely and efficiently as possible. Although smartcabs have become the transport of choice, concerns have arose that a self-driving agent might not be as safe or efficient as human drivers, particularly when considering city traffic lights and other vehicles. To alleviate these concerns, your task as an employee for a national taxicab company is to use reinforcement learning techniques to construct a demonstration of a smartcab operating in real-time to prove that both safety and efficiency can be achieved.
In this capstone project, you will leverage what you’ve learned throughout the Nanodegree program to solve a problem of your choice by applying machine learning algorithms and techniques. You will first define the problem you want to solve and investigate potential solutions and performance metrics. Next, you will analyze the problem through visualizations and data exploration to have a better understanding of what algorithms and features are appropriate for solving it. You will then implement your algorithms and metrics of choice, documenting the preprocessing, refinement, and postprocessing steps along the way. Afterwards, you will collect results about the performance of the models used, visualize significant quantities, and validate/justify these values. Finally, you will construct conclusions about your results, and discuss whether your implementation adequately solves the problem.