Machine Learning: Supervised Learning

Thank you for signing up for the course! We look forward to working with you and hearing your feedback in our forums.

Need help getting started?


Course Resources

Reading Materials

Suggested Text

  1. Tom Mitchell, Machine Learning. McGraw-Hill, 1997.
  2. Ethem Alpaydın, Introduction to Machine Learning. Second Edition.

Optional Text

  1. Larry Wasserman, All of Statistics. Springer, 2010.
  2. Richard Sutton and Andrew Barto, Reinforcement Learning: An introduction. MIT Press, 1998.
  3. Trevor Hastie, Robert Tibshirani and Jerome Friedman, The Elements of Statistical Learning. Springer, 2009.

Reading List

Coding Resources


  1. WEKA Machine learning software in JAVA that you can use for your projects
  2. Data Mining with Weka A MOOC Course
  3. ABAGAIL Machine learning software in JAVA. This is hosted on my github, so you can contribute too
  4. scikit-learn A popular python library for supervised and unsupervised learning algorithms
  5. MATLAB NN Toolbox The toolbox supports supervised learning with feedforward, radial basis, and dynamic networks and unsupervised learning with self-organizing maps and competitive layers.
  6. Murphy's MDP Toolbox for Matlab
  7. MATLAB Clustering Package By Frank Dellaert
  8. ICA Example


  1. UCI Machine Learning Repository An online repository of data sets that can be used for machine learning experiments.
  2. Stanford Large Network Dataset Dataset of large social and information networks.
  3. Vision Benchmark Suite Autonomous car dataset
  4. Other datasets

Downloadable Materials

You can download Supplemental Materials, Lesson Videos and Transcripts from Downloadables (bottom right corner of the Classroom) or from the Dashboard (first option on the navigation bar on the left hand side).

Course Syllabus

Lesson 1: Decision Trees

Lesson 1 Slides

  • Classification and Regression overview
  • Classification learning
  • Example: Dating
  • Representation
  • Decision trees learning
  • Decision tree expressiveness
  • ID3 algorithm
  • ID3 bias
  • Decision trees and continuous attributes

Lesson 2: Regression & Classification

Lesson 2 Slides

  • Regression and function approximation
  • Linear regression and best fit
  • Order of polynomial
  • Polynomial regression
  • Cross validation

Lesson 3: Neural Networks

Lesson 3 Slides

  • Artificial neural networks
  • Perceptron units
  • XOR as perceptron network
  • Perceptron training
  • Gradient descent
  • Comparison of learning rules
  • Sigmoid function
  • Optimizing weights
  • Restriction bias
  • Preference bias

Lesson 4: Instance Based Learning

Lesson 4 Slides

Problem Set 1

  • Instance based learning before
  • Instance based learning now
  • K-NN algorithm
  • Won’t you compute my neighbors?
  • Domain K-NNowledge
  • K-NN bias
  • Curse of dimensionality

Lesson 5: Ensemble B&B

Lesson 5 Slides

  • Ensemble learning: Boosting
  • Ensemble learning algorithm
  • Ensemble learning outputs
  • Weak learning
  • Boosting in code
  • When D agrees

Lesson 6: Kernel Methods & SVMs

Lesson 6 Slides

  • Support Vector Machines
  • Optimal separator
  • SVMs: Linearly married
  • Kernel methods

Lesson 7: Comp Learning Theory

Lesson 7 Slides

  • Computational Learning Theory
  • Learning theory
  • Resources in Machine Learning
  • Defining inductive learning
  • Teacher with constrained queries
  • Learner with constrained queries
  • Learner with mistake bounds
  • Version spaces
  • PAC learning
  • Epsilon exhausted
  • Haussler theorem

Lesson 8: VC Dimensions

Lesson 8 Slides

  • Infinite hypothesis spaces
  • Power of a hypothesis space
  • What does VC stand for?
  • Internal training
  • Linear separators
  • The ring
  • Polygons
  • Sampling complexity
  • VC of finite H

Lesson 9: Bayesian Learning

Lesson 9 Slides

  • Bayes Rule
  • Bayesian learning
  • Bayesian learning in action!
  • Noisy data
  • Best hypothesis
  • Minimum description length
  • Bayesian classification

Lesson 10: Bayesian Inference

Lesson 10 Slides

Problem Set 2

  • Joint distribution
  • Adding attributes
  • Conditional independence
  • Belief networks
  • Sampling from the joint distribution
  • Recovering the joint distribution
  • Inferencing rules
  • Naïve Bayes
  • Why Naïve Bayes is cool

Final Project: Predict Boston Housing Prices

Follow this link to access the final project.