Skip to content

Machine Learning: Unsupervised Learning

Free Course

Conversations on Analyzing Data

Related Nanodegree Program

Introduction to Programming

In collaboration with
  • Georgia Institute of Technology

About this course

This is the second course in the 3-course Machine Learning Series and is offered at Georgia Tech as CS7641. Taking this class here does not earn Georgia Tech credit.

Ever wonder how Netflix can predict what movies you'll like? Or how Amazon knows what you want to buy before you do? The answer can be found in Unsupervised Learning!

Closely related to pattern recognition, Unsupervised Learning is about analyzing data and looking for patterns. It is an extremely powerful tool for identifying structure in data. This course focuses on how you can use Unsupervised Learning approaches -- including randomized optimization, clustering, and feature selection and transformation -- to find structure in unlabeled data.

Series Information: Machine Learning is a graduate-level series of 3 courses, covering the area of Artificial Intelligence concerned with computer programs that modify and improve their performance through experiences.

The entire series is taught as an engaging dialogue between two eminent Machine Learning professors and friends: Professor Charles Isbell (Georgia Tech) and Professor Michael Littman (Brown University).

What you will learn

  1. Randomized optimization
    • Optimization, randomized
    • Hill climbing
    • Random restart hill climbing
    • Simulated annealing
    • Annealing algorithm
    • Properties of simulated annealing
    • Genetic algorithms
    • GA skeleton
    • Crossover example
    • What have we learned
    • MIMIC
    • MIMIC: A probability model
    • MIMIC: Pseudo code
    • MIMIC: Estimating distributions
    • Finding dependency trees
    • Probability distribution
  2. Clustering
    • Clustering and expectation maximization
    • Basic clustering problem
    • Single linkage clustering (SLC)
    • Running time of SLC
    • Issues with SLC
    • K-means clustering
    • K-means in Euclidean space
    • K-means as optimization
    • Soft clustering
    • Maximum likelihood Gaussian
    • Expectation Maximization (EM)
    • Impossibility theorem
  3. Feature Selection
    • Algorithms
    • Filtering and Wrapping
    • Speed
    • Searching
    • Relevance
    • Relevance vs. Usefulness
  4. Feature Transformation
    • Feature Transformation
    • Words like Tesla
    • Principal Components Analysis
    • Independent Components Analysis
    • Cocktail Party Problem
    • Matrix
    • Alternatives
  5. Information Theory
    • History -Sending a Message
    • Expected size of the message
    • Information between two variables
    • Mutual information
    • Two Independent Coins
    • Two Dependent Coins
    • Kullback Leibler Divergence
  6. Unsupervised Learning Project

    Prerequisites and requirements

    This class will assume that you have programming experience as you will be expected to work with python libraries such as numpy and scikit. A good grasp of probability and statistics is also required. Udacity's Intro to Statistics, especially Lessons 8, 9 and 10, may be a useful refresher.

    An introductory course like Udacity's Introduction to Artificial Intelligence also provides a helpful background for this course.

    See the Technology Requirements for using Udacity.

    Why take this course?

    You will learn about and practice a variety of Unsupervised Learning approaches, including: randomized optimization, clustering, feature selection and transformation, and information theory.

    You will learn important Machine Learning methods, techniques and best practices, and will gain experience implementing them in this course through a hands-on final project in which you will be designing a movie recommendation system (just like Netflix!).

    Learn with the best.

    • Charles Isbell
      Charles Isbell


    • Michael Littman
      Michael Littman


    • Pushkar Kolhe
      Pushkar Kolhe