About this Course

This course provides an introduction to computer vision including fundamentals of image formation, camera imaging geometry, feature detection and matching, multiview geometry including stereo, motion estimation and tracking, and classification. We’ll develop basic methods for applications that include finding known models in images, depth recovery from stereo, camera calibration, image stabilization, automated alignment (e.g. panoramas), tracking, and action recognition. We focus less on the machine learning aspect of CV as that is really classification theory best learned in an ML course.

The focus of the course is to develop the intuitions and mathematics of the methods in lecture, and then to learn about the difference between theory and practice in the problem sets. All algorithms work perfectly in the slides. But remember what Yogi Berra said: In theory there is no difference between theory and practice. In practice there is. (Einstein said something similar but who knows more about real life?) In this course you do not, for the most part, apply high-level library functions but use low to mid level algorithms to analyze images and extract structural information.

Course Cost
Approx. 4 months
Skill Level
Included in Course
  • Icon course 01 3edf6b45629a2e8f1b490e1fb1516899e98b3b30db721466e83b1a1c16e237b1 Rich Learning Content

  • Icon course 04 2edd94a12ef9e5f0ebe04f6c9f6ae2c89e5efba5fd0b703c60f65837f8b54430 Interactive Quizzes

  • Icon course 02 2d90171a3a467a7d4613c7c615f15093d7402c66f2cf9a5ab4bcf11a4958aa33 Taught by Industry Pros

  • Icon course 05 237542f88ede3178ac4845d4bebf431ddd36d9c3c35aedfbd92e148c1c7361c6 Self-Paced Learning

  • Icon course 03 142f0532acf4fa030d680f5cb3babed8007e9ac853d0a3bf731fa30a7869db3a Student Support Community

Join the Path to Greatness

This free course is your first step towards a new career with the Deep Learning Foundations Nanodegree Program.

Free Course

Introduction to Computer Vision

by Georgia Institute of Technology

Enhance your skill set and boost your hirability through innovative, independent learning.

Icon steps 54aa753742d05d598baf005f2bb1b5bb6339a7d544b84089a1eee6acd5a8543d

Course Leads

  • Aaron Bobick
    Aaron Bobick


  • Irfan Essa
    Irfan Essa


  • Arpan Chakraborty
    Arpan Chakraborty


What You Will Learn

A brief outline of units is given below, grouped into 10 parts:

1 Introduction

  • 1A Introduction

2 Image Processing for Computer Vision

  • 2A Linear image processing
  • 2B Model fitting
  • 2C Frequency domain analysis

3 Camera Models and Views

  • 3A Camera models
  • 3B Stereo geometry
  • 3C Camera calibration
  • 3D Multiple views

4 Image Features

  • 4A Feature detection
  • 4B Feature descriptors
  • 4C Model fitting

5 Lighting

  • 5A Photometry
  • 5B Lightness
  • 5C Shape from shading

6 Image Motion

  • 6A Overview
  • 6B Optical flow

7 Tracking

  • 7A Introduction to tracking
  • 7B Parametric models
  • 7C Non-parametric models
  • 7D Tracking considerations

8 Classification and Recognition

  • 8A Introduction to recognition
  • 8B Classification: Generative models
  • 8C Classification: Discriminative models
  • 8D Action recognition

9 Useful Methods

  • 9A Color spaces and segmentation
  • 9B Binary morphology
  • 9C 3D perception

10 Human Visual System

  • 10A The retina
  • 10B Vision in the brain

GT OMSCS Students

Note: Please refer to your course website/schedule for further details, assignments, etc.

Spring 2015 resources (old):

  • Schedule: Suggested pace, assignments, deadlines, references.
  • Course website: Course information, problem sets, academic policies, grading scheme.
  • Piazza forum: Discussions, announcements, clarifications.
  • T-Square site: Problem set submissions.

Note: This course was previously offered as CS 4495.

Prerequisites and Requirements

  • Data structures: You'll be writing code that builds representations of images, features, and geometric constructions.
  • A good working knowledge of Matlab and/or Python with NumPy. The lecture videos use Matlab for occasional demonstration because the instructor is too old to change. Problem sets will be done in Matlab or Python. As mentioned in the resources note below, you can use either Matlab or the open source version Octave.
  • This course has more math than many CS courses: Linear algebra, vector calculus, and linear algebra (that is not a typo).
  • No prior knowledge of vision is assumed though any experience with Signal Processing is helpful.

See the Technology Requirements for using Udacity.

Why Take This Course

Images have become ubiquitous in computing. Sometimes we forget that images often capture the light reflected from a physical scene. This course gives you both insight into the fundamentals of image formation and analysis, as well as the ability to extract information much above the pixel level. These skills are useful for anyone interested in operating on images in a context-aware manner or where images from multiple scenarios need to be combined or organized in an appropriate way.

What do I get?
  • Instructor videos
  • Learn by doing exercises
  • Taught by industry professionals

Thanks for your interest!

We'll be in touch soon.

Icon globe e82eae5d45465aba4fbe4bb746905ce55dc3324f310b79c60e4a20089057d347

Udacity 现已提供中文版本! A Udacity tem uma página em português para você! There's a local version of Udacity for you! Sprechen Sie Deutsch?

Besuchen Sie de.udacity.com und entdecken Sie lokale Angebote, unsere Partnerunternehmen und Udacitys deutschsprachigen Blog.

前往优达学城中文网站 Ir para a página brasileira Go to Indian Site Icon flag de deedb1a7a695700236cb6ef4204ddbede5d197dab9b47716c87a0b4d5d9fc325 Zu de.udacity.com continue in English