Udacity Logo
Log InJoin for Free

Advanced Computer Vision and Deep Learning

Course

Learn to apply deep learning architectures to computer vision tasks. Discover how to combine CNN and RNN networks to build an automatic image captioning application.

Learn to apply deep learning architectures to computer vision tasks. Discover how to combine CNN and RNN networks to build an automatic image captioning application.

Advanced

1 month

Real-world Projects

Completion Certificate

Last Updated January 3, 2024

Skills you'll learn:
Recurrent neural networks • Attention mechanisms • Long-short term memory networks • Yolo algorithm
Prerequisites:
Intermediate Python • Neural network basics • Basic probability

Course Lessons

Lesson 1

Advanced CNN Architectures

Learn about advances in CNN architectures and see how region-based CNN’s, like Faster R-CNN, have allowed for fast, localized object recognition in images.

Lesson 2

YOLO

Learn about the YOLO (You Only Look Once) multi-object detection model and work with a YOLO implementation.

Lesson 3

RNN's

Explore how memory can be incorporated into a deep learning model using recurrent neural networks (RNNs). Learn how RNNs can learn from and generate ordered sequences of data.

Lesson 4

Long Short-Term Memory Networks (LSTMs)

Luis explains Long Short-Term Memory Networks (LSTM), and similar architectures which have the benefits of preserving long term memory.

Lesson 5

Hyperparameters

Learn about a number of different hyperparameters that are used in defining and training deep learning models. We'll discuss starting values and intuitions for tuning each hyperparameter.

Lesson 6

Optional: Attention Mechanisms

Attention is one of the most important recent innovations in deep learning. In this section, you'll learn how attention models work and go over a basic code implementation.

Lesson 7

Image Captioning

Learn how to combine CNNs and RNNs to build a complex, automatic image captioning model.

Lesson 8 • Project

Project: Image Captioning

Train a CNN-RNN model to predict captions for a given image. Your main task will be to implement an effective RNN decoder for a CNN encoder.

Taught By The Best

Photo of Cezanne Camacho

Cezanne Camacho

Curriculum Lead

Cezanne is an expert in computer vision with a Masters in Electrical Engineering from Stanford University. As a former researcher in genomics and biomedical imaging, she's applied computer vision and deep learning to medical diagnostic applications.

Photo of Luis Serrano

Luis Serrano

Instructor

Luis was formerly a Machine Learning Engineer at Google. He holds a PhD in mathematics from the University of Michigan, and a Postdoctoral Fellowship at the University of Quebec at Montreal.

Photo of Jay Alammar

Jay Alammar

Instructor

Jay is a software engineer, the founder of Qaym (an Arabic-language review site), and the Investment Principal at STV, a $500 million venture capital fund focused on high-technology startups.

Photo of Ortal Arel

Ortal Arel

Curriculum Lead

Ortal Arel has a PhD in Computer Engineering, and has been a professor and researcher in the field of applied cryptography. She has worked on design and analysis of intelligent algorithms for high-speed custom digital architectures.

Photo of Kelvin Lwin

Kelvin Lwin

AI | Knowledge Architect

Kelvin had taught in US Academia and Industry within highly technical subjects of CS and AI/DL for a decade. He expanded into building AI Fullstack in China to have a broader global perspective for 3 years. Now he is combining AI, Empathy & Ethics informed by his 18 years of meditation to build new Educational AI for all.

The Udacity Difference

Combine technology training for employees with industry experts, mentors, and projects, for critical thinking that pushes innovation. Our proven upskilling system goes after success—relentlessly.

Demonstrate proficiency with practical projects

Projects are based on real-world scenarios and challenges, allowing you to apply the skills you learn to practical situations, while giving you real hands-on experience.

  • Gain proven experience

  • Retain knowledge longer

  • Apply new skills immediately

Top-tier services to ensure learner success

Reviewers provide timely and constructive feedback on your project submissions, highlighting areas of improvement and offering practical tips to enhance your work.

  • Get help from subject matter experts

  • Learn industry best practices

  • Gain valuable insights and improve your skills

Unlock access to Advanced Computer Vision and Deep Learning and the rest of our best-in-class catalog

  • Unlimited access to our top-rated courses

  • Real-world projects

  • Personalized project reviews

  • Program certificates

  • Proven career outcomes

Full Catalog Access

One subscription opens up this course and our entire catalog of projects and skills.

Month-To-Month

4 Months

Average time to complete a Nanodegree program

*Discount applies to the first 4 months of membership, after which plans are converted to month-to-month.

Get Started Today

Advanced Computer Vision and Deep Learning

Month-To-Month


  • Unlimited access to our top-rated courses
  • Real-world projects
  • Personalized project reviews
  • Program certificates
  • Proven career outcomes

4 Months

Average time to complete a Nanodegree program

  • All the same great benefits in our month-to-month plan
  • Most cost-effective way to acquire a new set of skills
Discount applies to the first 4 months of membership, after which plans are converted to month-to-month.