Udacity part of Accenture logo

LLM Inference Optimization

This course provides a comprehensive overview of techniques to enhance the performance of large language models (LLMs) during inference. It begins with an introduction to the principles of LLM inference optimization, focusing on the transformer architecture and various optimization strategies. Participants will explore advanced methods, including quantization and speculative decoding, to reduce model complexity and improve execution speed. The course also covers model parallelism and sharding techniques for effective deployment in real-world applications. Finally, learners will complete a project on accelerating news headline generation using LLM optimization, demonstrating practical implementations of the concepts discussed.

  • Course
  • Intermediate
  • 11 hours
  • Updated: Feb 24, 2026

Subscription · Monthly

  • Cancel Anytime
  • Unlimited access to hundreds of top-rated courses
  • Hands-on projects with expert feedback
  • Personalized career coaching and interview prep
  • Program Certificates

Skills you'll learn

2 skills

  • Model deployment
  • LLM Inference Optimization

Prerequisites

4 prerequisites

Prior to enrolling, you should have the following knowledge:

  • Model evaluation
  • Deep learning model optimization
  • PyTorch
  • AI and ML applications

You will also need to be able to communicate fluently and professionally in written and spoken English.

Course Outline

  • 5 lessons
  • 1 project

Program Instructors

1 instructor

Unlike typical professors, our instructors come from Fortune 500 and Global 2000 companies and have demonstrated leadership and expertise in their professions:

Rishabh Misra

Staff Machine Learning Engineer

Rishabh Misra

Staff Machine Learning Engineer

About this program

Master LLM inference optimization with transformer tweaks, model parallelism, and sharding using DeepSpeed, TensorRT-LLM, Triton Inference Server, and more.

Subscription · Monthly

  • Cancel Anytime
  • Unlimited access to hundreds of top-rated courses
  • Hands-on projects with expert feedback
  • Personalized career coaching and interview prep
  • Program Certificates

Other programs you might like:

Udacity Accenture logo

Company

  • Facebook
  • Twitter
  • LinkedIn
  • Instagram

© 2011-2026 Udacity, Inc. "Nanodegree" is a registered trademark of Udacity. © 2011-2026 Udacity, Inc.
We use cookies and other data collection technologies to provide the best experience for our customers.