Learn More

We talk with Dan Romuald Mbanga, Global Lead of Business Development for Amazon AI, about teaching students to use SageMaker for training and deploying deep learning models.

Amazon - AWS - SageMaker - Udacity - Deep Learning

Deep Learning is one of the most exciting technology fields in the world today, and because Udacity’s learning platform is built to allow for maximum adaptability, our Deep Learning Nanodegree program is one of our most dynamic and future-facing programs right now, as we continue to respond to advances in the field by augmenting and enhancing our curriculum.

We are very excited to share details about the latest additions to our program curriculum, which include new content and projects focused on PyTorch and SageMaker. In a recent post by Cezanne Camacho, Curriculum Lead for Udacity’s School of Artificial Intelligence, we discussed new PyTorch content, and today, we’re going to explore how we’ll be teaching students to use SageMaker for training and deploying deep learning models.

To integrate the incredible new content, we teamed up with AWS and the SageMaker team, and in the updated program, students will train and deploy a sentiment analysis model on SageMaker, then connect it to a front end through an API using other AWS services. After deploying a model, students will also learn how to update their model to account for changes in the underlying data used to train their model—an especially valuable skill in industries that continuously collect user data.

To provide a closer look into the world of SageMaker, we spoke recently with Dan Romuald Mbanga, Global Lead of Business Development for Amazon AI, and a leader of business and technical initiatives for Amazon AI platforms.

Dan, thank you so much for taking the time to speak with us, and to share your insights with our community. Let’s begin at the beginning—for those not yet familiar with SageMaker, can you give us a high-level summary of the platform?

Amazon SageMaker is a fully-managed Machine Learning (ML) service for ML practitioners at all levels of skills and interest, to get things done rapidly. SageMaker covers the entire machine learning workflow to label and prepare data, choose an algorithm, train the algorithm, tune and optimize it for deployment, make predictions, and take action. ML models get to production securely, faster, with much less effort and lower cost.

What was your driving motivation to create SageMaker in the first place, and how did you proceed initially?

“To build SageMaker, we worked backwards from the needs of our customers, external and internal.”

Internally, Amazon has thousands of ML scientists and engineers building, deploying, and securing ML models at scale, every day. These teams are spread around the world, and if you think about what to do to make their work efficient from a tooling perspective, you get closer to our mindset when we built SageMaker.

Similarly, our external customers have asked us for managed ML tools that can enable and accelerate their data science practices or adoption. Our intent was to make machine learning at scale vanilla, easy, and almost common sense. SageMaker understands the ML process and delivers on requirements for data labeling, training algorithms at scale, accessible state-of-the-art and built-in algorithms, accessible deep learning frameworks, accessible reinforcement learning training and simulation environments, automatic model tuning, collaboration and sharing, experiments management, and includes additional helpful tricks for workflows management services integration, and so forth.

Where does SageMaker Neo fit into this picture?

SageMaker also provides a deep-learning model compiler, aka “SageMaker Neo.” Neo lets developers and data scientists train models once and deploy them anywhere, in the cloud or at the edge, with up to 2X performance improvement at 1/10th the size of the original framework. This is very useful for low-power, low-memory devices commonly used for the internet of things. Neo will compile and optimize machine learning models with no loss in accuracy, for target hardware platforms from Intel, Nvidia, Arm, Cadence, Qualcomm, and Xilinx.

So, SageMaker is really meant for a wide user base?

It’s designed to make advanced researchers comfortable with the level of control of flexibility they want, while making ML approachable and reproducible for entry-level ML practitioners or non-experts.

Security was a key factor for the team as well, is that right?

Yes, SageMaker was built around security as job-zero. It’s designed to provide access control with virtual private clouds and a native integration with Amazon’s IAM—identity and access management—as well as end-to-end encryption support for data at rest and in transit. SageMaker also delivers metrics and metadata for efficient experiment management, team management, audit and compliance. Its strong built-in security is what makes SageMaker the cloud ML platform of choice for many regulated companies like Moody’s in Financial Services or Celgene in Healthcare; as well as many enterprises and startups that care about other things beyond training models; things like ML team building and management, access control, encryption, and more.

What are some of the key challenges with existing options that SageMaker now solves for?

It all starts with the data, and as you know, data labeling is typically the most time-consuming part of the process of building ML models based on custom data, not to mention being costly. Amazon SageMaker Ground Truth helps here, by offering easy access to public and private human labelers. Additionally, Ground Truth will kick-off a private active-learning model; learning from human labels to make high-quality, automatic annotations that significantly lower labeling costs by up to 70%.

Next, AI developers need a work environment to build ML models. Jupyter Notebooks are very commonly used for that. With a few clicks, any practitioner can get managed EC2 machines with Jupyter Notebooks up and running, pre-configured with the most popular data science tools like Scikit-learn, Pandas, TensorFlow, MXNet, PyTorch, and more. With the Notebook instances, SageMaker facilitates integration with Git repositories (publicly or privately managed), enabling seamless collaboration and sharing among ML developers.

The next thing SageMaker provides is an API-driven model training service. With a few commands aided by the publicly available Amazon SageMaker Python SDK, practitioners can summon a number of machines of their choice in order to launch a distributed ML training job. This service is peculiar in the sense that it knows the ML training process, and is built to act accordingly. For example, if you are a PyTorch developer, launching a PyTorch training job will make SageMaker kick-off an environment that comes pre-configured with PyTorch, and with understanding of how to distribute PyTorch jobs, how to download and distribute data efficiently from the cloud storage, and how to aggregate metadata, metrics, and logs in order to provide ML developers the same level of insights they’d have if they were running the job on a machine they fully control. SageMaker removes the undifferentiated heavy lifting in the context of the ML framework used by ML developers, to ensure they don’t have to deal with the details while maintaining access to state-of-the-art tools. This training environment is ephemeral and only lasts for the duration of the training job, i.e. you only pay for what you use.

We’ve heard you use the term “algorithms as a service” before, can you tell our readers a bit more about what that means?

Ok, so, a model training environment is two things typically: an algorithm, and a framework in which that algorithm is developed. For example, you might want to implement the Word2Vec algorithm in order to compute your words transformations into vectors that you’d use for higher level tasks such as documents classification or user reviews classification. In SageMaker, we went beyond providing frameworks and rewrote the most common ML algorithms, for specific tasks such as object detection, classification, forecasting, text analytics, etc. We even released an Object2Vec algorithm recently, which is a general purpose neural embedding algorithm that is highly customizable. There are currently 15 state-of-the-art algorithms to pick and use on SageMaker, with extensive examples in Jupyter notebooks accompanying them.

“SageMaker closes the getting-things-done gap by providing ‘algorithms as a service’ to speed up machine learning development.”

These algorithms were designed for 10x faster performance than the current implementations, and re-architected to leverage GPUs and streaming data whenever possible.

What does that mean in actual practice for data scientists?

This materially reduces the traditional model training time data scientists are accustomed to; fundamentally pushing the envelope on the expectation from a ML model in terms of training time.

In addition to speed, what about accuracy?

Once practitioners are happy with the successful training of a model—whether the algorithm is a SageMaker one or one that they provide—they typically want to get it to the best possible accuracy (or optimization, if they are optimizing for something else than accuracy). SageMaker comes with a managed automatic model tuning feature that can assist with the search of the best possible hyperparameters that can drive the best possible outcome. This is a common task for advanced ML practitioners, but not for the non-initiated. SageMaker brings this to the forefront of the mind of every user, and by making tuning easier, what we’ve observed is that numerous SageMaker users adopted best-practices because the feature was easy to grasp and grab.

Can you tell us a little more about this “tuning” feature?

The tuning service uses an optimization strategy called Bayesian Optimization, to automatically select and configure the parameters of your algorithm—say the learning rate, the weight decay, the number of neurons in a layer, the embedding dimension in a NLP or Object2Vec task, etc.—and drive towards the best possible outcome by launching parallel or serial training jobs in a controlled manner. Eventually delivering the “best job” with the best configuration (or hyper) parameters.

The way machine learning models are being used is changing rapidly. How does SageMaker address that?

Nowadays, many models live in a web app that serves live traffic via a REST API. While that’s common, paradigms are shifting. With IoT and mobile-first approaches to delivering value to consumers via technology, artificial intelligence has to get as close to users as possible. With the large amount of data available, serving a model could also mean running a batch job which scores, say, the Year-To-Date sales transactions and seeks to classify them as fraudulent or not. So, there are real-time, edge, and batch ways of using an ML model. SageMaker supports all three and more.

With SageMaker, trained model artifacts can be deployed automatically behind a scalable and auto-scaling endpoint that would serve a REST API for real-time use cases such as checking a transaction, or validating an identity based on a model trained to classify real-time biometrics data. For batch scenarios, SageMaker has a batch deployment feature that makes it quick and easy to spin a number of pre-configured machines with one’s model, and pull-in a large amount of data for a batch scoring. The infrastructure shuts itself down once the scoring job is done, which makes it agile and cost effective. Cloud inference is peculiar, in that there is sometimes a waste of resources in the event one wants the endpoint “always on”; but the GPUs might not be utilized to full capacity. That’s why for SageMaker inference in the AWS cloud (batch or real-time), it is possible to provision GPUs elastically. We call that “Elastic Inference” (EI). With EI you can manage your cost of inference dynamically and only pay for GPUs when you need them.

Finally, for mobile and/or edge deployments, SageMaker supports integrations with mobile and IoT tools, to export and deploy the ML models as close to the end users as possible. Again, with SageMaker Neo, models can be compiled and optimized to use 10% of the original memory footprint the model would have used, at 2X the performance, and with no loss in accuracy.

Who should learn SageMaker?

Everyone interested in machine learning, deep learning, and reinforcement learning! Or really, anyone interested in advanced optimization techniques that eventually drive towards artificial intelligence.

I hope what we’ve covered so far helps everyone understand that SageMaker is built around the flow of machine learning, and helps with guiding learners towards their goals. Within the SageMaker Notebook Instances, we made more than 30 examples available. They are packaged in end-to-end repeatable Jupyter Notebooks, and cover easy, medium, and hard examples of dealing with ML.

“We joke internally that we want to make machine learning so accessible that it becomes boring!”

That’s fantastic! So, regarding that kind of accessibility, can you give us a SageMaker use case? Who out there is using it right now, and for what?

Certainly. SageMaker is designed to support a wide array of possibilities and skills levels. So, for example, if you use Zocdoc to find your doctor, you’re using SageMaker in the back-end.  Zocdoc provides medical care search for end users with an integrated solution about information on medical practices and individual doctor schedules. One of Zocdoc’s mobile engineers was able to train and deploy a doctor specialty recommendation model from scratch in less than a day, which Zocdoc ended up rolling out to production.

Major League Baseball (MLB) is the most historic professional sports league in the United States and consists of 30 member clubs in the U.S. and Canada, representing the highest level of professional baseball. SageMaker and the rest of the AWS ecosystem enable MLB to eliminate the manual, time-intensive processes associated with record keeping and statistics such as scorekeeping, capturing game notes, and classifying pitches. By using Amazon SageMaker, MLB is empowering its developers and data scientists to automate these tasks as they learn to quickly and easily build, train, and deploy machine learning models at scale.

Tinder has enabled people to meet in over 190 countries, by processing billions of daily swipes. With SageMaker, Tinder’s ML models can enable new connections at a massive scale.

We’ve covered healthcare, sports analytics, and a very popular app connecting millions. These are just a few examples of where you can find the use of SageMaker. I’ve also observed many applications in Financial Services such as Intuit’s fraud detection models deployed in SageMaker, or Thomson Reuters’ advanced NLP algorithms for documents mining and questions answering, built on SageMaker.

So, the sky really is kind of the limit, in terms of applicability?

“Virtually every company of every size in every industry has a use case for machine learning, deep learning, reinforcement learning, and/or artificial intelligence.”

In saying that, I should note that I use ML as one term to encapsulate all the aspects of the AI practice that would require building and training algorithms to optimize for a specific set of tasks. Some tasks such as object detection and semantic segmentation used in self-driving cars are very visible. Other tasks are way less visible, tasks such as demand forecasting, supply-chain optimization, logs analysis for predictive maintenance, topic modeling for documents classification, etc. But looking at ML through these lenses, it becomes readily apparent that whatever your business is about, ML can help you. It also means that the scope of execution of an ML idea is massively large, from R&D to enterprise-grade products serving millions of production users.

What does it mean for you, to see SageMaker coming to our Udacity classrooms?

I personally believe that possibilities are endless. Udacity machine learning students have access to AWS credits and supplementary machine learning curriculum through our AWS Educate collaboration, so I can’t wait to see what Udacity students come up with after learning to use Amazon SageMaker.

Udacity is all about empowering our students to advance their lives and careers through the acquisition of valuable and rewarding skills. So, for aspiring machine learning and deep learning engineers, how much of a difference in their career trajectories can having SageMaker skills make?

I think it’ll be a material boost for everyone’s career to inject some form of ML in their operations. Machine learning has never been easier to approach. With tools like SageMaker, and specifically the SageMaker built-in examples, learners set themselves up for success by ensuring they know the state of the arts currently used by companies like the ones I’ve mentioned earlier.

“SageMaker won’t only help direct your ML learning process, it will also land you in a best-practices bucket—something every employer is seeking at the moment.”

Anticipating what employers might need in an ML-qualified candidate, SageMaker is definitely an important piece of the puzzle as employers themselves are ramping up on it, and they’d appreciate candidates that come with some initial knowledge. If you’re an aspiring ML/DL engineer, my recommendation is to learn in a context that will enable you to hit the ground running when you land your ML job; SageMaker delivers that. Besides, Udacity Machine Learning Engineer Nanodegree program students already have access to free AWS credits to use to explore AWS tools and services, including SageMaker, through Udacity’s collaboration with AWS Educate.

Where does the Amazon/Sagemaker/ML-DL team see these technologies heading in the next one, two, and five years?

At Amazon we obsessively listen to our customers, and work backwards from their needs. Traditionally, 90 to 95% of our roadmap has been driven by direct customers’ requests. SageMaker is similar. While we’re observing a growing demand in ML, we want to ensure SageMaker is future proof and meets customers’ demands at scale while enabling them to operate in a flexible manner. In the next few years you should see SageMaker adopting more of the popular paradigms in ML and making them easily accessible, scalable, and cost-effective. That’s our goal. I personally think it’s too early to make a bet on the next five years given the rapid developments in AI lately.

“I love that the AI field is open and inviting a lot more researchers than the traditional Ph.D candidates.”

What this means in my opinion is an evolution of innovation and general integration of AI in our practices. With SageMaker we want to enable fast research, data modeling, learning, scaling of workloads, production deployment, and continual re-training of intelligent systems; for every skills level. You should expect us to develop more technology designed to shorten those cycles.

What’s the next big arena or field where machine learning is going to have a major impact?

I believe we’re going to have this proliferation of multiple healthy streams of impactful ML applications. There are important movements done in drug discovery at the moment, as well as cancer research and other parts of Healthcare. ML models are assisting physicians, providing incredible lab assistance. I’d love to see more development in that direction as it helps humanity overall. A parallel is FinTech. There are many movements around enabling financial services with the benefits of intelligent systems. I think numerous developments are going to move in that direction as well in order to revamp the current financial technologies ecosystem, especially with new paradigms like blockchain.

This is all so great to hear! We know from our students that social good applications are a big motivation for getting into these fields and technologies, and our Blockchain Developer Nanodegree program classrooms are right now full of talented and passionate individuals eager to push the boundaries of what this transformative technology is capable of. After everything you’ve told us about SageMaker, we’re all the more excited to see what our Deep Learning Nanodegree program students are able to achieve as they learn this powerful new platform.


On behalf of all of us at Udacity, we want to thank Dan Romuald Mbanga for his generosity, and for sharing these incredible insights! This amazing new content is now available in our Deep Learning Nanodegree program, so enroll today, and start exploring how you can use SageMaker to achieve your goals!

Start Learning

Mat Leonard
Mat Leonard
Mat Leonard is Product Lead for Udacity's School of Artificial Intelligence. He is a former physicist, research neuroscientist, and data scientist. He did his PhD and Postdoctoral Fellowship at the University of California, Berkeley.