Jan 9, 2019

Supervised vs. Unsupervised Learning: The Basics

Machine learning, the science of developing how machines learn in an autonomous manner to act like humans, is an incredibly exciting aspect of computer science. At its heart, it involves teaching computers how to interpret patterns in data, either with or without insight from test data. If you're considering a career as a machine learning engineer, or if you want to take on any role relating to data science, it's important to develop your knowledge of the fundamental principles and practices of teaching machines, including the benefits of supervised vs. unsupervised learning.

Data Mining and Machine Learning Definitions

Data mining is the process of sorting large data sets to identify patterns, establish relationships, and make connections that allow for data grouping and the prediction of future trends. Data mining is a big part of practical machine learning and usually involves supervised or unsupervised learning algorithms:

  • Supervised learning: With supervised learning, you know the input and output variables, and you use an algorithm to map the function from input to output. The supervised learning algorithm analyzes labeled training data and then determines a function for labeling new data. The term ""supervised learning"" derives from the way in which the algorithm learns from the training data in the same way a student learns from a teacher. The teacher knows the correct answer and guides the student towards that answer. Class finishes once the student is able to reproduce consistent answers of an acceptable level, so it's possible to accurately predict the output for a given input.
  • Unsupervised learning: With unsupervised learning, you know the input variables, but you don't have any output variables. The algorithm doesn't have any test data to work from, so it must group unsorted data according to identified patterns. The term ""unsupervised learning"" derives from the way in which the algorithm must act independently, without any guidance from a teacher, as it finds hidden structures in the input data.

Supervised Learning Example

A simple example of a supervised learning algorithm is an email inbox that sorts incoming mail. As the user marks incoming mail as spam, he or she creates test data. The email service then uses the test data to identify subsequent mail, filing it accordingly.

Unsupervised Learning Example

A simple example of an unsupervised learning algorithm is a piece of software that analyzes a customer database and groups together all the people who purchased similar items. This is important information for marketers.

Learning More

Now that you know the difference between supervised vs. unsupervised learning, you've taken the first step on a road to discovering machine learning and associated fields of computer science. If you want to know more about this fascinating subject, a Udacity Nanodegree program is a good place to start. These convenient courses cover core principles and fundamental concepts, providing the bedrock of knowledge on which to build your successful career in data science, artificial intelligence, or many other technical fields. Browse the Udacity Catalog to learn more.