C++ - C++ Sets - Programming Languages

C++ Sets Explained

Think of your favorite sports team. All of its players are unique and we can order them by their jersey numbers. We can refer to the players directly by name or by their number. 

A similar concept exists in programming: We can organize a set number of unique values based on a predetermined order. In C++, this type of data structure is called a set.

In this article, we’ll explore what C++ sets are and how to use them, before presenting scenarios in which we would and would not use sets. 

What Is a Set in C++?

A set is a data structure that stores unique elements of the same type in a sorted order. Each value is a key, which means that we access each value using the value itself. With arrays, on the other hand, we access each value by its position in the container (the index). Accordingly, each value in a set must be unique. 

Sets are also ordered: Every element added to a set is assigned a position when it’s inserted. You can iterate through the elements in order, as we’ll see below. Once inserted in a set, the value cannot be changed, although elements can be deleted.

Let’s now put ourselves in the role of a C++ developer and examine some basic functionality of a set. 

How To Use a C++ Set


First, let’s create an empty set:

#include <iostream>       // std::cout
#include <set>          // std::set
using namespace std;
   
// empty set
int main ()
{
    set<int> my_empty_set;
}

Next we’ll create a set that contains a list of integers:

set<int> my_full_set{1,2,3,4};

Both of our examples involved creating a set of integers, but we can also create sets of strings and other data types. Now that we’ve created a set, let’s look at some simple ways of adding to and removing from the set. Going back to our empty set, the simplest way to add elements looks like this:

my_empty_set.insert(10);
my_empty_set.insert(100);
my_empty_set.insert(1000);
my_empty_set.insert(10000);

Our set isn’t empty anymore, so let’s make a copy of it and call it something else:

set<int> my_set_of_tens (my_empty_set);

Now we have a whole new set copied from the one we just populated.

Deleting elements is just as simple as inserting them:

my_set_of_tens.erase(10);

The insert function returns an iterator pointing to the newly inserted element in the set, while the erase function returns the number of elements erased. This means we can erase more than one element at a time. 

When deleting more than one element, erase takes two parameters, namely pointers that specify a range within the container to be removed.

Let’s look at how to do this using two additional C++ member functions, find and end:

// erase all elements from 1000 to the end of the set
my_set_of_tens.erase(my_set_of_tens.find(1000), my_set_of_tens.end());

We’ve successfully erased all elements from 1000 to the last element. So what’s left? Let’s print the contents of our set. To do that, we need an iterator:

// declare iterator to help us iterate over our set
set<int>::iterator it;
   
// print the contents of our set
for (it=my_set_of_tens.begin(); it!=my_set_of_tens.end(); ++it)
    cout << ' ' << *it;
cout << '\n';
// output: 100

If we wanted to clear our entire set, we could use the beginning and ending pointers, but it’s easier with the built-in clear function: 

// using pointers
my_set_of_tens.erase(my_set_of_tens.begin(), my_set_of_tens.end());
   
// using clear
my_set_of_tens.clear()

Just as we can erase more than one element at a time, the same goes for adding elements. You’ll notice that we used an iterator in the above code to print the contents of our set. Similarly, we can use iterators to add elements to our set.

Below you’ll see our complete example. We encourage you to copy and paste the code into your own code editor and play with it to your heart’s content.

#include <iostream>       // std::cout
#include <set>          // std::set
using namespace std;
   
// empty set
int main ()
{
    // create empty set
    set<int> my_empty_set;
   
    // add single elements
    my_empty_set.insert(10);
    my_empty_set.insert(100);
    my_empty_set.insert(1000);
    my_empty_set.insert(10000);
   
    // create copy of a set
    set<int> my_set_of_tens (my_empty_set);
     
    // delete single elements
    my_set_of_tens.erase(10);
   
    // delete elements using pointers
    my_set_of_tens.erase(my_set_of_tens.find(1000), my_set_of_tens.end());
   
      // print elements
    set<int>::iterator it;
    for (it=my_set_of_tens.begin(); it!=my_set_of_tens.end(); ++it)
      cout << ' ' << *it;
    cout << '\n';
   
    // clear set using pointers
    my_set_of_tens.erase(my_set_of_tens.begin(), my_set_of_tens.end());
   
    // clear set using clear
    my_set_of_tens.clear()
}

We now know how to add and delete elements from a set. We can also get the size of a set, check whether a set is empty, clear an entire set of all elements and search through the set to find a particular key, as we’ll see shortly.

C++ Set Use Cases

What are some use cases for sets? One important use case is deduplication, or eliminating duplicates. Since the values in a set are accessed by keys, there can never be duplicates. So, how might we deduplicate a list that has more than two of the same item? Let’s instantiate a new set called no_dupes and fill it with integers:

  set<int> no_dupes{1,2,3,4,4};

If we iterate through this set, printing every element, we’ll see only four keys are printed. The second integer 4 was not added to the list. No duplicates!

We can also erase duplicates from a preexisting list. Our example array so_many_duplicates is full of integers:

int so_many_duplicates[] = {1,1,1,2,2,3,3,3,4,4};

To deduplicate, all we need to do is to create a set from that array, and the duplicates will be no more:

set<int> set_from_array{begin(so_many_duplicates), end(so_many_duplicates)};
   
for (it=set_from_array.begin(); it!=set_from_array.end(); ++it)
    cout << ' ' << *it;
cout << '\n';
// output: 1 2 3 4

Another great occasion for using a set is when we need to sort. Because a set is inherently ordered, using a set makes for a logical choice. Let’s look at a similar example of creating a set from an array, but instead of containing duplicates, this time our array is unsorted:

// declare array
int out_of_order[] = {2,1,4,3};
   
// create set from array
set<int> set_from_array{begin(out_of_order), end(out_of_order)};
   
// print every element in set
for (it=set_from_array.begin(); it!=set_from_array.end(); ++it)
    cout << ' ' << *it;
cout << '\n';
// output: 1 2 3 4

Last but not least, we can use sets for search indexing — storing elements and then quickly checking if a given element is present in the set, all in log(n) time. We already saw this in action when we used the find function to erase elements from our set. It returns the iterator if the value is there, and if it’s not, find points to the position following the end position.

As we’ve seen, sets can be useful in a variety of circumstances. But let’s look at some cases where they are better avoided.

When To Avoid Using a Set in C++

There are some circumstances under which using a set is not ideal. If a key-value store could work better for you than a key alone, consider using a map. And if you need to store two values together as a single unit, using a set of pairs is another option.

Another situation in which to avoid using a set is if you’d like to access elements by their position in the data structure, rather than by the value itself. In that case, a vector may be a handier option.

Optional Parameters in C++ Sets

Several data structures in C++ can, upon instantiation, be passed an optional second parameter that influences an underlying attribute of that container. For example, a queue can be passed an underlying container that defaults to a vector when not invoked.

Sets, too, can take a second argument — a compare function. This compare function returns true if the first argument precedes the second argument in the strict weak ordering defined, otherwise returning false. When we look at the definition of a C++ set, we can see the default function for the compare object:

template<
    class Key,
    class Compare = std::less<Key>,
    class Allocator = std::allocator<Key>
> class set;

It defaults to less, a function that behaves like the less-than operator by sorting a and b using a<b. So, what if we wanted our set to be ordered from greatest to least, instead of least to greatest? All we need to do is pass the built-in greater function as the second argument to our set:

// create set with numbers in ascending order
set<int, greater<int>> descending_set{1,2,3,4};
   
// print
for (it=descending_set.begin(); it!=descending_set.end(); ++it)
    cout << ' ' << *it;
cout << '\n';
// output: 4 3 2 1

When we print each element, we can see that the numbers now descend. We can even create our own custom compare function, as long as it follows the rule of returning a boolean value.

Continue Your C++ Journey

In this article, we touched on C++ sets and how to use them, before going over a few cases in which their use is either advisable or better avoided. We also explored optional parameters for sets.

If you’re ready to put your C++ prowess to the test, Udacity is offering an expert-taught online program that will take you through five real-world projects.   

Enroll in our C++ Nanodegree program today!

Start Learning