Thank You, next(): Getting Comfortable With Python’s Iterators

Writing code, like writing poetry, literature, or blog posts, is full of stylistic choices. If you work with a developer long enough, you will get to know these choices. Code, like the written word, necessitates the use of common conventions to serve its purpose. In Python, one of the most important conventions – and one that is often invisible to us in the same way that word order is invisible to us – is the use of iterators.

Iterators We Already Know

If you’ve ever written a for loop, you’ve used an iterator, whether you know it or not! Consulting the Python Enhancement Proposal (PEP) that proposed iterators, PEP 234, an iterator in Python is any object that implements an __iter__() method and a __next__() method. That means that the primary built-in data structures we use in Python: list, dict, set, and tuple, are all iterable objects: objects we can call iter() on to yield an iterator. This is an important distinction, since, for example, a list is not itself an iterator, but IS an iterable object.

>>> my_list = [1, 2, 3, 4]
>>> next(my_list)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'list' object is not an iterator

However, if we call iter on our list:

>>> my_list = [1, 2, 3, 4]
>>> my_iterator = iter(my_list)
>>> next(my_iterator)
1
>>> next(my_iterator)
2

Go ahead and run the code above in your Python REPL and see what happens when you call next() a fifth time – it stops you!

While you may not have recently – or ever – used the next() method of an iterator, you have probably used the for … in … syntax. When we write a for loop, under the hood, the Python interpreter is creating an iterator object and calling the next() method for us.

my_dict = {
"spam": 1,
"ham": 1,
"eggs": 2,
"cheese": 3 
}

for k, v in my_dict.items():
print(v, k)

Running this code gives us the following output:

>python example.py
1 spam
1 ham
2 eggs
3 cheese

Now, if we wanted to use an iterator of our dictionary instead, we could use a while loop like this, and get the same output:

my_dict = {
"spam": 1,
"ham": 1,
"eggs": 2,
"cheese": 3 
}

dict_iterator = iter(my_dict.items())
i = 0
while i < len(my_dict):
k, v = next(dict_iterator)
print(v, k)
i += 1

Creating Our Own Iterator

As we know, any object with an __iter__() method and a __next__() method is an iterator. We also know that calling iter() on our iterable objects provides us with an iterator. But what if we want to build a custom iterator? Let’s consider the code below.

class Fibonacci():
    def __init__(self):
       self.value = 1
       self.previous = 0

    def __iter__(self): 
        return self

    def __next__(self):
        current = self.value
        previous = self.previous
        self.value = current + previous
        self.previous = current
        return current

This iterator will start with 1, and each call to next() will return the next() element of the Fibonacci sequence! (We’ll put the arguments about the sequence starting at 0 aside and acknowledge that if we just change return current to return previous, we get the 0-indexed sequence)

So how does that work?

iterator = Fibonacci()
print(next(iterator))
print(next(iterator))
print(next(iterator))
print(next(iterator))
print(next(iterator))
print(next(iterator))

Given the Fibonacci object, the code above will yield the sequence: 1, 1, 2, 3, 5, 8. This is because we’re calling next() on our iterator manually. What happens if we run the following code?

iterator = Fibonacci()
for element in iterator:
    print(element)

In this case, since there is no bounding, the code will run forever (or at least until you interrupt the process!). Unlike the my_iterator object from our first example that calls iter() on a list object, there is no final element of this iterator, and so it will never throw a StopIteration exception. If we want to do this, we can simply modify our object to stop iterating when it hits a value larger than some arbitrary upper bound.

class Fibonacci():
    def __init__(self, max_val):
       self.value = 1
       self.previous = 0
       self.max_val = max_val

    def __iter__(self): 
        return self

    def __next__(self):
        if self.value > self.max_val:
            raise StopIteration
        current = self.value
        previous = self.previous
        self.value = current + previous
        self.previous = current
        return current

iterator = Fibonacci(20)
for element in iterator:
    print(element)

Running this code will yield: 1, 1, 2, 3, 5, 8, 13 since the next largest value of the sequence is 21, which is larger than our specified maximum value of 20.

Conclusion

Iterators are an incredibly important part of Python code. Although the actual iterators are often abstracted away from us and we instead loop over iterable objects, there are times where we will want or need to implement our own iterable objects. It is important for us to know how to construct our __iter__() and __next__() methods. For data science and machine learning applications on large data sets, a special type of iterator known as a generator sees frequent use. If that excites you, then the next() thing you should do is check out the Programming for Data Science with Python Nanodegree program!