cs215 ยป

CS215 Unit 5 Solutions

Implementing Dijkstra with Heaps

There are two differing approaches to using a heap data structure.  In the first approach, we use the heap code from Unit 4, but with some modifications. The second approach uses the built-in heapq module.

Using heap code from Unit 4

The main modification we need to make from the heap code in Unit 4 is that we e need to be able to update the value of a node if we've found a shorter path then what we had previously found

One way to do that would be to search through the entire heap trying to find the node and then update it.  Updating the node this way would run in linear time and lose much of the advantage of using a heap over a list.  Instead, we need to use another data structure to keep track of where the node is at in the heap - a python dictionary is a good one to use.

First, we define a swap routine to swap two items in the heap and update their respective entries in a location dictionary.  Swapping items in a heap happens a lot in operations like up_heapify and down_heapify.

def swap(heap, old, new, location):
    location[heap[old]] = new
    location[heap[new]] = old
    (heap[old], heap[new]) = (heap[new], heap[old])

Here is the new implementation for down_heapify which also includes a location dictionary.  This implementation also removed the recursive calls and instead updates the value of current node in a loop.:

# Call this routine if the heap rooted at i satisfies the heap property
# *except* perhaps i to its children immediate children
# location is a dictionary mapping an object to its location
# in the heap
def down_heapify(heap, i, location):
    # If i is a leaf, heap property holds
    while True:
        l = left(i)
        r = right(i)

        # see if we don't have any children
        if l >= len(heap): 

        v = heap[i][0]
        lv = heap[l][0]

        # If i has one child...                 
        if r == len(heap):
            # check heap property
            if v > lv:
                # If it fails, swap, fixing i and its child (a leaf)
                swap(heap, i, l, location)

        rv = heap[r][0]
        # If i has two children...
        # check heap property
        if min(lv, rv) >= v: 
        # If it fails, see which child is the smaller
        # and swap i's value into that child
        # Afterwards, recurse into that child, which might violate
        if lv < rv:
            # Swap into left child
            swap(heap, i, l, location)
            i = l
            # swap into right child
            swap(heap, i, r, location)
            i = r

The up_heapify routine also needs to be modified to use the location dictionary.  The recursive calls have also been removed.

# Call this routine if whole heap satisfies the heap property
# *except* perhaps i to its parent
def up_heapify(heap, i, location):
    # If i is root, all is well
    while i > 0: 
        # check heap property
        p = (i - 1) / 2
        if heap[i][0] < heap[p][0]:
            swap(heap, i, p, location)
            i = p

Additionally, we need a way to find a remove the minimum value:

def heappopmin(heap, location):
    val = heap[0]
    new_top = heap.pop()
    location[val] = None
    if len(heap) == 0:
        return val
    location[new_top] = 0
    heap[0] = new_top
    down_heapify(heap, 0, location)
    return val

Adding a new item on the heap is pretty straightforward:

# put a pair in the heap
def insert_heap(heap, v, location):
    location[v] = len(heap) - 1
    up_heapify(heap, len(heap) - 1, location)

And, finally, we need to be able to update a value on the heap.  Note that if the value was increasing, which can't happen in dijkstra, we would have to call down_heapify instead of up_heapify at the end in order to make sure the heap property is maintained.

def decrease_val(heap, location, old_val, new_val):
    i = location[old_val]
    heap[i] = new_val
    location[old_val] = None
    location[new_val] = i
    up_heapify(heap, i, location)

Now, with these changes to heap we can implement dijkstra with a heap

# version of dijkstra implemented using a heap
# returns a dictionary mapping a node to the distance
# to that node
def dijkstra_heap(G, a):
    # Distance to the input node is zero
    first_entry = (0, a)
    heap = [first_entry]
    # location keeps track of items in the heap
    # so that we can update their value later
    location = {first_entry:0}
    # dist_so_far is kind of our 'frontier'
    # it keeps track of the nodes that we've seen so far
    # but aren't sure if we've found the shortest distance
    dist_so_far = {a:first_entry} 
    # final_dist keeps track of the nodes that we've seen
    # and know that is the shortest distance
    final_dist = {}
    while len(dist_so_far) > 0:
        # grab the minimum value from the heap
        dist, node = heappopmin(heap, location)
        # lock it down!
        final_dist[node] = dist
        del dist_so_far[node]

        # examine the neighbors
        for x in G[node]:
            if x in final_dist:
                # skip the ones we already have shortest
                # distances for
            new_dist = G[node][x] + final_dist[node]
            new_entry = (new_dist, x)
            if x not in dist_so_far:
                # add to the heap
                insert_heap(heap, new_entry, location)
                dist_so_far[x] = new_entry
            elif new_entry < dist_so_far[x]:
                # we found a better way to this node, update heap
                decrease_val(heap, location, dist_so_far[x], new_entry)
                dist_so_far[x] = new_entry
    return final_dist

Using heapq module

Python provides a module that lets you treat a list object like a heap.  You can read the documentation http://docs.python.org/library/heapq.html on the python website.

One feature it doesn't provide is an easy way to update the value of an object in the heap.  One way around this, as suggested in the documentation, is to mark the node as 'REMOVED' and filter 'REMOVED' nodes when trying to find the minimum value in the heap.  This does mean that the heap grows with the number of edges instead of the number of nodes, but, in the case of python, being able to take advantage of the more efficient implementation is usually worth the cost.

import heapq

def val(pair): return pair[0]
def id(pair): return pair[1]

def dijkstra(G,v):
    heap = [ [0, v] ]
    dist_so_far = {v:[0, v]}
    final_dist = {}
    while len(final_dist) < len(G):
        # find the closest un-explored node
        while True:
            w = heapq.heappop(heap)
            # grab the relevant parts of w
            node = id(w)
            dist = val(w)
            if node != 'REMOVED':
                del dist_so_far[node]

        # lock it down!
        final_dist[node] = dist
        # look at its neighbors
        for x in G[node]:
            # but only those that haven't been locked down
            if x not in final_dist:
                new_dist = dist + G[node][x]
                new_entry = [new_dist, x]
                if x not in dist_so_far:
                    # we haven't see this yet
                    # so add to the heap and the dictionary
                    dist_so_far[x] = new_entry
                    heapq.heappush(heap, new_entry)
                elif new_dist < val(dist_so_far[x]):
                    # the new distance is less then the 
                    # best known
                    # Instead of removing it from the heap
                    # which could be expensive, mark it
                    dist_so_far[x][1] = "REMOVED"
                    # and then add a new entry
                    # for this node
                    dist_so_far[x] = new_entry
                    heapq.heappush(heap, new_entry)
    return final_dist