cs215 ยป

CS215 Unit 5 Solutions

# Implementing Dijkstra with Heaps

There are two differing approaches to using a heap data structure.  In the first approach, we use the `heap` code from Unit 4, but with some modifications. The second approach uses the built-in `heapq` module.

## Using `heap` code from Unit 4

The main modification we need to make from the `heap` code in Unit 4 is that we e need to be able to update the value of a node if we've found a shorter path then what we had previously found

One way to do that would be to search through the entire heap trying to find the node and then update it.  Updating the node this way would run in linear time and lose much of the advantage of using a heap over a list.  Instead, we need to use another data structure to keep track of where the node is at in the heap - a python dictionary is a good one to use.

First, we define a `swap` routine to swap two items in the heap and update their respective entries in a `location` dictionary.  Swapping items in a heap happens a lot in operations like `up_heapify` and `down_heapify.`

``````def swap(heap, old, new, location):
location[heap[old]] = new
location[heap[new]] = old
(heap[old], heap[new]) = (heap[new], heap[old])
``````

Here is the new implementation for `down_heapify` which also includes a `location` dictionary.  This implementation also removed the recursive calls and instead updates the value of current node in a loop.:

``````# Call this routine if the heap rooted at i satisfies the heap property
# *except* perhaps i to its children immediate children
#
# location is a dictionary mapping an object to its location
# in the heap
def down_heapify(heap, i, location):
# If i is a leaf, heap property holds
while True:
l = left(i)
r = right(i)

# see if we don't have any children
if l >= len(heap):
break

v = heap[i][0]
lv = heap[l][0]

# If i has one child...
if r == len(heap):
# check heap property
if v > lv:
# If it fails, swap, fixing i and its child (a leaf)
swap(heap, i, l, location)
break

rv = heap[r][0]
# If i has two children...
# check heap property
if min(lv, rv) >= v:
break
# If it fails, see which child is the smaller
# and swap i's value into that child
# Afterwards, recurse into that child, which might violate
if lv < rv:
# Swap into left child
swap(heap, i, l, location)
i = l
else:
# swap into right child
swap(heap, i, r, location)
i = r
``````

The `up_heapify` routine also needs to be modified to use the `location` dictionary.  The recursive calls have also been removed.

``````# Call this routine if whole heap satisfies the heap property
# *except* perhaps i to its parent
def up_heapify(heap, i, location):
# If i is root, all is well
while i > 0:
# check heap property
p = (i - 1) / 2
if heap[i][0] < heap[p][0]:
swap(heap, i, p, location)
i = p
else:
break
``````

Additionally, we need a way to find a remove the minimum value:

``````def heappopmin(heap, location):
val = heap[0]
new_top = heap.pop()
location[val] = None
if len(heap) == 0:
return val
location[new_top] = 0
heap[0] = new_top
down_heapify(heap, 0, location)
return val
``````

Adding a new item on the heap is pretty straightforward:

``````# put a pair in the heap
def insert_heap(heap, v, location):
heap.append(v)
location[v] = len(heap) - 1
up_heapify(heap, len(heap) - 1, location)
``````

And, finally, we need to be able to update a value on the heap.  Note that if the value was increasing, which can't happen in dijkstra, we would have to call `down_heapify` instead of `up_heapify` at the end in order to make sure the heap property is maintained.

``````def decrease_val(heap, location, old_val, new_val):
i = location[old_val]
heap[i] = new_val
location[old_val] = None
location[new_val] = i
up_heapify(heap, i, location)
``````

Now, with these changes to `heap` we can implement `dijkstra` with a heap

``````#
# version of dijkstra implemented using a heap
#
# returns a dictionary mapping a node to the distance
# to that node
def dijkstra_heap(G, a):
# Distance to the input node is zero
first_entry = (0, a)
heap = [first_entry]
# location keeps track of items in the heap
# so that we can update their value later
location = {first_entry:0}
# dist_so_far is kind of our 'frontier'
# it keeps track of the nodes that we've seen so far
# but aren't sure if we've found the shortest distance
dist_so_far = {a:first_entry}
# final_dist keeps track of the nodes that we've seen
# and know that is the shortest distance
final_dist = {}
while len(dist_so_far) > 0:
# grab the minimum value from the heap
dist, node = heappopmin(heap, location)
# lock it down!
final_dist[node] = dist
del dist_so_far[node]

# examine the neighbors
for x in G[node]:
if x in final_dist:
# skip the ones we already have shortest
# distances for
continue
new_dist = G[node][x] + final_dist[node]
new_entry = (new_dist, x)
if x not in dist_so_far:
insert_heap(heap, new_entry, location)
dist_so_far[x] = new_entry
elif new_entry < dist_so_far[x]:
# we found a better way to this node, update heap
decrease_val(heap, location, dist_so_far[x], new_entry)
dist_so_far[x] = new_entry
return final_dist
``````

## Using `heapq` module

Python provides a module that lets you treat a list object like a heap.  You can read the documentation http://docs.python.org/library/heapq.html on the python website.

One feature it doesn't provide is an easy way to update the value of an object in the heap.  One way around this, as suggested in the documentation, is to mark the node as 'REMOVED' and filter 'REMOVED' nodes when trying to find the minimum value in the heap.  This does mean that the heap grows with the number of edges instead of the number of nodes, but, in the case of python, being able to take advantage of the more efficient implementation is usually worth the cost.

``````import heapq

def val(pair): return pair[0]
def id(pair): return pair[1]

def dijkstra(G,v):
heap = [ [0, v] ]
dist_so_far = {v:[0, v]}
final_dist = {}
while len(final_dist) < len(G):
# find the closest un-explored node
while True:
w = heapq.heappop(heap)
# grab the relevant parts of w
node = id(w)
dist = val(w)
if node != 'REMOVED':
del dist_so_far[node]
break

# lock it down!
final_dist[node] = dist
# look at its neighbors
for x in G[node]:
# but only those that haven't been locked down
if x not in final_dist:
new_dist = dist + G[node][x]
new_entry = [new_dist, x]
if x not in dist_so_far:
# we haven't see this yet
# so add to the heap and the dictionary
dist_so_far[x] = new_entry
heapq.heappush(heap, new_entry)
elif new_dist < val(dist_so_far[x]):
# the new distance is less then the
# best known
# Instead of removing it from the heap
# which could be expensive, mark it
dist_so_far[x][1] = "REMOVED"
# and then add a new entry
# for this node
dist_so_far[x] = new_entry
heapq.heappush(heap, new_entry)
return final_dist
``````