So Alice is now super, super happy, but what about the others? So I suggest that we take a look at Dave here, and Dave, as you remember, was working on a logistics problem, so what he was trying to solve was shortest tour. And actually, last time we left off, Dave wasn't too happy, because we found out it is not very easy to find a good search tree or a pre processing strategy for his problem. On the other hand, of course, we mentioned that in practice shortest tour is usually very easy to solve, but from a theoretical perspective, maybe we were a bit lucky for Dave; if we're not looking for the best possible solution, but except an approximation, and then if we should find an approximation, Dave can get a little happier here and of course he doesn't have to be as jealous that Alice has such an NP complete problem and he seems to have a harder one. Now lucky for Dave, there is a constant factor approximation algorithm for shortest tour, but since Dave is not that proficient in theoretical computer science, I think he doesn't really stand a chance of coming up with this algorithm himself, so you will have to pay close attention now to be able to explain it to Dave. The concept that we will use is that of a spanning tree, and if you've had an algorithm course before, you will know what a spanning tree is, but I'm quickly going to explain it to you just to make sure. A spanning tree is a part of a graph or actually it's a selection of edges in a graph so that the result looks like a tree. Well, what that means is that you select edges so that you still keep the whole graph connected, but you don't have any cycles, so a cycle would be something like this where you can leave a vertex and then come back to it another way, so you select edges so that the whole graph is still connected, but there can be no cycles, so to have you figure it out, let's do a quick quiz here. This is the thing that worked 3 times. I'm going to give you 3 choices of what a spanning tree could be, and the red edges are always those that I'm selecting here. I would like you to tell me in which of these cases the red edges from a spanning tree for the graph.
And actually, just this one here is correct. The other 2 are not correct, and as I told you, the conditions for a spanning tree are first, it can contain no cycles. and actually here, we have a cycle, so this can not be a spanning tree. And over here, we said that a spanning tree must still keep the whole graph connected, so all vertices connected to each other, and this is not the case here, so here, we have basically split the graph into 2 parts. Here is 1 part and here is the other part. So this is also not a spanning tree.
So now that you know what a spanning tree is, I would like to introduce to you the concept of a minimum spanning tree. So a minimum spanning tree is still a spanning tree on a graph, and in order to find a minimum spanning tree, you actually need to have a special kind of graph, and that is one that looks just like the input for shortest tour. And what I mean by that is that all of the edges have a certain number assigned to them. So in the case of shortest tour, those were distances, but it can be other values as well; that doesn't really matter. The important thing is that each edge must have a certain value attached to it such as in this one here. Now what is a minimum spanning tree? Now, first of all, let's take any spanning tree for this graph here. So let's say we take this spanning tree. And what you can do if you have a graph like this where you have numbers attached to the edges is, for each spanning tree, you can sum up those numbers. So here we have 2+4+3+2+3+6, and the sum of that is 20. And so what you say is that this spanning tree here has a weight of 20. Now if you choose another spanning tree--for example, this one here-- that spanning tree will have a weight of 3+4+3+2+3+6, which is a total weight of 21. And now you can almost guess what a minimum spanning tree will be. A minimum spanning tree is the spanning tree for the graph for which this total weight--the sum of all of the edge numbers-- is the smallest possible one. A graph can have more than one minimum spanning tree, but every minimum spanning tree has to have the property that the sum of the edges is as low as possible.
And, of course, I am going to quiz you on that. So what I would like you to do now is to think about what would be a minimum spanning tree for this graph here, and I would like you to make a check mark in each of the boxes that correspond to edges you would put into that minimum spanning tree.
Now how do you go about finding a solution if you don't have an algorithm for it? Well, first of all, this edge here costs just 1, so we probably want to take that. Then we have a lot of edges that just have a weight of 2, and, of course, they have been arranged in a way that is very beneficial for us here because we can actually put them all into the tree, and we almost have a spanning tree here. There's just one vertex missing, which we still need to connect to that tree here. And the cheapest way to do that is to use this one edge down here, of course, using 4 instead of this one here which would use 6. So this is a minimum spanning tree for this graph, and this has a total weight of
How can we use a minimum spanning tree for shortest tour? Well there's actually one more thing we need to talk about in terms of terminology so that I can give you an algorithm for shortest tour. Then you will hopefully be able to explain to Dave as well, should he ask you. The edges are still there, but we will just look at the minimum spanning tree for now. The terminology that I would like to introduce to you is that of a walk along a tree. Of course you can define this formally, but it's actually quite easy to see. What I mean by this is if you want to do a shortest tour on a tree, then a shortest tour on a tree goes something like this. You start out at a certain vertex, and then you can travel along that tree by using every edge exactly twice. Just the way that I'm doing right now. Actually it's not possible to do it any other way. I mean of course you can walk along the tree in a different order, but actually if you want to do a walk along a tree, then you have to use every edge twice. Then of course you return back to your origin. This would be a walk along the minimum spanning tree.
Enough with introducing new words to you. What is the algorithm that I propose that Dave should use? It's a very simple one actually. The algorithm is now quite simple to describe. Of course it takes a bit more effort to actually implement it. Given any graph, the first step that we do is we calculate a minimum spanning tree for that graph. That is something if you have had an algorithms class you know to be doable in polynomial time. That can be done in O(nÂ²) time for an in vertext graph. Then you take a walk along the tree, and you can guess that that can be done in polynomial time as well. The running time of this alogirthm here in total---so it's a polynomial time algorithm. Now where I probably need to convince you is that this is actually a good approximation algorithm, because it sounds so simple again doesn't it? My claim now is, as I've written here, that this is an approximation algorithm for shortest tour. Actually it's a constant factor approximation algorithm for that problem. So how come? Well, since you already saw one constant factor approximation algorithm, I am actually going to let you figure this out. Let's start out by taking 2 numbers and comparing them. The first number is the weight of the minimum spanning tree, and by weight I simply mean if you take the sum over all edges that are in the spanning tree, that's what I will call the weight of the spanning tree. The second one is going to be the length of the shortest tour for the network. Now first of all I would like you to tell me what you can say about these 2 numbers. How do they compare? Is the weight of the tree smaller than the length of the shortest tour? Is it smaller than or equal to the length of the shortest tour? Are both the same value? Or is the weight of the tree larger than or equal to the length of the shortest tour? Or is it always larger than the length of the shortest tour? Please select which one is correct here.
And the answer here is that the weight of the tree is always smaller than the length of the shortest tour. And the reason for that is the following: When you have a shortest tour for a given graph, then you can always transform the shortest tour into a spanning tree because the shortest tour visits each vertex at least once, so the connectedness is given. But, of course, a shortest tour always contains at least 1 cycle because you have to get back to the vertex where you started from. So you have to remove at least 1 edge from the shortest tour in order to make it a spanning tree. And then you have a spanning tree with a weight that is definitely less than the length of the shortest tour, because you have removed at least 1 edge. And now here, you even have a minimum spanning tree, so the weight of the minimum spanning tree must be even smaller. And that is why the weight of the minimum spanning tree is smaller than the length of the shortest tour.
Now I would like you to look at a third number, and that is this 1 here, the tour. So what is the length of the tour that this algorithm constructs here? And I'm going to give you 3 options here: Is it equal to the weight of the tree? And by tree I mean, of course, here, the minimum spanning tree. Is it equal to 2 times the weight of the tree? Or is the length of the tour, at most, 2 times the weight of the tree? And by tree I mean the minimum spanning tree here.
And the answer here is that the length of the tour is exactly 2 the weight of the spanning tree. That is because of the way that the walk was constructed, so you'll remember that when I just showed you how to walk along a spanning tree, or any tree for that matter, you use every edge exactly twice. So, if the weight of the tree is the sum of all the edges, then the length of the tour will be twice that number.
So now one final piece of information. Let's compare the length of the shortest tour for the input graph with the length of the tour that the algorithm computes. So, again we're going to have the length of this tour, and by this I mean the one output by the algorithm, and compare that to the length of the shortest tour. Again, I would like you to tell me what I should put here. Is the length of the tour computed by the algorithm smaller than the length of the shortest tour, is it at most as large, is it equal, is it bigger, or is it at least as large?
And of course, the part that is easy to see is that this is the shortest possible tour, so these two here cannot be true. And also this one here cannot be true because, after all, this is an approximation, and it is not guaranteed to give us the optimum solution. The question now is which one of these two here to choose. So is it always strictly smaller or can it be the same? And the correct answer here is that the length of the shortest tour can actually also be equal to the length of the tour output by the algorithm. It's not guaranteed to be, but it can be. And this could be the case, for example, if the input that we are given is already a tree, then the spanning tree will just be the input network itself. Then a shortest tour is indeed a walk along that tree. So in this case, you would have equality here. So length of the tour that we compute is at least as large as the length of the shortest tour but not always strictly bigger.
So now that we have 3 pieces of information, let's put these pieces together. So the first two that we will put together is the statement here on the second line and this one here on the third line. So we know that the length of the tour computed by the algorithm is exactly two times the weight of the minimum spanning tree. Now we will use the piece of information that we will have up here, so let's make a little bit more space down here. We know that the length of the shortest tour is larger than the weight of the minimum spanning tree, so we can put this part down here. Now isn't that beautiful. At least I think so. So the length of the shortest tour must lie somewhere in between the weight of the minimum spanning tree and the solution that you have just computed, which is twice the weight of the spanning tree. And this means that the algorithm I specified here is actually a constant factor approximation algorithm for shortest tour. Oh, I forgot to write the factor here. Well, of course I did that deliberately because this is going to be your final quiz for this algorithm. Please tell me what is the approximation factor, using this information we've just figured out, that this algorithm has for shortest tour?
And the approximation factor here is a 2, because the length of the shortest tour is larger than the weight of the tree. And the solution that we output is twice the weight of the tree, so it must lie somewhere in between here, which tells you that this solution here is at most twice as large as an optimum solution. So Dave can now be very happy because he has a factor 2 approximation algorithm for his problem. Which, of course, should he ask you, you can easily explain to him, because you figured out most of this by yourself.