In the last unit, we have talked about what you can do when a problem is NP-complete, despite it being NP-complete. And we have found out that there are techniques such as optimized search trees, preprocessing, and fixed parameter tractability that you can often use to solve NP-complete problems and practice despite their hardness in general. So what can we do when these techiniques don't work, and we don't want to prove that P=NP? Well, in that case, we can try something else. We can try to give our algorithm a little leeway. So instead of demanding that the algorithm finds the best possible solution, we'll allow the algorithm to find a solution that is within a certain range of the best possible solution. In return, of course, we'll demand from our algorithm that it runs in polynomial time.
In the last unit, we have talked about how if we encounter a problem that is NP-complete, we can often, using very advanced algorithms, still hope to find a solution for those problems, at least in practice. Because you learned about techniques such as intelligent search trees and preprocessing. Now, in this unit we're going to investigate something different. So consider a challenging problem such as measuring the length of an elephant. Now, doing that exactly is actually not that easy because the elephant might be moving and it might not like the ruler down here. So finding an exact solution such as the elephant has a length of 5.107 meters might actually be quite a difficult thing to do. On the other hand, if you allow for a bit of sloppiness, and say, "Well, let's just estimate it at about 5 meters," then you can get a solution much faster. What we're going to look at in this unit is if sloppiness, or at least allowing for a bit of leeway, can actually help us solve NP-complete problems more efficiently. And what I mean by sloppiness is the following: Let's say we wanted to solve shortest tour. What we always asked was, very demandingly, what is the shortest tour? No excuses, no leeway, no sloppiness. What is the shortest tour? And what would happen now if we didn't ask what is the shortest tour, but rather asked the following: What is a pretty short tour? So for example, 20% within the optimum. Then we become less demanding on what the algorithm is supposed to produce. And now, of course, the question is, will this allow us to construct faster algorithms? Now, before we dive into this, I would like you to quickly think about this approach for a bit. In which scenarios might a sloppy, or not-exact solution be acceptable? So what I would like you to think about for a little bit is, in what scenarios approximation to the best possible solutions would be acceptable? So approximations is the more formal term for sloppiness, and, of course, just to be precise, in this unit we're going to be dealing with optimization problems. You can also talk about approximations to decision problems, but that's a bit of a different scenario, and, actually, we'll get into that when we talk about randomization. But this is not part of this unit here. So would we be content with approximations if the stakes are low, so we're solving an approximation problem where nothing is really critical? I mean, you might spend a little bit more money, but it's not going to mess up research or anything like that. Would it be acceptable if the algorithm kind of sounds good? We do not have a provable guarantee, but it still appears to yield good solutions and make sense in general. And would using an approximation be acceptable if we find that exact solutions are simply out of the question? We're using the best search tree; we're using preprocessing, but still our instances are so large, and they behave so badly that we just cannot find an exact solution.
And, of course, this is 1 of those easy intro quizzes just to get you thinking a bit about approximation. So it's clear, when the stakes are low, yes, approximate solutions are acceptable. If we gain running time, why not? If the approximation is very good then usually those solutions are also acceptable. The important thing that I wanted to emphasize is that, in this case here, I said that we have a provable guarantee on how good the algorithm is. So we'll not get into ±1%, unfortunately, in this unit, but what we will be doing is, we will be analysing approximation algorithms to see how good they really are. Because the thing is this: Approximation algorithms, where you have not done the analysis and they just sound good, can actually lead to very, very bad solutions, and I'll show you 1 example for vertex cover in this unit. So using approximation is okay, but you have to do the analysis nevertheless to see how good that approximation will be. Otherwise, you can walk into some rather nasty traps, I would say. And finally, whenever an exact solution cannot be obtained, yeah, of course, then we have to try an approximation if the problem is important to us. The important thing here, that I would like you to keep in mind, is what we found out in the last unit: that often it is possible to find exact solutions for NP-complete problems and practice. So many people tend to start out with approximation algorithms once they hear that a problem is NP-complete, and actually, I would like you to think about this the other way around. So first of all, NP-completeness can definitely mean that there still is a possibility for finding exact solutions, and only when that approach fails, or you're under time pressure, or there's no really good known search tree for your problem, and you have to solve it nevertheless. Only then would I like you to think about approximation. So I guess my message is this: Whenever you encounter an NP-complete problem, be demanding an exact solution unless you really know it's not possible.
So let's talk about approximation quality. Because, obviously, even though you've decided that you're content with a solution that is not optimal, you still want to ensure a certain quality of your solution. And we'll be looking at 2 different types of approximation algorithms in this unit. The first 1 is called a constant factor approximation. And what a constant factor approximation is, well, as the name says, there is some constant C, so when you have an optimization problem where you want to minimize some quantity, this algorithm will guarantee you a solution that is no worse than C x the best possible solution. For a minimization problem, your solution will be ? C x the optimum, or the best possible solution. And that is a guarantee that this constant factor approximation gives you. So this is for minimization, and for maximization--so minimization problem you're trying to find this point here; for maximization problem, you're trying to find this point here, basically. Your solution is at least as good as 1 over C x the optimum. So let's do a quick quiz to practice this. So say you have a factor-2 approximation algorithm, which means that C = 2. So you have a factor-2 approximation algorithm for vertex cover. You run the algorithm on an input, and it returns to you a vertex cover of size 100. And, of course, this is a correct answer, so it is, indeed, a vertex cover, we just don't know if it's the smallest possible vertex cover because we're using an approximation algorithm. What I would like you to tell me now is 2 things: First of all, what is the maximum number of vertices in an optimum solution, and please enter your answer here, and what is the minimum number of vertices in an optimum solution? And please enter this number here in this box.
Now, the first 1 is actually quite trivial; so first of all, vertex cover is a minimization problem, so we're looking for a vertex cover that is as small as possible. Now, if the algorithm already finds one with 100 vertices, then that means, in an optimum solution, there cannot be more than 100 vertices. Because we already know that there's a vertex cover of that size, the optimum solution might be smaller, but it will certainly not be bigger. Now here comes the interesting part. So we just said vertex cover is a minimization problem, which means if you have a factor-2 approximation algorithm, then the solution that we found, and we found a solution with 100 vertices, is ? 2 x the size of an optimum solution. Which means if you solve this equation, that the optimum is at least as big as 100/2, or 50. So the minimum number of vertices in an optimum solution is 50.
Now, let's do the same thing again for clique. So, again, we're going to assume that we have a factor-2 approximation algorithm, and the algorithm gives you a clique for a given input. Again we're going to assume that the clique that you're getting contains 100 vertices. So now what is the maximum number of vertices in an optimum solution to clique? And what would be the minimum number of vertices that are in an optimum solution? So please, again, enter your answers here in these boxes.
Now, the important thing to notice here is that clique is a maximization problem. That is, we're trying to find a clique that is as large as possible. So we have already found a clique with 100 vertices. So an optimum solution has to contain at least 100 vertices, because we already know that a clique of this size is contained in the input network. So this time, this 1 over here is easy. Now, what about the maximum number of vertices in an optimum solution? There could be a clique that is bigger than 100 vertices, but we just haven't found it. Well, since clique is a maximization problem, we have to plug it into this equation here. So our solution was 100 vertices which is ? 1/C. C is the approximation factor, 1/C x the optimum, and if we solve that equation, we find out that the optimum solution contains ? 200 vertices. So this gives you a good intuition: For a minimization problem, a factor-2 approximation algorithm means that we could have found a smaller solution, but it is limited by this factor-2, and for a maximization problem, we could have found a larger solution, but this is also limited by this factor-2, or, more precisely, by 1 over this factor of 2. So the factor here kind of gives us a range that tells us, in a worst case scenario, how far away we are from the optimum solution.
In addition to constant-factor approximation algorithms, which we will look at in more detail soon, we are also going to be talking about advanced algorithms that are known as polynomial-time approximation schemes. Scary term. What the idea of a polynomial-time approximation scheme is, it's a sort of more advanced constant-factor approximation, so you can think of it as a sort of scale. The idea behind a polynomial-time approximation scheme is that the running time actually depends on the quality that you want to get. With a constant-factor approximation you always get the same guarantee, say a factor to approximation, and in a polynomial-time approximation scheme, you can more or less decide what the quality is that you want, but the better a guarantee you're demanding, with respect to quality, the more running time you'll have to put into that algorithm. That's the trade-off here. But first of all let's look at constant-factor approximation. We are now going to meet Alice again. Alice, as you know, is working on vertex cover. So far she has been the most lucky in this course, because she seems to actually be working on a very well-behaved NP-complete problem. We also learned that there's good search trees, there's good preprocessing, and vertex cover was even fixed parameter tractable. Alice can be quite optomistic about approximating vertex cover, and actually she has come up with 2 different ideas of how you could approximate vertex covers. Here are the 2 possible ideas that she has come up with. The first algorithm takes this input graph and then looks if some edges are still not covered in that graph, and if that is the case it takes the uncovered edge and puts both of the endpoints of that edge into the vertex cover. Then the algorithm looks again if there is still some uncovered edges, if there are any of those, then it takes again one and puts both endpoints into the vertex cover and so on until all edges are covered. And she calls that algorithm "Take 2," because it always takes 2 endpoints into the vertex cover each time you go through that loop until you're done. Here is her second idea, and she calls that idea "Greedy," because what that algorithm does-- It's still the same in the first line here. It looks if some edges are uncovered. Then what it does is it goes through all of the vertices to determine which vertex could cover the most edges that are still uncovered. So which vertex is next to the most edges that have not yet an endpoint in the vertex cover, and then it puts that vertex in the vertex cover. It kind of tries to maximize the coverage that it can achieve in each round that this loop here is run. So looking at this strategy, so these 2 algorithms, what I would like you to think about for a moment here is, which of these 2 algorithms should we expect to perform better in terms of approximation quality? The one that always takes 2 endpoints, or the one that tries to maximize coverage?
The answer here, of course it's a bit subjective, but I think from just looking at the strategies that these 2 algorithms use here, it seems as if the "Greedy" alogirthm should perfom much better, because it always tries to cover as much as possible in each round, while this one up here seems almost a bit stupid, because it just takes any edge and always blindly puts both endpoints into vertex cover. Although actually just 1 endpoint would be needed to cover that edge. You may already be guessing that I asked you this question because I am now going to show you the exact opposite. Because it's the case that although this algorithm here seems better, and the strategy appears to make more sense, actually from a point of approximation quality, at least if you consider the worst case, this algorithm up here, "Take 2," is actually the better algorithm.
Let's look at the approximation quality of these algorithms, and we're going to start with the "Take 2" algorithm. Here is the little graph for which we are going to try and example. As a first practice quiz, how large is a minimum vertex cover for this small graph here? Please enter your answer here into this box.
As almost always with vertex cover, there is a number of possible solutions here, but the minimum vertex cover will always be of size 4. I'm going to give you the solution that I found. I could for example put this vertex here into the vertex cover. This one, this one, and this one. Then I have these edges covered by this vertex here. I have these edges covered by that vertex, and this vertex here covers these 2. This vertex here covers the rest.
Let's see what the "Take 2" algorithm would do if it's given this graph here. The algorithm of course doesn't know the minimum vertex cover, but I'd keep those vertices marked here to help you answer the next quiz. What I would like you to think about is, how many vertices from the minimum vertex cover, does the "Take 2" algorithm put into its solution in each loop? By loop I mean this part here. So each time that this part here is run, how many vertices that the "Take 2" algorithm puts into the vertex cover actually belong to the minimum vertex cover? I'm going to give you 5 choices for this. Is it something that we can't make a general statement about? Is it at least 1? Is it at most 1? Is it exactly 1? Or is it exactly 2? Please select the correct answer here.
If you play around a bit with this algorithm what you will see is this. Each time that this loop here runs, it puts at least 1 of the vertices from the minimum vertex cover into its solution. So the correct answer is this. Let me illustrate this. If you take any edge here, so the algorithm starts out fresh, and you take any edge from this graph, it's impossible to take an edge where not at least 1 of the points is in the minimum vertex cover. Say we take this edge here, then "Take 2" would put these 2 vertices here into its solution, and 1 of those vertices is actually part of the minimum vertex cover. Or if the "Take 2" algorithm say uses this edge here, then we even have a better case, because both of the vertices that it now chooses to be part of its solution, are actually also part of a optimum solution. And this works for all edges. No matter what edge you choose here, you will always have at least 1 endpoint that is part of a minimum vertex cover.
Now that we know that each time this loop here runs, it puts at least 1 vertex from the minimum vertex cover into its solution, what can we say about the running time of this algorithm here? As in the previous unit, we can express the running time actually in 2 ways. We can either use n, the size of the input, or we can use k, which is the size of a minimum vertex cover for the input graph. So when you run "Take 2" on a given input of size n, and where the size of a minimum vertex cover is k, how often is that loop here executed? Does the loop run up to n times? So up to the number of vertices times. Does the loop run at most k times? Or does the loop run at least k times? Please select the correct answer here.
And this might have required a little thought, but since we have figured out that each time this loop here runs, it puts at least 1 vertex from the best possible vertex cover into its solution. It means that this loop can run at most K times because once it has run K times, it must have put all of the vertices from the minimum vertex cover into its solution. And in that case all of the edges are covered. So the condition for this while loop here saying that some edges must still be uncovered cannot be true any more after this loop here has run K times.
So we're very close to saying something about the approximation quality of the Take 2 algorithm, because we now know 2 things about this loop here. First of all each time this loop here is run, it puts at least 1 vertex from an optimum solution into its solution set. And the second thing that we know is that this loop here runs at most K times, where K is the size of a minimum vertex cover. Now my question to you is, after this algorithm here is done, how many vertices has it put into its solution set? Has it chosen at most K vertices, at most 2K vertices, at least K vertices, at least 2K vertices, or on average 2K vertices? Please select all that are correct.
There's 2 correct answers here. The first one is very obvious. It's this one here. Because the algorithm, of course, returns the vertex cover and the vertex cover can never get smaller than K. But the really useful answer is this one here. The algorithm has chosen at most 2 K vertices where K would have been the optimum number. Because each time this loop is run, it chooses at least 1 vertex from the optimum solution. What that also means is that it chooses at most 1 useless vertex that didn't have to be part of that solution. So the error that it makes is at most 1 extra vertex, but it can make that error at most K times. And for that reason, the algorithm chooses at most 2K vertices which means that although the Take 2 algorithm doesn't look very smart, it's actually a factor 2 approximation algorithm. It guarantees that the solution that it gives you is at most twice as big as an optimum solution. So let's now have a look at the algorithm that seems so sophisticated.
So let's start out again with an example here. And by example, of course, I mean a quiz. Actually I mean a triple quiz, because I have 3 questions for you. The first one is the size of a minimum vertex cover for this graph here. The second one is the size of the greedy solution. And then finally the size of the solution that would have been produced if we had used the Take 2 algorithm. So please give me your 3 answers here.
So the first one is easy to see. The size of a minimum vertex cover is 2 because you can just take this one here and this one here, and all edges are covered. Of course I constructed it that way. And that is also the size of the greedy solution. Because what the greedy algorithm will do is it will look at the whole graph and then say, oh which vertex could I use to cover the most edges? And actually this vertex here is next to 7 uncovered edges, and this one here as well so it will just choose 1 of them. Put that one into the vertex cover, say, and then in the next round all of these edges here are already covered, so it will just say, oh, here's another vertex I can use to cover 7 edges. So let's take that one because all of the other vertices, they can just cover 2 or 1 edge. So after executing this loop just twice, the algorithm is already done and it has even found an optimum solution. Now the Take 2 algorithm, on the other hand, well that looks a bit stupid compared to the greedy algorithm for this graph here, because what would it do? Well, it takes any uncovered edge--it doesn't really matter which one. So let's say it takes this one here, and then it will put this vertex here into the solution set and this vertex here. So all of those edges here are covered, and that one here as well. And, well it's not done yet, so it will have to choose another uncovered edge, let's say it's that one here, and again put 2 vertices into the solution set and then all edges are covered. So here the Take 2 algorithm, which you know by now is a factor-2 approximation algorithm, actually performs as bad as it can. So it really maxes out this factor-2 leeway that it is given.
But now the important thing is this. Just because I have given you 1 example where greedy works extremely well doesn't mean that this algorithm will perform well on any network. We haven't yet proven that it has a good approximation quality. And a good way to test approximation quality is to always try and construct an input for which the algorithm will perform really, really bad. So basically we are now going to try and trick this algorithm here--the greedy algorithm--into picking suboptimal vertices, basically using this greedy strategy against it. And if we do this basically using this greedy strategy against it, that will give us some indication on where this algorithm could fail and how badly it could fail. So if it's a factor 2 approximation algorithm, if it's a factor 1.5 approximation algorithm maybe, or if it's actually not a good approximation algorithm at all although it sounds good. And of course this is similar to some of the problems we have already had in this course. Finding out how to trick this algorithm, that takes a bit of playing around. So next to the screen here I have lying about 4 or 5 sheets of paper on which I experimented around to find out how I could trick this algorithm, and whenever you want to trick an approximation algorithm, it just takes a bit of time to sink your teeth into that. But here is something that I have come up with, and it's very similar of course to the ideas that other people have also had to try and trick the greedy algorithm. So I'll start out with the amazing number of 24 vertices. Now as if this weren't enough, I will now add 12 additional vertices. You can now see why I used up so much paper playing around with this. And then I'm going to connect them like this. Now my question to you is, how many vertices would greedy choose if we gave it this graph here as an input? And please enter your answer here.
And the answer is, of course, that greedy would choose 12 vertices because it always chooses the ones with which it could cover the most edges. So this one here, this one here, this one here, this one here, and so on. And that is, of course, a total of 12 vertices.
What am I going to do next? Well, guess what? I'm going to add more vertices. And this this time I'm going to add 8 vertices, and I'm now going to connect them this way, always groups of 3. And guess what I'm going to ask you now? How many vertices does greedy choose?
And the answer here is that greedy chooses 20 vertices. Why is that? Well first of all, greedy will choose the 8 vertices that I added down here because they have the largest number of edges attached to them, which is this one, this one, this one, you get the picture. So we'll choose 8 vertices down here. And then after that the whole bottom part here of the network will be covered so it will then proceed as it did before. It will choose these 12 here. So that makes a total of 20 vertices.
So maybe you already see a pattern emerging here, but I am now going to deviate just a little bit from this. I am now going to add 3 vertices here, 3 vertices here, And just so that we keep that in mind here, up here we have 8, here we have 12, and here we have 24 in these layers here. Now how am I going to connect those? And this is where the picture gets actually quite messy. I'm going to connect them to all those here. So each of these 3 vertices here that I've added is connected to 6 other vertices, so there's 6 edges going out here--1, 2, 3, 4, 5, 6--and what you see also is that each of the vertices here in this layer--the first layer that I added-- is connected to exactly 5 vertices. So 1, 2, 3, 4, 5. And of course I deliberately constructed this in this way, and I'm going to draw these out here in a minute. I just want you to understand the principle. I've added these 12 vertices here in a way such that the greedy algorithm will first take these 12 here, then it will take these 8 here, and then it will take these 12 here. So I'm constructing my graph so that the greedy algorithm will choose the vertices in exactly the reverse order that I am adding them to the network. And in that way I can make that algorithm behave very, very, very badly. You would think that theoretical computer science is a clean and not messy science, but I think I can just show you the opposite here. But you get the principle now. So these 24 vertices here are the first ones that I have added, and I have now added these layers here. So I first added these 12 here, then these 8, then these 12 here in such a way that the greedy algorithm will choose these vertices here to be in a vertex cover and these here in a vertex cover. So the question, of course, now is, can I add even more vertices to this so that I am still making the greedy algorithm take my new vertices that I add? And in fact I can, but it will become really messy to draw. So I'm more going to explain to you how I'm going to do this rather than actually draw out every single vertex. So you see for each of the vertices that I've added here, I've connected them to larger and larger groups of these vertices here in the middle of those original 24, and I can continue this for a little bit by now adding 2 vertices which would then be connected to groups of 8 down here. So this would be connected to all over here, and then I will do the same part here. So again I can add 2 vertices here, and I will connect them to this group of 8, then 2 more vertices here, and I will connect them to this group of 8. And if you want, you can of course check. It will still be such that the 6 vertices I've now added will be the first ones that are picked by the greedy algorithm. And then those 12, and then those 8, and then those 12. What I can then do is I can add another 4 vertices, and this time I will look at groups of 12. So I will add 4 vertices connected to these 12 here in the middle, and I will add 4 vertices connected to these 12 here in the middle. And we'll be done soon, so in case you think this gets very tedious, bear with me for just another moment here, and we'll be done. So we've added another 8 vertices, and again these will be the first ones picked by the greedy algorithm. And final addition of vertices actually. I will add another 11 vertices. And these vertices will be connected to all 24. Now my question for you is, how large is a minimum vertex cover for this network that I've constructed here? And remember every vertex is connected to those 24 vertices here in the middle. So please enter the size of the minimum vertex cover here into this box.
And the answer--and of course I've given you that hint--is 24. Because every vertex that I've added is connected to these 24 here in the middle. So by just choosing those, I can cover every edge in the graph.
Now my next question to you is--how many vertices will the greedy algorithm choose? So please give me your answer here in this box.
Well, the way we constructed this graph or the way that I first construct it--well of course as follows that we made the greedy algorithm take all vertices into the vertex cover except for those 24 here. So every vertex except for those that are actually in a minimum vertex cover. So how many are there. Well, up here we have 11, then we have an additional 14 here, then we have 12 here, as opposed to a minimum vertex cover, which is 24.
So given that the greedy algorithm takes 57 instead of 24, what's the approximation factor for this concrete example here where the approximation constant? And I would like to enter your answer here and please give me 2 digits after the decimal.
And the answer here is that the approximation factor is 2.38, so it's performing worse than that take two algorithm, which have an approximation factor of 2. So what that means is that the greedy algorithm, although the strategy might sound good, doesn't perform as well as the take 2 algorithm, which is guaranteed to have an approximation factor of 2 and that's why it's so important to prove properties of algorithms and not just saying--well, this is a sound strategy, this one sounds good, because while it may sound good, it can actually in some cases perform really bad. So does this mean that the greedy algorithm is useless and we should always use and we should always use the take 2 algorithm. Well, not really because actually for most instances that you would encounter in a real world setting, so for example a telecommunications networks as Alice is trying to do-- this algorithm here will perform very well, but it is not guaranteed to perform very well.
So now my question for you is, of course, which would Alice do if she want to approximate vertex cover? Should she run the take 2 algorithm, should she run the greedy algorithm, or should she actually run both algorithms?
And the answer here is that I would recommend that Alice run both algorithms because both algorithms, if you look at them, are very simple so they can be implemented to run very fast even if the input graph is very large. So if she can first, for example, run the take 2 algorithm and then she already has a solution for which she has a guaranteed, so running the take 2 algorithm will defer some information about an optimum solution and that is the optimum solution cannot be smaller than some quantity x. And if she then runs the greedy algorithm, she can just see if the greedy algorithm gives her a solution that is better than the take 2 algorithm. In this case, she should take the greedy solution or if it produces adverse solution, so if by some accident she's running it on a graph that has the same properties than the one that we just constructed to trick the algorithm, then she can take the solution, and in this way, she gets the best of both algorithms. She gets the good performance or the generally good performance of this algorithm, but she ensures that she is within certain guarantees. The important thing here to keep in mind is just because an approximation algorithm sounds like it make sens or sounds like it is a good idea, doesn't really mean that it is a good idea. It could be a good idea, but unless you analyze the algorithms and try to prove that properties--you'll never know.