So now we have factor-2 approximation algorithms for Alice, who of course is very happy about this, and we have a factor-2 approximation algorithm that Bob can use. So Alice and Bob are both working on problems that apparently can be approximated very well. Now what about Bob and Carol? So Alice was working on vertex cover. Bob, as you remember, was working on clique, and Carol was working on independent set. So we now know that there is a factor-2 approximation algorithm for vertex cover, and we have seen in the previous units how closely related vertex cover is to especially independent set but then also clique. So my question for you is now-- now that we have a factor-2 approximation algorithm for vertex cover, do we also have a factor-2 approximation algorithm for clique and independent set? So can this be carried over here? So do we have a factor-2 approximation for clique and independent set, (and I'm just writing independent set like this here) if we know that we have a factor-2 approximation algorithm for vertex cover? And of course there are two possibilities here, so you could easily get this by guessing. But please thing about it for a while before giving me your answer.
So the answer here, and maybe it's a bit surprising, is actually that we can't be sure yet because the reduction between vertex cover and independent set, although it was a very, very easy one, could actually be destroying the approximation factor.
So let's investigate this a little bit further, and we'll focus on vertex cover and independent set for now. So let's say we have as input a graph with N vertices. Now let's say we compute a minimum vertex cover for that graph, and that vertex cover has size K. Then as you remember from the reduction, we know that the largest independent set in the same graph has size N minus K. If we use the approximation algorithm for vertex cover, then that algorithm will find a vertex cover that is at most 2K in size. What that means for independent set, if we use the same approximation algorithm, that we know that the graph has an independent set of size at least N minus 2K. So this is what the approximation finds, and this here is an optimum solution. It means that over here we have a factor to approximation, as you remember. Now over here, to figure out the approximation factor, we have to divide what the algorithm finds by the size of an optimum solution, which is equal to 1 minus K over N minus K. So the factor to approximation algorithm for vertex cover is a factor 1 minus K over N minus K approximation algorithm for independent set.
And now my question to you is, is this a useful approximation factor? I'll give you a number of choices here. Is it not possible to make a statement here, because it looks a bit weird doesn't it? Is it a useful approximation factor only if the independent set is a majority of the vertices? The maximum independent set of course. Is this a useful approximation factor for large networks, or is this not a useful approximation factor at all? Please select the ones here that are correct, and then let's discuss.
And very unfortunately, it turns out that this is no useful approximation factor at all. The approximation factor depends on the size of a minimum vertex cover. It is not a constant factor approximation factor. So that's already a bit weird, but let's look a bit closer at this expression here. So for a good approximation factor, we want this to be as close as possible to 1, because this is a maximization problem, so you'll remember that the approximation factor is 1 over C, and of course, the smaller this constant, the better, and then this whole expression gets closer to 1. So let's say we want a factor 2 approximation for independent set, then we would have to have that 1 minus K over N minus K is at least one-half. If you solve this for N and K, what you will get is that K must be smaller than or equal to one-third of N. So this approximation algorithm is only useful when the input graph has a comparably small vertex cover. Now if it has a small vertex cover, then it will have a very large independent set, so you might be inclined to think that this could be a correct answer. It is a useful approximation algorithm if the independent set consists of a majority of the vertices, but the problem here is that when you are given a network, you just don't have that information here. So if you run the approximation algorithm, it will give you some answer, but it will not really allow you to make any statement about how good the algorithm is. It's a kind of thing that you only know after the fact. So once you would know what an optimum vertex cover or independent set looks like for the graph, then you could have said what the approximation quality is, but in that case you wouldn't want to use an approximation anymore because you already have an exact solution.
So unfortunately the factor-2 approximation that we have found for vertex cover does not carry over to clique or independent set. Well we've only shown that it doesn't carry over to independent set, but since independent set and clique are, again, so closely connected you can already see or at least guess why it doesn't carry over here as well. And I've actually mentioned a similar thing to you in the last unit when we talked about fixed parameter tractability. Where it was also the case that if you took the size of the optimum solution as a parameter, there is a parameterized algorithm for vertex cover, but clique and independent set are both not fixed parameter tractable. So the thing here is this. A reduction that you use to show NP-completeness is, in a way, a very coarse way of reducing problems to each other. So coarse that it can destroy approximation factors, that it can destroy fixed parameter tractability, and so on. So there are special types of reductions that will keep approximation factors or fixed parameter tractability. Unfortunately we can't go deeper into them in this course, but you should just know that just because two problems are closely related, that does not mean that algorithmic techniques, such as approximation algorithms, easily carry over from one problem to the other as well, unfortunately.
Now here, just because the fact 3 approximation algorithm for vertex cover doesn't carry over here doesn't mean doesn't mean there can not be an approximation algorithm for Clique and Independent Set; it's just that we can not use this one. So Bob and Carol still don't have to be unhappy, but unfortunately, I'm now going to make them very unhappy. For Clique and Independent set, there is no constant factor approximation algorithm unless P equals NP. That, of course, is a very unpleasant surprise and it's very surprising, so where did I get that from? Well, the proof for this result here relies on some very, very deep results of theoretical computer science which you will only get to know in very advanced theoretical computer science courses, if ever. But what I would like to do is just give you an intuitive idea for why there is no constant factor approximation algorithm for Clique and Independent Set unless P equals NP.
Suppose we're playing a game--and that game is "guess the word"-- where I want you to guess a word, and all I'm giving you is how many letters this word has. So one, two, three, four, five, six, seven. So assuming you haven't already skipped ahead to the next part of this unit, can you guess my word?
Of course, you can not guess the word simply by knowing the number of letters, because there are lots and lots and lots of 7-letter words.
Now, what if I gave you two letters.? Can you then guess my word? Of course, if you can't guess it, just skip ahead, but I think there are good chances that you'll figure this one out.
Of course, I was thinking of the word "Udacity." Chances are, I think, you've gotten this one.
So what does this have to do with approximation algorithms? Well, it shows that even though a problem can be very hard initially, once you know just small parts of the solution it can become rather easy to obtain a complete solution. So the deep resolve that underlies the statement here basically works the same way. It says that if you know a small part of the solution of Clique and Independent Set, then you can use this to construct an optimal solution. The real proof of this is much more complex, but that is the basic idea. Having a constant factor approximation algorithm for Clique or Independent Set would reveal so much information about a perfect solution that you could construct a polynomial time algorithm from this. And actually there's a similar result for Vertex Cover, surprisingly, but Alice doesn't need to worry that much, so for Vertex Cover there is not constant factor approximation that is better than 1.36, unless P=NP. And actually, many people believe that this number here should be higher, but this is the highest number that has been proven so far.
Now, finally, what about Bob, who is solving Shortest Tour? Well, Shortest Tour is a problem that hasn't been investigated that much. In the general case, for it's closely related problem, Travelling Salesman, we don't know. But there are many special cases of this problem where good approximations are actually possible. But the bounds of approximation for Shortest Tour and Travelling Salesman have actually not been investigated that much. So if you're really into it, that will be a possible area for research. Although, it seems to be quite hard to actually transfer the results from here over to Shortest Tour. So maybe we can make Alice a little bit less happy; although, she has been quite lucky in this course. She has a constant factor approximation algorithm, and the limit of 1.36 probably doesn't affect her that much. And Bob, well, it's not really clear if he should be happy or not, but as we mentioned, in practice it's likely that his problem is actually very well solvable.