cs313 ยป

Now the hardest part is done. We have a problem we know to be NP complete, and that problem SAT. So from now on, showing that other problems are NP complete will get a lot easier, because we don't have to go through this laborous process n coding machines anymore It will just suffice to show that SAT can be reduced the the problem we're looking at. So we can use SAT to show other problems to be NP complete now, and once we have shown these problems to be NP complete, we can use them as well to show again, other problems to be NP complete, so we are basically building this whole tree of NP complete problems using SAT as a c problem and it will just suffice to show these reductions here; we do not have to do the same tedious steps of Boolean formula n coding anymore, because we already know SAT to be NP complete now. So let's take 1 step back now with our next quiz. What happens if you show one NP complete problem to be solvable in polynomial time? And by solvable in polynomial time, of course, I mean on a deterministic RAM. Would that mean that some problems in an NP can be solved in polynomial time? Again, we're talking about a deterministic RAM here. Would it mean that all NP complete problems are solvable in polynomial time? Or would it mean that all problems in NP are solvable in polynomial time? And again, more than 1 of these answers here can be correct, so please check every statement that is correct.

And the answer here is that all 3 statements are actually correct. So the strongest one is probably this one here, so if you just happen to show just one single NP complete problem to be solvable in polynomial time, then all NP complete problems are solvable in polynomial time on a deterministic RAM, so on your home computer, basically. If all NP complete problems are solvable in polynomial time, because they are NP complete that also means, of course, that all problems that are an NP are solvable in polynomial time. So this one here is true, as well. Some problems in NP can be solved in polynomial time, yeah. We already know that, and actually, that's the weakest of the 3 statements, so if all problems are solvable in polynomial time, yes, some problems are solvable in polynomial time.

So now let's look at this statement in a slightly different way. What would happen if you showed 1 NP complete problem to be solvable only in exponential time. So it's not solvable in polynomial time, you show for 1 NP complete problem that requires exponential time to solve, on a deterministic RAM again, of course. So would that mean that some problems in NP are only solvable in exponential time? Would it mean that all NP complete problems are only solvable in exponential time? or would it mean that all problems in NP are only solvable in exponential time? So please, once more, check all of those that are correct.

And this time not all statements are correct, but the strongest one of them is, so if you showed just one single NP complete problem requiring exponential time to be solved, then all NP problems will require exponential time to solve, and the reason for that is that all NP complete problems are connected to each other using polynomial time reductions, so if one of them requires exponential time and another one requires polynomial time, then that just can not be, because say if this problem here requires polynomial time, that would mean that you can SAT, for example, in polynomial time, which would then mean that you can solve this problem over here in polynomial time, and so on and so on. All problems that are NP complete are from a viewpoint of polynomial or exponential time equally hard to solve, which would also mean that the weaker statement here is true. Some problems in NP are only solvable in exponential time; yes, if all NP problems require exponential time, then there are problems in NP which require exponential time. Finally, this statement down here; that is not true. All problems in NP are only solvable in exponential time. That is not true. Not all problems in NP require exponential time. So remember our picture from before when we said that P is basically a subset of NP, so these problems here will still be solvable in polynomial time. It doesn't matter if there are some problems in here that require exponential time. It doesn't effect these problems here in NP.

Now back to our initial question. What does all of this mean for Alice, Bob, and Carol? We have already seen that the problems they are trying to solve are contained in NP and so what we could now try is to show that their problems are actually NP complete. To do that, what we need to do is we need to reduce SAT to one of their problems, and we're going to do this with clique. It's also possible to do this with vertex cover or independent set, but clique is actually one of the easier ones to see. And now what we need to show is for any given instance of SAT, so for any Boolean formula, if you will, we can transform it into an input for the clique problem in polynomial time such that clique as a decision problem will say yes if the Boolean formula has a satisfying assignment and no otherwise, and once we achieve this, we know that clique is NP complete and since we have already seen there are polynomial time reductions between clique, vertex, and independent set, if clique is NP complete, then these 2 problems here are also NP complete which of course will mean that not only will Bob be very unhappy about this, but Alice and Carol will be in this together, as well.

So we're trying to show that any Boolean formula can be transformed into a graph such that this graph has a large clique in that case only if the Boolean formula has a satisfying assignment, and here's one fact that's going to make our life a little bit easier: Every Boolean formula can be written like this, so you the brackets around variables that are only combined by or, so you have variable 1 or variable 2 or variable 3, and you can have more than 3 variables in here, but you have nothing else; you only have variables and or and between the brackets, you always have an and, so you have brackets of or and bracket of or and so on. This is known as the conjunctive normal form of a Boolean formula, and it can be shown that you can write any Boolean formula, no matter how complex in this form here, and it will not significantly change the size of the formula, so if you have n variables and your Boolean formula has size polynomial in n, it will stay that way if you to transform it into the conjunctive normal form, and if you have conjunctive normal form, then this part here in the brackets is known as a clause, so this here would also be a clause, and what you can see is the way that this is structured the Boolean formula, if it has a satisfying assignment, then each of these clauses needs to evaluate to true, because they're all connected by an and, so if one of these would evaluate to false, then the whole formula would not be satisfied. Now how how are we going to transform this into clique? So I'm just going to show you how to transform a Boolean formula that is written in this form into a graph, and then we'll discuss what this has to do with clique.

So let's say we are starting out and I'm going to do is using an example. So let's say we start up with the following Boolean formula (x? or not-x? or x?) and (not-x? or x?) and ( not-x? or x?) and the array we're going to transform this into a network is we're going to do groups of vertices for each clause. So for this clause here, for this one here, and for this one here. So, this is our first group of vertices and we're going to have one vertex for each variable in that clause. So, this vertex here is for the x?. This vertex here is for the not-x?. This one here is for the x?. For this one here, we're going to do another group. This one here is for the not-x?. This one here is for the x?. And for that clause done here. We're going to do the same thing. So we have not-x? and here we have x?. And now, we're going to draw a lot of edges. So from each vertex, we're going to draw edges to every vertex in the other groups with one exception. We're not going to draw an edge if we have the same variable, but 1 is the untouched variable and the other one has a not. So x? and x?, we're going to draw an edge. x? and x?, we're going to do an edge.. And x? and not-x? also but we're not going to do an edge here. And it works similarly for this one here. So not-x?, not-x?, that's something we don't care about. That's something where we draw an edge. no-x? and x?. And then we have a not. Here, we do not have a not. So we don't draw an edge. And down here, that's fine. And here we have a case where we have not-x? and not-x?. That's also fine. So we'll draw an edge here. It's early if we have a not and then here we don't have one. So for x?, we can draw an edge down here. Down here. Again, those are the same. So we do draw an edge. That's fine and we're draw here. Now let's look at those here. So we already have all the edges all to these groups. So we only need to look down here and then the graph edges here and the same thing is going to happen here. So x?, I'm not drawing an edge to this not-x? here and we're drawing an edge down here. And we've already drawn all the edges down here. So this is the network that we constructed. Now what about the size of the graph? So if the Boolean formula here has polynomial size, so we have n variables or in this case three variables and we said that we're going to have the total length of the Boolean formula. The polynomial and the number of variables. So, if this length here is polynomial, then the network also will have polynomial size because we have one vertex for each of the variables that occur here, so the size really doesn't change that much. We introduced edges but you already know that the number of edges is quadratic in the number of vertices that we have. So, the size stays polynomial which is already good for our reduction. But now I still need to show you what satisfiability has to do with clique. We're actually going to do this as a quiz. So let's say that the Boolean formula has m clauses. So in this case, m is 3 but you can consider the more general case here and then let's say that the graph that we have constructed has a clique of size m. So one clique of size m, for example here, because m is 3 would be this here. All vertices are connected to each other. There's others that you can find actually. So, if she can't find a clique of size m in a graph that was constructed this way, does that graph contain at least one vertex from each clause group and by clause group, I mean, the groups of vertices that we introduced for each of the clauses from our satisfiability formula. And what I would also like you to think about is if a clique can contain multiple vertices from one of this clause groups here. So if we could have a clique, say for example, that could contain these two vertices here or this two here or in the more general case, if we can have a clique that contains more than one vertex from each of these groups here. So please check those that are correct and leave the other ones blank.

And here, only one statement is correct. This statement down here is not correct. The clique can not contain multiple vertices from one of these groups, and the reason why that is is that in a clique, all vertices, as you know must be connected to all other vertices, but this is not how we constructed the graph. When we constructed the graph, all we did was draw edges between vertices from different groups, but we did not allow any edges between vertices in that group, so if you find a clique that can only contain exactly one vertex from each of these groups here. So if the graph has a clique of size m, and the clique can not contain multiple vertices from a clause group, this means that you have to have exactly 1 vertex from each clause group in your clique. So this statement here is definitely true. It contains at least one vertex from each clause group. But actually, we can make an even stronger statement and say it contains exactly one vertex from each clause group. And now we're almost done with our reduction, because if you think about it, so we solved clique for this graph we have constructed here, and we asked the decision problem, does the graph have a clique of size m where m is the number of clauses?

There's 2 cases here; so let's say the graph does have a clique the size of m, then what will happen is the following. So it contains exactly 1 vertex that corresponds to a variable in that clause, and what we also know is because we constructed the graph this way, that we can not have 2 vertices connected here in this graph where there's a conflict, so what we can not have in this graph, for example is 1 vertex that represent not x2 and another vertex that represents x2, because we avoided this in the construction but this directly shows us how we can satisfy this Boolean formula here, so if we have a not x2 here, we just set it to 0 so we could just set this one to false, so that not x2 becomes true, then this clause here is satisfied and also this one over here would be satisfied. And x1--that's a separate variable, so we can set it as we want, and we could set this one to false. So we have x2 and we can set that to false so not x2 becomes true. So that becomes true, that one becomes true, that one here is false. Over here, we have not x2 as well, so there's no conflict here. x1--we haven't yet talked about x1, but can just do the same thing; we can set x1 to false, so the whole thing not x1 becomes true. We have set it to false, and we haven't even looked at x3, but all of the clauses are already satisfied, and that's actually the cool thing to notice here. x3 is also not contained in the clique, so the vertices in the clique tell you how to set the variables so that the Boolean formula can be satisfied. So what if this graph here does not have a clique of size m? Well in this case, it means that we have 1 clause group which is not contained in the clique and if we have 1 clause group that is not contained in the clique, this means that there's no vertex in the clause group that has connections to all of the other vertices in the smaller clique that we're looking for, and that means, due to the way we constructed the graph, that every vertex has a conflict with another vertex in the clique so if there's no large clique, this means that every assignment of true and false of the variables will lead to a conflict such that at least 1 of the clauses can not be satisfied. So then the Boolean formula does not have a satisfying assignment. So now that we have this reduction, we know that clique is NP complete, which of course makes Bob a little unhappy. So now what about vertex independent set? Are those problems NP complete or are we not sure yet?

Now that we have shown clique to be NP-complete, we can easily say that independent set is NP-complete because we can transform a graph for which you want to solve clique easily into a graph on which you can solve independent set, and then that will give you a solution for clique. So we already have that reduction down here, which of course will make Carol quite unhappy. And we also had the reduction from independent set to vertex cover because we showed that solving vertex cover on the graph is basically the same thing as solving independent set. So vertex cover is also NP-complete. So Alice is now unhappy as well. And what this means is the following: The problems that Alice, Bob, and Carol are trying to solve are as hard as any problem in NP can become. Any problem that you can solve in polynomial time, if you have had this magic "if-better" function available, would also be solvable in polynomial time if we found a polynomial-time algorithm for any of their problems. Or even, in other words, if you found a polynomial-time algorithm for vertex cover, independent set, clique or SAT, what this would mean is that you have found a way to simulate the if-better function in polynomial time. You can simulate a nondeterministic RAM on a deterministic RAM in polynomial time.

So remember at the beginning of this unit when I introduced you to nondeterminism and the "if-better" function? And we also talked about simulating the if-better function and how this takes exponential time to do? And I imagine your first reaction to this might have been a mild version of "So what?" or even "This is ridiculous." But now, think what this means in the context of NP-completeness. If, for any NP-complete problem, you find a deterministic algorithm that runs in polynomial time, then you have also found a way to simulate a nondeterministic RAM in polynomial time, which would mean that a normal, deterministic RAM without the if-better function is just as powerful as one that has this function. Which, of course, seems a bit strange because our intuition, of course, tells us that having this if-better function is super, super powerful and should somehow make a difference. So the cool thing is, there's only 2 possibilities. Either all NP-complete problems can be solved in polynomial time, which would essentially mean that P=NP. So any problem that can be solved in polynomial time on a nondeterministic RAM can be solved in polynomial time on a deterministic RAM. Or, in other words, a deterministic RAM is just as powerful as a nondeterministic one. Or it could be the other case, and that is no NP-complete problem can be solved in polynomial time. So P?NP. There are certain problems that you can solve in polynomial time if you have the if-better function available, but you can not solve them in polynomial time if you don't have this function. So basically, you'd say P?NP is the same as saying that the if-better function is really powerful. So which one is it? And that question is the famous P vs NP problem. Because the thing is, we don't know. Nobody knows. Asking this question here, if P=NP or P?NP, is a problem that is decades old, and yet nobody has been able to figure it out. Now, given how powerful this if-better function is, remember when I introduced it to you? It seemed like this magic function that could solve all of our problems. Many computer scientists nowaday believe that P?NP, but nobody has been able to prove it. On the other hand, also nobody has been able to find a polynomial time algorithm for any single NP-complete problem, despite, literally, thousands of practically-relevant NP-complete problems being out there. Now let's think a little bit of how you could resolve this question. So how could you show that P=NP or P?NP? And we'll do that as a quiz. So I'm going to give you a number of accomplishments or things that you could do, and I would like you to tell me what that would show. So, for example, if you found a polynomial-time algorithm for an NP-complete problem, would that show that P=NP, would that show that P?NP, or would that show nothing interesting? And what if you found an algorithm that solves an NP-complete problem in polynomial time on average? So not worst-case running time, but average-case running time. And what about if you showed that the most efficient algorithm for clique requires exponential time, on the deterministic RAM, of course? Would that show that P=NP, P?NP, or something else, or we don't know? What if you showed that the vertex cover problem has an exponential number of potential solutions that any algorithm has to go through? What if you were able to reduce the clique problem to the shortest path problem? So shortest path is finding the shortest path between 2 vertices in a graph. What if you were to reduce clique to the problem of finding the shortest path between 2 vertices in a graph? Or what if you could show that any shortest path problem in a graph can be solved by solving clique in a transformed graph? So please make your selection. Which ones are correct?

So let's go through them 1 by 1. Finding a polynomial-time algorithm for an NP-compete problem, yes. That would show that P=NP because the problem is NP-complete. So it means it's among the hardest-to-solve problems in NP. So polynomial-time algorithm for this problem will solve any other NP-complete problem in polynomial time as well. Finding an algorithm that solves an NP-complete problem in polynomial time on average. Now, that's a very interesting one, because you might be inclined to think, "Well, if it solves an NP-complete problem in polynomial time on average, then probably that is sufficient." Well, it shows a lot of interesting things, but it doesn't show anything with respect to P being = to NP, or P being ? to NP. Now, showing that the most efficient algorithm for clique requires exponential time, that would show that P?NP. Because clique is contained in NP, so if it requires exponential time, it means that there's at least 1 problem, namely clique, that is contained in NP but not in P. Of course, as nice as it would be to show this, of course, it would also be quite difficult to prove this because you would have to show that, again, no matter what algorithm you come up with, it always requires exponential time. That would be hard to do because you would have to consider any algorithm, even algorithms that haven't been invented yet, so to say. Now, showing that vertex cover has an exponential number of potential solutions, that is similar to the example I pointed out to you in the beginning of this unit, namely, that just having an exponential number of potential solutions does not mean that the problem is hard. It can mean that the problem is hard, but as you've seen in the case of shortest path, it's not necessarily so. Now, reducing clique to shortest path versus reducing shortest path to clique, then you have to remember again what reduction means. So if you reduce clique to shortest path, it means that solving the shortest path problem is also capable of also solving the clique problem. And since shortest path is solvable in polynomial time, if you manage to reduce clique to shortest path, that would show that P=NP. If, on the other hand, you manage to reduce shortest path to clique, that doesn't really have any implications because all you've shown is that shortest path is an easier problem than clique, which is something we already know. Now if you look at all of these solutions, well, of course, it's a bit of a subjective selection here, but what you'll see, that showing that P=NP in a way would be quite easy. All you have to do is find 1 single polynomial-time algorithm for an NP-complete problem. Then you have immediately shown that P=NP. On the other hand, if you want to show that P?NP, then you have to put in a lot of work, because you have to consider any possible algorithm, no matter how complex, how intelligent, how sophisticated, and still somehow make the case that this algorithm is not able to solve a problem like clique in polynomial time.