cs313 »

Now, this is, of course, again one of the more subjective quizzes, but I think there's two variables in here for which it's obvious how to set them. The first one here is x? because the way this formula is structured, you will see that it's a number of clauses, so you have one here, one here, one here, one here that are all connected by an and, and then you have this lonely variable x? here. In any satisfying assignment, x? must be set to true because as soon as x? is set to false, then no matter what value the other clauses take--the whole formula will be false. So definitely x? is something we could already say, "Yes, this variable is true." Now, for x? and x?, it's not directly obvious in my mind how to set these values, although now that we know x? to be true maybe you could also find additional possibilities here. But there's another interesting one here and that is x? because x? only appears once in this whole formula. If it only appears once, we can always set the variables so that it is most advantageous for us so to say. So if we set x? to false, this here will become true. We do not know what effects it will have if it for example will make this clause here true or false, but the thing is we can't hurt ourselves so to say by setting this value x? here to false because x? does not appear anywhere else--it's just the only one here. So this is also very easy to set because an algorithm that would do pre-processing could basically go through the whole SAT formula and then count how often each variable occurs and if it finds a variable that just occurs once, then we set it in a way that is advantageous to us and know that it will not have any consequences for the satisfiability of the other clauses. I will now modify this formula slightly and then give you the same question to see if you can figure out a little more complicated pre-processing.

Is there now a variable or more variables for which it is easy to see or relatively easy to see without trying True and False values if they should be set to True or False?

And this is, of course, a little bit harder, but still I think it's possible to see that for the variable x? it's clear that this variable has to be set to false. This variable now appears twice, so it wouldn't be caught by the pre-processing rule that counts the number of occurences, but what you can see here both times that x? occurs-- it occurs as not x?, so we have not x? and we have not x? here. So it still holds that we can set x? to false without affecting the satisfiability of the whole instance in a negative way. What makes SAT hard is that variables can occur in some clauses without a ¬ and the others with a ¬ because then you have dependencies between the different clauses, but if it just appears in one form such as here, then there's no such dependency and again, you can use a polynomial time algorithm to figure this out before you even start any search trees.

So let's check if we can generalize the things that we figured out for SAT to some general rules or requirements for pre-processing. What are requirements for pre-processing? Is it that pre-processing has to be done in polynomial time? Is it that it has to speed up the worst-case running time of the overall algorithm? Or is it that a pre-processing algorithm should not affect the solution? And what I mean by not affecting the solution is that finding a solution having run the pre-processing algorithm will lead to the same conclusion as running it without the pre-processing algorithm. Please check any one of those that are correct and, as always, there can be more than one.

I think there's two here that are definitely true. Polynomial time, yes--you definitely want your pre-processing algorithm to run in polynomial time because the whole purpose of pre-processing is to make the exponential part easier. Secondly, of course, the pre-processing must not effect the solution. You still want to have a correct algorithm, get the same solution--you only want to get there faster. Speeding up worst-case running time--actually, that can sometimes be the case for effective pre-processing and we're going to see this in a minute, but it's not a requirement. If you find some pre-processing rules that just tend to work well in practice but you cannot prove that they have a worst case running time improvement, there's no reason not to use that pre-processing algorithm. Now that that you've understood the basics of pre-processing, let's turn our attention back to Alice. Last time we heard from her, of course, she was--well, she was medium happy, but let's see if pre-processing can make her happier.

Alice is medium happy right now and our goal is, of course, to make her more happy here. Let's find out if we can find some good rules for pre-processing a graph for which we want to solve vertex cover, and the way we're going to do this is-- I am going to give you a couple of options of potential ideas for pre-processing. So if there is a vertex with just one neighbor--what I mean by this is there's a vertex like this it's connected to another vertex and that vertex is possibly connected to more vertices, but this one here is only connected to this one. Should we then, as a pre-processing rule, automatically take this vertex here into the vertex cover or should we automatically take this vertex here into the vertex cover, and as a third case, if we have a vertex that has just two neighbors--so something like this-- should we then have the pre-processing put these two vertices here into the vertex cover, which of course would also mean that you do not need to have this one in here. Please check all of the rules here that are correct--meaning that as required, they do not affect the solution that we're going to find in the end. Having applied this rule, we can still find the smallest possible vertex cover for the whole graph.

And this was probably a bit challenging but I think it's also quite fun actually to play around with these ideas here. This pre-processing rule here is actually correct and the reason is the following-- so you have this vertex here and it has just one neighbor, so you have this edge here. We already know either this vertex has to be in the vertex cover or this one because, otherwise, this edge here would not be covered. Now, taking both is clearly is a waste. You wouldn't want to do that because once we have put this one here into the vertex cover, you do not need to put that one into the vertex cover, but if you know that you only need to put one of the two vertices into the vertex cover--well, in that case it's always better to put this one here into the vertex cover because it will not only cover this edge but also some additional edges. And in this case here, you've basically paid one vertex but you only get to cover one edge, so it's always better to use this one in that case, which also means that this rule here is not correct. Now, what about this case over here. You might be inclined to think that this is the same case so that instead of using this vertex here it's always better to put these two into the vertex cover, but unfortunately that is not the case and I will give you a very simple example of this. If you have a graph that looks like this for example--you look at it this way that this part here is the same as this part here, then you can see that if you put these two vertices into a vertex cover, then you need an additional two vertices and it can be either one of those or either one of those but you need two more to cover the whole graph. This would give you a solution of size 4, and actually, this graph has a vertex cover of size 3 if you use this one here and this one and this one and you have a size 3 solution. This is not a correct rule because it can lead us to find a suboptimal solution. So you have to be careful with pre-processing rules as useful as they are, but actually just having this one rule over here is already pretty cool because what does it tell us-- Well, once we have a vertex with one neighbor such as this one here, we'll put that one into the vertex cover and we also know that we do not need to put this one here into the vertex cover, and this will now help us to improve our search tree.

We start off with an input for vertex cover then apply pre-processing using the rule that we just discovered, which means we then already know that this vertex here must be put into the vertex cover as well as this one here and this one here and these all do not have to be in the vertex cover and this of course is dramatically effective in this case here. It's unfortunately not always the case in practice but as you can see it only leaves a very small part of the network actually where we have to find an assignment but the main point is that we now put this into a search tree to find the best possible vertex cover for those vertices where we have not yet found an assignment. And now my point is this, the search tree that we had so far was as follows. We started out by looking at an uncovered edge and then we branch into three possibilities. We either put exactly one of the vertices into the vertex cover or both. And now comes the cool part, if we have applied pre-processing then we can redesign our search tree because what this pre-processing step means basically is that we have eliminated or we have already assigned all vertices now to have just a single neighbor. Now that we have applied pre-processing, we know a little bit more about these vertices here because we know that this vertex here will have at least one more neighbor because otherwise we would have already found it in the pre-processing and this one here will also have another neighbor. There can be more. The same thing here. What we can now do is we can do a more sophisticated search tree and that sophisticated search tree is going to be based on this part here, these three vertices. You can do more sophisticated search trees but I just want to show you the basic principle here.

How do we use pre-processing. We know that we can find structures like these in our graph. And here can be other connections possibly--it don't have to be, but we'll ignore those for now. Now, if we look at this structure from a brute force perspective, there are eight different possibilities for assigning or not assigning these vertices here into a vertex cover. The eight possibilities are as follows: either we can take no vertex at all into the vertex cover and you already know that this going to cause a problem and we could take all of them into the vertex cover, and then then the six more combinations. A brute force search tree would look at eight possible assignments. Out of these eight assignments, if you were to design a search tree that directly branches into these cases here, there's only five cases that would even make sense. Please check which of these cases would make sense for a search tree.

And the answer here is that this one makes sense, at least in principle, this one here, this one here, this one here, and this one here. The reason why the other three don't make sense is that you have uncovered edges. So here you have an uncovered edge which is not good, here you have an uncovered edge which is not good, and here you have even two uncovered edges. So there are only five cases that, in principle, seem to make sense. So this would suggest that, in a search tree, we would have five possible assignments, but, actually, we can even do a little bit better.

We now know that the search tree, if we design it this way, doesn't have to branch into eight possible assignments but only five possible assignments--those here. So now my question is, if we design a search tree using this branching here, how large is that going to become? I'm going to give you two hints; the first one is the height of the search tree--that will be n thirds. Again, we have to take a little bit of care about the details if we can find such a structure here, but I think I've shown you often enough in this unit that this is possible, so hopefully you'll just believe me this time that it's also possible to always find the structure or, at least, if you don't find one, decide for certain vertices if they are going to be in the vertex cover or not. Now how much wider does this search tree get? It gets very wide, actually; each time you go down into a new level, you branch into five additional cases times five, times five, times five, and so on. Now what I would like to know from you--how big is that search tree? And I would like you to enter your answer here. It will be some number to the power of n, and I would like you to provide two digits after the decimal and round up.

And the answer here is 5 to the power of n thirds, which is 1.71 to the power of n. Now if you compare that to the algorithm that we had before, that was about 1.73, so the improvement here is rather marginal. But there's two things to be said here--the first one is that the pre-processing rule that we used was actually a rather trivial one. It just concerned those vertices that have a single neighbor. There are much more complex pre-processing rules that will then allow you to design much better search trees. And the other thing is that we're using a very simple analysis of the search tree here. We're basically just saying there are eight possible assignments, and we're just using five of them; and in a more sophisticated search tree analysis, what you would do--you would not always have the same number of assignments in each of the cases that you go into. So you could have a search tree that, for example, in one possibility, only assigns two vertices a value; in another possibility, it assigns five vertices a value; and so on. And through this refined analysis, you can get much better, as we'll soon see for our vertex cover.

And, of course, there are much more sophisticated pre-processing rules than the one that we have discussed here, which, unfortunately, lie outside the scope of this course. But there's one final question about pre-processing that I would like to discuss with you, and that is when to apply pre-processing. And this is one you'll have to think through a bit. Is it before starting the search tree--of course, you start the algorithm and not the tree, but you know what I mean. Is it at each level of the search tree? Is it regularly during the search tree algorithm? Or is it after the search tree is completed? So, again, you might have to think these through a little bit and then please check each one that is correct.

So the first one is, of course, rather obvious, before you even start a search tree algorithm, you should always apply pre-processing. And the last one, of course, also doesn't make sense, because once the search tree is completed, then you already have your best possible solution so why would you want to start pre-processing again. The interesting part is this here, because it could be the case that the assignments that you create during your search tree algorithm actually allow you to do new pre-processing rules, and you have already seen this. For example, for independent set and vertex cover where we had said that during the search tree you come across cases where for other vertices you can already decide if they are going to be in an optimal solution or not, so it does make sense, and there are the names of it misleading to apply this pre-processing also during the search tree. Now the question is, how often and when you should you should apply it. You should definitely apply it regularly, because you can regularly gain new information during your search tree algorithm. Doing it at each level, on the other hand, I think should not be recommended, and the reason for that is although pre-processing requires only polynomial time, it still requires time, so if you do the pre-processing too often then on the one hand you waste a lot of time, and on the other hand, these rules will not be very effective because you need to have made a few more decisions each time in the search tree until new pre-processing rules might become applicable. How often to apply it is almost more of an art than a science, but you should definitely not think about pre-processing as something that you do only before you start a search tree algorithm but also something that can help you during a search tree.