# cs313 unit 2.1

## 01 q Which Problem is which

So welcome to Unit 2. In the last unit we encountered three computer scientists as I hope you'll remember, Alice, Bob, and Carol. And as you'll recall, all three were working on rather tough problems for which they weren't able to find an efficient algorithm. Alice was working in telecommunications, and her task was for the network of her employer to find the minimum number of monitoring devices that needed to be installed to monitor all cables in the telecommunications network. Bob was working in bioinformatics, and his job was to find clusters of data in genetic analysis. And finally, Carol was working in finance and she has been tasked with selecting a portfolio of assets that consists of as many assets as possible but it has to observe a number of restrictions, which assets she can put together in one portfolio. Now, at the end of the last unit, we stated these problems as problems on graphs and we're going to do a little recall quiz here to see if you remember which problems they were working on when we stated this as a graph problem. The three problems we're looking at were called independent set, clique, and vertex cover and all of those three problems took as input a graph with n vertices. For independent set, you were then looking for the maximum number of vertices where none of those vertices you have selected are connected to each other. For clique, you were looking for a maximum number of vertices where all of them are connected. And for vertex cover, we were looking for a minimum number of vertices to cover all edges meaning that every edge must have at least one endpoint in the set of vertices that we selected. So here's our first quiz and I would like to see if you remember which of these computer scientists, so Alice working in telecommunications, Bob working in bioinformatics, and Carol working in finance, which of these computer scientists were working on which of these problems here? And they way I want you to answer this is for each of the problems just select if that was the problem that Alice was working on, or the problem that Bob was working on, or the problem that Carol was working on. And the same here for clique and vertex cover.

## 01 s Which Problem is which

And the correct here is that Carol was working on independent sets because she had to find a number of investment opportunities that were independent of each other. Bob was working on clique because he was trying to find cluster of genes that are showing similar expression patterns or similar behaviors, and Alice finally was working on vertex cover because she was trying to find a way that her telecommunications company could monitor all of their network as efficiently as possible.

## 02 l P vs NP

The question that we are, for that matter, Alice, Bob, and Carol were stuck with at the end of the previous unit was that they had only been able to find exponential time algorithms for these 3 problems here, and those algorithms did not show any acceptable running time. So the question they were stuck with is, did they just not think hard enough to find good algorithms for their problems, or are their problems intractable, which would mean that it's not even possible or at least highly unlikely to find a polynomial time algorithm for those problems? And this is what we'll investigate further in this unit. And I'm not yet going to reveal the answer to you but what I am going to tell you is that without realizing it by asking this question, if their problems are tractable or intractable, Alice, Bob, and Carol are currently pondering what is by many seen as one of the toughest questions in computer science-- the famous P vs. NP problem. And in this unit, I want to show you 4 things. First, what these letters mean. Second, what the problem is actually all about. So what this here means. Third, what Alice, Bob, and Carol have actually to do with a P vs. NP problem. And then, last but not the least, how you could become an instant millionaire. That's not bad for one unit, is it?

## 04 s How to Show Intractability

The answer here, although it might be a little bit subjective, but I think it's rather objective actually is that in terms of difficulty, it's much easier to just gather evidence than accomplish a mathematical proof in my opinion. The reason for that is that in a mathematical proof you cannot only say well these 10 people or these 20 people haven't found a good algorithm for my problem, but you would really have to show that any algorithm that anybody could ever come up with would still require exponential time which does not allow you to give any examples or just state the number of algorithms, but you have to go about it in another way that is much more complicated as you'll also see in this unit. In terms of convincing this of course, the mathematical proof is more convincing than just gathering evidence because here we know that no matter what happens the problem is intractable whereas here we could just have another smart person coming in who finds a polynomial time algorithm and then it suddenly turns out that the problem is tractable.

## 05 q Many Solutions vs Intractability

Now, before we continue, I would like to make one important point, and I don't know if this already occurred to you, but at least you might be inclined to think that it's rather obvious that the problem that Alice, Bob, and Carol are working on is intractable for one simple reason. Each of their problems has to look at an exponential number of solutions. For vertex cover, we have O(2^n) potential solutions that we have to look at because there are 2^n 0-1 assignments to the vertices for putting the vertices of a graph into a vertex cover. And the same thing is true for independent set and clique as we have seen in the last unit. Now, the question is does the fact that there is an exponential number of solutions already mean that the problem must be intractable. You could basically think this because having to consider an exponential number of solutions might mean that you have to look at each one of the solutions and that obviously is going to take exponential time. What I now want to show you is that this is not the correct intuition to have. Having many solutions does not mean that the problem is intractable. It can mean this, but it does not have to. To show you this, I'm going to give you one example where we have an exponential number of solutions but actually there is a polynomial time algorithm for our problem. What I would like you to tell me--in this graph here, how many different paths or how many different ways there are of getting from A to B if you cannot go backwards? You start out at A and once you've traveled across an edge you cannot go backwards. You always have to go forward, so how many ways are there to get from A to B? Of course this is a very simple warm up example. I'm going to draw a little arrows here just to make clear that we cannot go backwards.

## 05 s Many Solutions vs Intractability

The answer here is, of course, that there's just two different ways to get from A to B-- either you go this way up here or you go this way down here.

## 06 q Number of Paths

Now how about this little bit more complicated graph? Again, I would like you to figure out the number of paths or the number of different ways that you can take to get from A to B and the rules here are again the same-- you can only travel in this forward direction. I'm again going to draw the little arrows. So please enter your answer here into this box.

## 06 s Number of Paths

The answer here is that there is actually four different ways of getting from A to B. This up here is the first one, then you could go this way, then you could go this way, or you can go this way, which is four in total.

## 07 q Number of Paths Continued

Now finally, I would like you to figure two more--this one down here and this one down here, and the rules again are the same. I'm not going to draw in the little errors just so that you can still see the image, but you still have to move in this direction here from A to B and the same down here. How many ways are their from A to B in this graph up here and how many ways are there to get from A to B in this graph down here.

## 07 s Number of Paths Continued

The answer is that up here there's 8 different possibilities and down here, there's 16 possibilities. Now, there's two ways to go about this. One is to just do the counting, which can get very tedious at least in the graph down here, or the other one is to figure out a general rule which could be quite helpful and that rule is that any time that you want to move forward you have two different choices that you can make. Once you start in A, there's two different choices. You can go either that way or that way. If your say in this vertex here, you have two choices. You can either stay up here or you can go down here. Likewise when you're down here, you can go either this way or that way. Each time you move forward, you can make one of two choices and since here, you can make one of two choices except for the last time. In the last move, you'll always have to end up at B, so here we have 1, 2, 3 times that we can make two choices and then we cannot make a choice here anymore. And 2³ is 8 and down here, it's basically the same thing. We just have one more of these layers here where we can make a choice and so that's going to be 2? or 16.

## 08 q Number of Paths Generalizing

Now, let's try to find a general rule for the number of ways that you can get from A to B. The number of vertices in each of the graphs that I have drawn is 4, 6, 8 and 10. When we have four vertices, there are two different ways to get from A to B. When we have six, there are four different ways. When we have eight, there are eight different ways. And when we have 10 vertices, there are 16 different ways to get from A to B. Now my question to you is what if we had 20 vertices and please again give your answer down here in this box?

## 08 s Number of Paths Generalizing

The answer here is that there are 512 possibilities to get from A to B, and the way to figure this out is to notice that there's a general rule here. Each time that we add two vertices, the number of different ways that you can use to get from A to B doubles. We add 2 here, the number of ways to get from A to B multiplies by 2. We add 2 here and the number multiplies by 2 and so on and so on. To get from 10 to 20, we have to add two vertices 5 times, so we already know that there are 16 ways to get from A to B if we have 10 vertices. For 20 vertices, we add 2 five times, which means we multiply by 2 five times and that turns out to be 512.

## 09 l Hardness of Shortest Path

If you play around with this a little bit further, you can figure out a general rule here, and the general rule is that if you have n vertices then the number of different paths from A to B is two to the power of n half minus one. And the way you can see that this formula is correct is you can plug in the four here so it's four over two which is two minus one and 2¹ that is two. And in the quiz, we just saw that each time we add two vertices here, we have to multiply by 2, and this is exactly what happens here because if we add two vertices here then the exponent here will increase by one. And in this way, you can work your way up to larger and larger networks of this structure. So what you can see here is that the number of ways to get from A to B grows exponentially as these graphs here get more and more vertices. And I'm now going to ask in algorithmic question, and the algorithmic question is a very simple one and that is what is the fastest way to get from A to B or the shortest way to get from A to B. And if you have had an algorithm's class, you might know that this is a problem that can be solved in polynomial time. So there are many algorithms actually that can figure out in polynomial time. What the shortest way is to get from A to B? But this means that having many possible solutions cannot mean intractability because here's the case where we do have many possible solutions. There are exponentially many ways to get from A to B, but if we ask an algorithm to figure out the shortest one for us, it does obviously not have to consider an exponential number of solutions to come up with the answer because, otherwise, there would be no polynomial time algorithm to figure out shortest paths between two points A and B. And so for vertex cover, clique, and independent set, this means that if we want to show these problems to be intractable, it's not enough to point out that there's an exponential number of solutions because that does not mean that there can't be no polynomial time algorithm.