# cs271 Lesson2 notes

Contents

## What Is A Problem

• State space: set of all of the states here is known as the state space, and we navigate the state space by applying actions. eg: all cities in Romania.
• initial state: s0 e.g. Arad
• Actions(s) -> set of possible actions {a1, a2, a3,....} at a particular state.
e.g. The set of possible routes from the current city to the next city in a path
• Results(s,a) -> new state s1
e.g when in state Arad, taking action of driving down E671 brings you to state Timisoara
• GoalTest(s) -> T|F true iff current state is the goal
e.g. GoalTest(city) is true iffwe are at our destination, Bucharest
• StepCost(s,a) -> n which is the cost of that step e.g. fuel cost, miles, time
• PathCost([(s,a),(s1,a1),....]) := Sum([StepCost(state,action)) for each (step,action) in path])

## Example Route Finding

Now let’s see how the definition of a problem maps onto the route finding, the domain. First, the initial state was given. Let’s say we start off in Arad, and the goal test, let’s say that the state of being in Bucharest is the only state that counts as a goal, and all the other states are not goals. Now the set of all of the states here is known as the state space, and we navigate the state space by applying actions. The actions are specific to each city, so when we are in Arad, there are three possible actions, to follow this road, this one, or this one. And as we follow them, we build paths or sequences of actions. So just being in Arad is the path of length zero, and now we could start exploring the space and add in this path of length one, this path of length one, and this path of length one. We could add in another path here of length two and another path here of length two. Here is another path of length two. Here is a path of length three. Another path of length two, and so on. Now at ever point, we want to separate the state out into three parts. First, the ends of the paths— The farthest paths that have been explored, we call the frontier. And so the frontier in this case consists of these states that are the farthest out we have explored. And then to the left of that in this diagram, we have the explored part of the state. And then off to the rigtht, we have the unexplored. So let’s write down those three components. We have the frontier. We have the unexplored region, and we have the explored region. One more thing, in this diagram we have labeled the step cost of each action along the route. So the step cost of going between Neamt to Iasi would be 87 corresponding to a distance of 87 kilometers, and the path cost is just the sum of the step costs. So the cost of the path of going from Arad to Oradea would be 71 plus 75.

[Narrator] Now let's define a function for solving problems. It's called Tree Search because it superimposes a search tree over the state space. Here's how it works: It starts off by initializing the frontier to be the path consisting of only the initial states, and then it goes into a loop in which it first checks to see do we still have anything left in the frontier? If not we fail, there can be no solution. If we do have something, then we make a choice. Tree Search is really a family of functions not a single algorithm which depends on how we make that choice, and we'll see some of the options later. If we go ahead and make a choice of one of the paths on the frontier and remove that path from the frontier, we find the state which is at the end of the path, and if that state's a go then we're done. We found a path to the goal; otherwise, we do what's called expanding that path. We look at all the actions from that state, and we add to the path the actions and the result of that state; so we get a new path that has the old path, the action and the result of that action, and we stick all of those paths back onto the frontier. Now Tree Search represents a whole family of algorithms, and where you get the family resemblance is that they're all looking at the frontier, copying items off and and looking to see if their goal tests, but where you get the difference is right here, in the choice of how you're going to expand the next item on the frontier, which path do we look at first, and we'll go through different sets of algorithms that make different choices for which path to look at first. The first algorithm I want to consider is called Breadth-First Search. Now it could be called shortest-first search because what it does is always choose of the frontier one of the paths that hadn't been considered yet that's the shortest possible. So how does it work? Well we start off with the path of length 0, starting in the start state, and that's the only path in the frontier so it's the shortest one so we pick it, and then we expand it, and we add in all the paths that result from applying all the possible actions. So now we've removed this path from the frontier, but we've added in 3 new paths. This one, this one, and this one. Now we're in a position where we have 3 paths on the frontier, and we have to pick the shortest one. Now in this case all 3 paths have the same length, length 1, so we break the tie at random or using some other technique, and let's suppose that in this case we choose this path from Arad to Sibiu. Now the question I want you to answer is once we remove that from the frontier, what paths are we going to add next? So show me by checking off the cities that ends the paths, which paths are going to be added to the frontier?

## Tree Search Continued

[Male narrator] And the answer is that Breadth - First Search always considers the shortest paths first, and in this case, there's 2 paths of length 1, and 1, the paths from Arad to Zerind and Arad to Timisoara, so those would be the 2 paths that would be considered. Now, let's suppose that the tie is broken in some way and we chose this path from Arad to Zerind. Now, we want to expand that node. We remove it from the frontier and put it in the explored list and now we say, "What paths are we going to add?" So check off the ends of the paths the cities that we're going to add.

[Male narrator] In this case, there's nothing to add because of the 2 neighbors, 1 is in the explored list and 1 is in the frontier, and if we're using graph search, then we won't add either of those.

[Male narrator] So we move on, we look for another shortest path. There's one path left of length 1, so we look at that path, we expand it, add in this path, put that one on the explored list, and now we've got 3 paths of length 2. We choose 1 of them, and let's say we choose this one. Now, my question is show me which states we add to the path and tell me whether we're going to terminate the algorithm at this point because we've reached the goal or whether we're going to continue.

[Male narrator] The answer is that we add 1 more path, the path to Bucharest. We don't add the path going back because it's in the explored list, but we don't terminate it yet. True, we have added a path that ends in Bucharest, but the goal test isn't applied when we add a path to the frontier. Rather, it's applied when we remove that path from the frontier, and we haven't done that yet.

An algorithm that has traditionally been called uniform-cost search but could be called cheapest-first search, is guaranteed to find the path with the cheapest total cost. Let's see how it works. We start out as before in the start state. And we pop that empty path off. Move it from the frontier to explored, and then add in the paths out of that state. As before, there will be 3 of those paths. And now, which path are we going to pick next in order to expand according to the rules of cheapest first?

## Uniform Cost Search 1

Cheapest first says that we pick the path with the lowest total cost. And that would be this path. It has a cost of 75 compared to the cost of 118 and 140 for the other paths. So we get here. We take that path off the frontier, put it on the explored list, add in its neighbors. Not going back to Arad, but adding in this new path. Summing up the total cost of that path, And now the question is, which path gets expanded next?

## Uniform Cost Search 2

Of the 3 paths on the frontier, we have ones with a cost of 146, 140, and 118. And that's the cheapest, so this one gets expanded. We take it off the frontier, move it to explored, add in its successors. In this case it's only 1. And that has a path total of 229. Which path do we expand next? Well, we've got 146, 140, and 229 So 140 is the lowest. Take it off the frontier. Put it on explored. Add in this path for a total cost of 220. And this path for a total cost of 239. And now the question is, which path do we expand next?

## Uniform Cost Search 3

The answer is this one, 146. Put it on explored. But there's nothing to add because both of its neighbors have already been explored. Which path do we look at next?

## Uniform Cost Search 4

The answer is this one. Two-twenty is less than 229 or 239. Take it off the frontier. Put it on explored. Add in 2 more paths and sum them up. So, 220 plus 146 is 366. And 220 plus 97 is 317. Okay, and now, notice that we're closing in on Bucharest. We've got 2 neighbors almost there, but neither of them is their turn yet. Instead, the cheapest path is this one over here, so move it to the explored list. Add 70 to the path cost so far, and we get 299. Now the cheapest node is 239 here, so we expand, finally, into Bucharest at a cost of 460. And now the question is are we done? Can we terminate the algorithm?

## Search Comparison

So, we've looked at 2 search algorithms. One, breadth-first search, in which we always expand first the shallowest paths, the shortest paths. Second, cheapest-first search, in which we always expand first the path with the lowest total cost. And I'm going to take this opportunity to introduce a third algorithm, depth-first search, which is in a way the opposite of breadth-first search. In depth-first search, we always expand first the longest path, the path with the most lengths in it. Now, what I want to ask you to do is for each of these nodes in each of the trees, tell us in what order they're expanded, first, second, third, fourth, fifth and so on by putting a number into the box. And if there are ties, put that number in and resolve the ties in left to right order. Then I want you to ask one more question or answer one more question which is are these searches optimal? That is, are they guaranteed to find the best solution? And for breadth-first search, optimal would mean finding the shortest path. If you think it's guaranteed to find the shortest path, check here. For cheapest first, it would mean finding the path with the lowest total path cost. Check here if you think it's guaranteed to do that. And we'll allow the assumption that all costs have to be positive. And in depth first, cheapest or optimal would mean, again, as in breadth first, finding the shortest possible path in terms of number of lengths. Check here if you think depth first will always find that.

## Search Comparison 1

Here are the answers. Breadth-first search, as the name implies, expands nodes in this order. One, 2, 3, 4, 5, 6, 7. So, it's going across a stripe at a time, breadth first. Is it optimal? Well, it's always expanding in the shortest paths first, and so wherever the goal is hiding, it's going to find it by examining no longer paths, so in fact, it is optimal. Cheapest first, first we expand the path of length zero, then the path of length 2. Now there's a path of length 4, path of length 5, path of length 6, a path of length 7, and finally, a path of length 8. And as we've seen, it's guaranteed to find the cheapest path of all, assuming that all the individual step costs are not negative. Depth-first search tries to go as deep as it can first, so it goes 1, 2, 3, then backs up, 4, then backs up, 5, 6, 7. And you can see that it doesn't necessarily find the shortest path of all. Let's say that there were goals in position 5 and in position 3. It would find the longer path to position 3 and find the goal there and would not find the goal in position 5. So, it is not optimal.

## Search Comparison 2

Given the non-optimality of depth-first search, why would anybody choose to use it? Well, the answer has to do with the storage requirements. Here I've illustrated a state space consisting of a very large or even infinite binary tree. As we go to levels 1, 2, 3, down to level n, the tree gets larger and larger. Now, let's consider the frontier for each of these search algorithms. For breadth-first search, we know a frontier looks like that, and so when we get down to level n, we'll require a storage space of For cheapest first, the frontier is going to be more complicated. It's going to sort of work out this contour of cost, but it's going to have a similar total number of nodes. But for depth-first search, as we go down the tree, we start going down this branch, and then we back up, but at any point, our frontier is only going to have n nodes rather than 2 to the n nodes, so that's a substantial savings for depth-first search. Now, of course, if we're also keeping track of the explored set, then we don't get that much savings. But without the explored set, depth-first search has a huge advantage in terms of space saved. One more property of the algorithms to consider is the property of completeness, meaning if there is a goal somewhere, will the algorithm find it? So, let's move from very large trees to infinite trees, and let's say that there's some goal hidden somewhere deep down in that tree. And the question is, are each of these algorithms complete? That is, are they guaranteed to find a path to the goal? Mark off the check boxes for the algorithms that you believe are complete in this sense.

## Search Comparison 3

The answer is that breadth-first search is complete, so even if the tree is infinite, if the goal is placed at any finite level, eventually, we're going to march down and find that goal. Same with cheapest first. No matter where the goal is, if it has a finite cost, eventually, we're going to go down and find it. But not so for depth-first search. If there's an infinite path, depth-first search will keep following that, so it will keep going down and down and down along this path and never get to the path that the goal consists of and never get to the path on which the goal sits. So, depth-first search is not complete.

## More On Uniform Cost

Let's try to understand a little better how uniform cost search works. We start at a start state, and then we start expanding out from there looking at different paths, and what we end of doing is expanding in terms of contours like on a topological map, where first we span out to a certain distance, then to a farther distance, and then to a farther distance. Now at some point we meet up with a goal.  Let's say the goal is here. Now we found a path from the start to the goal. But notice that the search really wasn't directed at any way towards the goal. It was expanding out everywhere in the space and depending on where the goal is, we should expect to have to explore half the space, on average, before we find the goal. If the space is small, that can be fine, but when spaces are large, that won't get us to the goal fast enough. Unfortunately, there is really nothing we can do, with what we know, to do better than that, and so if we want to improve, if we want to be able to find the goal faster, we're going to have to add more knowledge. The type of knowledge that is proven most useful in search is an estimate of the distance from the start state to the goal. So let's say we're dealing with a route-finding problem, and we can move in any direction--up or down, right or left-- and we'll take as our estimate, the straight line distance between a state and a goal, and we'll try to use that estimate to find our way to the goal fastest. Now an algorithm called greedy best-first search does exactly that. It expands first the path that's closest to the goal according to the estimate. So what do the contours look like in this approach? Well, we start here, and then we look at all the neighboring states, and the ones that appear to be closest to the goal we would expand first. So we'd start expanding like this and like this and like this and like this and that would lead us directly to the goal. So now instead of exploring whole circles that go out everywhere with a certain space, our search is directed towards the goal. In this case it gets us immediately towards the goal, but that won't always be the case if there are obstacles along the way. Consider this search space.  We have a start state and a goal, and there's an impassable barrier. Now greedy best-first search will start expanding out as before, trying to get towards the goal, and when it reaches the barrier, what will it do next? Well, it will try to increase along a path that's getting closer and closer to the goal. So it won't consider going back this way which is farther from the goal. Rather it will continue expanding out along these lines which always get closer and closer to the goal, and eventually it will find its way towards the goal. So it does find a path, and it does it by expanding a small number of nodes, but it's willing to accept a path which is longer than other paths. Now if we explored in the other direction, we could have found a much simpler path, a much shorter path, by just popping over the barrier, and then going directly to the goal. but greedy best-first search wouldn't have done that because that would have involved getting to this point, which is this distance to the goal, and then considering states which were farther from the goal. What we would really like is an algorithm that combines the best parts of greedy search which explores a small number of nodes in many cases and uniform cost search which is guaranteed to find a shortest path. We'll show how to do that next using an algorithm called the A-star algorithm.

[Male narrator] A Search works by always expanding the path that has a minimum value of the function f which is defined as a sum of the g + h components. Now, the function g of a path is just the path cost, and the function h of a path is equal to the h value of the state, which is the final state of the path, which is equal to the estimated distance to the goal. Here's an example of how A works. Suppose we found this path through the state's base to a state x and we're trying to give a measure to the value of this path. The measure f is a sum of g, the path cost so far, and h, which is the estimated distance that the path will take to complete its path to the goal. Now, minimizing g helps us keep the path short and minimizing h helps us keep focused on finding the goal and the result is a search strategy that is the best possible in the sense that it finds the shortest length path while expanding the minimum number of paths possible. It could be called "best estimated total path cost first," but the name A is traditional. Now let's go back to Romania and apply the A algorithm and we're going to use a heuristic, which is a straight line distance between a state and the goal. The goal, again, is Bucharest, and so the distance from Bucharest to Bucharest is, of course, 0. And for all the other states, I've written in red the straight line distance. For example, straight across like that. Now, I should say that all the roads here I've drawn as straight lines, but actually, roads are going to be curved to some degree, so the actual distance along the roads is going to be longer than the straight line distance. Now, we start out as usual--we'll start in Arad as a start state-- and we'll expand out Arad and so we'll add 3 paths and the evaluation function, f, will be the sum of the path length, which is given in black, and the estimated distance, which is given in red. And so the path length from this path will be 140+253 or 393; for this path, 75+374, or 449; and for this path, 118+329, or 447. And now, the question is out of all the paths that are on the frontier, which path would we expand next under the A algorithm?

## A Search Solution

The answer is that we select this path first--the one from Arad to Sibiu-- because it has the smallest value--393--of the sum f=g+h.

## A Search 1

Let's go ahead and expand this node now. So we're going to add 3 paths. This one has a path cost of 291 and an estimated distance to the goal of 380, for a total of 671. This one has a path cost of 239 and an estimated distance of 176, for a total of 415. And the final one is 220+193=413. And now the question is which state to we expand next?

## A Search 1 Solution

The answer is we expand this path next because its total, 413, is less than all the other ones on the front tier-- although only slightly less than the 415 for this path.

## A Search 2

So we expand this node, giving us 2 more paths-- this one with an f-value of 417, and this one with an f-value of 526. The question again--which path are we going to expand next?

## A Search 2 Solution

And the answer is that we expand this path, Fagaras, next, because its f-total, 415, is less than all the other paths in the front tier.

## A Search 3

Now we expand Fagaras and we get a path that reaches the goal and it has a path length of 450 and an estimated distance of 0 for a total f value of 450, and now the question is: What do we do next? Click here if you think we're at the end of the algorithm and we don't need to expand next or click on the node that you think we will expand next.

## A Search 3 Solution

The answer is that we're not done yet, because the algorithm works by doing the goal test, when we take a path off the front tier, not when we put a path on the front tier. Instead, we just continue in the normal way and choose the node on the front tier which has the lowest value. That would be this one--the path through Pitesti, with a total of 417.

## A Search 4

So let's expand the node at Pitesti. We have to go down this direction, up, then we reach a path we've seen before, and we go in this direction. Now we reach Bucharest, which is the goal, and the h value is going to be 0 because we're at the goal, and the g value works out to 418. Again, we don't stop here just because we put a path onto the front tier, we put it there, we don't apply the goal test next, but, now we go back to the front tier, and it turns out that this 418 is the lowest-cost path on the front tier. So now we pull it off, do the goal test, and now we found our path to the goal, and it is, in fact, the shortest possible path. In this case, A-star was able to find the lowest-cost path. Now the question that you'll have to think about, because we haven't explained it yet, is whether A-star will always do this. Answer yes if you think A-star will always find the shortest cost path, or answer no if you think it depends on the particular problem given, or answer no if you think it depends on the particular heuristic estimate function, h.

## A Search 5

The answer is that it depends on the h function. A-star will find the lowest-cost path if the h function for a state is less than the true cost of the path to the goal through that state. In other words, we want the h to never overestimate the distance to the goal. We also say that h is optimistic. Another way of stating that is that h is admissible, meaning is it admissible to use it to find the lowest-cost path. Think of all of these of being the same way of stating the conditions under which A-star finds the lowest-cost path.

## Optimistic Heuristic

Here we give you an intuition as to why an optimistic heuristic function, h, finds the lowest-cost path. When A-star ends, it returns a path, p, with estimated cost, c. It turns out that c is also the actual cost, because at the goal the h component is 0, and so the path cost is the total cost as estimated by the function. Now, all the paths on the front tier have an estimated cost that's greater than c, and we know that because the front tier is explored in cheapest-first order. If h is optimistic, then the estimated cost is less than the true cost, so the path p must have a cost that's less than the true cost of any of the paths on the front tier. Any paths that go beyond the front tier must have a cost that's greater than that because we agree that the step cost is always 0 or more. So that means that this path, p, must be the minimal cost path. Now, this argument, I should say, only goes through as is for tree search. For graph search the argument is slightly more complicated, but the general intuitions hold the same.

## State Spaces

So far we've looked at the state space of cities in Romania-- a 2-dimensional, physical space. But the technology for problem solving through search can deal with many types of state spaces, dealing with abstract properties, not just x-y position in a plane. Here I introduce another state space--the vacuum world. It's a very simple world in which there are only 2 positions as opposed to the many positions in the Romania state space. But there are additional properties to deal with as well. The robot vacuum cleaner can be in either of the 2 conditions, but as well as that each of the positions can either have dirt in it or not have dirt in it. Now the question is to represent this as a state space how many states do we need? The number of states can fill in this box here.

## State Spaces 1

And the answer is there are 8 states. There are 2 physical states that the robot vacuum cleaner can be in-- either in state A or in state B. But in addition to that, there are states about how the world is as well as where the robot is in the world. So state A can be dirty or not. That's 2 possibilities. And B can be dirty or not. That's 2 more possibilities. We multiply those together.  We get 8 possible states.

## State Spaces 2

Here is a diagram of the state space for the vacuum world. Note that there are 8 states, and we have the actions connecting the states just as we did in the Romania problem. Now let's look at a path through this state. Let's say we start out in this position, and then we apply the action of moving right. Then we end up in a position where the state of the world looks the same, except the robot has moved from position 'A' to position 'B'. Now if we turn on the sucking action, then we end up in a state where the robot is in the same position but that position is no longer dirty. Let's take this very simple vacuum world and make a slightly more complicated one. First, we'll say that the robot has a power switch, which can be in one of three conditions: on, off, or sleep. Next, we'll say that the robot has a dirt-sensing camera, and that camera can either be on or off. Third, this is the deluxe model of robot in which the brushes that clean up the dust can be set at 1 of 5 different heights to be appropriate for whatever level of carpeting you have. Finally, rather that just having the 2 positions, we'll extend that out and have 10 positions. Now the question is how many states are in this state space?

## State Spaces 3

The answer is that the number of states is the cross product of the numbers of all the variables, since they're each independent, and any combination can occur. For the power we have 3 possible positions. The camera has 2. The brush height has 5. The dirt has 2 for each of the 10 positions. That's 2^10 or 1024. Then the robot's position can be any of those 10 positions as well. That works out to 307,200 states in the state space. Notice how a fairly trivial problem-- we're only modeling a few variables and only 10 positions-- works out to a large number of state spaces. That's why we need efficient algorithms for searching through states spaces.

## Sliding Blocks Puzzle

I want to introduce one more problem that can be solved with search techniques. This is a sliding blocks puzzle, called a 15 puzzle. You may have seen something like this. So there are a bunch of little squares or blocks or tiles and you can slide them around. and the goal is to get into a certain configuration. So we'll say that this is the goal state, where the numbers 1-15 are in order left to right, top to bottom. The starting state would be some state where all the positions are messed up. Now the question is: Can we come up with a good heuristic for this? Let's examine that as a way of thinking about where heuristics come from. The first heuristic we're going to consider we'll call h1, and that is equal to the number of misplaced blocks. So here 10 and 11 are misplaced because they should be there and there, respectively, and 14 and 15 are misplaced. That's a total of 4 misplaced blocks. The 2nd heuristic, h2, is equal to the sum of the distances that each block would have to move to get to the right position. For this position, 10 would have to move 1 space to get to the right position, and 15 is 1 displaced, so that would also be a total of 4. Now, the question is: Which, if any, of these heuristics are admissible? Check the boxes next to the heuristics that you think are admissible.

## Sliding Blocks Puzzle 1

H1 is admissible, because every tile that's in the wrong position must be moved at least once to get into the right position. So h1 never overestimates. How about h2? H2 is also admissible, because every tile in the wrong position can be moved closer to the correct position no faster than 1 space per move. Therefore, both are admissible. But notice that h2 is always greater than or equal to h1. That means that, with the exception of breaking ties, an A search using h2 will always expand fewer paths than one using h1