Welcome back. So far we've talked about AI as managing complexity and uncertainty. We've seen how a search can discover sequences of actions to solve problems. We've seen how probability theory can represent in reason with uncertainty. And we've seen how machine learning can be used to learn and improve. AI is a big and dynamic field because we are pushing against complexity in at least 3 directions. First, in terms of agent design, we start with a simple reflex-based agent and move into goal-based and utility-based agents. Secondly, in terms of the complexity of the environment, we start with simple environments and then start looking at partial observability, stochastic actions at multiple agents, and so on. And finally, in terms of representation, the agents model of the world becomes increasingly complex. And this unit will concentrate on that third aspect of representation, showing how the tools of logic can be used by an agent to better model the world.
The first logic we will consider is called propositional logic. Let's jump right into an example, recasting the alarm problem in propositional logic. We have propositional symbols B, E, A, M, and J corresponding to the events of a burglary occurring, of\ the earthquake occurring, of the alarm going off, of Mary calling, and of John calling. And just as in the probabalistic models, these can be either true or false, but unlike improbability, our degree of belief in propositional logic is not a number. Rather, our belief is that each of these is either true or false or unknown. Now, we can make logical sentences using these symbols and also using the logical constants true and false by combining them together using logical operators. For example, we can say that the alarm is true whenever the earthquake or burglary is true with this sentence. (E V B) E or B implies A. So that says whenever the earthquake or the burglary is true, then the alarm will be true. We use this V symbol to mean or and a right arrow to mean implies. We could also say that it would be true that both John and Mary call when the alarm is true. We write that as A implies (J ^ M) and we use this symbol ^ to indicate an and, so that this upward-facing wedge looks kind of like an A with the crossbar missing, and so you can remember A is for "and" where with this downward-facing V symbol is the opposite of and, so that's the symbol for or. Now, there's 2 more connectors we haven't seen yet. There's a double arrow for equivalent, also known as a biconditional, and a not sign for negation, so we could say if we wanted to that John calls if and only if Mary calls. We would write that as J <=> M. John is equivalent to Mary--when one is true, the other is true; when one is false, the other is false. Or we could say that when John calls, Mary doesn't, and vice versa. We could write that as John is equivalent J<=> to not Mary, and this is the not sign. Now, how do we know what the sentences mean? A propositional logic sentence is either true or false with respect to a model of the world. Now, a model is just a set of true/false values for all the propositional symbols, so a model might be the set B is true, E is false, and so on. We can define the truth of the sentence in terms of the truth of the symbols with respect to the models using truth tables.
[Male narrator] Here are the truth tables for all the logical connectives. What a truth table does is list all the possibilities for the propositional symbols, so P and Q can be false and false, false and true, true and false, or true and true. Those are the only 4 possibilities, and then for each of those possibilities, the truth table lists the truth value of the compound sentence. So the sentence not P is true when P is false and false when P is true. The sentence P and Q is true only when both P and Q are true and false otherwise. The sentence P or Q is true when either P or Q is true and false when both are false. Now, so far, those mostly correspond to the English meaning of those sentences with one exception, which is that in English, the word "or" is somewhat ambiguous between the inclusive and exclusive or, and this "or" means either or both. We translate this mark into English P implies Q; or as if P, then Q, but the meaning in logic is not quite the same as the meaning in ordinary English. The meaning in logic is defined explicitly by this truth table and by nothing else, but let's look at some examples in ordinary English. If we have the proposition O and have that mean 5 is an odd number and P meaning Paris is the capital of France, then under the ordinary model of the truth in the real world, what could we say about the sentence O implies P? That is, 5 is an odd number implies Paris is the capital of France. Would that be true or false? And let's look at one more example. If E is the proposition that 5 is an even number and M is the proposition that Moscow is the capital of France, what about E implies M? Is that true or false?
[Male narrator] The answers are first, the sentence if 5 is an odd number, then Paris is the capital of France, is true in propositional logic. It may sound odd in ordinary English, but in propositional logic, this is the same as true implies true and if we look on this line--the final line for P and Q, P implies Q is true. The second sentence, 5 is an even number, implies Moscow is the capital of France. That's the same as false implies false, and false implies false according to the definition is also true.
[Male narrator] Here's a quiz. Use truth tables or whatever other method you want to fill in the values of these tables. For each of the values of P and Q--false/false, false/true, true/false, or true/true-- look at each of these boxes and click on just the boxes in which the formula for that column will be true. So which of these 4 boxes, if any, will this formula be true, and this formula and this formula?
[Male narrator] Here are the answers. For P and P implies Q, we know that P is true in these bottom 2 cases, and P implies Q, we saw the truth table for P implies Q is true in the first, second, and fourth case. So the only case that's true for both P and P implies Q is the fourth case. Now, this formula, not the quantity, not P or not Q, can work that out to be the same as P and Q, and we know that P and Q is true only when both are true, so that would be true only in the fourth case and none of the other cases. And now, we're asking for an equivalent or biconditional between these 2 cases. Is this one the same as this one? And we see that it is the same because they match up in all 4 cases. They're false for each of the first 3 and true in the fourth one, so that means that this is going to be true no matter what. They're always equivalent, either both false or both true, and so we should check all 4 boxes.
[Male narrator] Here's one more example of reasoning in propositional logic. In a particular model of the world, we know the following 3 sentences are true. E or B implies A, A implies J and M, and B. We know those 3 senetences to be true, and that's all we know. Now, I want you to tell me for each of the 5 propositional symbols, is that symbol true or false, or unknown in this model, and tell me for the symbols E, B, A, J, and M.
The answer is that B is true. And we know that because it was one of the 3 sentences that was given to us. And now, according to the first sentence, says that if E or B is true then A is true. So now we know that A is true. And the second sentence says if A is true then J and M are true. What about E? That wasn't mentioned. Does that mean E is false? No. It means that it is unknown that a model where E is true and a model where E is false would both satisfy these 3 sentences. So we mark E as unknown.
Now for a little more terminology. We say that a valid sentence is one that is true in every possible model, for every combination of values of the propositional symbols. And a satisfiable sentence is one that is true in some models, but not necessarily in all the models. So what I want you to do is tell me for each of these sentences, whether it is valid, satisfiable but not valid, or unsatisfiable, in other words, false for all models. And the sentences are P or not P, P and not P, P or Q or P is equivalent to Q, P implies Q or Q implies P. And finally, Food implies Party or Drinks implies party implies Food and Drinks implies Party.
The answers are P and not P is valid. That is, it's true when P is true because of this, and it's true when P is false because of this clause. P and not P is unsatisfiable. A symbol can't be both true and false at the same time. P or Q or P is equivalent to Q is valid. So we know that it's true when either P or Q is true, so that's 3 out of the 4 cases. In the fourth case, both P and Q are false, and that means P is equivalent to Q. And therefore, in all 4 cases, it's true. P implies Q or Q implies P, that's also valid. Now in ordinary English that wouldn't be valid. If the 2 clauses or the 2 symbols P and Q were irrelevant to each other we wouldn't say that either one of those was true. But in logic, one or the other must be true, according to the definitions of the truth tables. And finally, this one's more complicated, if Food then Party or if Drinks then Party implies if Food and Drinks then Party. You can work it all out and both sides of the main implication work out to be equivalent to Not Food or Not Drinks or Party. So that's the same as saying P implies P, saying one side is equivalent to the other side. And if they're equivalent, then the implication relation holds.
Propositional logic. It's a powerful language for what it does. And there are very efficient inference mechanisms for determining validity and satisfiability, alhough we haven't discussed them. But propositional logic has a few limitations. First, it can only handle true and false values. No capability to handle uncertainty like we did in probability theory. And second, we can only talk about events that are true or false in the world. We can't talk about objects that have properties, such as size, weight, color, and so on. Nor can we talk about the relations between objects. And third, there are no shortcuts to succinctly talk about a lot of different things happening. Say if we had a vacuum world with a thousand locations, and we wanted to say that every location is free of dirt. We would need a conjunction of a thousand propositions. There's no way to have a single sentence saying that all the locations are clean all at once. So, we will next cover first-order logic which addresses these two limitations.
[Norvig] I'm going to talk about first order logic and its relation to the other logics we've seen so far-- namely, propositional logic and probability theory. We're going to talk about them in terms of what they say about the world, which we call the ontological commitment of these logics, and what types of beliefs agents can have using these logics, which we call the epistemological commitments. So in first order logic we have relations about things in the world, objects, and functions on those objects. And what we can believe about those relations is that they're true or false or unknown. So this is an extension of propositional logic in which all we had was facts about the world and we could believe that those facts were true or false or unknown. In probability theory we had the same types of facts as in propositional logic-- the symbols or variables--but the beliefs could be a real number in the range 0 to 1. So logics vary both in what you can say about the world and what you can believe about what's been said about the world. Another way to look at representation is to break the world up into representations that are atomic, meaning that a representation of the state is just an individual state with no pieces inside of it. And that's what we used for search and problem solving. We had a state, like state A, and then we transitioned to another state, like state B, and all we could say about those states was are they identical to each other or not and maybe is one of them a goal state or not. But there wasn't any internal structure to those states. In propositional logic, as well as in probability theory, we break up the world into a set of facts that are true or false, so we call this a factored representation-- that is, the representation of an individual state of the world is factored into several variables--the B and E and A and M and J, for example-- and those could be Boolean variables or in some types of representations-- not in propositional logic--they can be other types of variables besides Boolean. Then the third type--the most complex type of representation--we call structured. And in a structured representation, an individual state is not just a set of values for variables, but it can include relationships between objects, a branching structure, and complex representations and relations between one object and another. And that's what we see in traditional programming languages, it's what we see in databases--they're called structured databases, and we have structured query languages over those databases-- and that's a more powerful representation, and that's what we get in first order logic.
[Norvig] How does first order logic work? What does it do? Like propositional logic, we start with a model. In propositional logic a model was a value for each propositional symbol. So we might say that the symbol P was true and the symbol Q was false, and that would be a model that corresponds to what's going on in a possible world. In first order logic the models are more complex. We start off with a set of objects. Here I've shown 4 objects, these 4 tiles, but we could have more objects than that. We could say, for example, that the numbers 1, 2, and 3 were also objects in our model. So we have a set of objects. We can also have a set of constants that refer to those objects. So I could use the constant names A, B, C, D, 1, 2, 3, but I don't have to have a one-to-one correspondence between constants and objects. I could have 2 different constant names that refer to the same object. I could also have, say, the name C that refers to this object, or I could have some of the objects that don't have any names at all. But I've got a set of constants, and I also have a set of functions. A function is defined as a mapping from objects to objects. And so, for example, I might have the Number Of function that maps from a tile to the number on that tile, and that function then would be defined by the mapping from A to 1 and B to 3 and C to 3 and D to 2, and I could have other functions as well. In addition to functions, I can have relations. For example, I could have the Above relation, and I could say in this model of the world the Above relation is a set of tuples. Say A is above B and C is above D. So that was a binary relation holding between 2 objects. Say 1 block is above another block. We can have other types of relations. For example, here is a unary relation--vowel-- and if we want to say the relation Vowel is true only of the object that we call A, then that's a set of tuples of length 1 that contains just A. We can even have relations over no objects. Say we wanted to have the relation Rainy, which doesn't refer to any objects at all but just refers to the current situation. Then since it's not rainy today, we would represent that as the empty set. There's no tuples corresponding to that relation. Or, if it was rainy, we could say that it's represented by a singleton set, and since the arity of Rainy is 0, there would be 0 elements in each one of those tuples. So that's what a model in first order logic looks like.
[Man] Now let's talk about the syntax of first order logic, and like in propositional logic, we have sentences which describe facts that are true or false. But unlike propositional logic, we also have terms which describe objects. Now, the atomic sentences are predicates corresponding to relations, so we can say vowel (A) is an atomic sentence or above (A, B). And we also have a distinguished relation--the equality relation. We can say 2 = 2 and the equality relation is always in every model, and sentences can be combined with all the operators from propositional logic so that's and, or, not, implies, equivalent, and parentheses. Now, terms, which refer to objects, can be constants, like A, B, and 2. They can be variables. We normally use lowercase, like x and y. And they can be functions, like number of A, which is just another name or another expression that refers to the same object as 1, at least in the model that we showed previously. And then, there's 1 more type of complex sentence besides the sentences we get by combining operators, that makes first order logic unique, and these are the quantifiers. And there are two quantifiers for all, which we write with an upside-down A followed by a variable that it introduces and there exists, which we write with an upside-down E followed by the variable that it introduces. So for example, we could say for all x, if x is a vowel, then the number of (x) is equal to 1, and that's the valid sentence in first order logic. Or we could say there exists in x such that the number of (x) is equal to 2, and this is saying that there's some object in the domain to which the number of function applies and has a value of 2, but we're not saying what that object is. Now, another note is that sometimes as an abbreviation, we'll omit the quantifier, and when we do that, you can just assume that it means for all; that's left out just as a shortcut. And I should say that these forms, or these sentences are typical, and you'll see these form over and over again, so typically, whenever we have a "for all" quantifier introduced, it tends to go with a conditional like vowel of (x) implies number of (x) =1, and the reason is because we usually don't want to say something about every object in the domain, since the objects can be so different, but rather, we want to say something about a particular type of object, say, in this case, vowels. And also, typically, when we have an exists an x, or an exists any variable, that typically goes with just a form like this, and not with a conditional, because we're talking about just 1 object that we want to describe.
[man] Now let's go back to the 2-location vacuum world and represent it in first order logic. So first of all, we can have locations. We can call the left location A and the right location B and the vacuum V, and the dirt--say, D1 and D2. Then, we can have relations. The relation loc, which is true of any location; vacuum, which is true of the vacuum; dirt, which is true of dirt; and at, which is true of an object and a location. And so if we wanted to say the vacuum is at location A, we just say at (V, A). If we want to say there's no dirt in any location, it's a little bit more complicated. We can say for all dirt and for all locations, if D is a dirt, and L is a location, then D is not at L. So that says there's no dirt in any location. Now, note if there were thousands of locations instead of just 2, this sentence would still hold, and that's really the power of first order logic. Let's keep going and try some more examples. If I want to say the vacuum is in a location with dirt without specifying what location it's in, I can do that. I can say there exists an L and there exists a D such that D is a dirt and L is a location and the vacuum is at the location and the dirt is at that same location. and that's the power of first order logic. Now one final thing. You might ask what "first order" means. It means that the relations are on objects, but not on relations, and that would be called "higher order." In higher order logic, we could, say, define the notion of a transitive relation talking about relations itself, and so we could say for all R, transitive of R is equivalent to for all A, B, and C; R of (A, B) and R of (B, C) implies R (A, C). So that would be a valid statement in higher order logic that would define the notion of a transitive relation, but this would be invalid in first order logic.
[Man] Now let's get some practice in first order logic. I'm going to give you some sentences, and for each one, I want you to tell me if it is valid--that is, O is true-- satisfiable, but not valid; that is, there's some models for which it is true; or unsatisfiable, meaning there are no models for which it is true. And the first sentence is there exists an x and a y such that x = y. Second sentence: there exists an x such that x = x, implies for all y there exists a z such that y = z. Third sentence: for all x, p of x or not p of x. And fourth: there exists an x, P of x.
[Man] The answers are the first sentence is valid. It's always true. Why is that? Because every model has to have at least 1 object and we can have both x and y refer to that same object, and so that object must be equal to itself. Second, let's see. The left-hand side of this implication has to be true. X is always equal to x, and the right-hand side says for every y, does there exist a z such that y equals z? And we can say yes, there is. We can always choose y itself for the value of z, and then y = y, so true implies true. That's always true. Valid. Third sentence: for all x, P of x or not P of x, and that's always true because everything has to be either in the relation for P or out of the relation for P, so that's valid. And the fourth: there exists an x, P of x, and that's true for the models in which there is some x that is a member of P, but it doesn' t necessarily have to be any at all. P might be an empty relation, so this is satisfiable. True in some models, but not true in all models.
[Man] Now I'm going to give you some sentences or axioms in first order logic, and I want you to tell me if they correctly or incorrectly represent the English that I'm asking about. So tell me yes or no, are these good representations? And the first, I want to represent the English sentence "Sam has 2 jobs," and the first order logic sentence is there exists an x and y such that job of Sam x and job of Sam y and not x = y. And so tell me yes, that correctly represents Sam has 2 jobs, or no, there's a problem. And secondly, I want to represent the idea of set membership. Now, assume I've already defined the notion of adding an element to a set. Can I define set membership with these 2 axioms? For all x and s, x is a member of the result of adding x to any set s, and for all x and s, x is a member of s implies that for all y, x is a member of the set that you get when you add y to s. And third, I'm going to try to define the notion of adjacent squares on, say, a checkerboard, where the squares are numbered with x and y coordinates and we want to just talk about adjacency in the horizontal and vertical direction. Can I define that as follows? For all x and y, the square x, y is adjacent to the square +(x,1), y, and the square (x, y) is adjacent to the square (x, +(y, 1) and assume that we've defined the notion of + somewhere and that the character set allows + to occur as the character for a function. Tell me yes or no, is that a good representation of the notion of adjacency?
[Man] The first answer is yes, this is a good representation of the sentence "Sam has 2 jobs." It says there exists an x and y, and one of them is a job of Sam. The other one is a job of Sam, and crucially, we have to say that x is not equal to y. Otherwise, this would be satisfied and we could have the same job represented by the variables x and y. Is this a good representation of the member function? No. It does do a good job of telling you what is a member, so if x is a member of a set because it's one member and then we can always add other members and it's still a member of that set, but it doesn't tell you anything about what x is not a member of. So for example, we want to know that 3 is not a member of the empty set, but we can't prove that with what we have here. And we have a similar problem down here. This is not a good representation of adjacent relation. So it will tell you, for example, that square (1,1) is adjacent to square (2,1) and also to square (1,2). So it's doing something right, but one problem is that it doesn't tell you in the other direction. It doesn't tell you that (2,1) is adjacent to (1,1) and another problem is that it doesn't tell you that (1,1) is not adjacent to (8,9) because again, there's no way to prove the negative. And the moral is that when you're trying to do a definition, like adjacent or member, what you usually want to do is have a sentence with the equivalent or the biconditional sign to say this is true if and only if rather than to just have an assertion or to have an implication in one direction.