cs262 »

CS262 Lesson 5: Interpreting

Simulating programs

Contents

Formal semantics

In this lesson, we're going to be covering what programs mean and especially what they mean in terms of their context, the environment in which they operate.

This is called formal symantics, and while it may initially seem a bit dry, it's actually one of my favorite topics in computer science.

In fact, before my research moved on to the greater glory of fixing bugs in programs, I started out trying to find bugs in programs, and one of the surprising lessons in finding bugs in programs is that both the program and our idea of what it should mean matter.

All of you have probably had a bug in your code at some point or another, and officially a bug is really just an instance where the program's meaning is different from its specification.

What it does do and what it should do don't agree. A, perhaps, surprising result is that in industrial practice a lot of the time the mistake is actually with the specification, and you've probably seen this in real life.

A friend might've asked you to do something. You go and do it faithfully only to discover that that wasn't actually what the friend wanted to have happen. It's just what they told you to do.

The same sort of thing comes up in bug finding and bug fixing research.

Often the formal specification for a problem, this would never happen for the homework problems here, is vague or imprecise. Not enough to tell you exactly what's supposed to happen.

Regardless of whether the problem is with the source code or the specification, understanding what code means in context is critical to figuring out if it's right or wrong, and that's what we're going to do in this lesson.

Interpreters

An interpreter finds the meaning of a program by traversing its parse tree.

Welcome back to programming languages.

It's an exciting time, and the story thus far we have already completed lexing and parsing, but remember that our goal was ultimately to make a web browser?

Let's see how things are going towards that goal. We are going to start with your average web page, a string of HTML with some embedded JavaScript, and we wanted to break it down into tokens or words, and that process is called lexical analysis.

Once we have a bunch of tokens, we were going to parse those into a tree like one of these trees,

Interpreters.png

and that approach is called syntactic analysis or just parsing, and now we want to walk that tree and understand it, and do useful work, and this is called either symantics, a fancy word for meaning, or interpreting.

We want to look at this abstract syntax tree that has 5 and 3 as a child of plus, and say that means 8.

And we want to do this for web pages, and the meaning that we're going to extract from them is what they look like.

Quiz: syntax vs semantics

Lexing and parsing deal with the form of an utterance. We now turn our attention to the meaning of an utterance — we turn our attention to semantics.

A well-formed sentence in a natural language can be "meaningless" or "hard to interpret". Similarly, a syntactically valid program can lead to a run-time error if you try to apply the wrong sort of operation to the wrong sort of thing (e.g., 3 + "hello").

Thus far, we've only looked at the form of an utterance or sentence, what words it has, whether they fit the subject, object, verb kind of ordering, but we haven't really looked at what they mean.

It turns out that these 2 are not the same thing, and one of the most famous counter examples comes, again, from our man Noam Chomsky with the sentence, "Colorless green ideas sleep furiously."

And this is a sentence that's syntactically well formed.

Idea is the noun; sleep is the verb. This is an adverb. These are all modifiers that make it a noun phrase, but it's not clear what the meaning is.

What does it mean for an idea to sleep? I thought ideas were conceptions that people had; can ideas go to sleep?

And people can try to put poetic interpretations on this. You can imagine a sort of a prose or poetic way of reading this, but in general I can sort of alter the sentence and make it still syntactically correct but symantically more and more contradictory.

It's really not clear what this sentences means. Turns out that we have similar notions in computer programming languages.

In python, you can write 1 + 1 or hello + world and get reasonable answers out, but what happens if I try to write 1 + hello? This does not really make sense.

It's not clear what it means to add an integer to a string, and in general if you feed such a program fragment or expression to a python interpreter, it will give you some sort of run-time error which is the equivalent of throwing up its hands and saying, "I don't know."

A well-formed sentence in a natural language can be "meaningless" or "hard to interpret". Similarly, a syntactically valid program can lead to a run-time error if you try to apply the wrong sort of operation to the wrong sort of thing (e.g., 3 + "hello").

Just to review what it means for a program to lead to a run-time error or to describe something that doesn't make much semantic sense, here I've written 4 program fragments, and what I'd like you to do is check each one that would lead to some sort of run-time error, and you have to assume that each one is a complete program.

What you see is what you get; there are no other hidden variables or imports declared earlier. Multiple multiple choice, try it out.

Syntax vs Semantics.png

Solution

Bad programs

One goal of semantic analysis is to notice and rule out bad programs (i.e., programs that will apply the wrong sort of operation to the wrong sort of object). This is often called type checking.

So because we're going to be writing an interpreter for JavaScript and HTML, we're only really going to be able to interpret good programs successfully.

That means we want to recognize and rule out bad programs.

So here you can conceptually imagine some sort of border patrol or gateway crossing where our diligent customs agent is going to keep out bad programs, like this "1"+2, while allowing in super-happy-fun programs like 1+2.

The process of looking at a program's source code, looking at its text, looking at the letters that make it up, and trying to see if it's going to be well-behaved or not is known as type checking or semantic analysis.

And as you may already guess, it's not going to be perfect but it's going to be pretty good.

Type checking and semantic analysis. Semantic means "meaning." We're going to break a program down into its meaning. And I'm going to explain types in just a minute.

Types

A type is a set of similar objects (e.g., number or string or list) with an associated set of valid operations (e.g., addition or length).

A type is just a set of similar objects, like numbers or strings, with associated operations, and we've run into things like numbers in strings before, but there are more exotic types that also support operations, and as we can see even from this little doodle diagram sometimes there are operations like plus that apply to all of these things.

Types.png

I can add 2 numbers using arithmetic.I can concatenate 2 strings or I can append together 2 lists. They mean almost the same thing but not really.Arithmetic addition is not really the same as string concatenation.

Similarly, both strings and lists support length.

Here, it's the number of characters; here, it's the number of elements.

Each one of these things, number, string, list, is a type, and what we basically want to do is make sure that we're not mistakenly combining elements from various types or unsafely calling non-associated operations.

For example, 5 divided by 2 makes a lot of sense, but the string hello divided by the string 6, we have no idea what this means.

Division isn't one of the operators for strings even though it is one of the operators for numbers.

Let me give you another analogy of this that sort of bares reference to the real world.

The word execute can mean many different things depending on what it's being applied to. A computer can execute a program which means to run it or to carry it out.

There are some governments in the world that will execute their citizens, and this means to kill typically in a judicial context.

These do not mean the same thing, but the word looks otherwise similar, and it falls in the same sort of grammatical pattern.

They're both verbs being applied to objects.

We're going to need to tell the difference.

Quiz: Syllepsis

In fact, in English--and in fact, in Greek--there is a particular form of word play called syllepsis, based on exploiting this semantic incongruity.

Let me give you an example of what it looks like in natural language, and then we'll bring it back to HTML.

In this fragment: "she made no reply, up her mind, and a dash for the door"

"made" is being applied to three different things--replies, make up a mind, and make a dash for some location.

And it means something slightly different in each of these.

Making a reply means to speak.

Making up your mind means to decide something.

And making a dash for the door means to run for the exit.

But because we have this one function ("made"), if you'll permit me, being applied to these three different types of arguments ("reply", "up her mind", "dash"), there's an incongruity which some may find humorous.

Here's perhaps my favorite example from the same source.

"She lowered her standards by raising her glass, her courage, her eyes, and his hopes."

This is a very good but very disconcerting poem and a lovely example of syllepsis.

Lowered is being applied to standards; that means to give up your ideals or try something worse.

Raising her glass, as if in a toast.

Raising her courage--to muster up her willpower.

Raising her eyes to look up at someone.

And raising his hopes because, well, nothing good happens in this poem.

Similarly, there's an error in HTML known as mismatched tags.

<b>Mismatched tags</a>

We talked about balanced parentheses in parsing-- the same number of As followed by the same number of Bs, or the same number of open parentheses followed by the same number of closed parentheses.

Recall that that's not something we can do with regular languages or regular expressions, but we can capture it with a context-free grammar.

However, we didn't handle it in our parser, so we must handle it now.

This concept is not particularly tricky, but it does require a context-free grammar.

So just to make sure that we're all on the same page, there's a bit of a quiz.

I have written three HTML fragments, but I've left some things blank.

What I would like you to do is fill in each blank with the word necessary for each fragment alone to be well-balanced HTML, to have its tags match up perfectly.

Syllepsis.png

Solution

Html interpreter

We can interpret a parse tree by walking down it — traversing it — in a particular order.

So here's a pictorial representation of a parse tree for HTML.

HTML Interpreter 1.png

Remember that all parse trees always grow upside down, and we have our actual tokens at the leaves at the bottom because this is a web page that just says, "Hello friends," with no tags, and it has a large number of intermittent nodes corresponding to non-terminals or rewrite rules in our grammar.

We want to walk over this parse tree and decide based on all these intermediate nodes to print out to display on the screen as our web page Hello Friends.

That process is known as interpreting.

HTML Interpreter 2.png

Walking over the tree to figure out what to do, and in some sense it's just like interpreting a sentence in English to figure out what it means or even translating a sentence from one language to another.

Recall that our 2 major elements in our syntax tree are represented in python as tuples.

HTML Interpreter 3.png

Word elements just have 2 parts, the word word-hyphen element so that we know what's going on, and then the actual text from the user, and tag elements which start with tag elements.

Again, to make it easier for us to interpret, we'll see that in just a minute.

The beginning tag, the ending tag, and a list of other elements. We also have a JavaScript element, which I haven't spent as much time talking about, but is our way of representing embedded JavaScript pieces of code in a web page.

We're going to focus on these 2 for now.

Graphics

We introduce the particular graphics library we will use in this class to render webpages. It is intentionally simple so that we can focus on core programming language concepts.

We're going to do that while we introduce graphics.

Remember our goal is to render a webpage to make a picture that corresponds to a webpage.

That means we need some way to make a picture, and it may not be obvious how we'll do that in Python.

It turns out that just as we used a library for regular expressions and parsing and even timing procedures--if you remember that example from before-- we're going to use a library--someone else's code--to do graphics.

This ability, by the way, to build on existing off-the-shelf components is one of the phenomenal advantages of computer science compared to other parts of engineering.

It's very easy to copy someone else's library and build on top of it-- much harder to copy someone else's building without actually reconstructing it yourself.

So there are 4 key functions in our graphics library that are worth knowing about.

Graphics 1.png

In graphics.word, you pass in a single word, and we'll draw it on the screen or on the picture we're making for you.

Graphics.begtintag is a bit more complicated. You have to tell us both the type of the tag and also any arguments it might have.

For example, suppose it's an anchor tag--a link in a webpage-- we need to know what the destination of the link is. I'll have you pass that in as a dictionary, mapping href to "google".

Then after that you could call and display a bunch of other words and those would all be underlined or they would show up in blue or however we draw web links.

Eventually, we can end the most recent tag, and there's also a way to display warnings, which is basically just for your benefit if you're trying this out in later assigments. This gives you a way of debugging. Maybe this will show up in some bold, red color or something like that so that you can't miss it.

So let's imagine that we have the following webpage input.

Nelson Mandela <b>was elected</b> democratically.

I'm going to show you the sequence of calls to the graphics library that we would want.

Well, conceptually, the first thing we're going to do is print out the word Nelson on the screen.

Then we'll want to print out the word Mandela on the screen.

Then we want to tell the graphics library that for while, subsequent words should be bolded. So we'll begin a bold tag. This bold tag doesn't have any arguments, so we'll just pass in the empty dictionary. That didn't actually draw anything on the screen. It's just a note to begin drawing in bold. You can imagine changing out your pen for something that draws in a different color.

Now we instruct our graphics library to write the word "was," but it's going to be bolded.

Now we write the world elected, and now we're done with things in bold, so we back out of our most recent tag.

Finally, we'll add the word "democratically" and a period.

This depends on how our lexer works, but remember that our definition for word was sort of anything that's not whitespace or angle brackets, so this period will be part of the same word.

Then it will be the responsibility of our graphics library to do things like wrapping when we get to the end of the page to decide what bold looks like.

And then to display the image back to you. So these 7 calls produce this image.

Graphics 2.png

That's how we're going to make our web browser.

While we're here, Nelson Mandela was actually the first South African President in a fully representative democratic election, and he went on to win the 1993 Nobel Peace Prize, plus a host of other honors.

Quiz: Writing an interpreter

So let's do it! We're just going to get started writing an interpreter together. I'm going to write the first half of it, and for this quiz, you're going to finish it off.

We're going to write a recursive procedure called interpret that's going to walk over abstract syntax trees and figure out what they mean.

What figure out what they mean is, calls the graphics library to render a webpage. That's the meaning of HTML.

The reason why we're operating on trees instead of just one, this is mostly just how I'm labeling it. But remember, that HTML can be a collection of elements rather than just one.

For example, you could write out a webpage that said, "Hello, Friend", and it would be a list of 2 elements.

So I want to make sure that we're prepared to take a list instead of just a single thing. We'll immediately dispense with that however by considering them one at a time.

def interpret(trees): # Hello, friend
    for tree in trees: # Hello,
        # ("word-element","Hello")
        nodetype=tree[0] # "word-element"
        if nodetype == "word-element":
            graphics.word(tree[1])
        elif nodetype == "tag-element":
            # <b>Strong text</b>
            tagname = tree[1] # b
            tagargs = tree[2] # []
            subtrees = tree[3] # ...Strong Text!...
            closetagname = tree[4] # b
            # QUIZ: (1) check that the tags match
            # if not use graphics.warning()
            #  (2): Interpret the subtree
            # HINT: Call interpret recursively

So now, for example, we'll just do the "Hello". Remember for us though, it would look something like this.

We've decided to represent our abstract syntax trees as tuples.

So I'm going to need to pull out the 0th element of this tree to see what sort of thing it is.

So I'm going to pull out the nodetype.

In our running example, it's word-element, but it could be Javascript element, tag element--bunch of possibilities.

If it is a word element, I know what to do.

I'm going to call our graphics library and say print out a word, and the word that I want you to print out is the second child of the tree.

So in this example, it's "Hello,".

Another possibility is that someone has given us a tag element-- something like this--bold tag begins, some HTML in the center, bold tag ends.

Well, we went to all the work to storing this information in our parser, so let's get it out now.

The name of the tag we're entering is the first part of the abstract syntax tree. In this strong text example, it's just a b.

There may well be some number of tag arguments. For the bold tag, no arguments, but for something like the anchor tag, there might be more.

Then there's a list of HTML elements that we'll have to interpret later-- strong text, in this example, and then there's the closing tag name.

Here's what I want you to do for the quiz-- I want you to check that the tags match and interpret the subtree.

If the trees don't match, use graphics.warning to make that really obvious.

Otherwise, interpret the subtree. Hint: call interpret recursively.

Students wishing to run their browser locally can use the graphics.py/ file used in the course. Please note that generating images successfully requires pdfTeX, ImageMagick, and Ghostscript.

Solution

Arithmetic

We will write a recursive procedure to interpret JavaScript arithmetic expressions. The procedure will walk over the parse tree of the expression. This is sometimes called evaluation.

Wow was that actually it for HTML?

Actually, yes, remember we only have 3 types of HTML elements.

The word element, which we just shot how to handle.We call graphics.word.

The tag element, which we also saw how to handle.We check to make sure the tags match.We call graphics begin tag and graphics end tag.

The JavaScript element.

Now, this is super tough, but it's not really HTML proper.Here's what we're going to do:

  1. interpret the Javascript until we get a string.
  2. call graphics.word() on that string.

Whatever they want us to print out, we will.

Now, we just need to know how to interpret JavaScript, and then we'll be done with our web browser.

However, Javascript is symantically richer than HTML, which means it will be harder to interpret.

For example, JavaScript has arithmetic and HTML rarely has arithmetic.

JavaScript also has variables. HTML rarely has variables, and we're going to have to deal with all of these things, but we have to start somewhere so let's just start with arithmetic.

Suppose our input is:

(1*2) + (3*4)

We'll be given a parse tree that looks mostly like this

Arithmetic 1.png

but with some extra decorations. Remember that our parse tree is actually more--it will look more like this for the leaves--They're all tuples,

(!number", "3")

but I'm abstracting away the tuples stuff just to show you the shape of it.

What we're going to do is just write a procedure, again, called interpret or JavaScript interpret, that walks down the tree and computes values and returns them back up.

How do I figure out what this plus means? Well, I'm going to call myself recursively.

How do I figure out what this times means? Well, it's a binary operator so I'm going to figure out my left child. That's 1, and figure out my right child that's 2, so 1*2 is 2. I'll just bring that up, pass that up to my parent.

Now, I'll call myself recursively over here. How do I figure out this times? Well, this is a 3, and that's a 4, so together they make 12, so this whole thing is 14.

Our order was down this way, down that way, back up, down this way, all the way back up, and that's how we're going to figure out JavaScript arithmetic.

Arithmetic 2.png

Interpreting complicated languages in computer science is often called evaluation, which is abbreviated to eval in many instances.

We're going to write an eval procedure for arithmetic expressions like this. It will therefore be called evaluate expression (eval_exp).

Quiz: eval_exp

So once again, as a quiz, we're going to build a lot of this interpreter, together.

I will get started; you'll complete the rest.

We're going to write an eval_exp procedure to interpret JavaScript arithmetic. We're only going to handle (+), (-), and numbers for now.

Our procedure walks over a (tree) so it takes a single (tree) as argument.

Note that this is a little different from our HTML interpreter, which took multiple (tree)s.

Remember that our (tree)s typically look like these. It's either a tuple "number", followed by something like "5" or maybe something like "binop", for binary operator, with a left_child (tree), a right_child (tree), and something like (+) in the middle.

("number" , "5")
("binop" , ... , "+", ... )

This first part, is the nodetype. We're going to extract it, by getting the zeroth element of the (tree).

If it's a "number", all I have to do is figure out the integer value of this string.

Python allows me to do that, just by calling (int) as if it were a function-- to turn this string into an integer.

Otherwise, for us, the node might be some binary operator.

So we'll just pull out the left_child, the operator, and the right_child from the parts of the tuple that you already put there for us when you did all the hard work of making the parser.

So here's what I want you to do for the quiz:

You'll need to figure out the value of the left and the right_child-- and the big hint here is recursive function call.

And then once you have those values, you should actually do the work. If this is a (+) binary operator, add them together.

# Quiz: Eval Exp

# Write an eval_exp procedure to interpret JavaScript arithmetic expressions.
# Only handle +, - and numbers for now.

def eval_exp(tree):
    # ("number" , "5")
    # ("binop" , ... , "+", ... )
    nodetype = tree[0]
    if nodetype == "number":
        return int(tree[1])
    elif nodetype == "binop":
        left_child = tree[1]
        operator = tree[2]
        right_child = tree[3]
        # QUIZ: (1) evaluate left and right child
        # (2) perform "operator"'s work      

test_tree1 = ("binop",("number","5"),"+",("number","8"))
print eval_exp(test_tree1) == 13

test_tree2 = ("number","1776")
print eval_exp(test_tree2) == 1776

test_tree3 = ("binop",("number","5"),"+",("binop",("number","7"),"-",("number","18")))
print eval_exp(test_tree3) == -6

Solution

Quiz: Context

We need to know the values of variables — the context — to evaluate an expression. The meaning of x+2 depends on the meaning of x.

So we just saw how to evaluate Arithmetic expressions, like (1 + 1) evaluates to 2. But we can also consider more complicated expressions, like (1 + 2) is equal to 3.

Context 1.png

We can imagine evaluating this part, recursively, and getting the answer, 3-- and then deciding that 3 is equal to 3, so this whole thing evaluates to True.

And these may not seem like very complicated or powerful expressions yet.

But these are really the building blocks that we're going to use to make a working Web browser.

However, what if I give you an expression like this: We want to check and see if (x + 2) is equal to 3.

Context 2.png

What should this return? Actually, we're not certain.

If (x) is currently equal to 1, then it should definitely return 3.

But if (x) is something else, like negative 300, then this won't work out quite as well.

We need to know the value of (x) to figure this out.

And, in fact, we need to know the current value of (x) because a lot of variables change.

In fact, we can see this same sort of thing in natural language. Here, I've written out a sentence,

The king of France is bald.

that is syntactically, entirely valid.

We can imagine trying to figure out if this sentence is True or False, but knowing what it means--whether it represents the world correctly or not-- depends on knowing the current state of the world.

For example, as of the time of this recording, there is no king of France; France is not a monarchy. So it's not clear what it means to talk about the hair of the king of France when there is no king of France.

Similarly, if you're writing a Python program and you write

x + 1 == 2

and you haven't yet defined (x)-- if there is no king of France-- it's not clear what this means and the Python interpreter is going to give you a run time error, a run time error if (x) is not yet defined. That's not the only kind of run time error-- there are many ways things can go wrong-- but it's one of the most common.

Now let's just briefly review evaluating expressions in context, in the form of a quiz. Try it out:

I've written a 6-line Python program and I would really like it to print out True three times. So each of these print statements should have its argument evaluate to True.

What I'd like you to do is fill in each blank so that that happens, but you can only fill in blanks with x, y, or z. This is kind of a puzzle-- can you solve constraints or do backwards reasoning to figure out how things would have to go.

Context 3.png

Solution

State

The state of a program execution is a mapping from variable names to values. Evaluating an expression requires us to know the current state.

So as that last quiz just showed us, in order to figure out whether an expression is True or not or what it evaluates to, we're going to need to keep track of the values of variables-- which might change with assignment statements.

So we're going to track the current State of the World. Is (x) 1 or is (x) 2? We don't know yet. Is the King of France bald, or not? We're going to have to find out. It's a question that can be answered empirically, by looking at the state of things.

So to figure out what an expression like (x + 2) means-- when we're doing our interpreting, when we're writing our Web browser, we're going to need to keep around some notion of the values of the variables--this State of the World.

And we're going to do that with a mapping from variable names, like (x), to values.

In Python dictionary-style, that might be something like:

{"x" : 3}

This mapping is called the State, and this is a super important concept in programming languages.

It's kind of like asking: what's the current temperature? Depends on the current state of the world--and it may change, over time. We'll need a lot of bookkeeping to keep track of it.

So as we hinted earlier, a fairly direct way to do this would be just to use a Python dictionary that maps strings--variable names--to values.

If we had something like this, we could easily evaluate (x * y) in this context, in this state-- and determine that it's True for this state.

e = {"x" : 2, "y" : 3}
x * y == 6

Right now, "x" is 2; right now, "y" is 3; (2 * 3) is 6. So the value of this expression is True--if this is the State of the World.

Now--later, we may want our environment to be a little more complicated than just a single Python mapping. So we're going to introduce an abstract function to query it.

So we'll just make a promise to ourselves. We're going to write a function called environment_lookup, where you pass in an environment--probably a dictionary, but we may make it more complicated later-- and a variable_name, and you get the answer out.

def env_lookup(environment, variable_name):

So if this is our environment, e, calling env_lookup of this environment, e, and asking for the variable, "y", gives us back the value, 3.

env_lookup(e, "y") == 3

Quiz: Variable lookup

So let's test out our knowledge of this with a quiz.

We're going to go back to our JavaScript interpreter and add support for looking up variables like (x) and (y).

The first change is that our recursive eval_exp procedure now takes 2 arguments: the Abstract Syntax Tree--and the environment because we've just established that the meaning of an expression depends on the context in the World--depends on the environment.

def eval_exp(tree, environment):
    nodetype = tree[0]
    if nodetype == "number":
        return int(tree[1])
    elif nodetype == "binop":
        left_value = eval_exp(tree[1], environment)
        operator = tree[2]
        right_value = eval_exp(tree[3], environment)
        if operator == "+":
            return left_value + right_value
        elif operator == "-":
            return left_value - right_value
    elif nodetype == "identifier":
        # ("binop", ("identifier","x"), "+", ("number","2"))
        # QUIZ: (1) find the identifier name
        # (2) look it up in the environment and return it

We've already seen these 2 cases before. We peel apart our abstract syntax tree, which is just a bunch of nested tuples.

We look in the zeroth component for the type.

If it's a "number" node, then we just return the string value, converted to an integer.

If it's a binary operator, like (+) or (-), we'll have to do the Addition or Subtraction.

Here, I'm skipping a few steps, compared to last time. Last time, we assigned tree[1] to left_child and then called eval_exp on it. I'm going to do it all on one line now, just to say a little space. But notice that when I make my recursive calls to eval_exp, I have to pass in this environment in order to have my subchildren--the subparts of the AST-- know what the values of variables are.

And since there haven't been any intervening assignment statements, the environment stays the same--whatever I was given is whatever I will pass on to my subtrees.

So now I just check to see: if the operator is a JavaScript +, I evaluate it, using Python +. If the operator is JavaScript -, I evaluate it, using Python -. I translate these basic universal concepts, like Addition and Subtraction, from one language to another. I'm writing an interpreter.

But what if, instead of being a "number" or a "binop" our nodetype is "identifier" is actually a reference to a variable? And just to remind you here, I've written out our Abstract Syntax Tree for (X) + 2.

("binop", ("identifier","x"), "+", ("number","2"))

At the highest level, it's a binary operator where the left_child is the ("identifier", "x"), and the right_child is the ("number", "2").

So we started up here with "binop" but that involves calling ourselves, recursively, to evaluate our left_child. When we do, the nodetype for our left_child is "identifier" so we're going to have to figure out what the value of (x) is.

So your mission is to complete the code for this case, by finding the identifier name, looking it up in the environment and returning it.

# QUIZ : Variable Lookup

# Adding variable lookup to the interpreter!

def eval_exp(tree, environment):
    nodetype = tree[0]
    if nodetype == "number":
        return int(tree[1])
    elif nodetype == "binop":
        left_value = eval_exp(tree[1], environment)
        operator = tree[2]
        right_value = eval_exp(tree[3], environment)
        if operator == "+":
            return left_value + right_value
        elif operator == "-":
            return left_value - right_value
    elif nodetype == "identifier":
        # ("binop", ("identifier","x"), "+", ("number","2"))
        # QUIZ: (1) find the identifier name
        # (2) look it up in the environment and return it

# Here's some code to simulate env_lookup for now. It's not quite what we'll be
# using by the end of the course.

def env_lookup(env,vname): 
        return env.get(vname,None)

environment = {"x" : 2}
tree = ("binop", ("identifier","x"), "+", ("number","2"))
print eval_exp(tree,environment) == 4

Solution

Control flow

Python and JavaScript have conditional statements like if — we say that such statements can change the flow of control through the program.

Program elements that can change the flow of control, such as if or while or return, are often called statements. Typically statements contain expressions but not the other way around.

So now we can handle expressions like

2 + 3
x + 1

but if you think about most of the Python programs you've written, they've got quite a bit more going on-- things like "if" statements or assignment statements that influence the state or change which parts of the program are executed.

Formally, we say that statements or program parts, like "if" or "while" or "return" change the flow of control, and a potential reasonable analogy for control flow is a river.

Here, I've written a relatively simple bit of Python code. But we could sort of trace through it, noting which lines get visited.

x = 1
if x < 5:
   print x
else:
   x = 2
x = 3

We're definitely going to visit: x = 1. We'll take a look at this "if" because (x) is 1, it's less than 5 so we're going to curve in here and evaluate this "print". But then we're going to kind of skip over this whole "else" branch and rejoin things at the bottom--so you've got this flow that sometimes branches, seems to fork, rejoins at the end--Control Flow.

Control Flow 1.png

Which part of the program do we execute? If I'm to keep my finger on the line that we're currently worried about, how does my finger move around?

We're going to need to handle this sort of things if we're building an interpreter for JavaScript and HTML.

So things like "if", "while", and "return" change the flow of control.

These things are called "statements" to distinguish them from "expressions". Expressions are typically a bit simpler. They're more like noun phrases in a natural language.

A statement is more like an entire sentence. It's going to contain a verb as well or some instruction, like change the value of (x) or jump over here.

Statements often contain expressions but not the other way around.

And you can think of this as being a bit of a hierarchy. This whole thing is a statement.

Control Flow 2.png

The usual abbreviation for a statement is "stmt". But just this little part down here is an expression-- just like a whole sentence may contain one or two nouns.

So let's say we have an environment that maps the variable, (x), to the value, 2

env = {"x" : 2 }

if x < 5 {
   y = "don";
} else {
   y = "quixote";
}

and now I've written a JavaScript program below it that contains an if-then-else statement--it should remind you a lot of Python.

If (x < 5), the variable, (y), is going to be assigned to "don"; otherwise, the variable, (y), is going to be assigned "quixote".

In our environment, where (x) is 2, (x < 5) will be evaluated to True.

Control Flow 3.png

So the flow of control will visit here and here, but will skip these two and rejoin us down here at the bottom.

And hopefully, at the end of the day, we'll have a new environment where (x) maps to 2, and the variable (y) maps to "don".

"Don Quijote" is a classic novel, written by Miguel de Cervantes around 1605. We get a number of expressions from it, like "Tilting at windmills" , in English; the protagonist believes a windmill to be a mortal foe, perhaps a dragon, and he charges up but gets knocked down by the whirling arms each time. These days, "tilting at windmills" means trying something over and over again, even though you know you're never going to succeed--an impossible task.

Quiz: Evaluating Statements

So let's apply what we just learned to help complete our interpreter to fill out what it means to evaluate statements.

Previously, we had an evaluate function expression.

Now I'm going to make a new function for evaluating statements. It's going to take the abstract syntax tree and also the environment. We need to know the values of variables.

As always by convention, we can get the type of the thing we're looking for from the 0th part of the tuple. We did all this work in the parser to record it. Let's totally take advantage of that.

Well, 1 possibility is that it's an assignment statement. Those look something like this.

("assign", "x", ("number", "3")) ---> x = 3

Here's our statement type, assignment, and its first argument or the first part of its tuple is going to be some variable name that we're assigning to, and then the second part is going to be an arbitrary expression abstract syntax tree, so that corresponds to x becomes 3.

So I'll just get out the variable name and get out the right_child abstract syntax tree. I'll figure out what the new value is by calling evaluate expression because this right-hand side part could be more complicated.

For example, it could be a binop of a bunch of things, so we're going to need to walk down that tree and interpret it--translate it to figure out what it means.

Now we've got the new value, and we'll just promise ourselves later on we're going to write some new function called environment update that changes the environment so that variable_name now points to new_value.

This should remind you a lot of the chart update procedure that we wrote before.

However, there are other types of statements. In addition to assignments, there are if-then-else.

There are typically 3 parts of if-then-else. The conditional expression--suppose our if-then-else is if x < 5 I haven't drawn in the curly braces to save space, but this is the general idea-- then our conditional expression is x < 5 — that's the guard that tells us which way to go.

The then statements are A. The else statements are B.

But before we go farther, there is 1 complicating factor. Remember that in JavaScript, you can have multiple statements inside this compound statement block. So really, we have then statements and else statements.

Your mission is to complete this code, and you can assume that there's a procedure called eval_stmts that takes statements as an argument and an environment.

# QUIZ: Evaluating Statements

def eval_stmts(tree, environment):
    stmttype = tree[0]
    if stmttype == "assign":
        # ("assign", "x", ("binop", ..., "+",  ...)) <=== x = ... + ...
        variable_name = tree[1]
        right_child = tree[2]
        new_value = eval_exp(right_child, environment)
        env_update(environment, variable_name, new_value)
    elif stmttype == "if-then-else": # if x < 5 then A;B; else C;D;
        conditional_exp = tree[1] # x < 5
        then_stmts = tree[2] # A;B;
        else_stmts = tree[3] # C;D;
        # QUIZ: Complete this code
        # Assume "eval_stmts(stmts, environment)" exists

def eval_exp(exp, env): 
        etype = exp[0] 
        if etype == "number":
                return float(exp[1])
        elif etype == "string":
                return exp[1] 
        elif etype == "true":
                return True
        elif etype == "false":
                return False
        elif etype == "not":
                return not(eval_exp(exp[1], env))

def env_update(env, vname, value):
        env[vname] = value

environment = {"x" : 2}
tree = ("if-then-else", ("true", "true"), ("assign", "x", ("number", "8")), ("assign", "x", "5"))
eval_stmts(tree, environment)
print environment == {"x":8}

Solution

Quiz: Creating an environment

Alright, so we've just seen, especially in that last quiz, how to evaluate an expression or a statement in in an environment, but we've been assuming that the environment, which I said is kind of like the world, comes pre-prepared, but what if we want to build a new world? What if we want to set the values of variables? What if I want to determine what the current temperature is?

Here I've written 2 code snippets, that do the same thing, and both of them extend or create an environment.

Creating an Environment 1.png

They make a bigger, richer world that has new bindings of values to variables.

Here in Python, we assign the value zero to the variable x, and then we're going to print out x + 1, that'll be 1, and over here in JavaScript, we use a var statement to introduce the variable x and assign it the value zero, and then we write out onto the web page x + 1, which in an incredible surprise move will also be 1.

We're going to want to handle statements like these in our JavaScript interpreter in order to build our web browser.

So let's test our knowledge of assignment. statements with a brief quiz.

So here I've written a fragment of Python code that's going to print out results to the screen based on these 2 print statements in myfun.

What I would like you to do is to determine which of the following outputs could actually be produced by this program and try to work through it in your head, and this quiz is multiple, multiple choice.

Check each box that corresponds to a string that could be printed out if you were to run this program in the Python interpreter.

Creating an Environment 2.png

Solution

Quiz: Scope

We use the term scope to refer to the portion of a program where a variable has a particular value.

This notion of variables having different values and different contexts is sometimes called scope.

Scope 1.png

There's some sort of global scope that we start in where we're assigning values to variables, but then inside myfun, there's a new x that has a new value, and what this means is that our environment cannot be a flat mapping.

We had hoped before that we'd be able to use a single Python dictionary, but this myfun example shows that that's really not going to cut it because in a single dictionary, you can only have x bind to 1 thing, and we need to know that x is sometimes outside x and sometimes "os lasiandras".

Let's get some practice with these concepts with another quiz.

This one explicitly on scoping, and we'll do it together in the interpreter. So we have x gets outside x and y gets outside y just like before. So I've written out this program, and it's a little more complicated than last time.

Scope 2.png

It has 2 functions, funone and funtwo, my phenomenal cosmic naming scheme, with a few printouts.

X starts out as outside x.

Y starts out as outside y.

When we enter function 1, we print out current value of x in scope, the current value of y in scope, and then we call function 2.

Function 2 also prints out its value of x and its value of y, and down here we get the ball rolling by calling function 1 on Hong Kong in 1997.

So the quiz for you is to fill in these blanks, the 3 formal parameter declarations,-- 2 for funone and one for funtwo — but you can only use x, y, and z.

So for each one of these blanks, fill it in with either x or y or z, and that's it.

In order to make the output of this program, match exactly what we see down here, funone x Hong Kong, funone y outside y, funtwo x outside x, funtwo y funone y. This is a bit of a puzzle, try it out.

Solution

Quiz: Identifiers and storage

Because the value of a variable can change, we will use explicit storage locations to track the current values of variables.

So we want to understand the relationship between Identifier or variable names (this means the same thing) and storage places, particular values as we're running our program; the state of the world.

If I write something like this in Javascript,

var x = 2;

it's going to return 2. Conceptually, somewhere there's a box where we've made enough room to store the value 2, and we've labeled that with x.

Identifiers and Storage 1.png

Since this is a place for storing a value, we might call it a storage location or storage place.

If I write x=3, we need to go back to this box and update it with the new value, because x can only have 1 value at a time in the same context.

Identifiers and Storage 2.png

So, if I declare another variable, it's going to make another box, another storage location. We'll need to use more memory in the computer to store this value.

Identifiers and Storage 3.png

Let's see a more complete example.

Identifiers and Storage 4.png

Here in this program we start off by initializing y to 2, we declare a myfun function of x, it prints out x and then it prints out y, and then we call a myfun of y plus 5.

A little reasoning suggests that this is going to print out first the value of x, x is y plus 5, y is 2, so first we're going to print out 7, and then we're going to print out y which was 2, 7 and 2.

So, initially maybe our world looks a bit like this.

Identifiers and Storage 5.png

Y has the value of 2. But then when we call this function, we're moving into a slightly new context that has to have room for x, it's actual argument. The particular value we put in there is y plus 5, or 7,

Identifiers and Storage 6.png

but we want to remember our roots. I defined this function here right next to the place where y was equal to 2, so when I'm in this context, printing out x and y, I can get the value of x right here, and I may have to look a little farther to get the value of y, but I can just trace the arrows and do it.

Whenever we make a function call, it's going to get us a new box, new storage locations,

Identifiers and Storage 7.png

and I'm going to firm up this intuition in just a bit. I realize it's very hand-wavy now.

Here's the quiz. I've written a Python program that has 3 print statements. I'd like you to tell me what each one prints out. Fill in the blanks.

Identifiers and Storage 8.png

Solution

Quiz: Environments

There is a special global environment that can hold variable values. Other environments have parent pointers to keep track of nesting or scoping. Environments hold storage locations and map variables to values.

The first environment, the one we start with, is special. We call it the global environment. By default everyone gets a chance to see it.

Whenever we make a function call, we make a new environment that officially has a parent. This environment knows that it was created in, defined in this global environment.

Environments 1.png

So when we're interpreting JavaScript, we're going to use this nested, or chained, notion of environments to keep track of the values of variables. An environment is just a mapping from variable names to values except environments may also have parent pointers. Every environment that's not the global one has a parent.

Beyond having a parent pointer, environments just map variables to values.

B is 555. X is 2.

To give you another way to think about this notion of chained environments, we're going to reason by an analogy.

Let's say that you're going on vacation. You leave your home and you go to stay at a hotel by the seashore. You don't bring everything in your home with you, but you bring a few things. And then eventually you're going to leave your hotel room and go out on the beach. Maybe you'll bring your swimming suit, pack a lunch, that sort of thing. You'll bring a few items with you, but not everything that was in your hotel room and certainly not everything that was in your home.

Now let's say that you've forgotten your keys. Oh my gosh! Where are they? The first place to check might be in your pocket or around with you on the beach. Did I bring them all the way to the beach?

If not, maybe they're back at the hotel.

If they're not back in the hotel, maybe you left them at home.

And if they're not at home, there isn't a whole lot of recourse. Maybe you can call a locksmith or do something else sort of generically. Hopefully you can get new keys. But ultimately, there is just nowhere else to check.

So this is analogous of having a global environment--a big marketplace where you could buy new keys. And then some nested environments on top of it-- your home, your hotel, whatever you brought with you to the beach.

Environments 2.png

So let's test our knowledge of this idyllic tropic scene with a quiz. Suppose you are currently in the hotel environment. We have A is 6 and B is 7. That points back to the home environment, where A is 44 and X is 55. And that points all the way to the global environment, where A is 1, B is 2, and C is 3. You're in the hotel, and you want to evaluate A, B, C, and X. What are the values of these variables? Fill in the blanks.

Solution

Lookups

Congratulations on mastering nested environment frames and variable lookups.

While it might seem a bit foreign now, this notion of looking up a variable in an environment is very common in languages like PHP, Ruby, JavaScript, or Python.

It's how it's done in the real world.

Identifiers like home or friend may not mean the same thing in all contexts.

My home may not be the same as your home.

My friends may not be the same as your friends. But we both have homes, and we both have friends. This is an area where your intuition will really guide you to the right answer.

Quiz: Chained Environments

So let's start building up these chained environments through a series of function calls.

What will that look like? Well, at some point we declare or define the function--some fun, we'll call it-- and then later on we call it. We want to have some fun, and we're going to pass in as the actual arguments three different expressions.

Chained Environments 1.png

Here are the steps we go through.

First, we create a new environment, and its parent pointer points to the current environment. This parent part is really essential.

Next, in that new environment we create storage places for each formal parameter. Formal parameters, in this example, p1, p2, and p3. There are three formal parameters.

Then you want to fill in those places with the values of the actual arguments. The actual arguments are e1, e2, and e3. Oh, no! There's a step 4.

Chained Environments 2.png

Then you actually just do it. You evaluate the function body in the new environment.

That means it's going to be able to see the values of the actual arguments, which is just what we wanted.

Let's try this out with a bit of a quiz.

Here I've got a 5-line Python program.

x = 1
y = 2

def myfun(a, b, x):
    print a + b + x

myfun(1 + 2, 3+ 4, 5+ 6)

X gets 1; Y gets 2. I define a function, myfun, of formal parameters A, B, and X, prints out A + B + X. And then I make a call to myfun with actual arguments 1+2, 3+4, and 5+6.

I'm definitely going to have a global environment, and when we call myfun, we want to make a new environment pointing back to the old one, which is going to have room for some new places. But I haven't quite filled everything out, and I'm hoping you will help me.

Chained Environments 3.png

We need to know the value of X and Y in the global environment and then fill in these three boxes down here in this child environment corresponding to the myfun function call. Fill in the blanks.

Solution

Quiz: Greetings

Let's see a slightly more lively example.

Greetings 1.png

There's a lot going on in this example, and some of it you may not have seen before.

I'm making a procedure called makegreeter that--and this is cute-- inside of it defines another procedure.

If you haven't seen this before in Python, don't worry, but it is possible to make 1 function inside another.

Here I'm making a function called greeter that takes a person to be talking to and prints out a particular greeting to that person.

And then this makegreeter function returns greeter this function that I just defined in here. Wow! This is not obvious.

So down here I'm making a variable sayhello, which is the result of making a greeter who always starts with "hello from uttar pradesh".

And then I can ask that greeter to say hello to Gracie and also say hello to Lucknow.

So sayhello points to or references a particular version of greeter where the greeting has been locked in to be hello from the UP.

If you haven't been there, Uttar Pradesh is a region in India. Its capital city is Lucknow, known as The City of Nawabs. It's known for its manufacturing and retailing and possibly also silversmithing. Cool place. Very populous.

So here I've recopied the same code from before.

Greetings 2.png

Nothing has changed except that I'm abbreviating Uttar Pradesh with u.p.

And now over here on the right I'm drawing out the chained or nested environment frames as we're calling sayhello("gracie").

I'd like you to help me out with just the last part.

Down here in this final environment frame there's a single variable bound to a value.

I'd like you to fill in these 2 blanks so that this picture of the environment is complete.

Solution

Quiz: Environment needs

We've been talking about environments, and we've established that they need to do 2 different things:

  1. Map variables to values.
  2. Also point to the parent environment.

So we're actually just going to represent environments as a tuple of the parent_pointer, or None if it's the global environment, and the dictionary, mapping variable names to values.

(parent_pointer, dictionary)

That means that I can finally provide you with the definition of that environment lookup procedure we promised earlier.

Remember that our procedure--check your pockets, check your hotel room, check your actual home. If that doesn't work, try to buy it on the Interwebs.

Let's go see how this goes.

We'll just look and see if this variable name is in our environment.

But remember that our environment is actually a tuple of a parent in a dictionary, so we're going to have to select out the oneth element in order to check this out.

If we have a binding for our variable name in the current environment, we just return that binding.

We get out our dictionary, and we look up the value associated with vname in it.

If we were the global environment and we didn't know the answer, then we'll just return None.

We don't know what the deal is. We could try to do something else here like flag an error. For now let's just return some default value.

But if we are not the global environment, if I don't have it in my pockets, I can check my hotel room and then my house, and I'll do that by getting my parent pointer out of position 0 and calling myself recursively.

Environment Needs 1.png

So that's our lookup procedure.

Step 1: Do we have it?

Step 2: Are we the global environment?

Step 3: If we're not the global environment, ask our parents.

Similarly, we now know how to update an environment, or we could work our way through it.

Environment Needs 2.png

If we already have a binding for this variable name, then we'll just change that binding.

This is a dictionary, and here I just updated the value that corresponds to vname in that dictionary.

Otherwise, if I'm not the global environment, I can ask my parents to store the value for me.

If I am the global environment and I didn't define your variable, then this is problematic.

We should have allocated space earlier.

So here I'm going to actually define some environments in Python rather than drawing pretty pictures of them, and we're going to run through some practice problems to make sure we understand this.

This will be fill in the blanks. I'm going to call environment lookup of "x" in the new environment.

What's that going to return? So then after we look up "y" in the global environment, we make a change.

We assign "x" to be 55 in the new environment, and then we print out "x" in the new environment and the global one.

Environment Needs 3.png

How do these things turn out?

Solution

Quiz: Declaring and calling functions

Our newfound mastery of environments puts us in a perfect place to write interpreter code for declaring and calling functions.

However, there's a lot more to functions than just environments.

For example, someone could be a little mean and give us a procedure like this to try to interpret.

def mean(x):
   return x
   print "one thousand and one nights"

Mean of x immediately returns x, but then after that it has written print "one thousand and one nights".

We should never actually print out "one thousand and one nights" from this function definition. We should return first.

That's a shame because One Thousand and One Nights is perhaps the best-known example of Arabic literature, a collection of Middle Eastern and South Asian stories. One of my particular favorite parts are the stories involving Sindbad the Sailor. Unfortunately, his luck fluctuates wildly. More on Sindbad perhaps later on.

The meaning of this is we need a way to return without doing any more work.

Let's get a better feel for this with a bit of a puzzle.

Here I've written a bit of Python code, and I have 4 assignment statements involving x.

What I'd like you to do is imagine that these assignment statements are not there.

But you get to add one of them in, and your goal is to try to add in a statement that will prevent us from printing out "sindbad".

So again, assume these statements aren't there, but which ones could you move in to prevent "sindbad" from being printed?

Multiple multiple choice. Check all that apply.

Declaring and Calling Our Functions.png

Solution

Catching errors

Modern programming languages use exceptions to notice and handle run-time errors. "try-catch" or "try-except" blocks are syntax for handling such exceptions.

Sindbad certainly can't catch a break, but perhaps we can catch or notice errors in Python programs.

If we are very wary and think that an error might happen, we can use the Python keywords try and except to guard critical code.

Catching Errors 1.png

This code is guarded. And if an error happens in here instead of aborting Python execution, it will go down to this except block and start executing there.

This except block only runs if the guarded block has an error. So this code will print out "hello" and then we've seen before that when we try to do integer division by 0 that doesn't make much sense, so we'll jump down here and print out "didn't work" and then print out problem, which was our particular exception.

And if you haven't run into it before, exception is just a special computer science keyword meaning an error in a program.

It derives from exceptional situation and can refer to either something like you're out of space on your disk or memory card or a problem in your program like you're trying to divide by 0.

However, it's also entirely possible to raise exceptions on our own. So here in this Python code fragment we're printing out "joseph heller" but in the middle we raise this exception with the value 22.

Catching Errors 2.png

And then down here we catch that exception, calling it the variable problem, and then we print out, oh, it "didn't work: we caught" problem, and this will end up printing out "joseph". We'll raise the exception, so we'll never actually print out "heller". We will print out it "didn't work: we caught" 22.

And Catch-22 is a satirical novel by Joseph Heller. It's a bit absurd. The phrase "catch-22" has actually entered the popular English lexicon, meaning, approximately, "no win situation."

Harnessing exceptions

We want to harness the power of exceptions just like you might harness a horse to a wagon.

We're going to use exceptions under the hood in our interpreter to simulate return statements.

Remember that after a return statement we don't want to execute the statements that come after it; we want to jump back to the caller.

We use exceptions to do that. Let's consider our code for evaluating statements in the context of an environment again.

Harnessing Exceptions.png

We get out the type of the statement, and if it is a return statement, we'll get out the associated expression.

Remember you can write something like 1 + 2, so we're going to have to evaluate this expression's abstract syntax tree or parse tree to figure out what it means.

So we'll evaluate that return expression in the environment and get a return value, which I'll abbreviate as retval, and then we'll just raise an exception with retval as its payload, just like 22 was the payload in the previous example.

And we'll just have to trust ourselves to catch it later. In fact, that's what we'll do right now.

Quiz: Frames

We just saw how when evaluating statements like a return statement we could throw an exception.

Now we're in a great position to code up function calls, which we'll have to set up a new environment and catch the return values.

def eval_stmt(tree,environment):
    stmttype = tree[0]
    if stmttype == "call": # ("call", "sqrt", [("number","2")])
        fname = tree[1] # "sqrt"
        args = tree[2] # [ ("number", "2") ]
        fvalue = env_lookup(fname, environment)
        if fvalue[0] == "function":
            # We'll make a promise to ourselves:
            # ("function", params, body, env)
            fparams = fvalue[1] # ["x"]
            fbody = fvalue[2]
            fenv = fvalue[3]
            if len(fparams) <> len(args):
                print "ERROR: wrong number of args"
            else:
                #QUIZ: Make a new environment frame
                # evaluate the body in the new frame
                try:      
                    # QUIZ : Evaluate the body
                    return None
                except Exception as return_value:
                    return return_value
        else:
            print  "ERROR: call to non-function"
    elif stmttype == "return": 
        retval = eval_exp(tree[1],environment) 
        raise Exception(retval) 
    elif stmttype == "exp": 
        eval_exp(tree[1],environment)

Once again we extract this statement type from our abstract syntax tree, and now we want to handle function calls.

There are 2 parts to a function call abstract syntax tree: the name of the function, like absolute value or myfun or square root, and then the arguments.

And because the function may have 1, 2, 3, or more arguments, we just have a list of expressions.

Function name may mean different things in different contexts, just like any other variable, so we'll go look it up in the current environment.

We're going to have to decide what it means for something to be a function.

For something like an integer or a string, we can just use a Python integer or string.

For a function, we're going to use a tuple where the first part is function so that we know one when we see it, then there's a list of formal parameters, then there's the body, and then there's the environment, and we'll see how that comes into play later.

("function", params, body, env)

For now I'm just going to promise that this is how functions will turn out, and later, in the next step, we'll make that promise true by having function definitions produce these 4 tuple values.

So we'll just pull out the function parameters, the function body, and the function environment.

For example, if we're still calling square root, square root's official formal parameter name might be x when we passed in the particular actual argument 2.

One of the goals in our interpreter is to rule out bad code. One easy mistake to make is to pass in the wrong number of arguments, to have a different number of actual arguments and formal parameters. We'll just check for that now. They're both lists. Compare their lengths.

Otherwise, we have to make a new environment frame, which I'll leave for you to do, and then we want to evaluate the body in that new frame. We'll have to make a new environment frame and follow all of those steps-- make, potentially, spaces in the new frame for the formal parameters and assign to them the values of the actual arguments.

So we have quite a bit to do in here.

Then we want to evaluate the body in the new frame, and we're going to do that with exception handling. We'll try doing something--you tell me-- and ideally, we will get to the end and somewhere in the middle we'll have raised that special exception holding the return value.

If that happens, we'll return the return value. Otherwise, the user wrote code without a return statement; we'll just return None.

All the way up topside they're trying to call something like square root. Square root better be a function and not some string or some number.

All the way down here we check for that. If we try to call something that was not a function, we'll just print out an error.

Here's your quiz. Fill in these 2 pieces of code so that we correctly handle function calls. It's worth noting that this quiz is pretty tricky. This is a little more programming than we normally ask you to do.

Solution

Calling functions

In languages like Python or JavaScript, functions can be values (i.e., evaluating an expression can result in a function). As a result, we must decide how to represent function values. We'll use tuples that include the function body, parameter names, and the relevant environment.

So now we know how to call functions and return from functions, and this makes me doubly happy.

But both of those things assume we have a function somewhere, which we currently don't. We don't know how to make them.

So right now, very soon, we're going to get into defining functions, making new functions out of nothing.

Here I've defined the same function in Python on the left and JavaScript on the right.

Calling Functions.png

And while the syntax changes, we spell the words differently, all the key concepts are there.

We have to list the name of the function, fname, we have to list the formal parameters, fparam in our previous example, and we have to give the body.

And we said we'd just make a tuple, a for tuple that has the word "function" as a string.

So if the user writes in code like this and we're making our JavaScript interpreter for our web browser, what should we turn it into? Well, a function value that has 4 parts: the word "function," telling us that it's something we can call; the list of parameters, x, maybe y and z; the body, a list of statements; the environment we were in when the function was defined.

We don't need the function name because we'll be adding a mapping from the name to this value in the environment-- to the old environment, that is, not this new one. To the previous one.

Quiz: Function definitions

Let's try our hand at interpreting function definitions.

I'll do the first part, you fill in the rest.

Functions Definitions.png

In JavaScript, function definitions are top level elements of a JavaScript program. They're not statements and they're not expressions, so we're going to have a new evaluation function, evaluate elements.

Here's the abstract syntax tree for the element, here's the current environment.

As usual, we get the type of our tree node by looking at part 0. If it's a function definition, then we're going to pull out the function name, the function parameters, the function body, and then we're going to make up a function value out of that.

It's going to be a for tuple, and then we're going to add that to the environment. Fill in each one of these 4 blanks so that the code works correctly, so that we return the right sort of function value and then add it to the environment.

Solution

Double edged sword

We can simulate JavaScript programs with our interpreter written in Python. That means that anything that can be done in JavaScript could be done in Python as well. It turns out that JavaScript could also simulate Python. So they are equally powerful!

So now we have the phenomenal cosmic power to define functions.

Once we have them, we can call them with actual arguments, and then inside the function body we can return with the return value.

Function bodies are statements which can contain expressions.

So we'll be evaluating or interpreting quite a bit.

And this is a lot of power.

In fact, it's going to turn out that this is really all you need, pretty much, to be just as powerful as any other programming language.

Recursive functions pack a wallop. There's a lot going on.

And that means that this is actually a double-edged sword.

With great power comes ... let's say lots of care that we have to take.

We've been writing together in Python a program that simulates, interprets, evaluates JavaScript, and this means that anything that JavaScript could do we could also do in Python, because if for some reason we didn't really know how to write it in Python, we would just leave it in JavaScript and run our written in Python JavaScript interpreter.

It's a bit like that MC Escher drawing where 1 hand is drawing another hand that's actually drawing the first in a bit of a recursive loop.

We can use language A to simulate or interpret language B.

So on the 1 hand, if you'll permit me, I can write a Python program that simulates any JavaScript program.

That means that Python is at least as powerful as JavaScript.

It also turns out--not shown here, but it's actually very similar-- that I could write in JavaScript an interpreter for Python.

So JavaScript is at least as powerful as Python.

Both of these claims ignore speed because our interpreter or our web browser might be a little slower than just running the appropriate code natively, just as translating natural language text from Spanish into Mandarin Chinese takes some amount of time.

But there's a strong sense in which these languages are equally expressive.

Any computation I could carry out in Python I could also carry out in JavaScript.

Any computation I could carry out in JavaScript I could also carry out in Python.

I can be just as creative in either language.

Now, one language may look prettier than another, one language may take a little less time to say a certain thought, to have a creative idea, but ultimately, I can express the same range of work, emotions, desires in Python and in JavaScript.

This is pretty powerful stuff.

Natural language power

While most computer languages are equivalent (in that any computation that can be done in one can also be done in another), it is debated whether the same is true for natural languages.

We know from computer science theory from people like Alan Turing that languages like Python and JavaScript are equally powerful.

You've seen half of that here when we wrote an interpreter for JavaScript in Python.

We might wonder are natural languages equal? Spanish, Portuguese, Japanese, Mandarin, Hindi--are these languages all equal in every way?

This idea is actually not universally accepted.

Does every French utterance have an equivalent Japanese utterance?

The Sapir Whorf hypothesis--or linguistic relativity hypothesis--in linguistics holds that the structure of a language influences the ways in which speakers reason about the way world.

This is still hotly debated. Language is generally held to influence thought.

For a chilling fictionalization, I encourage you back again to George Orwell's 1984.

However, in computing in unnatural languages like Python and JavaScript, we can have definitive answers based on mathematical proofs.

There is a very strong sense in which Python, JavaScript, C, C++, C-Sharp, and Java can all do exactly the same range of computations-- those that can be done on what's known as a "Turing machine," a mathematical model of computation.

However we won't talk about Turing machines in detail in this class.

Quiz: Comparing languages

Simulating a program requires running it.

In the real world, it's entirely possible that language influences thought, and it may be that some things are easier in 1 language than another.

In the world of computing, the languages that we are considering, like Python and JavaScript, are ultimately equally expressive in a strong formal sense.

However, we often still think that some things are easier-- that is, maybe they take you a little less time to write or they're a little easier to read, they feel a little more natural--in one language than another.

You may just prefer writing code in Python to writing code in JavaScript even though ultimately you could say all of the same things in JavaScript.

One may feel a little easier to you even though the end results would be the same.

So I assert that interpreting or translating between two systems of meaning is actually a very deep concept.

And one interesting downside for us is that simulating a program, interpreting it, translating it often requires running it.

Let's see how that plays out. Here I've written a simple Python program, the sort that we might want to translate or interpret into JavaScript and then back again, and what I'd like to know is, what does it do?

Maybe it prints 0. I've listed a few options. I want you to think about this program in your head and tell me, what will happen if I run it? Check all that apply, bearing in mind that I am trying to trip you up.

Comparing Languages.png

Solution

Quiz: Infinite loop

Computer programs can contain infinite loops. A program either terminates (halts) in finite time or loops forever. We would like to tell if a program loops forever or not. It is provably impossible to write a procedure that can definitely tell if every other procedure loops forever or not.

All right. Infinite loops, those are no fun at all. I want to see my web page. I don't want to wait an infinite number of seconds and then see my web page. That's pretty long. I would get hungry before that ran out. Wouldn't it be nice if we could just tell if a program was going to loop forever or not? And then if it's going to loop forever, I won't run it or I'll print a little warning on the web page, but I won't waste a lot of time on it.

So what I'd really like to do is just look at the program source code as we got it from the Web, say, and just be able to tell if it loops forever in one of these infinite loops or if it halts, if it stops, if it settles down and returns an answer after some finite amount of time.

This seems like a totally legitimate request.

Unfortunately, it's not just difficult, it's actually impossible, provably impossible. Not really hard, we're too lazy, we couldn't figure it out, but we know you can't do it.

You can't make this determination correctly every time. To see why this can't happen, let's assume we could solve it and see what bad things happen to the world.

Let's assume that we have thought very hard about this and we have somehow implemented a magic procedure called halts() which takes another procedure as an argument and returns True if that procedure halts and False if it loops forever.

Now we'd know if it's safe to evaluate a web page. We just look at all the JavaScript on the web page, we call halts on it, if that returns True every time, we can totally render that web page.

Let's test our understanding of this mythical halts procedure. Over here on the left I've written 4 bits of Python code: vladimir, nabokov, pale, and fire, and fire is in red because that's warm.

One possibility is that we think vladimir halts; another possibility is that we think nabokov halts.

What I'd like you to do is tell me which of these 4 statements-- there could be multiple--are actually true. Check each box that corresponds to a true fact.

Infinite Loop.png

Solution

Tsif

Here I've written just 1 last Python program.

You've mastered all the others; this one should not be too tricky.

Tsif.png

I have this procedure, tsif, and if our magical procedure says that it halts, then we take this then branch.

Otherwise, we take this else branch.

What I'd like you to do is check this box if halts (tsif) returns True and check this other box if halts (tsif) is False.

Check all that are true.

It should be relatively direct, a 50-50 chance even if you're guessing, right? Try it out. Give it some thought.

Solution

Conclusion

So we've just finished learning about interpreters and formal semantics.

We touched briefly on the halting problem.

Surprisingly, although the halting problem officially is impossible to solve, it's actually one of my favorite areas in computer science.

The halting problem and its cousin, Rice's Theorem, are sometimes informally known as the Law of Interpreter/Writer Employability.

Because it's officially impossible to solve all these problems and get them right every time, there's always a market for people who can make closer approximations or be almost right, be right most of the time, be right with high probability.

We know it's impossible to do absolutely correctly, but nothing stops us from being 99.999% correct.

So we keep trying, with closer and closer approximations.

In fact, a few years ago, with a student, I was trying to do just that. We wanted to look at a programmed source code and guess which lines were likely to be executed frequently without actually running the program.

If we could solve this formally, we could definitely solve the halting problem.

So we know it's impossible, but we can solve it before a subset of programs or be often right.

You might imagine looking at the recursive definition for Fibonacci. If you're calling Fibonacci of ten, the recursive call happens much, much more frequently than the base case that just returns one.

Is there some way we can look at how humans write programs and figure out what's common based on that?

It turns out we were able to extract this sort of structural information or comments that are present in human written programs and figure out for them which lines are more or less likely to be executed, or executed more or less frequently at run time.

So even though the problem is impossible to solve in practice, we actually view this as an opportunity. Because it's not possible to get it exactly right, we want to come as close as we can and reach for the stars.

Solutions

Syntax vs Semantics

2 + 2, we're actually pretty sure this works out.

There's no error for this one; it's going to evaluate to 4.

Down here Mars=Earth + 1.

There is actually going to be a problem here, and the problem relates to Earth, which is a variable that's used before it's being defined, and python's going to emit some sort of run-time error saying, "I don't know what Earth is so I don't know how to assign Earth + 1 to the variable Mars." This one will result in some sort of error.

Over here, we assign Mars to be 4, and that's totally fine, and then we assign Mars to be Mars + 0 or another 4, so no errors here,

Finally, in this last one, we take a string, Mars, and we add an integer, 2, to it, and we don't expect addition to work on both strings and integers. This one also leads to an error.

Mars is sometimes called the red planet, the fourth rock from the sun, and among other things, it features very prominently in Gustav Holst's 7 movement orchestral suite, The Planets, which was written in about 1915, worth giving a listen to if you haven't yet run into it.

Syllepsis

Well, to have the tags match, we're beginning a bold here, so we'd have to end a bold there.

This second example is a bit more complicated. We have to do these things from the inside out.

We might look over and see, oh, we're ending an italics tag, so I should begin an italics tag here. But if we do that, things don't balance well. We look down and we see this bold later.

Actually, this innermost tag--the i here has to balance with the closing i, meaning there's a b here--bold--that has to balance with the closing b.

And then here we have an anchor tag beginning, so we need to have an anchor tag ending.

Syllepsis.png

Shaka was born in about 1787 and throughout his military and political career united quite a bit of the southern part of Africa, although these days his reign is viewed with some controversy.

Writing an interpreter

Alright, let's go through his together.

Checking whether or not the tags match is as simple as checking to see if the entry tag name matches the exit tag name or the beginning tag name matches the closed tag name.

If they're not the same, then we'll indicate as much by calling graphics.warning.

Otherwise, they are the same, so I just tell our graphics library that we're beginning this tag. It's a bold tag. There may be some arguments. In this case, there aren't.

I call interpret on all the subtrees.

It's convenient that we made interpret work on trees instead of single elements because now it's just 1 line to call it here.

When I'm done, and this is critical, we have to call graphics.endtag.

Otherwise, it would be run away bold. The whole rest of the webpage would be bold.

We need to stop doing whatever this tag did. We only want to apply it over the span that it controls.

# Your function should display HTML according to a given parse tree.

# graphics.warning(msg) displays an error message. Upon encountering mismatched
# tags, use graphics.warning to display the error message: "mismatched tag".

# To display a tag, use graphics.begintag(tag,args) at the start and
# graphics.endtag() at the end of the tag.

import graphics

def interpret(trees): # Hello, friend
    for tree in trees: # Hello,
        # ("word-element","Hello")
        nodetype=tree[0] # "word-element"
        if nodetype == "word-element":
            graphics.word(tree[1])
        elif nodetype == "tag-element":
            # <b>Strong text</b>
            tagname = tree[1] # b
            tagargs = tree[2] # []
            subtrees = tree[3] # ...Strong Text!...
            closetagname = tree[4] # b
            # QUIZ: (1) check that the tags match
            # if not use graphics.warning()
            #  (2): Interpret the subtree
            # HINT: Call interpret recursively
            if tagname != closetagname:
                graphics.warning("mismatched tag")
            else: # matching tags
                graphics.begintag(tagname,tagargs)
                interpret(subtrees)
                graphics.endtag()

# Note that graphics.initialize and finalize will only work surrounding a call
# to interpret

graphics.initialize() # Enables display of output.
interpret([("word-element","Hello")])
graphics.finalize() # Enables display of output.

Eval exp

So let's see one way to do it, together.

To get the value of the left_child, I'll just call eval_exp recursively.

To get the value of the right_child, I'll do the same-- but for the right-hand side of the Abstract Syntax Tree.

If the operator is (+), I'll just add them together-- returning the sum of the left_value and the right_value.

Otherwise, if the operator is (-), I'll just subtract them--return the left_value minus the right_value.

# Quiz: Eval Exp

# Write an eval_exp procedure to interpret JavaScript arithmetic expressions.
# Only handle +, - and numbers for now.

def eval_exp(tree):
    # ("number" , "5")
    # ("binop" , ... , "+", ... )
    nodetype = tree[0]
    if nodetype == "number":
        return int(tree[1])
    elif nodetype == "binop":
        left_child = tree[1]
        operator = tree[2]
        right_child = tree[3]
        # QUIZ: (1) evaluate left and right child
        # (2) perform "operator"'s work
        left_val = eval_exp(left_child)
        right_val = eval_exp(right_child)
        if operator == "+":
            return left_val + right_val
        elif operator == "-":
            return left_val - right_val

test_tree1 = ("binop",("number","5"),"+",("number","8"))
print eval_exp(test_tree1) == 13

test_tree2 = ("number","1776")
print eval_exp(test_tree2) == 1776

test_tree3 = ("binop",("number","5"),"+",("binop",("number","7"),"-",("number","18")))
print eval_exp(test_tree3) == -6

To some of you, this may seem a little weird.

I'm defining (+) in terms of (+) in Python. That is, in order to explain how JavaScript works, I'm saying well, JavaScript asked you to do some Addition-- why don't you just do it in Python? But remember that an interpreter is really like a translator--so it should not be surprising.

If you were teaching someone how to speak French and they ask you what does a French word mean-- what's the French word for "big" or how do I interpret the French word "grande"-- you might tell them: oh, it's just like the word "big" in the language that we speak.

And this is that notion of their universal underlying concepts but we just have slightly different syntax for them.

In this particular case, it looks a little weird because the JavaScript and the Python syntax for (+) are exactly the same.

But you'll see slight differences later. Remember that, for example, in function declarations, JavaScript and Python do actually look different.

So it should look very similar now because the translation is very direct. This is like Romance languages that share common Latin roots.

For an "industrial strength" JavaScript interpreter, we'd want to be more careful about checking the types of, and possibly converting, the left and right arguments. For example, in JavaScript, 1+"2" yields "12". In Python, 1+"2" yields TypeError: unsupported operand type(s) for +: 'int' and 'str'. In this case we can get JavaScript's expected result by casting the left and right values to strings first: str(1)+str("2") does yield "12" in Python. These sorts of corner cases can make it complicated to implement duck typing correctly.

Context

How do we even get started? One of the common tricks for a school or exam or test questions is to work from the bottom, up and see if there's extra information you can glean.

x + something) has to be equal to (z)--ah, not super handy yet.

Something is exactly equal to "hello". Well, wait--right up here we assigned:

y = "hello".

So as long as I don't overwrite (y) with 2, this should totally work: y = "hello" because we just did the assignment a few lines up. All right.

Now let's take a look at the rest of these. I need to add something to (x)--to get (z) that suggests that (x) is smaller than (z) because we don't have any negative numbers to play with in this example. So I'm going to have to add something like 1 or 2 or whatnot to (x) to get (z).

Well, if I really think that

x < z

then this line is probably True.

But in order for this to print out True and not give some sort of error, I'm going to have to assign to the value (x) and the value (z) before I get here. So one of these will have to be (x) and one of them will have to be (z). Since I want (x) to be smaller, let's try this out:

x = 1
z = 2

All right.

So then 1 is Less Than 2--that's going to be True.

y = "hello"--that's going to be True.

And now we have: print x + (something) is equal to z.

Well, (x) is 1 and (z) is 2, so we'd really like to just write "1" in here-- but remember the rules of this puzzle-- we can only write x, y, or z so we'll write another (x). Since (x) is 1, (1 + 1) equals 2.

Context Solution.png

Variable lookup

Well, let's finish it off together.

The variable_name or identifier name--they mean mostly the same thing-- is just the oneth component of this tuple.

The zeroth part was "identifier", so the oneth part is "x". So to get out the final value, we're just going to call env_lookup, using our existing environment and the variable_name. And whatever that gives us back, we'll just return. And that's it.

Now we can handle expressions like (X + 2), assuming that we know what the value of (x) is in our environment.

# QUIZ : Variable Lookup

# Adding variable lookup to the interpreter!

def eval_exp(tree, environment):
    nodetype = tree[0]
    if nodetype == "number":
        return int(tree[1])
    elif nodetype == "binop":
        left_value = eval_exp(tree[1], environment)
        operator = tree[2]
        right_value = eval_exp(tree[3], environment)
        if operator == "+":
            return left_value + right_value
        elif operator == "-":
            return left_value - right_value
    elif nodetype == "identifier":
        # ("binop", ("identifier","x"), "+", ("number","2"))
        # QUIZ: (1) find the identifier name
        # (2) look it up in the environment and return it
        return env_lookup(environment,tree[1])

# Here's some code to simulate env_lookup for now. It's not quite what we'll be
# using by the end of the course.

def env_lookup(env,vname): 
        return env.get(vname,None)

environment = {"x" : 2}
tree = ("binop", ("identifier","x"), "+", ("number","2"))
print eval_exp(tree,environment) == 4

Evaluating statements

So let's go over the answer to this or 1 possible answer together.

We're trying to write a program to interpret if-then-else statements, and the big concept here about control flow is that we don't want to execute both the then branch and also the else branch.

Instead, we'll have to pick one.

So the first think I'm going to do is evaluate the conditional expression in the same environment in the same set of evaluations to variables that we currently have.

If it comes out true, I'll want to execute the then statements.

If it comes out false, I'll want to execute the else statements, and if you like, this if statement is the same as one where I checked to see if the result of evaluating the expression is equal to true, but if you have a little experience with propositional logic, that's actually the same as just checking the value of this variable directly.

So if the conditional expression evaluates to true, I will evaluate all of the statements in the then branch.

In our running example, that was A and B.

In our current environment there haven't been any intervening assignment statements so the value of the variables hasn't changed.

Otherwise, if the value of the conditional expression wasn't true, presumably it was false, we'll go over to the else branch, evaluate the else statements in the same environment, and note that no matter what happens, we're either evaluating the then statements or the else statements, but not both, and that decision is made based on the value of the conditional expression, and that's it.

These 4 lines suffice to evaluate an if-then-else statement in JavaScript.

# QUIZ: Evaluating Statements

def eval_stmts(tree, environment):
    stmttype = tree[0]
    if stmttype == "assign":
        # ("assign", "x", ("binop", ..., "+",  ...)) <=== x = ... + ...
        variable_name = tree[1]
        right_child = tree[2]
        new_value = eval_exp(right_child, environment)
        env_update(environment, variable_name, new_value)
    elif stmttype == "if-then-else": # if x < 5 then A;B; else C;D;
        conditional_exp = tree[1] # x < 5
        then_stmts = tree[2] # A;B;
        else_stmts = tree[3] # C;D;
        # QUIZ: Complete this code
        # Assume "eval_stmts(stmts, environment)" exists
        if eval_exp(conditional_exp, environment) == True :
            eval_stmts(then_stmts, environment)
        else:
            eval_stmts(else_stmts, environment)

def eval_exp(exp, env): 
        etype = exp[0] 
        if etype == "number":
                return float(exp[1])
        elif etype == "string":
                return exp[1] 
        elif etype == "true":
                return True
        elif etype == "false":
                return False
        elif etype == "not":
                return not(eval_exp(exp[1], env))

def env_update(env, vname, value):
        env[vname] = value

environment = {"x" : 2}
tree = ("if-then-else", ("true", "true"), ("assign", "x", ("number", "8")), ("assign", "x", "5"))
eval_stmts(tree, environment)
print environment == {"x":8}

Creating an environment

So the trick to understanding this quiz is noticing that myfun has a formal parameter named x, and when we're inside myfun, when we're inside the body, when control has reached these statements, we need x to refer to the actual argument and not the outside value.

So when we get in here, myfun x = outside x. This can't happen because we've passed in "os lusiadas", and that's going to shadow or replace the old value of x.

However, there's only 1 variable here, x in myfun(x). So y never changes. So we will print out myfun y = outside y.

Similarly, since we are passing in "os lusiadas" to myfun as x, we can print out myfun x is "os luciadas", but there is never an assignment of y to "os luciadas" so we'll never end up printing out this variant y="os lusiades".

So again, the key concept, we can have multiple values of x in different contexts. Out here, x is outside x. In here, x is the value of the actual argument.

Os lusiandas, which we often--let's say aggressively pronounce in English as The Lusiads , a Portuguese epic poem in 1572, entirely worth checking out.

Creating an Environment.png

Scope

Well, let's get started.

Taking a look down here at the bottom, one of the first things we print out is funone x is Hong Kong.

The only Hong Kong we ever see in the program is the first argument to funone so that means that this must be x.

Funone y is outside y, which means that we can't have overwritten it here. If we put y in this box, we would mistakenly get out 1997, and we don't see that. So it can't be x because we just used x; it can't be y, so let's try z. If we pass in 1997 as z, then y will still refer to this old value of outside y, we'll get the printout that we expect.

Then we call funtwo, and down here if we take a look at the printout, x remains unchanged. X is still outside x, but y is funone y. The place we see that is right here, it's the parameter being passed in.

So in order from funone y to the flow here, we have to fill in this box with y.

Note that the way scoping works, funone and funtwo are independent, so even though funone has overwritten x, funtwo still sees the old outside value.

Scope.png

Well, let's test this answer and see if it's right. I will just add a little more information, and we get exactly the output we were expecting.

Some actual information that Hong Kong is a special administrative region in China, was controlled by the UK until 1997, and then down here, just the output we had hoped to get for this puzzle.

Identifiers and storage

Now let's work through it from the top.

Initially x is 1, and then I define a new function tricky, but I haven't called it yet, so I'm not actually going to make any new variable spaces over here.

And then I print out the value of x. Currently x is 1, and now I call tricky on x, and we noted before that whenever we make a function call we get a new box corresponding to that function call.

This one knows who its parents are, and this new box is going to have room for all the formal parameters in tricky, also x, and this time we're passing in x. The current value of x is 1, huh, so we've got almost a duplicate sort of state.

Identifiers and Storage 1.png

So now we're inside tricky, and we say x=x+5. The real question is: Which one do we change? The answer is this bottom, more specific one down here. This x has shadowed, has taken the place of in our hearts, the old one. So I change just this one to 6, so here, when I'm printing out the value of x, we look in this world, we see a 6, and we print it out.

Identifiers and Storage 2.png

And now we return from tricky, and when we return from a function call, you can imagine we throw away all of its space, and we're back here in the normal world.

Identifiers and Storage 3.png

Well, what's the value of x here? It's 1. So this assignment statement, x=x+5, changed the function's actual argument but did not change this more global value of x.

We call these boxes that help us keep track of storage places for variables environments.

Kind of like before, when we were talking about state being a bit like the world, you can view a particular environment like Central Asia or Western Europe or Antarctica as being a little region in the world that might have its own values for variables. If I want to know what the temperature is, well, it depends on the current environment. Are you in the Sahara desert or are you in Northern Sweden?

Environments

Well, let's go through it together. We're right here in the hotel; we want to get the value of A. Oh, it's right here; it's 6. Similarly, if we're right here in the hotel and we want to get the value of B, we brought one with us; it's 7.

But if we want to get the value of C, do we have it in the hotel? No. Okay, let's check back home. We also don't have it here.

Okay, eventually we'll have to get it from the global environment. And then finally we want to get the value of X.

Do we have it here in the hotel? No. Do we have it back at home? Yeah; it's 55.

And in this example, we never actually got to Pebble Beach. It's a shame.

Chained environments

Well, in the global environment x is 1 and y is 2.

When we call myfun, the rules from before say that we should make storage spaces for all of the formal parameters: a, b, and x. Here's b, here's x, so this must be a.

We fill them in with the values of the actual arguments. A was 1 plus 2, that's 3. Great, that was already done. B is 3 plus 4. This evaluates to 7, so b is 7, and x was 5 plus 6, so that's 11. So, at the end of the day, this is going to print out 21.

Chained Environment.png

Greetings

One way to think about this is to remember the rules for function calls.

You take the function body and you evaluate it in the new argument.

So the body of sayhello is print greeting + " " person, and we already know from seeing it in the interpreter what it's going to print out.

It's going to print out "hello from u.p." "gracie".

Currently I don't see gracie anywhere in this environment.

That suggests that we're definitely going to need to add it.

Another way to get to that is to remember the other rules for constructing environments.

We definitely want to make space for the formal parameters, person, and put in the value of the actual argument, "gracie".

So now when we go to evaluate greeting, we don't see it here but we'll go one up and get "hello from u.p." and we'll try to find person.

We see it right here. We get "gracie".

We'll compose them together and print out exactly the same behavior we saw on the interpreter.

Greetings.png

This quiz was pretty tricky. This notion of nested procedures does not come up very often in Python. But if we want a complete interpreter, one that understands all the nuances of a language, then we have to handle this. It's kind of like the subjunctive in a lot of romance languages. It doesn't come up very often in English; may come up more often than you'd think in other places.

Environment needs

Let's actually just draw the environments to make it a little easier.

Here's our global environment. x is 11, y is 22.

Here's our new environment. It points up here. x is 33, z is 44. A little abbreviated, but you get the idea.

So if I look up x in the new environment, the answer is right here: 33.

If I look up y in the global environment, the answer is similarly right there: 22.

Now I'm going to update x in the new environment, and our procedure is, do we have a binding for x in the new environment? We do, so we update that one.

Now when I go to look up x in the new environment, there's an answer right here: 55.

When I go to look up x in the global environment, there's an answer right here: 11.

Enviroment Needs.png

Declaring and calling functions

There's nothing particularly innocuous about this statement (1). X is already in scope. In fact, it's 3, so this is going to assign x to be 4. I don't think this will prevent us from printing "sindbad".

This one (2), though, says x is x divided by 0. Division by 0 doesn't really make sense and will typically cause the Python interpreter to stop and say, "You are asking me to do impossible math," and thus not execute subsequent statements.

Assigning x the string value "return" doesn't have the same effect as actually executing a return statement, so this won't help us.

But x = y, well, if this is the entire program, then there is no y in scope, so Python will yell at us and say, "Oh, you're referencing an undefined variable."

Declaring and Calling Functions 1.png

So just to make these examples clear, let's actually see it in the interpreter.

This time I've left in x = x + 1, which should not prevent us from seeing Sindbad. And there he is.

Declaring and Calling Functions 2.png

But if I print out x = x / 0, no, we don't get Sindbad and instead we get this error message from Python, "Blah, blah, blah, blah, blah"--the part here at the bottom is the most important-- "ZeroDivisionError: integer division or modulo by zero."

Declaring and Calling Functions 3.png

Declaring and Calling Functions 4.png

Well, we did ask you to do that, so it's our own fault. Similarly, x = y we don't see Sindbad and instead we see "NameError: global name 'y' is not defined." So both of those 2 errors caused us not to see Sindbad.

Frames

Let's go through it together.

def eval_stmt(tree,environment):
    stmttype = tree[0]
    if stmttype == "call": # ("call", "sqrt", [("number","2")])
        fname = tree[1] # "sqrt"
        args = tree[2] # [ ("number", "2") ]
        fvalue = env_lookup(fname, environment)
        if fvalue[0] == "function":
            # We'll make a promise to ourselves:
            # ("function", params, body, env)
            fparams = fvalue[1] # ["x"]
            fbody = fvalue[2]
            fenv = fvalue[3]
            if len(fparams) <> len(args):
                print "ERROR: wrong number of args"
            else:
                #QUIZ: Make a new environment frame
                newenv = (fenv, {}) # new environment
                for i in range(len(args)):
                    argval = eval_exp(args[i],environment)
                    (newenv[1])[fparams[i]] = argval # add to env
                try:
                    # QUIZ : Evaluate the body
                    eval_stmts(fbody, newenv)
                    return None
                except Exception as return_value:
                    return return_value
        else:
            print  "ERROR: call to non-function"
    elif stmttype == "return": 
        retval = eval_exp(tree[1],environment) 
        raise Exception(retval) 
    elif stmttype == "exp": 
        eval_exp(tree[1],environment)

We need to make a new environment, and the parent pointer of that new environment should point to the environment the function was declared in. This is tricky, and it's easy to get this wrong. Don't use the environment we already have lying around; use the one we got out of the function. Currently it's empty. There is nothing in our mapping.

So our next step is to evaluate the actual arguments and make room for them.

The actual arguments may be as simple as the number 2, or they may be complicated mathematical expressions.

We will go over each one in turn, and for that particular argument we will evaluate it in the current environment. Not in the function declaration environment; in the current one. We haven't really made this new one yet.

And now we'll add to our environment. This line was particularly tricky because it requires you to remember what our environment really is. It's a tuple with 2 parts: a parent pointer and a dictionary. We want to get out the oneth part, the dictionary, and extend it so that when it refers to the ith formal parameter, that binds to the value of the ith actual argument.

So in our example up here with calling the function square root on 2, the parameter name is x, the actual argument value is the number 2. X is now bound to 2. fparams[i]] is the variable name x, argval is the number 2.

That's actually it for our new environment, so now we just want to evaluate the body in that environment. One line suffices.

Remember that much like the if-then-else statement we've seen before with the curly braces in JavaScript, a function body may have many statements, so we'll use our eval-stmts procedure on the body in the new environment. This is critical.

We don't want to use any of the old environments. We need the new ones where the formal parameters are bound to the actual arguments. And that's it.

# QUIZ : Frames
# Return will throw an excception
# Function Calls: new environments, catch return values

def eval_stmt(tree,environment):
    stmttype = tree[0]
    if stmttype == "call": # ("call", "sqrt", [("number","2")])
        fname = tree[1] # "sqrt"
        args = tree[2] # [ ("number", "2") ]
        fvalue = env_lookup(fname, environment)
        if fvalue[0] == "function":
            # We'll make a promise to ourselves:
            # ("function", params, body, env)
            fparams = fvalue[1] # ["x"]
            fbody = fvalue[2]
            fenv = fvalue[3]
            if len(fparams) <> len(args):
                print "ERROR: wrong number of args"
            else:
                #QUIZ: Make a new environment frame
                newenv = (fenv, {}) # new environment
                for i in range(len(args)):
                    argval = eval_exp(args[i],environment)
                    (newenv[1])[fparams[i]] = argval # add to env
                try:
                    # QUIZ : Evaluate the body
                    eval_stmts(fbody, newenv)
                    return None
                except Exception as return_value:
                    return return_value
        else:
            print  "ERROR: call to non-function"
    elif stmttype == "return": 
        retval = eval_exp(tree[1],environment) 
        raise Exception(retval) 
    elif stmttype == "exp": 
        eval_exp(tree[1],environment) 

def env_lookup(vname,env): 
        if vname in env[1]:
                return (env[1])[vname]
        elif env[0] == None:
                return None
        else:
                return env_lookup(vname,env[0])

def env_update(vname,value,env):
        if vname in env[1]:
                (env[1])[vname] = value
        elif not (env[0] == None):
                env_update(vname,value,env[0])              

def eval_exp(exp,env): 
        etype = exp[0]        
        if etype == "number":
            return float(exp[1])
        elif etype == "binop":
            a = eval_exp(exp[1],env)
            op = exp[2]
            b = eval_exp(exp[3],env)
            if op == "*":
                return a*b
        elif etype == "identifier":
            vname = exp[1]
            value = env_lookup(vname,env) 
            if value == None: 
                print "ERROR: unbound variable " + vname
            else:
                return value

def eval_stmts(stmts,env): 
        for stmt in stmts:
            eval_stmt(stmt,env) 

sqrt = ("function",("x"),(("return",("binop",("identifier","x"),"*",("identifier","x"))),),{})

environment = (None,{"sqrt":sqrt})

print eval_stmt(("call","sqrt",[("number","2")]),environment)

Function definitions

The first part is our little note to ourselves.

This is a function value.

Note that there's little repetition between the abstract syntax tree marker for this is a function element and also this is a function value. We could have made them different. It's our choice when we're writing the interpreter.

The next thing to pass in is the parameters, the next thing is the body, and the last thing is the environment in which the function was defined.

And we're done. This is how we do a function definition.

Function Definitions.png

Comparing languages

Let's go through the answer together.

When I run this program, we start out with x being 0, and then this while loop says as long as True is true, keep incrementing x.

So now x is 1, it's 2, it's 3, it's 4, it's 5, and we never actually get to this line about the print.

So does this program print 0? Not so much. Control flow goes here, here, here, then back, and we stay in this while loop forever.

It doesn't print out this huge number 4,294,967,296, which is, I think, pretty close to, let's say, 2 to the 32. This is a reasonable guess if you think, "Oh, the integer is going to max out at some value based on computer hardware." But no. There's a thought that it raises an exception. Some badly made Python interpreters might raise an exception on this, but it should not. If we had lots of computing resources, this would run forever.

And similarly, it also never prints out -1.

The correct answer is none of these things. It loops forever.

All right. So if this program naturally loops forever and we're writing an interpreter that exactly simulates this program by following all of its steps, our interpreter is also going to loop forever.

If someone had written this program in JavaScript and put it in the middle of a web page, our web browser would loop forever and never actually render the resulting web page.

Infinite loop

Going through it together, vladimir immediately returns 1, so it halts in pretty much 1 step.

So yes, this does not loop forever. To figure out the value of nabokov, we call nabokov, which causes us to call nabokov, which causes us to call nabokov. This is one of those infinite loops.

We never actually get a value out of this. We call ourselves over and over again. This procedure does not halt. You might be thinking, "Oh, eventually we will run out of stack space," or more prosaic concerns like that, but remember here I want you to think in the abstract. We're assuming this mythical halts procedure.

How about pale? X is 0, while True, x is x + 1. We actually looked at this before. This loop does not halt, does not terminate.

How about fire? We start out with x is 0 and y is 1000, and as long as x is less than y, we add 2 to x and we add y to 1.

After a little bit, this will be 2 and 1001, then this will be 4 and 1002, this will be 6 and 1003.

And although it doesn't look like it now, eventually x will catch up to y because x is growing twice as fast.

For example, after 1000 steps, x will be 2000 and y will also be 2000, at which point we'll break out of this loop.

So fire does in fact halt.

Pale Fire was a 1962 novel by Nabokov. He's perhaps more famous for Lolita. He was a Russian author who did a lot of writing in English. All right. We didn't see anything bad.

That all looked fine. We could figure out just by staring at things whether they halted or not. Why am I predicting doom and gloom? We didn't have any problems.

Tsif

Let's go through the possibilities.

Let's imagine for now that halts (tsif) is true-- that tsif terminates.

This means that after a finite number of steps, tsif must return a value.

Well, let's go through and see what that value is. We go in here, and if halts (tsif), well actually, we were just assuming that was true, so we'll definitely take this then branch. Uh-oh, we loop forever. So we assumed that tsif halted, but now we seem to be looping forever. That is a contradiction, and it makes me super unhappy.

By the way, cutely, sometimes people in mathematics-- rarely, but sometimes--draw crossed swords to show a contradiction. That's of course near and dear to my heart, so we'll use that.

So when we thought the procedure halted, its source code made it loop forever.

So if it's not one, it must be the other.

Let's try the other approach. Let's say that we think that tsif does not halt, so it should loop forever.

So let's run it and see it looping forever. We get in here, is halts (tsif) true? No, we just assumed it was false. Okay, so we go down to the else branch and we--oh--immediately return. That's kind of annoying because we were assuming that it doesn't halt--that it loops forever.

Neither of those options are possible.

This is also tricky. There is no possible value for halts that works out.

No matter what you do, it ends up being wrong.

And this is why we could never have a procedure like halts that worked in the real world.

Even if you thought it worked on some cases, there's one evil case, at least, where it doesn't. tsif--This Sentence is False.

It's a paradox, and the essence of a paradox like this is self reference.

This sentence talks about itself.

This sentence asserts its own falsehood.

Similarly, the source code for tsif talks about whether or not tsif halts.

This is the computer programming equivalent of a natural language paradox like this sentence is false.

If I ask you, is this sentence true? Well, that can't be right because then it would truthfully be asserting that it's false, so that's a contradiction.

If I ask you, is this sentence false? Well, then we'd invert it and have it be asserting that it was true.

Oh, another contradiction--none of these work.

It's the same problem in natural language or in computer science.

You may have heard a similar version of this called the Barber Paradox, which is typically phrased as something like the barber cuts the hair of each person in this village who does not cuts its own hair. The question is does the barber do a self haircut or not?

It's a lot like this sentence is false, but one level removed. You may also have heard Epimenides Paradox-- "All Cretans are liars. I'm a Cretan." Just like this sentence is false, but one step removed.

If tsif halts, then it loops forever. If tsif loops forever, then it halts. Both cases lead to a contradiction. Therefore, halts() cannot exist.

End