cs101 ยป

CS101 - Unit 3 Homework

Question 1 - Search Engines and the Web

Fill in initial values of the elements of p to produce the behaviour shown below:

``````p = [?,?,?]

p[0] = p[0] + p[1]
p[1] = p[0] + p[2]
p[2] = p[0] + p[1]

print p
``````

When your values of p are acceptable, the output should be:

``````[1, 2, 3]
``````

Question 2 - Mutating Lists

Which of the following procedures leaves the value passed in as the input unchanged?

``````def proc1(p):
p[0] = p[1]

def proc2(p):
p = p + [1]

def proc3(p):
q = p
p.append(3)
q.pop()

def proc4(p):
q = []
while p:
q.append(p.pop())
while q:
p.append(q.pop())
``````

Question 3 - Product Lists

``````# Define a procedure, product_list,
# takes as input a list of numbers,
# and returns a number that is
# the result of multiplying all
# those numbers together.

def product_list():

#print product_list([9])
#>>> 9

#print product_list([1,2,3,4])
#>>> 24

#print product_list([])
#>>> 1
``````

Question 4 - Greatest

``````# Define a procedure, greatest,
# that takes as input a list
# of positive numbers, and
# returns the greatest number
# in that list. If the input
# list is empty, the output
# should be 0.

def greatest():

#greatest([4,23,1])
#>>> 23
#greatest([])
#>>> 0
``````

Question 5 - Lists of Lists

``````# Define a procedure, total_enrollment,
# that takes as an input a list of elements,
# where each element is a list containing
# three elements: a university name,
# the total number of students enrolled,
# and the annual tuition fees.

# The procedure should return two numbers,
# not a string,
# giving the total number of students
# enrolled at all of the universities
# in the list, and the total tuition fees
# (which is the sum of the number
# of students enrolled times the
# tuition fees for each university).

udacious_univs = 'Udacity',90000,0

usa_univs = [ ['California Institute of Technology',2175,37704],
['Harvard',19627,39849],
['Massachusetts Institute of Technology',10566,40732],
['Princeton',7802,37000],
['Rice',5879,35551],
['Stanford',19535,40569],
['Yale',11701,40500]  ]

def total_enrollment():

#print total_enrollment(udacious_univs)
#>>> (90000,0)

# The L is automatically added by Python to indicate a long
# number. If you are trying the question in an outside
# interpreter you might not see it.

#print total_enrollment(usa_univs)
#>>> (77285,3058581079L)
``````

Question 6 - Max Pages

``````# The web crawler we built at the end of Unit 2 has some serious
# flaws if we were going to use it in a real crawler. One
# problem is if we start with a good seed page, it might
# run for an extremely long time (even forever, since the
# number of URLS on the web is not actually finite). The final two
# questions of the homework ask you to explore two different ways
# to limit the pages that it can crawl.

# Modify the crawl_web procedure to take a second parameter,
# max_pages, that limits the number of pages to crawl.
# Your procedure should terminate the crawl after
# max_pages different pages have been crawled, or when
# there are no more pages to crawl.

# The following definition of get_page provides an interface
# to the website found at http://www.udacity.com/cs101x/index.html

# The function output order does not affect grading.

def get_page(url):
try:
if url == "http://www.udacity.com/cs101x/index.html":
return ('<html> <body> This is a test page for learning to crawl! '
'<p> It is a good idea to '
'<a href="http://www.udacity.com/cs101x/crawling.html">learn to '
'crawl</a> before you try to  '
'<a href="http://www.udacity.com/cs101x/walking.html">walk</a> '
'or  <a href="http://www.udacity.com/cs101x/flying.html">fly</a>. '
'</p> </body> </html> ')
elif url == "http://www.udacity.com/cs101x/crawling.html":
return ('<html> <body> I have not learned to crawl yet, but I '
'am quite good at '
'<a href="http://www.udacity.com/cs101x/kicking.html">kicking</a>.'
'</body> </html>')
elif url == "http://www.udacity.com/cs101x/walking.html":
return ('<html> <body> I cant get enough '
'<a href="http://www.udacity.com/cs101x/index.html">crawling</a>! '
'</body> </html>')
elif url == "http://www.udacity.com/cs101x/flying.html":
return ('<html> <body> The magic words are Squeamish Ossifrage! '
'</body> </html>')
except:
return ""
return ""

def get_next_target(page):
return None, 0
end_quote = page.find('"', start_quote + 1)
url = page[start_quote + 1:end_quote]
return url, end_quote

def union(p,q):
for e in q:
if e not in p:
p.append(e)

while True:
url,endpos = get_next_target(page)
if url:
page = page[endpos:]
else:
break

def crawl_web(seed,max_pages):
tocrawl = [seed]
crawled = []
while tocrawl:
page = tocrawl.pop()
if page not in crawled:
crawled.append(page)
return crawled

#print crawl_web("http://www.udacity.com/cs101x/index.html",1)
#>>> ['http://www.udacity.com/cs101x/index.html']

# print crawl_web("http://www.udacity.com/cs101x/index.html",3)
#>>> ['http://www.udacity.com/cs101x/index.html',
#>>> 'http://www.udacity.com/cs101x/flying.html',
#>>> 'http://www.udacity.com/cs101x/walking.html']

#print crawl_web("http://www.udacity.com/cs101x/index.html",500)
#>>> ['http://www.udacity.com/cs101x/index.html',
#>>> 'http://www.udacity.com/cs101x/flying.html',
#>>> 'http://www.udacity.com/cs101x/walking.html',
#>>> 'http://www.udacity.com/cs101x/crawling.html',
#>>> 'http://www.udacity.com/cs101x/kicking.html']
``````