Big O notation is the final tool for analyzing algorithms that you need to know about before we can analyze Alice algorithm. As we've just seen even talking about worst-case running time can be a bit tricky if we're trying to be very precise. We have to identify the worst-case input in detail, which can be counter intuitive and then work through a lot of different cases. And also as we noticed it's annoying to have to count every single step an algorithm makes. So for example if you have two algorithms and one had a running time of 2n²+23 time steps and the other one had 2n²+27 time steps, you wouldn't really care about the 23 or the 27. You would say that essentially they have the same running time. So we are going to introduce a huge simplification to stating running times and that is called Big O-notation. If you've had an algorithm course before, you should already be familiar with this but we'll review it here just to make sure you understand. So let's say we have two algorithms, algorithm A and algorithm B, and algorithm A has a running time 3n²-n+10 for an input of size n, and algorithm B has a running time of 2^n-50n+256. So what I would like you think about is which of these two algorithms you would prefer if you don't know anything about the input other than that you're going to get different sizes. So no other structural assumptions or anything else that you know. And I would like you to check this box if you think it's algorithm A that if you take in general and this box if you think it's algorithm B.
The answer is that in general I would think that algorithm A would be preferable to algorithm B because the running time of algorithm A is 3n² -n+10 and here we have 2?. So just ask what the algorithm of Alice, this function here would grow very fast if and increases, but of course this is only a very general statement. So in general you have to be a bit more careful about choosing algorithms even if the running times deviate like this because as we said before we are taking worst-case running time so maybe the average-case running time of algorithm B is much better than of algorithm A and also we might be dealing with very small inputs. So if you would know that n for example does not get much bigger than 10, then those 2 algorithms are almost equal or sometimes even algorithm B would be preferable, but in general, if you don't have any other information, I would go with algorithm A because the running time is much lower than the running time here as long as n is not very small.
So when you just compared algorithm A with algorithm B, I guess there were certain parts of the running time that you did have a look at and that you based your comparison on and there were other parts that you ignored. So the parts that you probably ignored are the +10 and the +256 because those are just constants that we add to the running time function, and also when we compared function, we didn't really care about the -n or the -50n because we just look at the growth of the 2n versus 3n². And even the 3 here is not that important because if we had say 5n² here or 6n² or even 100n², And this is why in theoretical computer science, we used a notation called Big O-notation that leads away all those things that we didn't really care about when comparing the algorithms.
What did we just do when we determined that the running time of algorithm B grows much faster than the running time of algorithm A. Well first of all we said that there was some value of n. So some value for the size of the input where this running time function is always larger than this running time function. So the running time of algorithm A is some function f(n) and the running time of algorithm B is some function g(n). So if g(n) grows faster than f(n) then there must be some value of n for which g(n) is larger than f(n). And for any value larger than that, this must also be true and we are going to call that value n‘. And for that value n‘, we must have that g(n‘) is larger than f(n‘) because we're saying that g(n) grows faster than f(n) and of course this must hold true for all values larger than n‘ So we have a second condition here that for any value larger than n‘, we must also satisfy this condition here. Now we also said that we do not want to care about constants. So we do not want to care about if this here says 3n² or 5n². We would just say that this function basically grows depending on n². So in order to do that, we need another number in here and that number is a constant that would allow us to scale the function g(n) and I'm soon going to give you a few example to show what that means exactly. But in general it means that if we can multiply this function here with some number, let's call that constant c, so that it outgrows f(n) then we would be still be satisfied. Then we would still say that g(n) grows at least as fast as f(n). And this is all you need to know to understand big O notation because if those two conditions here are satisfied so there are some numbers n‘ and c so that cg(n‘)?f(n‘) so there's some point where this function gets at least as large as this function and from any point onwards, this function continues to be at least at large then we would say that f(n) is contained in O(g(n)). So the big O means that g(n) is a function that grows at least as fast as f(n) and this is the O that gives big O notation its name.
If the function f(n) is contained in O(g(n)), this means that g(n) grows at least as fast as f(n). Now this might sound a little more complicated that it actually is. Let's use some examples to see how functions outgrow each other. Here I have a graph and is on the x axis and I'm going to draw three functions on this graph. The first function is called f(n), the second function is called g(n), and the third function is called h(n). What I would like you to tell me know is which of the following statements is true. Is f(n) contained in O(g(n))? Is f(n) contained in O(h(n))? Is g(n) contained in O(f(n))? Is g(n) contained in O(h(n))? Is h(n) contained in O(f(n))? Or is h(n) contained in O(g(n))? So more than one of these is true and I would like you to check each one that is true. So every time that this function here outgrows the other function, I would like you to check this box.
There are three statements here that are true. The first one is true because f(n) clearly grow slower than g(n), so g(n) grows at least as fast as f(n). The second one is also true because h(n) grows even faster, so it outgrows f(n) also. The f(n) here is the slowest growing function, so it does not outgrown g(n), but h(n) on the other hand that is the fastest growing function so it does outgrow g(n). This one here is true also, and since h(n) is the fastest growing function, it's neither in O(f(n)) or in O(g(n)). You can see the slowest growing function is always contained in O of the two faster-growing functions and the fastest growing function is not contained in either O(f(n)) or O(g(n)), which are growing slower.
Big O notation is very useful because we can use it to concentrate just on the fastest growing part of the function and even without all the involved constants. I now want to show you a few examples to make this more concrete. So 3n+1 for example would be contained in O(n) because we do not care about the +1 and 3n grows at the same rate. It does grow at least as fast as n. The function 18n²-50 would be contained in O(n²) because again n² here is the fastest growing term. We do not care the minus 50 and we don't care about the constants. Now very similarly here, if you have 2?+30n?+123 that would be in O(2?). We do not care about the constant and those two terms here, they both depend on n but Now if we don't add those two parts here but for example we multiply 2?n², we need to include that into the O notation because just having 2? that grows slower than 2?n². We just have 2? here. That statement would not be true because 2? grows slower than 2? n² . So what about a statement like 3n+1 is contained in O of n² because n² really grows at least as fast as n. This one must also be true and it is true the only thing it would unusual to write this because usually we're trying to state this bound here as tight as possible. Similar down here, you could also say that 18n² -50 is contained in O of to the n³ but again this would be unusual to write because if we have a tighter bound that we can state, we would state this part over here. Enough with the examples, I'm going to let you figure out the rest as our next quiz.
We have a number of statements here, and for each of those statements, I would like you to tell me two things. The first one is if the statement is correct and the second one is if the upper bound I'm giving you on the right side here is the best possible bound or the tightest bound that you can find. Please check all of the boxes that you think are correct.
The first one, 4n²-300n+12 € O(n²) because n² is the fastest growing term. So it's correct. And it's also the best possible bound because we have n² here as the fastest growing term, which is exactly the same that we wrote here. The second one is also correct because n³ grows faster than n², but it's not the best possible bound we can give because we already found out that n² is the best possible bound that we can give. Now down here 3?+5n²-3? grows much faster than n² so this is not true and of course it cannot be the best possible bound if it's not a correct bound at all. In the next one, 3? is the fastest growing term and this is exactly what we're setting on the right here so it's correct and it's also the best possible bound that we can give. The next one, again, the bound is correct. This is similar to here when we compared n³ to n² so 4? grows much faster than 3?, but again, it's not the best possible bound because we already know that 3? is the best possible bound that we can give. This one down here is similar to one of the examples that I just showed you, so we can ignore the constant here and 2?(n²) grows faster than 2? so this is not correct. And since it's not correct, it cannot be the best possible bound. Now, this one down here is probably a little bit tricky because 2.1? indeed grows faster than you can see that 2.1? will outgrow this part here, 2?(n²), so it is a correct bound but it is not the best possible one because the best possible one would be 2?(n²) and not 2.1?, which grows much faster. So, as you can see, O notation is simply about looking at which part of the function grows the fastest and then it's a very convenient way of actually describing the growth of that function while ignoring details that are a bit distracting and not necessary for the analysis that we are going to do in this course. Now, there's one final note I should make. Some people instead of writing f(n) € O(g(n)) also write f(n) = O(g(n)) and sometimes but this is not very common people will also write it as if it were a subset. So they would use the symbol here and say subset contained in O(g(n)). This here is probably the most correct way to state it because O(g(n)) is actually a set of functions if you were to consider it mathematically. But this one appear = O(g(n)) is also a very common one that you can also find in many scientific papers and even teaching books.
Big O-notation is also sometimes called Landau notation, named after the German mathematician--Edmund Landau. He didn't invent the notation, which was done several years before him by another German mathematician called Paul Bachman, but he popularized it in the 20th century. In a way, this is actually quite ironic that you should have popularized a notation that contains so much approximation as big O-notation because Edmund Landau was actually known as one of the most exact and pedantic mathematicians. His goal was to be uncompromisingly rigorous. In his lectures that he used to given, he even had an assistant who always attended his lectures and was instructed to interrupt him if he even omitted the slightest detail. It's interesting that he introduced a notation that omits all of the details.
Now you can learn about the basics of analyzing algorithms, and we introduced three simplifications that make our life going forward much easier. The first one was the RAM model so that we can analyze algorithms without actually having to implement them. The second one was the concept of worst-case running time so only having to consider the worst possible inputs and not the best ones, not the average ones, such as really the bad ones, and finally, we introduced big O notation or Landau notation to be able to ignore all unnecessary details in the algorithm analysis. Let's go back to the example from before when we tried to count the number of times that the sequence a b appears in the string of length n using this algorithm down here. And as you remember from the quiz, this was actually quite painful because we had to figure out what is the wort-case input, and we noticed that this is actually not easy to figure out and even if you have this figured out. There is a lot of detailed counting that you need to do, and there are details you need to take care of such as how often is this line here actually executed and so on. So now that we have big O notation available, our task is much easier because we can just make two observations and will be able to state the running time. So the first observation you can make is that the algorithm will actually go through the string one by one by one, and since it always just looks at a single character and the next character, the algorithm will look at each character and again put string at most twice. And the second thing to notice is that each time the algorithm does consider a character so it starts out to zero, one, two, three and so on, it will perform a constant number of operations. So if it finds an a, it will do either one or two additional operations, and if it does not find an a, it will do zero additional operations. So for each character in the input string, a constant number of steps is performed and this is exactly advantage here because with big O notation, we can ignore constant. So overall, this means that if you have an input of length n, the algorithm will perform a number of steps that some constant times n plus some constant for all the rest of the operations, but using big O notation, we already know that this is O of n. So we do not need to care anymore about the detailed number of times this here is executed. We do not need to care about the detail such as how often this line here is executed. We can just say that the running time of this algorithm is O of n, which would also be referred to as a liner algorithm.
Now, let's have you figure out the running time of another algorithm using Big O Notation. So, this is the algorithm that we are going to look at, it's just four lines and actually the algorithm does not perform any very useful function. The most important part is that it's interesting to analyze its running times. So, there's not really any value or at least none that I can see of actually producing this result down here. So, what I would like you to do is to figure out the running time as a function of this value n here, and I'm going to give you a number of choices for your answer. So, the first choice is O (n) so it would be a linear algorithm. The second one is O(n¹. ?). Next one is O(n²). And the final one is O(n³). So, I want you to check the box that states the correct running time of this algorithm expressed as a function of n, and I'm also going to give you a little hint, which is that if you add n + n-1+ n-2 and so on plus 2 plus 1 that is equal to n²+n/2. And once you analyzed the algorithm, you'll see that this hint is actually quite useful.
The correct answer here is O(n²), and to see that, we just have to look at how many time steps of this I wouldn't requires other function within. As you remember, this line up here takes zero times steps because it's just a memory access and we said those were free. This one here, it depends again what you do with the end of the range here but it's going to be executed about n times, so it's either n or n + 1. That doesn't really make much difference here because now we can use big O notation. How often is this inner loop here executed? The first time, it's actually executed n times, then the next time it's executed, it's going to be executed on the n - 1 times, and so on and so on because this i here, this value here increases, and as this increases, this inner loop here is going to be executed less and less times. Here we have first time, we're going to have n times steps, the next time, we're going to have n - 1 times steps, the next time, we're going to have n - 2 times steps and so on, and the same basically also holds true for this inner loop here. So, if this is executed n times, then the inner loop here, again depending on how you count that is going to be executed n times or n - 1 times So, these two parts here, that is exactly where you need the hit. So, it's two times this part here, so it's two times n² + n over 2. So, the total here is n² + n. So, the total number of times steps is 0 + n + n² + n which is n² + 2n which is O(n²). Now, as you get more familiar with analyzing algorithms, you will actually see that having two inner loops usually leads with quadratic running time and you wouldn't even have needed this extracted here. So, with more practice, the analysis of algorithms becomes easier and easier and you will have to do less and less of the tedious single line counting. One final thing, of course, O(n³) is technically correct but as I already mentioned when discussing big O notation, you want to bound to be as tight as possible. This is really the only correct solution.