These are draft notes from subtitles. Please help to improve them. Thank you!
Welcome back. By now you have seen all kinds of models based on differential equations and all kinds of tricks to solve them on the computer. If that was all there is to it, thousands of phD students would be jobless. In this final unit, we'll touch upon more advanced topics such as modeling buildings and air and water. We'll explore why it's hard to predict the weather, and we'll look into aspects of software engineering, that is, how to do computer simulations in a proficient way.
Let's start with a bird's eye view on the finite-element method--FEM. These days it's the workhorse of almost all mechanical engineers. FEM can answer, for instance, what happens when you put a huge truck on a bridge. To compute this deformation is application of FEM to the static case, but FEM can also be applied in a dynamic setting, for instance, to simulate the effect of a crash on a car body. Here it's important to look at the process of buckling. Whereas in the static case we're not really interested in how the truck was placed on the bridge. It just has to be there. I want to outline three fundamental ideas of the finite element method. The first is discretization. The continuous structures are approximated with the help of--guess what--finite elements-- elements of finite size, not infinitesimal size. When we do so the first question is which geometry these finite elements should have. Should they be tetrahedra? Should they be cubes? Or should they even be curvilinear? The second fundamental idea is that of interpolation. Given the finite elements, how do I compute a value at an arbitrary location? For the static case, a really fundamental idea is that of minimization of the potential energy. Think about a ball that rolls on a terrain of mountains and valleys. Eventually, it's going to come to rest in a valley. In the static case, potential energy is minimized. Let's have a closer look at that valley. In the Gedankenexperiment, it splices this object a little further to the left or a little further to the right. Then the energy stays almost the same because we're at the bottom of the valley, which means that to displace this object in this way require no work. This is the concept of virtual work. For all infinitesimal displacements that are allowed-- we can't go down and we can't go up, obviously-- the virtual work equals 0. In mathematics, this way of posing the problem with the help of virtual work is called a weak form. The strong form would be to ask for all forces to compensate. The weak form asks for the virtual work to be equal to 0 for all allowed virtual displacements. This weak form results in a finite number of equations that we can solve on the computer. This finite number, however, may range in the hundred thousands or even millions.
Now let's try out some of these ideas with the flexible rope. We'll fix both ends of that rope and want to see what the equilibrium shape is going to be under the influence of gravity. The obvious choice of finite elements is springs. Springs of a given rest length, so that the potential energy of each spring amounts to 1/2 times the spring constant times the square of the extension of the spring, which is the distance between the two endpoints minus the rest length of the string. To model the mass of that rope, we attach mass points to these strings. Of course, these mass points carry potential energy due to gravity. Given the constants that we provide, compute the potential energy of that rope. The one end of that rope will be fixed at x = 0, y = 0. The other end of that rope will be fixed at x = 1.3 m and y = 0.4 m. Our code starts with a random initialization and then applies a pretty simplistic strategy to minimize the energy. For a certain number of times it's going to pick one of the masses and change the position of that mass point by a random vector. If the energy decreases, it keeps that new position. If it doesn't, it returns to the position before. Very simple, but highly inefficient.
In the implementation we are summing up the energy for all points, which is mass times gravitation acceleration times y position for every point and the energy for every spring. The resulting diagram may be regarded as abstract art, but also shows the evolution of that rope. We are starting with a random configuration, but eventually the shape of the rope becomes plausible. Keep in mind that this is not--I repeat--not temporal evolution, even though it looks like that. This shows a path of optimization toward an equilibrium. We are not talking about forces or acceleration here. It's a search process, like finding the shortest path to connect 10 cities.
A topic highly related to the finite element method is computational fluid dynamics--CFD-- the study of gases and liquids with the help of the computer. For instance, to find the optimum shape of an airfoil or the optimum shape of a car body or to design hydraulic machinery. I've titled this section "Why CFD is hard," so now let's look into that. If we were looking at a single particle of mass m that is subject to a force F Then the rate of change of the velocity would be proportional to that force. That's one of Newton's laws--force equals mass times acceleration, which is the rate of change of velocity, the derivative of velocity with respect to time. For the fluid, something similar has to happen, but now we're not dealing with a single particle. We're dealing with a virtually infinite amount of particles. What we are working with is not the velocity of the particle. It's the velocity field. For every position in space, we specify the velocity, so the velocity that we specify is the velocity of that particle at that instant of time. Before and after, most probably, this location is going to be occupied by other particles at other times. When we write down Newton's equation for this particle-- force equals mass times the derivative of velocity with respect to time-- We have to be a little careful. Let's look at what happens after a very short time step. It's mass times 1 over the time step, and now we have to form the difference of the velocity after that time step minus the velocity before that time step. The before part is easy. That's simply our velocity field at the current time and the current location. The tricky thing is the after part. It's the velocity field at the later time--t plus time step. Now we have to take care of the fact that our particle has moved a little. We don't need the velocity field at that later time. At position x it has to be a slightly different position, namely, how far did we advance? We advanced by time step times velocity. Now, this is going to make things ugly. The velocity field of something that includes the velocity field. A function applied to itself. This is what makes things ugly and eventually leads to computational fluid dynamics and eventually leads to computational fluid dynamics being hard. If we do the math right, this becomes the following. First we have to look into the change of the velocity field with time, so we get its partial derivative with respect to time. But then we also have to look into its change with position, which is the partial derivative with respect to x, for instance. The larger the velocity is, the more effect the spatial derivative has. What we get in the end is the x component of the velocity times the partial derivative of the velocity with respect to x. Of course, the same happens with y and z. This is going to be the acceleration, and this is inherently nonlinear. We have a product of a function that we're looking for--the velocity field--with itself. This is going to make solving the differential equation that results from this really hard. Finally, however, even though the resulting equation-- the so-called Navier Stokes equation--is going to look pretty complex, it's nothing else but Newton's law applied to the velocity field.
At every point of our fluid we have a velocity, but also we have pressure. Velocity has direction. Pressure doesn't. Now let's try to find out how to incorporate pressure in our equations. To make things simple, we look at a one-dimensional situation-- a straight pipe filled with fluid. Capital a is the cross-section area of that pipe. Let's replace some amount of that fluid with the piston of width Δx, and let's assume that the pressure to the left of that piston is p₁ and the pressure to the right of that piston is p₂. Then this force is proportional to the difference of those two pressures. If p₁, the pressure on the left, is larger than p₂, the pressure on the right, the force is to the right. So we have to choose sides like that. The proportionality constant is simply the area. If you double the area, you double the force. If we say that this piston has a volume of v, then the area, of course, is the volume of the piston divided by its width. If we want to use this as part of our differential equations, what should we be writing? What's the force per volume? The partial derivative of pressure with respect to x? Minus the partial derivative of the pressure with respect to x? Half the partial derivative or the pressure divided by the square root of the area?
It's the second one--minus the partial derivative of pressure with respect to x. If you look closely into what we've got here, (p₁ - p₂) / Δx-- if this was (p₂ - p₁) / Δx, we would get the derivative, but it's the negative of it.
Because fluid dynamics is so hard, many people try to come up with toy models to at least get a glimpse of what's happening. One of these toy models--a particularly interesting one--is the Lorenz system. By the way, this name is not to be confused with that of the physicist who spelled with "tz." This is what is described by this model. There is a layer of incompressible liquid between a surface of constant high temperature and a surface of constant low temperature. We've got some melting ice-cubes here. The velocity field in that liquid is described by three highly abstract parameters, called x, y, and z. They describe the relative amount of different sorts of motion. Even though this is a very sketchy and abstract description, some people have succeeded in implementing these equations with an actual water wheel. The idea about studying these equations is that they may be telling us something about every system in which convection is present. When we plot the solutions of the Lorenz system in 3D this is what we get-- a very intricate butterfly-like pattern. Here we're seeing two solutions--one in green and one in blue-- starting from almost the same point--something to explore in the next section. The trajectories make some turns on one ring, then jump to the other ring, jump back to the first ring and so on. The number of turns they take per ring seems to be almost unpredictable. There is one more thing. Even though both trajectories have almost the same starting point, you can see that they quickly diverge in the course of time. We're going to study that in the next segment.
Now we are going to look into the strange behavior concerning the two solutions that start at almost the same point. This is about deterministic chaos. Implement the forward Euler method for the Lorenz system and do this computation for two initial conditions that are only slightly--every so slightly--different from each other. Our program will eventually plot how the distance between these two trajectories evolves. If you first want to see the butterfly, uncomment the lines below and comment the lines above.
This is a straightforward implementation of the forward Euler method. This x at the initial step is actually 2D vector. S is the y. The result will again be a 2D vector with those two values for the two trajectories in it. The resulting diagram may need some explanation. We see the distance between the x values of the two trajectories over the course of time. Initially, they are really close--10^-14. And then they grow and eventually become macroscopic--10^100. The important thing to be seeing is this linear slop. But actually it's not linear. We are working with a logarithmic vertical axis, which means that the distance grows in exponential fashion. It takes about 30 units of time to increase the error from 10^-14 to 10^-4. These are 10 orders of magnitude, a factor of 1 to 10 billion. This leads to the following conclusion: whenever there is an uncertainty about the initial conditions-- however tiny--the effect of this uncertainty will quickly grow to a visible size. So, even those the system is deterministic, once we have fixed the initial conditions, everything is determined-- even though that is the case, it's behavior looks random. This sensitive dependence on initial conditions is the hallmark of deterministic chaos. That is to be expected. Every system, which includes convection, also inherits this aspect. If somewhere a butterfly flaps its wings, some days or months afterward this is going to change the weather at some remote location on the earth.
Up to now we have only dealt with the effect of a finite time step, but actually also the representation of the values on the x axis is cross-grained. Now we have a look at what this leads to, and for simplicity it's again the forward Euler method. As a test case, we're using that satellite that's almost geostationary from Unit 2. It takes precisely 24 hours to complete one orbit around the earth. What's critical about the forward Euler method in terms of round-off error is this addition and this one. X(t) is a number of reasonable size. In this case quite an amount of meters. I'm just drawing some ghost figures here to give an impression, but the step size times the velocity is pretty small in comparison, because we're multiplying by that small step size. If the velocity is of reasonable size and we multiply by that small step size, we get a number that's rather small in comparison to x. Then we form the sum of these two numbers and get something that looks like this, but actually this number again is computed and is being stored at the position of x--the loose position in that process. Actually, you see the value of x didn't have enough position to start with. The smaller we choose the step size, the smaller this number becomes, the more grave the effect of round-off error will be. The effect of the round off error is that we actually are losing the details of velocity and force. We could have gotten away with not computing these digits anyway. To see a noticeable effect of round-off errors in a manageable time, I'm cheating a little here. I'm using standard Euler, but them I'm multiplying by a number of 1 plus tiny random number. This epsilon is defined to be 2 * 10⁻⁶ in advance. This looks as though I have a pretty huge round-off error every time. This is the result that we get. First I have to explain why we see several points for each step size. This is because we're dealing with random numbers and because I let the computer try out every step size 10 times. The overall behavior is what we know. With the forward Euler method the error depends linearly on the step size, at least for step sizes that are small enough. But for very small step sizes you see--uh-oh--the error explodes, and we have many small steps. Round-off error accumulates and destroys our result. What you should be learning from this reducing the step size only takes you so far. There is a threshold after which the error goes up again and may become severe. Let's look at the same data from a different viewpoint. Let's note that the error depends on the step size, but let's plot how the error depends on the number of steps. This is what we get. For a small number of steps, the step size is large, and the forward Euler method introduces a huge error, which then decreases in a linear fashion, but as the number of steps grows, we get more and more error due to round off. The simple model for this behavior would be to say that the error introduced by rounding is proportional to the number of steps and hence inversely proportional to the step size plus h.
Now that we have an idea about the effect of round-off error, we can actually compute some estimates. For the forward Euler method, the total error is composed of two parts-- the part due to the Euler method, the global truncation error, if you remember that term. A constant times the step size plus the error due to round off, which seems to be proportionate to the number of steps and, hence, inversely proportional to the step size Epsilon denotes the relative size of the error. It's something like 10⁻⁷ or 10⁻¹⁶. By which fraction does our number tend to change due to round off? Again, we should see proportionality here. If we have twice the amount of error in each step. We should be seeing twice the amount of total error, even though, again, the error induced in the earlier steps is growing over time. D is some proportionality constant that we don't know yet and that depends on the problem as does C. So, there is one component to the error that grows from zero to infinity on principle. There is another component to the error that starts at infinity and decays to zero. We have to have a minimum in between. This is what we saw in the experiment. It's called the step size at that minimum hmin. It does not make sense to use more steps, that is, to use a smaller step size, because the error will be growing again due to round-off. Now, here is a quiz. For the two types of floating point numbers that we're typically dealing with, we can specify the value of epsilon. It's about 10⁻⁷ for single precision. In C and related languages, that's called float. It's about 10⁻¹⁶ for double position. Python works at double position, and the arrays of numpy can be configured as to which position they should be using. Now here comes the quiz. Say we do an experiment at single position and find that the value of hmin amounts to 10⁻⁴. If we try to solve the same problem at double position, what do you expect to be the value of hmin? Pick one of these: 3 * 10⁻⁹, 4 * 10⁻¹⁰, 5 * 10⁻¹¹? Here is a hint: at the minimum, it turns out that the contribution of both parts is the same. Ch will equal Dε/h.
It's 3 x 10^-9. Let me show why. At the minimum, both contributions are equal. I'll explain why that is so in a second. So we find that the constant, C, x the value h minimum equals the constant D x ε over h minimum. From which follows that h minimum is the square root of Dε over C. ε is multiplied by a factor of 10^-9, which means that the value of h min, for double position, is multiplied by a factor of square root of 10^-9. Starting from the value of h min for single position, we have to multiply by the square root of 10^-9. In total, this is 10^-8.5. And that's about 3 x 10^-9. An important remark: This model was made for the forward Euler method. If we have a method of higher order, if we have Ch square, or even Ch 4 in here. So from this result we learn that as we go from single to double position, we may be using a step size that's about 4 to 5 orders of magnitude smaller, and hence the error will be smaller by 4 to 5 orders of magnitude. The price that we would be paying is that we have in the order of 10^4 to 10^5 more computation which looks a little prohibitive, but then again, with a method of higher order, you need half your steps than with the forward Euler method. So the affect of roundoff errors tend to be way smaller with more advanced methods. In case you're interested, let me explain why these 2 contributions are equal at the minimum. Otherwise, skip over the rest of this video. Substitute the step size h, by e^-u, with an appropriate value of u. Then the total error becomes C x e^-u + Dεe^u. Because we are dividing by that power. Now you can see that the 1st part becomes a falling exponential function, and the 2nd one becomes an increasing exponential function. Both fall, and due to symmetry, the minimum has to be where these 2 curves meet. A completely different approach to that problem is, of course, to compute the derivative and set it equal to 0.
Choosing the right algorithm and choosing an appropriate floating-point format are just 2 of the issues we encounter when we implement a solver, a program to solve a differential equation. On a higher level, one may think about reusability and maintainability. This is my replica rendering of the Plug and Play style of implementation that's used in the book "Numerical Recipes." With this type of implementation, you can mix and match components, which comes in very handy for experiments. Let's have a closer look: 1 minor but vital component is the right hand side; the right hand side of the differential equation, that is. The rate of change. It's used by the algorithm, the algorithm being, for instance, the forward Euler method, Hunt's method, and similar. The algorithm is called by the stepper. which controls which steps are being made at which size. The stepper is responsible to adapt the step sizes and maybe to reduce the steps when it finds the error has grown too large. So the algorithm is only able to do 1 single step. The stepper calls the algorithm. The driver is the interface to the user, which, of course, is a program and not a human. The driver can be sent commands to start and stop, and eventually the driver returns the data to the user. To this end, it has a subobject that's called output. The output may collect values at predefined instances of time, or it may interpolate the data it receives from the stepper to regular time intervals. The stepper will pick smaller and larger time intervals. The output may interpolate them to produce data at regular time intervals.
Given the division of labor in the solver as discussed before, which component would be affected by each of these actions? To use an implicit method? To stop at time 42? To determine the value at time 13.000? And to set a different value for the mass? Enter the first letter of that component. The driver, the stepper, the algorithm, the right-hand side, the output, or is it unclear?
Setting a different value for the mass is obviously something that has to happen in the right-hand side. The differential equations, as such, contains our physical model. Determining the value at one precise instant of time is a job for the output. Most probably it has to interpolate between values before and after, because there is no step at exactly this instant of time. So, this is O. Stopping at the given time--that's a job for the driver. It will stop the stepper. Using an implicit method is tricky. That's unclear. Somebody would be responsible to solve the equation. Who would that be? There is no explicit right-hand side of that equation.
1 decision that has to be made early on in the interpretation of a server is whether or not to use parallel computing. There are a range of ways for doing so and all of these can be combined with each other. We have vector units on the CPU, which are units that can operate on vectors; apply the same operation to say, 4 different values, at the same time. Multicore CPUs allow us to do several very complex things at the same time. The same is true for multiprocessor systems. GPUs, graphics chips, are optimized for working with huge arrays of pixels, but they can work with everything else as well. Which means, of course, they are also good at working with matrices. It's rather difficult to speed up servers for ordinary differential equations. The server does 1 step at a time, and every following step depends on the step before. How could you do some of these in parallel? That does not seem to work well. Speeding up a differential equation server for ordinary differential equations with the help of parallel computing may work, for instance in a situation where you have lots of isolated entities which rarely influence each other, such as the planets orbiting the sun. As you have seen in unit 6, it's of vital importance to speed up servers for partial differential equations. And in this case, parallel computing really comes to help. 1 option for this is to use spatial subdivision. Each processor of a multicore system can handle its own spatial domain. Treating the borders between the different domains right requires communication. This can become an expensive overhead. We'll look into that in the next quiz. When we apply finite elements or work with implicit servers for partial differential equations, we often find huge matrices. They are an ideal workload for GPUs.
Now let's look into the communication overhead required by parallel computing. Assume that we have n-squared course or processors, each of which handles its own square of our complete domain. The complete domain should consist of l * l cells. What would be a reasonable model for the time taken by the computation? A constant C by the number of course or processors and squared times the side length of the total domain plus another constant times n times l? Or constant times l-squared divided by n-squared plus another constant times (n - 1)? Or the constant times n times l-squared minus a constant times n times l? A constant times l-squared over n-squared plus a constant times (n - 1) times l-squared? Pick one.
The second one would be a good choice, and I'll explain why. The total amount of computation to be done is proportional to l², the number of cells in the domain. So if we have n² course or processors, the amount of work for each of these is proportional to l² over n². Which means that the first term describes the time taken by the computation alone. But then the processors, or the course, have to exchange data at the boundaries. Add that, and the amount of that data is proportional to n minus 1. We have n minus 1 vertical boundary lines and n minus 1 horizontal boundary lines, and it is proportional to l, because we have l cells along each boundary line. So of these expressions, the second one makes the most sense. Footnote: When we use this expression, we'll find that the optimum number of processors is proportional to the cubed root of the number of cells, l². This is a little surprising on first sight. You may expect that the optimum number is proportional to the number of cells, not proportional to its cubed root. This relationship is easy to verify by looking at the minimum of this expression.
Instead of developing your own, in many cases you can, of course, use somebody else's solver. There are lots of ready-made software for that. First of all, there is general mathematical software-- application packages, which can do differential equations, but which also can do lots of other stuff. First of all, MATLAB and its clones, such as Octave or SyLab, Wolfram Alpha if you manage to formulate your problem in one line, Mathematica, and tons of others. Second, there is dedicated modeling and simulation software, such as Simulink, which comes with MATLAB, COMSOL, Multiphysics, and Wolfram Alpha System Modeler, and PyDS Tool, which is based on, as the name says, Python. Then there are lots of code libraries that you can use to build your own application. One important library to mention here is scipy integrate-- obviously for Python again. To create plausible motion for games, there are a number of physics engines. Finally, there are cook books providing snippets of code to be reused or adapted. The most prominent one of these are the Numerical Recipes.
Most solvers available on the market do not accept the differential equation in this form-- the second-order differential equation, which is not an explicit form. What you instead have to provide is a function that returns the first derivative, the rate of change. The trick is to turn this differential equation into two differential equations of first order. Here is a hint for you--the first equation should be that the derivative of x with respect to time is some new variable called y, and now you have to provide the second equation for the derivative of y with respect to time. Should that be 7-3y+4x or 7-3 times the derivative of x plus 4x or 1/37 minus the second derivative plus 4x or 1/37 minus the first derivative of y plus 4x.
The first option is the correct one. You bring 3 times the derivative of x minus 4x over to the right hand side and write the derivative of x as y, so this becomes y and the second derivative of becomes the first derivative of y, so we end up with the first derivative of y=7-3y+4x. This is the standard trick to convert higher order differential equations into first order systems of differential equations. You maybe asking--why are we not using the second one. The solver needs the rate of change. It provides x and y, the current values so that function should depend on x and y and not on the rate of change, which is to be determined.
Writing your own software or using somebody else's software is one thing--what about getting it right? Generally, one distinguishes between three different aspects of that. First is verification, checking whether your code really implements your model. Second, calibration. Ensuring the parameters of your model so as to reproduce given observations. And the third one is validation. Check whether the real word reflects the predictions of your model. Of course, within a certain application domain and to a certain accuracy.
Verification, calibration, validation--to which of these three aspects do these three problems belong. The first one--our actual car has a braking time of more than 6 seconds. Think about Unit 5. The second one--the orbit of the spacecraft around earth is not closed. Think about Unit 2. And the third on--without harvesting the amount of fish reaches 10⁷ tons. Think about Unit 4.
If the orbit of the spacecraft is not closed even though it has to be in the model, we've got some problem with implementations, so that's a verification issue. The maximum carrying capacity was a model parameter for the fish model, so in this case the calibration was done right. And the other one is a prediction of the model doesn't work out well, so that's a validation issue.
Here comes one idea about how to test the implementation of a differential equation solver. We provide the solver that we've never covered before that looks pretty complex. The question is what's the order of that solver. Enter that order here. The problem that we're using is again the satellite the takes 24 hours to orbit the earth and we are comparing its final position to its start position and plot the arrow. From that graph, determine the order of that solver.
Look closely--both axis are logarithmic and you can see as the step size increases by a factor of 10, the error increases by a factor of 10⁵. Remember, forward Euler was of order 1, Heun was of order 2. This is the solver of order 5. Read this the other way around as you shrink the step size by a factor of 10, the error decreases by a factor of 10⁵--that's a huge gain and you can see that this solver, when it takes steps of 3 minutes around earth gets back to where it started within just millimeters. This is quite an efficient method. It's called the Runge-Kotta feedback method. Now assume that with all these strange numbers we mistyped one-- mainly we didn't enter 56,430 but we entered 56,431. Let's look at the result. This doesn't look that great anymore. We've got an error that's well above 1 km, so getting these numbers right, which among other things belong to verification.
There is one fundamental issue about calibration and validation, namely that prediction does not always go hand in hand with understanding. Look at these two models for the motion of a planet in the solar system. Let's look at the left model--here the earth is fixed in space and close to the earth--thus the center of some large circle. This point moves around that circle and a point on that second circle, which then moves in this direction. This is called an epicycle, and we can continue that process. That point could be the center of another circle and so on and so on until finally we say that the planet is moving around that final circle. That's one of the antique models and let's have a look at the right one. Here the sun is at the center, the earth moves in a circle around the sun, and that planet moves in circle. Here comes the last quiz for you--which of these models allows a more flexible calibration and which of these two models allows a more accurate prediction of the apparent motion of the planet as seen from earth.