ud712 ยป



Hello, and welcome back. In this module you'll implement a recoverable virtual memory library like the one described in the lightweight recoverable virtual memory paper discussed in lecture. Your system will have an interface similar to the one provided by the coder file system developed at Carnegie Mellon. This is a non trivial exercise. And it forces you to grapple with some of the classic systems programming challenges, in representing data and marshaling it so that it can be written to disk. The harder you work to come up with an elegant solution, the more you will learn and the more you will appreciate systems programming. As usual, the videos are not intended to provide you with all the information you need. Read the instructions carefully, be prepared to look things up on your own, and write some of your own test programs. Let's get started.

RVM Design

We'll start off with a brief review of what we mean by Recoverable Virtual Memory and the architecture of the library that we're going to build. I'll start by representing the client's program and the thread of its execution over here. And let's suppose that his program manipulates some data segment, which in our example, we'll just say hello world. Our client wants this data to be durable. That is, if there's a power failure or his program crashes or something like that, the data should persist. That means that his data needs to live on some persistent storage like a disk. So far so good, but our client also wants to be able to manipulate this data as though it lived in memory. And he wants any changes that he makes to the memory to be reflected on the disk. Well, okay. There's a standard solution for this. In Unix Lex systems, it's called mmap. Which maps the storage device to a range of addresses in virtual memory. The problem with mmap for our client however, is that if he starts to make a change, but doesn't quite finish before there's a power failure or his program crashes. Then his persistent storage might end up in an inconsistent state. So maybe he tries to change hello world to hello earth, but as he's writing the third letter, we have some sort of power failure and we end up with hello earld written to persistent storage. And if this is what's written to Persistent Storage then this is what will get read back into memory when he starts up again. This data might be some very critical structure like the iNotes of a file system. And having it in an inconsistent state like this might have disasterous consequences. So instead of using mmap, our library will read the data into memory and then operate through a series of transactions where a sequence of changes gets committed atomically to disk. That is to say either all the changes happen or none of them happen. The client will specify when he wants to start a transaction with a call to the begin transaction procedure. When he wants to change something in memory, he'll make a call to about to modify, and specify the range of memory addresses that he plans on changing. Here he's starting at the sixth byte, and writing earth instead of world. Notice that while a client is in the middle of a transaction, we aren't going to apply the changes that he makes to memory to persistent storage. Instead, we wait until the end of the transaction when the client calls commit. Then we write the changes he made to something called a redo log, which lives on the disk. We can't write the changes directly to the data segment, because that write might be interrupted, creating an inconsistent state. Writing to the redo log is okay, as long as we can tell if the write was interrupted. If we find that it was interrupted, then we just don't apply the change. Assuming that we do finish the write to the log then this is going to have to get applied at some point. This process is called log truncation. It is possible that this will get interrupted, but as long as we don't delete the entry in the redo log, on restart we'll apply it, and put the data segment into a consistent state. Now it is also possible that something will happen in of a transaction that will cause the client to want to forget the transaction entirely, that is to go back to the state the memory was in before the transaction was started. For this purpose we keep an undo log in memory. The client does everything as he had done before calling about to modify to let us know the range of addresses that he plans on changing. And in this procedure we save the current contents of memory to this in memory undo log. That way when the client changes what's in memory, we have the old content preserved. If the client wants to restore the state before the transaction, he calls abort. Then what's in the undo log gets applied back to memory, putting things back in the hello world state. As it was before the transaction started. The result of all of this is quite powerful. A client of our library now has a way to treat permanent storage as if it were in memory. He gets to bash together changes that he is making into transactions so that either all the changes happen or none of them do. And if he finds himself in the middle of a transaction and decides not to go through with it, he can easily get back to the state he was in before the transaction was started.


We'll now take a quick look at the API for our rvm system which will closely resemble the one used as part of the coda file system. rvm_init should create a directory, image will put all of our files for this segment and return an rvm object that will get passed into other methods. We create a segment with rvm map with also maps an existing segment to memory. This procedure should go a lot like Nmap if you've used that before. The only changes that are made in memory aren't immediately written to disk. They only happen when a transaction is committed. Naturally, if we have map, we should also have unmap, just like for a [UNKNOWN] call we have free. Destroy, as its name suggests, destroys a data segment erasing it from the disk. Now we come to the transaction part of the system. Remember that we are after all the asset properties often associated with the database. Just add atomicity and durability. To start a transaction, use the begin transaction procedure. Specify an array of memory addresses. Each one of which should have been returned by call to M map. We use an array here, rather than a single memory address, so that we can modify more than one segment as part of a single transaction. Then, in the middle of a transaction, we make calls to, about_to_modify() where we specify the range of addresses we want to change. This should be done before the data is changed, so that the undo log will hold the proper information. Commit, actually makes the changes we've been making to the segments in the transaction permanent. This method must not return until all the changes made by the transaction have been recorded in the redo log. Now if we're in the middle of a transaction and we decide we don't want to finish it, we simply call abort, which will restore the memory to its previous state. And lastly, we have the truncate log procedure, which applies all the changes in the redo log to the data segments. Note that a client of the rvm library shouldn't have to call this. He may if he wants to in order to give help to the library about when it is a good time to truncate the log, but all he really cares about is that when he calls Mmap he gets the data with all the updates applied. I should say one more thing about the implementation. It may be tempting to ignore this ' about to modify' function and just work with the whole data segment. You could just save the whole data segment to the 'undo' log, on begin transaction, and then write the whole segment on 'commit'. But this would be terribly inefficient. Maybe this is a good first iteration through the code, but it shouldn't be your final solution. Typically, a transaction only modifies a small part of the segment. So, writing the whole thing to disk is not a good idea. In other ways it's okay for your implementation to be inefficient. It's okay to do the log truncation only last minute in M map if you like. And the sequential search to the array of addresses specified as part of the transaction is also acceptable. In fact I provided a dictionary, simple table data structure that uses sequential search for your convenience.

Testing Tips

Part of the project is for you to write your own test code for the RVM library. But I want to give you a few pointers that might make this easier. One of them is to use the fork procedure which creates a new process. A new process means a new address space but when we map from the same segment we should get the same data. That is what this RVM main program, that is provided to you, tests. This fork procedure can be a little confusing if you haven't used it before. It creates a new child process that is the same as the original. Because the original spawned the child process, it is called the parent. Now these processes are almost exactly the same, except that fork returns zero to the child process and it returns the process ID of the child process to the parent process. So in this example, they end up with different values for PID. If fork fails, then it returns negative one, we catch that error here. Then, since PID is zero for the child process, he runs this proc1 procedure and then exits, meanwhile, the parent process has some PID greater than zero, so he executes wait PID. Which just waits for the process with this ID to finish. And then he goes and executes proc2. You could achieve the same thing with two separate programs but this strategy is a little cleaner. One other thing worth nothing in this example is the use of abort in proc1. This is probably a better simulation of a power failure, because it does not do as much cleanup. Flushing of buffers, closing file descriptors, etc as a more traditional exit. Work incrementally and write lots of test code and good luck.