CPSC 457: Operating Systems

Professor Carey Williamson

Winter 2010

Assignment 2 (30 marks)
Due: March 1, 2010 (11:59pm)

The purpose of this assignment is to gain familiarity with User Mode Linux (UML) and the basics of process management in the Linux kernel. You will do so by adding a new system call to the kernel to manipulate how process IDs are allocated, and then demonstrating that your system call works correctly.

PID Spacer (30 marks)

Most Linux systems assign a monotonically increasing ID number to each process that is created. This PID value starts from 0 (during the boot process), and then increases one at a time for each new process until it reaches 32,767. It then wraps back around to re-use low numbered PIDs (near 1000) that are no longer in use for active processes. (Having thousands of processes active at the same time would crash most Linux systems!)

Your task in this assignment is to modify the allocation of PIDs for user processes in a very elementary way. Rather than having new processes get consecutive PID values, you will put a space (gap) between PID values, using a simple numerical increment.

To accomplish this task, you will need to perform several steps:

  1. (2 marks) Set up your UML environment by running /usr/uml/457-init on a machine of your choice, either in the lab, at home, or on one of the department servers.
  2. (2 marks) Learn how to compile and run your own Linux kernel in the UML environment, as well as how to enter and exit UML properly. Your TAs will explain this in the UML tutorial on February 2.
  3. (2 marks) Add a new integer variable called PIDincrement to the kernel (perhaps as a global variable in an existing file, or in a new kernel source file that you create). The default initial value of this variable should be 0. You may want to compile and test your kernel at this stage to make sure that it still works.
  4. (10 marks) Add a new system call PIDspacer to the kernel so that you can adjust the value of PIDincrement at will. Your system call will take a single integer parameter, and set PIDincrement to the new indicated value. Adding a system call is a rather intricate step that involves: extending the size of the system call table in the appropriate place(s); specifying a system call number for your new sytem call in the kernel system call table; writing the actual code for your system call in an appropriate kernel file; and adding some (optional) debug statements to tell you when your system call is invoked, and what the new value of PIDincrement is after each call. You will definitely want to compile and test your kernel at this stage to see if it works. Repeat as many times as necessary.
  5. (4 marks) In C, write an application-layer program spacer.c that invokes your new system call. Again, some intricacy is involved here to define the proper system call number in the right place(s), so that your program can be compiled successfully inside UML. Running this program in your UML environment with your kernel version should trigger the debugging output messages that you coded in your kernel system call. Convince yourself that you are able to adjust PIDincrement at will from the command line interface. You may need to modify, compile, and test your kernel again at this stage until everything works. Repeat as many times as necessary.
  6. (4 marks) Modify the PID allocation process. Note that this is the most treacherous part of the assignment. In the kernel source code, find the relevant place where PID values are assigned when processes are created. You only need to add a line or two of code, which boosts the PID counter by PIDincrement (default value 0) after each process creation. But figuring out the correct line(s) of code, and putting them in the exact right place, is the hard (fun!) part. Compile and run your kernel to make sure that all is well.
  7. (6 marks) Show that your system works. From a terminal window, run a small set of processes, such as "sleep 10 & ; sleep 10 & ; ps" or the "./burner" example from in class. The output from the ps command should show the PID values assigned for your processes. Most likely, they are consecutive. Now invoke your system call to set PIDincrement to a non-zero value, such as 1 or 10. From the same terminal window as before, run another small set of processes, and look at the PID values assigned, to see if they are spaced appropriately. Try the experiment again with another (different) setting for PIDincrement. Verify that it works. Then reset PIDincrement to 0 to verify that things are back to normal. Record your screen output as part of your documentation that your system is working correctly. Include it in your assignment submission, along with your source code (kernel system call, application-layer program) and appropriate documentation (e.g., README, source code, comments, testing output, acknowledgements)

Bonus (up to 5 marks)

Comments, Tips, and Hints

This assignment is not overly difficult, but learning UML for the first time is always a challenge. Be sure to attend the UML tutorial on February 2, and make sure to start working on this assignment early. Really!!

Kernel development work is best done outside the UML environment (for reasons of convenience, time, and safety), while kernel testing has to be done inside the UML environment. This means that you occasionally need to copy files back and forth between the two environments, such as your application-layer program spacer.c. Your TAs will help explain this magic to you.

While the incremental development steps indicated above are helpful, you will undoubtedly end up crashing your kernel more than once. In most cases, crashing your UML session pops you back into regular Linux, and you can enter UML (step 2) again easily, either with the same kernel or a slightly improved one. In the worst cases, crashing your kernel means going back to step 1, and starting over from scratch by reinitializing a clean new copy of UML. Shouting "Oh CRAP! Not again!" at this point is very therapeutic, but technically ineffectual. In other words, save your work often so that you don't end up having to redo it too many times.

If you are using departmental servers (e.g., csc, cse, csl) to run UML sessions, please clean up your files in /tmp when you are all done. Each UML session there consumes about 600 MB of disk space, which means that only a few UML users can be active at a time.

Submitting Your Assignment

When you are finished, send your solutions directly to your assigned TA via email, using a single email attachment (e.g., gzipped tar file, including a README file, relevant source code, and sample output). Multiple repeated submissions from the same student are frowned upon, as are multiple email attachments. Please put an appropriate subject line on your email. Submissions must be received on or before the stated submission deadline, otherwise a late penalty of 10% (3 marks) per day will apply.