OpenBSD in the Classroom

John Aycock
Department of Computer Science
University of Calgary
aycock@cpsc.ucalgary.ca

Introduction

Q: How do you get a hundred students in an operating systems class to work on real kernel code, using outdated machines and a lab barely big enough for a quarter of them?
A: Very carefully.

Like most university computer science programs, ours has a mandatory course on operating systems. It is a third-year (junior) course, with high enrollment -- 238 students over the last two semesters -- and assignments traditionally based on OS simulation or toy problems.

It would be nice to have students working with real OS code, though. Students get little experience studying large pieces of software, much less software that is well-written. There is also something to be said for working on the Real Thing rather than abstracted academic contrivances.

The kernel-hacking initiative was first started here by a sessional instructor, teaching operating systems over the summer using the Linux kernel. Being a summer term, there was a relatively low enrollment and, equally important, a reduced demand for student workstations -- a bank of about 30 SPARC-5 machines was allocated for the course, whose kernel the students could modify and reboot with impunity.

How could kernel work be done during a regular semester? There are three issues:

It was the lack of an experienced TA pool that led me to OpenBSD, strangely enough. Basically, there was no Linux kernel talent to rely upon, so there was no compelling reason to stick with Linux. OpenBSD had two other advantages, from the point of view of what I wanted to teach students. First, it doesn't enjoy the same popularity as Linux, so students would have to study the code; they couldn't easily find "how-to" kernel code on the Internet or in books. Second, and I'm sure someone will flame me for saying this, the code quality tends to be better in OpenBSD than in Linux. The OpenBSD kernel would not only have to be studied, but was good to study.

The other pieces of the puzzle, cost and equipment, were solved rather serendipitously. Our campus IT department gave us some dusty PCs they considered obsolete: P166 IBMs with 64M RAM, 2G of disk, floppy and CDROM drives. Free is good.

Configuring the Machines

Our support staff set up 28 of the castoff machines in a separate lab, hidden behind a firewall which let nothing in, and allowed only outbound ssh and sftp connections.

I had two priorities for the OpenBSD configuration on these machines. First, students had to be able to rebuild the machine from an unknown state quickly. Second, kernel compiles had to happen quickly.

To rebuild the machines quickly, I created a configuration where as much of the filesystem as possible resided on CD-ROM. OpenBSD's caching of blocks from the CD-ROM gave good enough performance to make extensive use of the CD-ROM feasible. Tracking down all the programs that wanted to write to the filesystem took a while; I must confess that it took ten attempts to iron out all the details! Once I was done, the basic machine rebuilding sequence the students had to follow took a matter of minutes:

  1. Booting from floppy. The machines were first booted single-user off the floppy, mounting the root filesystem from CD-ROM. (Unfortunately, the computers were unable to boot from the CD-ROM, which would have simplified matters.) Even single-user mode seemed to want writable /dev entries, so I set up a 1000-block MFS filesystem and changed /etc/rc to untar device nodes into it (this was far faster than running MAKEDEV).
  2. Installing a writable filesystem. I put a script in /sbin which the students ran as root to do a laundry list of tasks:
    1. Run fdisk and write a new disk label, with a 512M "a" partition and a 128M swap partition. The remainder of the disk was unused, which reduced rebuilding time a bit.
    2. Run newfs on the "a" partition, and mount the resulting filesystem on /w. All things writable on the system, like /tmp, were symlinked to /w.
    3. Populate /w. This was done by untarring an uncompressed tarfile located on the CD-ROM, containing the contents of /root and /var. To permit booting from the disk, /boot, /etc/boot.conf, and a kernel image were also placed in /w.
    4. Run installboot on /w and unmount it.
  3. Booting multi-user. Students had a usable system at this point and could log in as root. No writable kernel source was installed during machine rebuilding; it was left as a separate step.

Setting up a fresh, writable copy of the kernel source as quickly as possible took quite a bit of experimentation. I also wanted to have a prebuilt set of object files for the kernel to reduce kernel build time -- building a kernel from scratch on these machines took almost 15 minutes!

Using the union filesystem for kernel source would have been perfect, but it proved to be far too unstable and was abandoned. I also tried using a recursive cp, restore, and untarring symlink trees. The fastest and easiest method I found, however, was untarring a compressed version of the kernel source plus accompanying object files. The time was reduced further by omitting kernel source for architectures other than the x86. Again, I incorporated this into a script, which students would run after rebuilding the machine. This script would take about two minutes to complete.

Students could then modify the kernel source and build their own kernels. To test kernels, students would copy them into /w and simply boot from the hard drive.

I supplied students with a script to find files they had added or modified in the kernel source tree. As their kernel work could conceivably perturb the clock setting, basing changes on file modification times would be unwise. Instead, I precomputed an MD5 hash for each file in the source tree and stored these hashes on the CD-ROM; my script would then compute new MD5 hashes and look for differences. The output was a list of added or modified files that could be used as input to tar.

Configuring the Assignments

Our operating systems course has four assignments, which students are to do themselves, i.e., no group work is permitted. To accommodate the sheer size of the class, I actually set up eight assignments, divided into two four-assignment "streams": an OpenBSD lab stream whose assignments must be done in the OpenBSD lab, and a non-lab stream whose (traditional simulation-based) assignments could be done on any of the more plentiful workstations. Each student had to do one OpenBSD-stream assignment, and three non-lab-stream assignments.

Each student could pick which OpenBSD assignment they wanted to do. I supplied a summary of the assignments at the beginning of the course to help them make an informed choice. In an ideal world, each student would have chosen an OpenBSD assignment whose topic interested them. I also made it clear to students in lectures that it was their responsibility to distribute themselves over the four assignments. Naive on my part, at best.

Traditionally, the first operating systems assignment is often an easier introductory assignment; not all students know C at the start of the course, for instance. I followed this tradition, assuming that students with a greater learning curve would avoid the first OpenBSD assignment. I also made the final OpenBSD assignment a challenging one, to encourage students not to procrastinate.

Results and Lessons

In hindsight, what happened next was predictable. Of 97 submissions for the first assignment, 89 students opted for the OpenBSD assignment -- word got out that it was an easier assignment. I don't want to dwell on how 89 students crammed themselves into a lab meant for 28, though! What I am pleased to mention is the fact that five students waited and did the final OpenBSD assignment, despite knowing that it was going to be harder than the rest.

What did I learn from this? Based on my experience and the results from a survey I gave the students, several lessons are clear.

According to the survey results, the majority of students liked working with OpenBSD kernel code, at least in principle. The real problems lay in the implementation of the idea, but with some refinements I expect using OpenBSD will be a very educational experience for the students.

Acknowledgments

Thanks to Jim Parker for suggesting the dual streams, Theo de Raadt for feedback on the assignments, and the technical support staff (especially Debbie Mazurek) for setting up the lab. Tim Williams taught the Linux version of the course. Shannon Jaeger proofread a draft of this article.