<--staticfs inods ^--VFS--^ staticfs addresses-->

How to write a Linux VFS filesystem module - StaticFS - files

March 12, 2004



Files are next, so let's take a look at struct file_operations.
 * NOTE:
 * read, write, poll, fsync, readv, writev can be called
 *   without the big kernel lock held in all filesystems.
struct file_operations {
        struct module *owner;
        loff_t (*llseek) (struct file *, loff_t, int);
        ssize_t (*read) (struct file *, char *, size_t, loff_t *);
        ssize_t (*write) (struct file *, const char *, size_t, loff_t *);
        int (*readdir) (struct file *, void *, filldir_t);
        unsigned int (*poll) (struct file *, struct poll_table_struct *);
        int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long);
        int (*mmap) (struct file *, struct vm_area_struct *);
        int (*open) (struct inode *, struct file *);
        int (*flush) (struct file *);
        int (*release) (struct inode *, struct file *);
        int (*fsync) (struct file *, struct dentry *, int datasync);
        int (*fasync) (int, struct file *, int);
        int (*lock) (struct file *, int, struct file_lock *);
        ssize_t (*readv) (struct file *, const struct iovec *, unsigned long, loff_t *);
        ssize_t (*writev) (struct file *, const struct iovec *, unsigned long, loff_t *);
        ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int);
        unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned l\
Note that I copied the comment above the structure as well. The big kernel lock, or BKL, has been mentioned many times as I've trolled through the kernel source. From what I understand, this is the lock, which prevents all sorts of other things from happening at the same time. Maybe all things. We haven't had to learn about BKL yet, and let's keep it that way for now.

If you remember back to our superblock code, we decided to make two structures of file operations - one for files and one for directories. This is similar to ramfs, but not romfs, which only has one for directories (assuming that all the defaults are good enough for romfs).

But, of course, I have to wonder about what the defaults are. So many of the values in the above structure aren't defined in ramfs and romfs, so what does that mean? For most that are defined, the functions used are generic_(), as we saw when we were first looking at the file structures. I said then that we were going to look at them later. Perhaps now?

Yuck. No thanks. If you've the stomach for it, take a look in /usr/src/linux/mm/filemap.c at generic_file_read() and generic_file_write(). Lovely. For now I'll go on faith that they work ... somehow.

Okay, enough moaning. Let's look at that structure. Most of these functions are familiar to a C programmer, because these are the same names as functions that programmers use at the file level. Because of this, there's no need to go into what each does. Let's just concentrate on what romfs does for not, because we're very similar to it.

romfs has only two of the fields filled in, read and readdir. read is assigned generic_read_dir(), but why? What if it were left blank? Let's take a look at that generic() function.

ssize_t generic_read_dir(struct file *filp, char *buf, size_t siz, loff_t *ppos)
        return -EISDIR;
Interesting! This is what happens when you try to read a directory like a file! You get a nice error saying EISADIR. Does that mean if I have my own function here, I can return "content" for a directory, and still have it act as a directory? This was the question I had earlier, about directories being able to be "viewed" as well as "entered". Excellent. We'll certainly come back to this later.

It still makes me wonder: what if we didn't use this function for the read field? What does NULL mean? Would it return just nothing? Would VFS call some other function? It might be interesting to edit the ramfs module and see what happens when I try to cat a directory there. We'll assume for now, then, that the behavior of saying "hey, that's a directory" must be supplied, and isn't the default.

March 15, 2004

The other field, readdir, is filled with a home-brewed function, romfs_readdir():

static int
romfs_readdir(struct file *filp, void *dirent, filldir_t filldir)
        struct inode *i = filp->f_dentry->d_inode;
        struct romfs_inode ri;
        unsigned long offset, maxoff;
        int j, ino, nextfh;
        int stored = 0;
        char fsname[ROMFS_MAXFN];       /* XXX dynamic? */

        maxoff = i->i_sb->u.romfs_sb.s_maxsize;

        offset = filp->f_pos;
        if (!offset) {
                offset = i->i_ino & ROMFH_MASK;
                if (romfs_copyfrom(i, &ri, offset, ROMFH_SIZE) <= 0)
                        return stored;
                offset = ntohl(ri.spec) & ROMFH_MASK;

        /* Not really failsafe, but we are read-only... */
        for(;;) {
                if (!offset || offset >= maxoff) {
                        offset = maxoff;
                        filp->f_pos = offset;
                        return stored;
                filp->f_pos = offset;

                /* Fetch inode info */
                if (romfs_copyfrom(i, &ri, offset, ROMFH_SIZE) <= 0)
                        return stored;

                j = romfs_strnlen(i, offset+ROMFH_SIZE, sizeof(fsname)-1);
                if (j < 0)
                        return stored;

                romfs_copyfrom(i, fsname, offset+ROMFH_SIZE, j);

                ino = offset;
                nextfh = ntohl(ri.next);
                if ((nextfh & ROMFH_TYPE) == ROMFH_HRD)
                        ino = ntohl(ri.spec);
                if (filldir(dirent, fsname, j, offset, ino,
                            romfs_dtype_table[nextfh & ROMFH_TYPE]) < 0) {
                        return stored;
                offset = nextfh & ROMFH_MASK;
This function is probably the most work we've seen out of any part of the VFS system. Here we have to fill in a dirent structure, passed in as a void pointer, with a filldir() function, passed in as a filldir_t type. It's interesting that the structure to be filled and the function that fills it aren't static. Quite the forethought I suppose. Makes me wonder if the VFS has multiple structures and functions that it uses now, or if it's just planning for the future. Perhaps I'll look into it later.

Looking through this function, it grabs the needed inode and looks at the offset in the file pointer. The offset looks like it's probably module- dependent, so you can use the number of bytes into your directory file, or in my case, I'm going to use an index, because I don't really have a file to read. At least, that's how I'm hoping it works.

It also looks like I have to be able to handle the filldir() function returning less-than-zero, which is probably for both errors and "I'm too full" messages. This is why we'd have to support a call with an offset -- in case we had to stop sending back contents halfway through. This is likely not going to be a problem for us, since we have at most two entries in a directory.

Let's start writing.

static int staticfs_readdir(struct file *fp, void *dirent, filldir_t filldir) {
  struct inode *i=fp->f_dentry->d_inode;
  unsigned long offset=fp->f_pos;
  int ino=i->i_ino;
  int stored = 0;
  unsigned char ftype;
  char fsname[2];

  for (;;) {

    switch (ino) {
    case 0:
      switch (offset) {
      case 0:fsname[0]='a';fsname[1]='\0';ftype=DT_REG;break;
      case 1:fsname[0]='b';fsname[1]='\0';ftype=DT_DIR;break;
    case 2:
      switch (offset) {
      case 0:fsname[0]='c';fsname[1]='\0';ftype=DT_REG;break;

    if (fsname[0]=='\0') {
      return stored;

    if (filldir(dirent,fsname,1,offset,ino,ftype)<0) {
      return stored;
This is a bit simpler than the romfs version, especially since our staticfs is exactly that -- static. With our functions written, we should populate our structure.
static struct file_operations staticfs_dir_operations = {
One last thing to do I think, which is our address space functions.
<--staticfs inods ^--VFS--^ staticfs addresses-->
©2002-2018 Wayne Pearson