This document will explain the basics steps involved in setting up simple video capture via the Video For Linux (VFL) API.

Provided VFL has been setup properly on your system, the file videodev.h should be on your include path at <linux/videodev.h> and the API documentation should be available localy at file:/usr/src/linux/Documentation/video4linux/API.html.

If you are accessing this document via the web, the VFL documentation is accessible here. The example source file is example.cc, and the Makefile.

First, you will need to include various header files that define structures and constants used by VFL as well as various functions needed to communicate to the VFL device. Place the following #include lines in all of your source files that will be using VFL:

#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <sys/mman.h>
#include <linux/videodev.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
Somewhere within your program, you will need to initialize video capture. It is also a good idea to test some of the capabilities of the device. The steps involved in initialization and testing are:
  1. Open the VFL device.
  2. Query the capabilities of the device and verify that the device is a video capture device (optional).
  3. Enumerate the video channels (sources of capture) available on the device (optional).
  4. Set the video channel (source of capture) on the device (optional).
  5. Set the width and height of the capture image (optional).
  6. Retrieve the actual width and height of the capture image.
  7. Set the palette and bit depth of the capture image (optional).
  8. Retrieve the actual palette and bit depth of the capture image.
At this point, your capture device should be configured for the type of images it will be capturing and you have obtained this information from the capture device. From here, there are two ways to perform frame capturing. The first way is the simplest but it is also the least efficient. The function we use requires our program to wait for a frame, and then it copies the frame into a user provided buffer. In addition, this method may not be supported on all devices.
  1. Obtain frames using the system call, "read".
The second method requires some additional setup, but has the added benefit of running synchronously and does not require any buffers to be copied. However, this method may not be supported by all devices either:
  1. Setup Memory Mapped Input/Output (MMIO) interface.
  2. Capturing using MMIO.
  3. Clean-up after using MMIO.
Either method you use for performing capture, when you are finished, the final thing you need to do is to disconnect from the VFL device:
  1. Close the VFL device.

Example code.

example.cc
Makefile


1. Open the VFL device.

This step is relatively easy: use the open function to return a file descriptor to the device. The call to open is as follows:


int deviceHandle;
char* deviceName = "/dev/video";
deviceHandle = open (deviceName, O_RDWR);
if (deviceHandle == -1)
{       // could not open device
}
deviceName will generally be /dev/video, but this may be different (e.g. /dev/video0) depending upon how VFL is configured on your system.

If the call to open was a success, then deviceHandle will contain a valid handle to the device. If the call failed, then deviceHandle will be set to -1. For a list of error codes or for additional information regarding open, just type man open into a terminal window.


2. Query the capabilities of the device (Optional).

This step may be optional, provided you know that your device can capture video to memory. Omitting this step may be suitable for prototype development, but it is a good idea to check the capabilities of the device if your program may be running on different machines or with different devices.

Communication to the VFL device is performed via calls through ioctl. Querying the capabilities of the device tells us such things, for example, as whether this device is a video capture device and what the maximum and minimum capture dimensions are. The following code will query the device for its capabilities:


struct video_capability capability; // this structure is filled out by the ioctl call
if (ioctl (deviceHandle, VIDIOCGCAP, &capability) != -1)
{       // query was successful
}
else
{       // query failed
}
The call to ioctl will return a non-zero value if the function was a success, and it will return -1 if the function failed.

Now our capability structure is filled out. We want to verify that this device can capture video to memory; we simply mask the type field in the capability structure with the VID_TYPE_CAPTURE flag:

if ((capability.type & VID_TYPE_CAPTURE) != 0)
{       // this device can capture video to memory
}
else
{       // this device cannot capture video to memory, exit
}
For a list of the fields of the video_capability structure, and the flags that define the type of device available, view the VFL API documentation
here.


3. Enumerate the video channels (sources of capture) available on the device (Optional).

This step is optional, if you know the numeric value of the channel you want to open. Again, this may be suitable for prototype development, but if your program will be running on different machines or with different capture devices, the numeric value of the channel you want may not remain constant. It may be a good idea to enumerate the channels and allow the user to select which channel to use.

To enumerate the channels available on the capture device, you must first query the capabilities of the capture device. This is because the video_capability structure contains the field channels that holds the value of the number of channels available. Assuming you have performed step 2, code to print out the name and number of each enumerate channel follows:


struct video_channel queryChannel;
i = 0;
while (i < capability.channels)
{
	queryChannel.channel = i;
	if (ioctl (deviceHandle, VIDIOCGCHAN, &queryChannel) != -1)
	{       // ioctl success, queryChannel contains information about this channel
                printf ("%d. %s\n", queryChannel.channel, queryChannel.name);
	}
        else
        {       // ioctl failure
        }
	++ i;
}


4. Set the video channel (source of capture) on the device (Optional).

This step may also be optional, if you simply don't care what channel your video comes from. However, if you enumerated the channels and allowed the user to select one, or if you already know the number of the channel you want, you can set the channel that the VFL device uses. Assuming NTSC input, use the following code:


struct video_channel selectedChannel;
selectedChannel.channel = channelNumber;
selectedChannel.norm = VIDEO_MODE_NTSC;
if (ioctl (deviceHandle, VIDIOCSCHAN, &selectedChannel) == -1)
{       // could not set the selected channel
}
channelNumber is the user selected enumated channel, or the channel you specify. Other options for selectedChannel.norm are VIDEO_MODE_PAL, VIDEO_MODE_SECAM, and VIDEO_MODE_AUTO.


5. Set the width and height of the capture image (optional).

This section may be optional if you are willing to work with the default image width and height.

The following code will set the width and height of the capture image. Not every device supports image scaling, therefore it is important to test for this capability. There are several other fields within this structure that might be used, if, for example, you want to capture a clipped portion of the camera display. As clipping is not supported on every device, it would be necessary to test for this capability. For more information on testing capabilities, see
section two.

The code to test for scaling capabilities (assuming you completed section 2), and then to set the image width and height is:

if ((capability.type & VID_TYPE_SCALES) != 0)
{       // supports the ability to scale captured images
        struct video_window captureWindow;
        captureWindow.x = 0;
        captureWindow.y = 0;
        captureWindow.width = width;
        captureWindow.height = height;
        captureWindow.chromakey = 0;
        captureWindow.flags = 0;
        captureWindow.clips = 0;
        captureWindow.clipcount = 0;
        if (ioctl (deviceHandle, VIDIOCSWIN, &captureWindow) == -1)
        {       // could not set window values for capture
        }
}
width and height are the desired capture width and height.


6. Retrieve the actual width and height of the capture image.

As each device may capture to a different resolution, and many devices do not support image scaling or scaling to some dimensions, a call to set the width and height of the capture image may not succeed. Therefore, it is unsafe to assume that having set the width and height of the capture image, that the captured images will actually be of that dimension. For this reason, it is necessary to query the device for the capture image dimensions. This is done with the following code:

int width;
int height;
struct video_window captureWindow;
if (ioctl (deviceHandle, VIDIOCGWIN, &captureWindow) == -1)
{       // could not obtain specifics of capture window
}
width = captureWindow.width;
height = captureWindow.height;
Again, width and height are the width and height of the captured image.


7. Set the palette and bit depth of the capture image (optional).

This section is optional if you are willing to work with the default image depth and palette of the device.

The following will set the image depth and palette for capture. Because there are many fields in the video_picture structure, we first read the default values into the structure and then set the fields whose values we want to change.

// get image properties
struct video_picture imageProperties;
if (ioctl (deviceHandle, VIDIOCGPICT, &imageProperties) != -1)
{       // successfully retrieved the default image properties

        // the following values are for requesting 8bit grayscale
        imageProperties.depth = 8;
        imageProperties.palette = VIDEO_PALETTE_GREY;
        if (ioctl (deviceHandle, VIDIOCSPICT, &imageProperties) == -1)
        {       // failed to set the image properties
        }
}


The following table shows some common replacement values for imageProperties.depth and imageProperties.palette.

  imageProperties.depth imageProperties.palette
15bit RGB 15 VIDEO_PALETTE_RGB555
16bit RGB 16 VIDEO_PALETTE_RGB565
24bit RGB 24 VIDEO_PALETTE_RGB24
32bit RGB 32 VIDEO_PALETTE_RGB32

Grayscale may not be supported on all devices, and some devices might require the bit depth to match the bit depth of the desktop. For a list of all the available palette modes, see the VFL API documentation
here.


8. Retrieve the actual palette and bit depth of the capture image.

This is quite simple and virtually identical to the first step of section 7:

int depth;
int palette;
struct video_picture imageProperties;
if (ioctl (deviceHandle, VIDIOCGPICT, &imageProperties) == -1)
{       // failed to retrieve default image properties
}
depth = imageProperties.depth;
palette = imageProperties.palette;
If your program is expecting 24bit RGB, for example, then depth and palette should be tested for those values:

if ((depth != 24) || (palette != VIDEO_PALETTE_RGB24))
{       // not a format our program supports
}


9. Obtain frames using the system call, "read".

At this point, your capture device should be configured for the type of image it will capture and you will have queried this information from the capture device. The simplest way to capture a frame is to perform a read from the device. This is a blocking call that will return the entire frame. The call will wait for a frame to be received from the capture device, and then the call will copy the frame buffer into a user provided buffer. It is important to note that this method will not work on all devices. Also, I have not been able to test this method, because the devices I used never returned from the call. If you are concerned about this method working, or would like to perform other operations while frame capturing is taking place, do not use this method; skip this section and read the sections on Memory Mapped Input/Output (MMIO).

If you will be using this method, it is the responsibility of the caller to allocate enough space for the buffer and to pass the buffer and its length to the read function. Code to allocate the buffer follows:

// allocate a buffer for the image
int imageSize = width * height * ((depth+7)>>3);
char* image = (char*)malloc (imageSize);
if (image == 0)
{       // malloc failed
}
The above code should probably be performed during setup and not for every frame read. To read a frame use the following code:

// read a frame
if (read (deviceHandle, image, imageSize) != imageSize)
{       // the return value was not equal to the number of requested bytes
        // therefore some error occurred.
}
else
{       // successfully retrieved a frame
}
To free our allocated buffer, simply use the function free.

free (image);


10. Setup Memory Mapped Input/Output (MMIO) interface.

This method uses the Memory Mapped Input/Output (MMIO) interface to map hardware video buffers into our process memory space. From here we can obtain pointers directly to the captured buffer to perform reading.

The first step is to obtain information from the VFL device needed for MMIO:

struct video_mbuf memoryBuffer;
if (ioctl (deviceHandle, VIDIOCGMBUF, &memoryBuffer) == -1)
{       // failed to retrieve information about capture memory space
}
This structure contains the size in bytes of the memory mapped area, the number of frames bufferred by the capture device, and an array of offsets into the memory mapped area for each of the frames.

The next step is to get a pointer to the memory mapped area:

// obtain memory mapped area
char* memoryMap;
memoryMap = (char*)mmap (0, memoryBuffer.size, PROT_READ | PROT_WRITE, MAP_SHARED, deviceHandle, 0);
if ((int)memoryMap == -1)
{       // failed to retrieve pointer to memory mapped area
}
The pointer memoryMap and the offsets within memoryBuffer.offsets combine to give us the address of each buffered frame. For example:
Buffered Frame 0 is located at:  memoryMap + memoryBuffer.offsets[0]
Buffered Frame 1 is located at:  memoryMap + memoryBuffer.offsets[1]
Buffered Frame 2 is located at:  memoryMap + memoryBuffer.offsets[2]
etc...
The number of buffered frames is stored in memoryBuffer.frames.

Capturing requests require the use of a video_mmap structure for each buffer. We will need to allocate these structures and fill out their fields:

// allocate structures
struct video_mmap* mmaps;
mmaps = (struct video_mmap*)(malloc (memoryBuffer.frames * sizeof (struct video_mmap)));

// fill out the fields
int i = 0;
while (i < memoryBuffer.frames)
{
	mmaps[i].frame = i;
	mmaps[i].width = width;
	mmaps[i].height = height;
	mmaps[i].format = palette;
	++ i;
}
The variables width, height, and palette were all obtained when setting up the capture device.



11. Capturing using MMIO.

The VFL device only begins capturing to a bufferred frame when we ask it to. Once the device has captured to a buffer, it will not capture to that buffer again until we ask it to. The strategy for synchronous capture will be to tell the device that it can begin capturing to each buffer except for the last buffer. We will track an index that cycles through the buffers till it reaches the last buffer and then restarts at the first buffer. This index will initially refer to the last buffer. Each time we request a new frame, we will ask the device to begin capturing to the buffer we currently index, then we will move our index to the next buffer, and finally, we will block for the frame at that index to complete.

The following code will request capturing to each buffer except the last buffer:

int i = 0;
while (i < (memoryBuffer.frames-1))
{
        if (ioctl (deviceHandle, VIDIOCMCAPTURE, &mmaps[i]) == -1)
        {       // capture request failed
        }
        ++ i;
}
We also need an index to track which buffer we are sending capture requests to:

int bufferIndex;
bufferIndex = memoryBuffer.frames-1;
Now we can write a simple routine that will control capture requests, and will return the address of the currently available frame:

char* NextFrame()
{
        // send a request to begin capturing to the currently indexed buffer
        if (ioctl (deviceHandle, VIDIOCMCAPTURE, &mmaps[bufferIndex]) == -1)
        {       // capture request failed
        }

        // move bufferIndex to the next frame
        ++ bufferIndex;
        if (bufferIndex == memoryBuffer.frames)
        {       // bufferIndex is indexing beyond the last buffer
                // set it to index the first buffer
                bufferIndex = 0;
        }

        // wait for the currently indexed frame to complete capture
        if (ioctl (deviceHandle, VIDIOCSYNC, &mmaps[bufferIndex]) == -1)
        {       // sync request failed
        }

        // return the address of the frame data for the current buffer index
        return (memoryMap + memoryBuffer.offsets[bufferIndex]);
}
Successive calls to NextFrame will return pointers to successive frames.


12. Clean-up after using MMIO.

When we are finished, we will need to free the video_mmap structures we allocated, as well as unmap the capture memory. This is quite simple:

// free the video_mmap structures
free (mmaps);

// unmap the capture memory
munmap (memoryMap, memoryBuffer.size);


13. Close the VFL device.

This is very simple:

close (deviceHandle);
Having followed the above steps, you should have video capturing working under Video For Linux. For additional features of VFL or for additional details on the features documented here, see the original VFL API documentation located at
file:/usr/src/linux/Documentation/video4linux/API.html.



Documentation by Maxwell Sayles, 2002.