We have developed a novel vision system that can recognize people by the way they walk. The system computes optical flow for an image sequence of a person walking, and then characterizes the shape of the motion with a set of sinusoidally-varying scalars. Feature vectors composed of the phases of the sinusoids are able to discriminate among people.
Our goal is to describe the motion of a moving human figure in order to recognize individuals by variation in the characteristics of the motion description. We begin with a short sequence of images of a moving figure, taken by a static camera, and derive dense optical flow data, (u(x,y),v(x,y)), for the sequence. We determine a range of scale-independent scalar features of each flow image that characterize the spatial distribution of the flow, i.e., the shape of the motion. The scalars are based on various moments of the set of moving points. To characterize the shape of the motion, not the shape of moving points, the points are weighted by |u|, |v|, or |(u,v)|.
We then analyze the periodic structure of these sequences of scalars. All of these sequences share the same fundamental period of the gait, but they differ in phase. Although there are several regularities in the relative phases of the signals, some of the phases show significant statistical variation. Therefore, we are able to use vectors of phase measurements derived for each image sequence to recognize individuals by the shape of their motion.
The representation is model-free, and therefore could be used to characterize the motion of other non-rigid bodies.
As shown in the data-flow diagram (above), the steps in the system are, from top to bottom:
Raw Time Series for centx, centy, aspct | centx, centy, aspct with Linear Background Removed |
Frequency Spectrum of aspct (20 coefficients) | aspct with Sinusoid of Fitted Frequency and Phase |
The entire process is controlled from a shell script that executes programs to compute the optical flow, calculate the scalar shape descriptions, and analyze the signals for frequency and phase.
To verify that the system is capable of recognition, we sampled the gaits of six people using the apparatus depicted below.
A camera fixed on a tripod points towards a fixed non-reflecting static background. Subjects walk in a circular path such that on one side of the path they pass through the field of view of the camera and pass behind the camera on the other side. Only one subject is in the field of view at any one time. The subjects walk this path for about 15 minutes while the images are recorded on video tape.
Later, we digitize sequences for the six subjects. We discard the first two or three passes for each person and digitize seven sequences for each subject (42 sequences total).
Images from the tape are digitized in 24-bit color at a resolution of 640 by 480 pixels. We resample and crop the images to get black and white images with 320 by 160 pixels.
Click to view an example sequence for each of the six subjects:
The complete data set.
We analyzed the following scalars and their phase features:
centy (y coordinate of the centroid of moving region) is used only as the phase reference signal.
Analysis of variance, including post hoc testing, indicated the following:
The following scatter plots show the features with the greatest variation, centx, wcentx, and aspct:
Right | Left |
Stereo 3-D Scatter Plot of aspct versus centx and wcentx
We tested recognition using a variety of algorithms. The best results were obtained by finding the nearest neighbor from a set of exemplars, vectors of mean feature values for each subject. To get an unbiased estimate of the recognition rate we used a leave-one-out procedure. Using the full feature vector gives a recognition rate of about 90%. If we use only the four best features, daspct, dcenty, vwcenty, and waspct, then the recognition rate went as high as 95%. In comparison, recognition by chance would yield a rate of about 17%.
Recognition was possible for a variety of flow sources varying in spatial resolution. We observed that while the exemplars changed and the features with the greatest variation changed, as long as the parameters for flow computation were kept constant, recognition is successful.
Analysis of variance and our recognition test show that the features have the following approximate significance for recognition: daspct > dcenty > wcenty > vwcenty > aspct > vwcentx > waspct > centx > uwcentx > wcentx > uwcenty > dcentx