Doctor of Philosophy
Research Proposal

Shih Ching Fu
School of Computer Science & Software Engineering
The University of Western Australia
35 Stirling Highway, Crawley, W.A. 6009, Australia.
scfu(at)csse.uwa.edu.au

February 2004

A  Proposed Study

1.  Title

Structure from Motion using Differential Invariants of Optical Flow.

2.  Background

Computer vision is concerned with inferring information about the three-dimensional (3D) world from two-dimensional (2D) images. The human visual system is adept at discerning quantities such as depth and motion, helping us interact with our environment without needing to come into direct contact with it [15]. It is hence desirable to emulate this proficiency observed in nature. Of particular interest is the role that visual motion plays in extracting information about our surroundings. Even in the absence of stereo, the monocular observer can still determine scene structure through movements, this in fact forms the basis of active vision [6]. The focus of this research is on image motion or optical flow analysis, especially the extraction of the first-order differential invariants of image velocity using a correlative filtering method.

Optical flow is an approximation of a scene's 2D motion field, typically derived from an image sequence. Two-dimensional image motion fields comprise the projection of the 3D velocities of objects in 3D space onto a 2D image plane. These velocities may be resultant from viewer movement (or ego-motion), movement of scene objects, or a combination of both. Optical flow is only an approximation of the image motion field since assumptions are made about the lighting and texturing of surfaces. Assumptions of static light sources or adequate scene texture can result in the presence of optical flow in places of zero motion and vice versa [15].

There has been much work done in the area of optical flow computation [3]. Most of the recent published literature in this field focuses on improving the robustness and accuracy of existing optical flow algorithms rather than developing new approaches [4,8,2,23]. In general, optical flow determination algorithms can be categorized into three approaches: intensity-based methods, energy-based methods, and correlation-based methods [5].

Gradient methods such as that by Sobey and Srinivasan [21,22] compute image velocities by calculating the spatial and temporal derivatives of image intensities, assuming that images are differentiable. Typically they involve finding the solution to an overdetermined system of linear equations where one constraint is the optical flow constraint, defined in Equation 1,
Ix u + Iy v + It = 0
(1)
where I is image intensity with the subscripts defining partial derivatives and u and v are the x and y components of the optical flow. The optical flow constraint equation is derived from the assumption that for a point on the surface the image intensity in a small neighbourhood around that point does not vary with time. Energy-based methods can be shown to use similar constraints as differential (or gradient) methods but transformed into the Fourier domain [3].

Correlation or matching methods require regions or features to be tracked between images of an image sequence. Such methods are appealing when accurate differentiation of the image is impractical due to noise. Thus the recovery of the motion field is similar to solving the correspondence problem where point trajectories are interpreted as instantaneous velocity vectors. From this information, scene reconstruction can be treated as a classical projective geometry problem. However, unlike the previous methods, matching methods need to assume rigid body motion, and encounter problems when the scene contains several moving objects or occlusions [5].

A further approach to optical flow computation is to use a phase representation as done by Fleet [14]. Most algorithms, by assuming that image intensity is not time varying, approximate image motion as pure image translation. Fleet argues that the dynamics of an image's phase contours is a better approximation to the motion field. It is proposed that a phase based approach need not assume pure image translation and performs well under image contrast variations and geometric deformation due to perspective.

Once optical flow has been determined, its applications are numerous. The most common uses for optical flow fields are ego-motion determination [7,12,24] and 3D scene reconstruction. Further applications include motion detection [27], object segmentation [20], motion compensation, and tracking [25,13]. There are however, further properties of motion fields themselves, such as their divergence, vorticity, and deformation, that also provide us with extensive 3D scene information.

It has been shown that the change in shape of objects caused by relative motion between an object and observer can be decomposed into divergence, curl, and deformation components [16,17]. These three components can be geometrically interpreted as isotropic expansion, rigid rotation, and shear distortion 1. Divergence, curl, and deformation are called the first-order differential invariants of image motion fields, termed thus because their values are independent of the choice of coordinate system and viewer rotations about the projection centre. Moreover, these properties are directly related to 3D scene structure and ego-motion, and can be determined through their affect on scene geometry. Through these relationships, these three quantities can be used to derive information about surface orientations and time-to-contact; two quantities which are useful in obstacle avoidance problems [18,26].

Figure 1: Koenderink and van Doorn [16] showed that an image velocity field can be decomposed into components of curl, divergence, and deformation. From left to right: divergence (dilation), curl (rotation), and deformation (shearing about two different axes). The first two and the magnitude of the last are independent of coordinate system. The choice of axis for the deformation components means it is not a differential invariant, but any deformation can be expressed as a combination of these two components.

Cipolla and Blake [10,11] have done work on how to derive surface orientation and time to contact from divergence and deformation information. They measured the change in the apparent area of objects to compute the divergence and deformation. This was done with B-spline snakes, which could prove problematic in the absence of trackable features. Work has also been done by Nelson and Aloimonos [19] on the use of divergence for obstacle avoidance, deriving it mathematically but needing to use many images over time to produce good results. The time-to-crash detector implemented by Ancona [1] utilizes optical flow rather than using divergence as in Cipolla and Blake [9].

Although there has been much work in optical flow determination, as well as research into the usefulness of the geometrical properties of optical flow (such as divergence, curl, and deformation), there little work one how to connect the two. Apart from Cipolla's closed-curved tracking method, there are no other well documented methods for deriving the differential invariants of image velocity. This research is therefore aimed at finding new ways to determine the differential invariants of optical flow fields from input optical flow data. It is hypothesized that this can be done using a simple filter correlation technique similar to that of signal deconstruction. Some preliminary experimentation using small images and filters has produced promising results. It is hoped that this new method can take advantage of the extensive research in motion field determination, whilst contributing to research in areas such as structure from motion.

B  Research Plan

1.  Time estimates for completion

DateTask
Apr 2004 Complete literature review
Jul Complete implementations of popular optical flow algorithms
Nov Complete comparisons of these algorithms
Dec Design of appropriate div, curl, and def filters
Jan 2005 Start implementation of div, curl, and def extraction algorithms
Feb Experimentation over different datasets
Apr Investigate the affect of scale and noise
Oct Start experimentation with scene reconstruction
Jan 2006 Start thesis composition
Jun Start thesis review
Aug Thesis submission

2.  Project Aims

3.  Research Method

The first stage of this project will involve a comparison of the many optical flow determination algorithms found in contemporary literature. Implementation of these popular algorithms is needed for the evaluation of their performance over different data sets. Issues of importance include how these algorithms handle boundary discontinuities from occlusions, multiple moving objects, and transparency. Both real and synthetic data will be used to assess the accuracy and reliability of these algorithms whose optical flow outputs are needed for the next stage in the research.

The main part of the research involves the design of proper filters for correlations with the output motion field data from suitable algorithms discovered above. Filters need to be designed for each component of divergence, curl, and deformation. Correlation calculations determine the similarity between the filter and the underlying motion field. This way we can deduce how many `units' of say, divergence, exist in a particular motion field. This is an alternative way to determine differential invariants without needing to track curves or areas, and uses existing optical flow techniques.

There are however, several challenges that need to be overcome whilst designing correct filters. The first is the issue of scale. It will be necessary to define what a `unit' of divergence, or curl, or deformation is and then create a bank of filters comprising ones of different scale to correctly deduce the differential invariants. Once deciding upon different filter scales a method will be needed to some how combine these filters together so that they can be applied to motion fields.

Something that has always troubled motion field analysis, in fact almost any real world application, is the presence of noise. The effect of noise will be first encountered in the extraction of motion fields from image sequences, and then once again when trying to correlate the filters from noisy motion fields. An analysis is needed to examine the robustness and noise resistance of a correlative method for determining divergence, curl, and deformation, and any improvements it has over existing curve and area tracking methods.

The reconstruction of 3D scene structure requires the calculation of ego-motion and surface orientations. These quantities can be found from the divergence, curl, and deformation of the motion fields by first calculating the slant and tilt of surface normals. A possible issue with dealing with tilt quantities will be defining an `origin' for the camera coordinate system in which to measure tilt from. This is not a problem for slant since it is only dependent on the viewing direction.

4.  Duplicated Work

My supervisor and I have conducted literature searches and found no existing research that duplicates this project.

C  Scholars

1.  Identify some leading scholars in the field, particularly some whose published work you have had occasion to study. If possible, include at least one from Australia.

Roberto Cipolla
(cipolla@eng.cam.ac.uk)
Department of Engineering
University of Cambridge
Mandyam Srinivasan
(m.srinivasan@anu.edu.au>
Research School of Biological Sciences
Australian National University
David Suter
(d.suter@eng.monash.edu.au)
Department of Electrical and Computer Systems Engineering
Monash University
David Fleet
(fleet@cs.toronto.edu)
Department of Computer Science
University of Toronto

D  Bibliography

1.  Candidates should show familiarity with the literature in the field.

References

[1]
Nicola Ancona and Tomaso Poggio. Optical flow from 1D correlation: Application to a simple time-to-crash detector. In Proceedings of the Fourth International Conference on Computer Vision, May 1993.

[2]
Alireza Bab-Hadiashar and David Suter. Robust optic flow computation. International Journal of Computer Vision, 29(1):59-77, August 1998.

[3]
J. L. Barron, S. S. Beauchemin, and D. J. Fleet. On optical flow. In I. Plander, editor, 6th International Conference on Artificial Intelligence and Information Control Systems of Robots (AIICSR), pages 3-14, Bratislava, Slovakia, September 1994. World Scientific.

[4]
J. L. Barron, D. J. Fleet, and S. S. Beauchemin. Performance of optical flow techniques. Internal Journal on Computer Vision, 12(1):43-77, 1994.

[5]
S. S. Beauchemin and J. L. Barron. The computation of optical flow. ACM Computing Surveys, 27(3):433-467, 1995.

[6]
Steven S. Beauchemin, Ruzena Bajcsy, and John L. Barron. Recent advances in motion understanding. In Reinhard Klette, Georgy Gimel'farb, and Ramakrishna Kakarda, editors, International Conference on Image and Vision Computing (IVCNZ98), pages 29-37, The University of Auckland, November 1998.

[7]
A. Branca, E. Stella, and A. Distante. Passive navigation using egomotion estimates. Image and Vision Computing, 18:833-841, 2000.

[8]
Ted Camus. Real-time quantized optical flow. Real-Time Imaging, 3:71-86, 1997.

[9]
Roberto Cipolla. Active Visual Inference of Surface Shape, volume 1016 of Lecture Notes in Computer Science. Springer-Verlag, Heidelberg, Germany, 1995.

[10]
Roberto Cipolla and Andrew Blake. Surface orientation and time to contact from image divergence and deformation. Lecture Notes in Computer Science, 588:187-202, 1992.

[11]
Roberto Cipolla and Andrew Blake. Image divergence and deformation from closed curves. International Journal of Robotics Research, 16(1):77-96, 1997.

[12]
Guilherme N. DeSouza and Avinash C. Kak. Vision for mobile robot navigation: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2):237-267, February 2002.

[13]
T. Drummond and R. Cipolla. Real-time tracking of complex structures with on-line camera calibration. In Proceedings of the British Machine Vision Conference, pages 574-583, Nottingham, September 1999.

[14]
David J. Fleet. Measurement of Image Velocity. Kluwer Academic, 1992.

[15]
Berthold Klaus Paul Horn. Robot Vision. MIT Electrical Engineering and Computer Science Series. MIT Press, 1986.

[16]
J. J. Koenderink and A. J. van Doorn. Invariant properties of the motion parallax field due to the movement of rigid bodies relative to an observer. Optica Acta, 22(9):773-791, January 1975.

[17]
J. J. Koenderink and A. J. van Doorn. How an ambulant observer can construct a model of the environment from the geometrical structure of the visual inflow. Kybernetik, pages 224-247, 1978.

[18]
E. Martínez Marroquín and C. Torras Genís. Contour-based 3D motion recovery while zooming. Robotics and Autonomous Systems, 44:219-227, 2003.

[19]
Randal C. Nelson and John (Yiannis) Aloimonos. Obstacle avoidance using flow field divergence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(10):1102-1106, October 1989.

[20]
Paul Smith, Tom Drummond, and Roberto Cipolla. Layered motion segmentation and depth ordering by tracking edges. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(4):479-494, April 2004.

[21]
P. Sobey and M. V. Srinivasan. Measurement of optical flow by a generalized gradient scheme. Optical Society of America, 8(9):1488-1498, September 1991.

[22]
P. J. Sobey, M. G. Nagle, Y. V. Venkatesh, and M. V. Srinivasan. Measurement of complex optical flow with use of an augmented generalized gradient scheme. Optical Society of America, 11(11):2787-2798, November 1994.

[23]
Changming Sun. Fast optical flow using 3D shortest path techniques. Image and Vision Computing, 20(13-14):981-991, December 2002.

[24]
Tina Y. Tian, Carlo Tomasi, and David J. Heeger. Comparison of approaches to egomotion computation. In 1996 Conference on Computer Vision and Pattern Recognition (CVPR96), San Francisco, California, 18-20 June 1996. IEEE.

[25]
Greg Welch and Eric Foxlin. Motion tracking: No silver bullet, but a respectable arsenal. IEEE Computer Graphics and Applications, 22(6):24-38, November/December 2002.

[26]
Benedict Wong and Minas Spetsakis. Scene reconstruction and robot navigation using dynamic fields. Autonomous Robots, 8:71-86, 2000.

[27]
J. M. Zanker, M. V. Srinivasan, and M. Egelhaaf. Speed tuning in elementary motion detectors of the correlation type. Biological Cybernetics, 80:109-116, September 1999.

E  Facilities

1.  Supervision

Dr. Peter Kovesi is available to supervise this project.

2.  Special Equipment

This research will require access to a consumer level computer terminal. The standard laboratory machines provided by the School fit this specification. Camera and image capture equipment is already available in the School's vision research laboratory.

3.  Special Techniques

No special techniques are required for this project.

4.  Special Literature

No special literature is required for this project.

5.  Statistical Advice

No statistical advice is required for this project.

F  Estimated Costs

No costs other than those normally borne by the School are anticipated. The School will provide AUD$500/year to cover any incidental costs.

G  Confidentiality & Intellectual Property

There are no anticipated confidentiality issues and I intend to make all products of this research available to the academic community.

H  Approvals

No ethical or medical approvals are required for this project.


File translated from TEX by TTH, version 3.12.
On 5 Jan 2006, 15:34.