UWA logo

 
Computer Science
4th Year Projects in 2004

2009-2011 Projects   2008 Projects   2007 Projects   2006 Projects   2005 Projects   2003 Projects  
My past supervised projects  

Undergraduate Scholarships

Want some extra money to fund your studies this year? Apply for a Scholarship! Visit the Scholarships web page for undergraduates and the Scholarships web page for Honours for detail.



Welcome to the 4th year Project Page of Associate Professor Du Huynh!

I have more than 20 years of research experience in computer vision. My research areas include shape from motion, 3D shape reconstruction, visual tracking, video and image analysis.

A few projects in computer vision are offered this year.  Some of of them may be jointly supervised with staff members of the School.  If you have in mind a computer vision research topic that is not listed below, I would be interested to hear from you.  Please note that I can only supervise up to 3 projects in any one year.

With appropriate adjustment, any of the projects below could be suitable for a BE(SE) final year project (12 points), an Honours Research Project (24 points), or a MSc project (24 points).

Experience has shown that it can be very beneficial for research students to have a group of people with related interests to share ideas with. A student undertaking any of the projects below is expected to join the Computer Vision Research Group and will be expected to attend and contribute to group meetings and discussions. Such a student will be housed in the Computer Vision Research Group Laboratory in Room 2.09 of the Computer Science building. You are also strongly advised to take the CITS4240 Computer Vision unit offered in the first semester.

My past final-year project students:

Associate Professor Du Huynh (du@csse.uwa.edu.au)




Adaptive Background/Foreground Segmentation

This project studies the segmentation of foreground objects (e.g. people) in a dynamic, textured background from video sequences.  Examples of such time-varying texture backgrounds include changing illumination, swaying trees, waves on water, moving clouds, etc.  The project will adopt the adaptive background mixture models as described in [1]. Depending on  the progress of the project, if time permits, a comparison between the technique in [1] and a more recent technique reported in [2] will also be undertaken.

Background/foreground segmentation has  extensive applications in movie editing, video surveillance, and image synthesis.  In movie editing, the images of the human actors are often required to be segmented from the background and then superimposed into different scenes; in video surveillance of a scene (e.g. train stations, indoor laboratories) over a long period of time, segmentation of the interesting objects, such as people and vehicles from a background under variable lighting conditions is often required; in image synthesis, moving objects often arise as outliers and their segmentation from the image sequences needs to be integrated with the estimation of camera geometry.

This project will pose some challenges to students who lack basic knowledge in statistics (e.g. knowledge about the Gaussian distribution and conditional probability at high school level).  If you are interested in the project, I'd be happy to provide a short tutorial on statistics at the beginning of the semester to help you get started.

References:
[1] C. Stauffer and W. E. L. Grimson, "Adaptive background mixture models for real-time tracking", Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pages 246-252, 1999.
[2] J. Zhong and S. Sclaroff, "Segmenting Foreground Objects from a Dynamic, Textured Background via a Robust Kalman Filter", Proc. IEEE Conf. on Computer Vision, pages 44-50, 2003.



Video Google


Analogous to text-based search under Google, this project studies the retrieval of image frames from short video sequences, using a region of interest specified by the user in an image as the query region.  The technique to be investigated is reported in the recent work of Sivic and Zisserman [3] and the implementation required will be a cut-down version of [3].  The procedure involves firstly the use of the key point descriptor proposed by Lowe [1, 2] to compute the texture information in the specified region of interest and then the construction of a visual vocabulary for image retrieval.  The code for the key point descriptor of Lowe is available, so the implementation of the project will focus on the image retrieval component.  Due to complications involved in dealing with long video sequences for fast image retrieval, only short video sequences will be used and the implementation will include evaluation of scene matchings using the constructed visual words. 

References:
[1] D. Lowe, "Object Recognition from Local Scale-Invariant Features", Proc. IEEE Conf. on Computer Vision, 1999.
[2] D. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints", submitted to International Journal of Computer Vision.
[3] J. Sivic and A. Zisserman, "Video Google: A Text Retrieval Approach to Object Matching in Videos", Proc. IEEE Conf. on Computer Vision, pages 1470-1477, 2003.



Structure and Motion Reconstruction from Line Correspondences


The aim of this project is to investigate the recovery of camera motion and 3D structure given that the images of a number of 3D lines are identified in at least 3 images. In the structure-from-motion literature, it is well known that the 3D scene and camera motion can both be recovered from a number of matching feature points (e.g. corners, junctions) in two images. However, if the type of image features detected are image line segments then 3 images would be required for motion and structure estimation. Since straight line segments can easily be found in many man-made objects, such as buildings, desks, chairs, the focus of this project will be on the use of line segments as image features.

The input to the system to be implemented will be a number of manually identified line segments in 3 images of an object (or objects) viewed from 3 different positions and viewing directions. The output will be the reconstructed 3D objects that can be viewed under Matlab or a VRML viewer. The required implementation for the project will be based on the work described in [1] (see also related papers, e.g. [2],[3]).

References:
[1] A. Bartoli and P. Sturm, "Multiple-View Structure and Motion from Line Correspondences", Proc. IEEE Conf. on Computer Vision, vol. 1, pages 207-212, 2003.
[2] Y. Liu and T. Huang, "A Linear Algorithm for Motion Estimation using Straight Line Correspondences", Computer Vision, Graphics and Image Processing, vol. 44, no. 1, pages 35-57, 1988.
[3] T. Vieville, Q. Luong, and O. Faugeras, "Motion of Points and Lines in the Uncalibrated Case", International Journal of Computer Vision, vol. 17, no. 1, 1995.



3D Model Acquisition from Circular Motion Sequences

While studies of an earlier approach proposed by Jiang et al [2] has been carried out as an Honours project a couple of years ago, this project studies a new method for 3D model acquisition from circular motion sequences proposed by the similar group of authors [1].  Given a sequence of images of an object placed on a turntable captured by a stationary camera, the 3D model of the object can be recovered from the image features tracked through the image frames.  Here, the rotation of the turntable is driven by a motor, but the rotation angle from one image frame to the next is not known, and neither are the internal parameters (e.g. focal length) of the camera.  This new method proposed in [1] requires a minimum number of 2 points being tracked over 4 or more image frames.

This project will involve the implementation of the method proposed in [1] using some circular motion video sequences available on the web.

References:
[1] G. Jiang, L. Quan, H. T. Tsui, "Circular Motion Geometry by Minimal 2 Points in 4 Images", Proc. Int. Conference on Computer Vision, pages 221-227, 2003.
[2] G. Jiang, H. T. Tsui, and L. Quan, "Automatic 3D Model Construction for Turn-table Sequences based on Conics", Proc. IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2001), Dec 2001.



Image Smoothing using a Level Set Method

When an image is magnified, a standard image processing technique, such as bilinear or bicubic interpolation, can be applied to approximate the values of newly created pixels from the enlargement of the image.  As it has been reported that both the bilinear and the bicubic interpolation techniques can create jagged edges, the aim of this project is to study an alternative technique, namely a level set method, for image smoothing after image magnification.  The research involved will be mainly based on that described in [1]; however, a literature survey on the level set methods (e.g. see [2,3]) and its applications must be a large component of the thesis. Evaluation of the level set method (in terms of its computation complexity) and comparison between this method and other techniques, such as pixel replication, bilinear, and bicubic interpolation, should also be conducted.

Some basic knowledge on image processing will be essential and familiarity with differential equations and finite differences will be desirable.

References:
[1] B. S. Morse and D. Schwartzwald, "Image Magnification Using Level-Set Reconstruction", Proc. IEEE Conf. on Computer Vision and Pattern Recognition, 2001
[2] S. F. R. Osher, "Level Set Methods and Dynamic Implicit Surfaces", Addison-Wesley, 2003.
[3] J. A. Sethian, "Level Set Methods: Evolving Interfaces in Geometry, Fluid Mechanics, Computer Vision, and Material Science", Cambridge University Press, 1996.

Return to the 4th year project list