UWA Logo School of Computer Science & Software Engineering

Back to the 4th year projects page

Projects that I have supervised

2009:

Honours project title: Tracking Boundaries in Video via Segmentation
Student: Evgeni Sergeev
(supported by a Hackett Foundation Alumni Honours Scholarship (pdf))
Supervisor: Dr. Du Huynh
Abstract:
We study the challenging problem of real-time consistent video segmentation for sequences taken with a hand-held camera. Many image segmentation techniques are available, but they often produce markedly diㄦent segmentations given two similar images, such as two consecutive frames from a video sequence. However, two such segmentations usually have many boundaries in common. It seems that enough information is available to produce a temporally-stable segmentation.
    Working with a hand-held camera is associated with several complications, the most important of which is significant inter-frame displacement, which is a consequence of substantial peak angular velocity that a hand-held camera develops. The displacement, in image space, between two consecutive frames may be modelled by a translation and a rotation. We propose three techniques to correct for such displacement, and study their strengths and weaknesses. The first technique is based on a modified Hough transform, which allows correlations to be used for detecting shifts orthogonal to edges. The local displacement estimates are then combined. This technique is prone to drift and frequent large errors. The second technique tracks shifts in boundaries in the spatial domain and uses a more robust local estimate combination technique, based on particle swarm optimisation. This technique suㄦs when presented with areas containing high densities of edges. The third technique uses the results of a segmentation algorithm and uses optimisation directly in the domain of displacements to find the best match. This method meets our requirements for estimating inter-frame displacement.
    Consistent video segmentation is attempted first by an approach involving erosion around the boundaries of an initial segmentation, and the re-adjustment of it along the eroded areas. However, our algorithms tend to close boundaries readily, and have no facility for splitting segments. Another technique uses feedback to in uence a segmentation algorithm by the results of its segmentation of the previous frame. This method suffers from false boundaries which occur as a side-effect of using feedback.

BE(SE) project title: Recognising Guitar Chords in Real-Time
Student: Michael Goold
Supervisor: Dr. Du Huynh
Abstract:
Learning to play the guitar is a difficult task, one that requires a great deal of time and patience. In an effort to ease the strain of this process, a Java software package has been designed and developed in order to assist amateur guitarists while they are learning to play their instrument.
    The software package achieves this bymonitoring the waveform of themusic being played, and providing the user with feedback while they play through a song. The player is scored based on how well they play a song, thus making the process less a chore, and more like a game.
    There has been a large amount of research devoted to extracting information about music contained in an audio signal, and this paper reviews several such research papers. It later builds on this research by describing the utilisation of these theoretical procedures in a practical application. The algorithms and design choices of this application are described in detail, with special attention paid to the chord recognition algorithm itself. Finally, an analysis of the effectiveness of this application is provided, with remarks on its accuracy and shortcomings.


2008:

Honours project title: Investigating the Feasibility of Near Real-Time Music Transcription on Mobile Devices
Student: Barry van Oudtshoorn (WAITTA 2008 Student Project finalist)
Supervisor: Dr. Du Huynh
Abstract:
Converting music from an audio signal into its abstract (notated) form is an extremely complex task. With increased processing power comes the possibility of more complex, and accurate, analysis. The aim of this project is to investigate the feasibility of developing a system designed to run in near-real time on mobile devices, which have limited processing power. Two analysis techniques are described: a windowed Discrete Fourier Transform (DFT) technique, and a sliding window DFT, both of which are well-defined and computationally efficient. Computationally expensive techniques, such as wavelets, are not considered, as their processing requirements are prohibitively high for mobile devices. Ultimately in the transcription of audio signals, there is a trade-off between accuracy and processing requirements; the goal, then, is to maximise accuracy and minimise the computation needed. To attain this goal and determine the feasibility of developing a mobile system for transcription, a modular prototype system was developed for desktop computers, and is fully explained in this project. Through experiments conducted using this prototype system, it is shown that a system capable of automatically transcribing music within the context of a mobile device is certainly feasible.
Related link: WAITTA


2007:

Mechatronics Engineering project title: Adaptive Gaussian Mixture Model for Motion Segmentation
Student: Lih Wern Hiew
Supervisor: Dr. Du Huynh
Abstract:
Almost all video sequences are composed of foreground objects and a background. Motion segmentation is the task of segmenting foreground objects from the background and it is one of the most important tasks in computer vision. It is used for various applications such as video surveillance/monitoring and movie editing. In the case of a video surveillance system, a static camera observing a scene is a common setup. The detection of moving objects is of great interest and it is the first step in many automated visual monitoring applications. One of the more complex and advanced algorithms that has been proposed for motion segmentation is the Adaptive Gaussian Mixture Model algorithm, which was developed to deal with problems of dynamic backgrounds that vary over time. The aim of this research is to produce a practical software implementation of the algorithm based on its theory. Extensive experimentation has also been carried out to evaluate the algorithm in terms of its performance, robustness and effectiveness in various scenes.
Related link: More details

Honours project title: Human head tracking in Cluttered Scenes
Student: Eko Kurniawan Tenggara
Supervisor: Dr. Du Huynh
Abstract:
The problem of tracking human heads is challenging, especially when the motion model cannot be predicted. The tracking with the absence of the motion model can be done with the Particle Filters (PF) algorithm. The invention of the PF motivates several extended studies to produce the better version of the PF. As a result of that, a few newer algorithms have been proposed within the PF framework, such as: the Unscented Particle Filter and the Iterated Likelihood Weighting. In this project, we introduced the combination between the Unscented Particle Filter and the Iterated Likelihood Weighting, named Unscented Iterated Likelihood Weighting. The conventional Particle Filter is compared with five of its successor algorithms in terms of their effectiveness, robustness, and computational complexity. The implementation is done in Matlab with several experiments to differentiate each algorithm with the others. The experiment results show that the successor algorithms are more robust than the PF while tracking in the cluttered background.


2006:

Honours project title: 3D Pose Recovery of the Human Arm
Student: Daniel Deluca-Cardillo (WAITTA 2006 Student Project finalist)
Supervisor: Dr. Du Huynh
Abstract:
Motion capture is the process of digitally recording the movements of a subject, usually a human. This technology has useful applications in the areas of computer animation and bio-mechanics. Current commercial systems use markers to track movement and, because of this, are restricted in their use. Markerless systems can potentially facilitate new applications of motion capture due to their ability to capture from standard video footage or without specialised equipment. Because of the large uncertainty when dealing with a single video sequence, research has focused on statistical approaches, with particle filtering being the most successful. We constructed a motion capture system that recovers the 3D pose of a human arm from a single video sequence. Our System implements the annealed particle filter, which involves an iterative predict-compare-resample process. It maintains a set of predicted poses defined by a skeletal model. Image features are extracted from each video frame and provide evidence to support predicted poses. A method was developed to compare these 2D features with the 3D poses. The poses that fit best have a greater chance of surviving the annealing process. Test footage was created to evaluate the accuracy of the motion recovered by the system. A stereo view of each scene was recorded which enabled the actual movement of the arm to be determined. This was compared with the motion recovered by the system to calculate an error at each frame. Several different tests were run on three different video sequences. The results varied depending on the type of movement being captured. They highlight the difficulty of depth perception when dealing with monocular view.
Related links: WAITTA, Seminar presented at WAITTA

BE(SE) project title: Ball and Beacon Recognition in Robot Soccer
Student: Benjamin Philip Shaw
Supervisors: Dr. Du Huynh and Dr. Wei Liu
Abstract:
Robot vision can be described as the processing of digital images in a real-time environment. It is often used to obtain information about a robot's surroundings and then commonly integrated with well established higher level modules. It is therefore imperative that any images processed and used must be of the highest degree of accuracy. Accuracy is essential in ensuring that other processes controlling the robot can function in a correct manner. Furthermore, for a real-time environment it is also crucial that the image processing is done efficiently. Thus there is still room for further research and the challenge is posed - to improve efficiency and accuracy of vision modules. This thesis reviews existing vision modules used in the robot soccer environment with Sony AIBO robots and attempts to improve upon ball and beacon recognition algorithms by incorporating shape information. The recognition of these objects using the new algorithms is then tested in a real-time environment and the results are compared to that of current algorithms. Finally, the efficiency of the new and old algorithms is analysed to determine whether or not the new methods of ball and beacon recognition are suitable to be used in robot soccer.


2005:

Honours project title: A Motion Capture Implementation for 3D Cartoon Movies
Student: Robert Budiman
Supervisors: Dr. Du Huynh and A.Prof. Mohammed Bennamoun
Abstract:
There are many ways of describing motion capture. In its simplest form, motion capture is defined as the creation of 3D representation of live performance. Traditionally, animation was created using techniques such as rotoscoping. This technique uses a device called rotoscope which enables animators to trace live action movement of an actor (frame by frame) from a pre-recorded film images. With advances and widespread of computer technologies, the concept of rotoscoping was later on incorporated with computers to generate animated characters. This technique is better known as computer animation. Unfortunately, the process of generating animated characters using computer animation is difficult and time consuming. Furthermore, the resulting animated characters may not be realistic enough. Motion capture provides the means to introduce realism to the animated characters by providing source of motion data to computer animation.
    This project is about implementing a simple marker-based optical motion capture system for 3D cartoon movies. This system currently focuses only on the lower body part of the human subject. However, it can be extended to include the upper body part for future work. As with all optical motion capture system, this system requires a tracking algorithm for tracking the movement of makers. A more advanced optical motion capture system often use Light Emitting Diodes (LED) and retro-reflective tapes as markers. In this project, table tennis balls are used as markers due to their lightness. Markers are placed on the anatomical joints of the lower body part of the human subject. There are several material which can be used as markers in optical motion capture In this project, a well-known tracking algorithm called the mean-shift algorithm is used to track the movement of the markers. The tasks involved in this project are camera calibration, manual camera synchronization, de-interlacing, detection of markers, automatic labelling of markers, marker tracking, and 3D reconstruction. The 3D reconstruction of the markers take the form of a stick figure. The output of this motion capture system can be used or integrated with other application such as Poser, to produce a human-like animation.


2003:

Masters project title: Scanner Video Mosaicing
Student: Chi Chiu Cheng
Supervisors: Dr. Du Huynh and A.Prof. Ryszard Kozera
Abstract:
As a field of view of a camera is smaller than that of a human, visualizing different parts of a scene in one large image is often not possible. Video mosaicing refers to the creation of a panoramic mosaic from a sequence of smaller images. Existing algorithms have strong limitations on image conditions and scene structures, and are often unable to construct panoramic mosaics from images captured by handheld cameras. The constructed panoramic image contains three commonly undesirable properties: image blurring, ghosting, and distortion.
    Scanner video mosaicing simulates the scanning of a scene by a 1-D camera. The panoramic image is created by aligning a set of image strips into a manifold, which is formed according to the camera motions. This is known as general manifold projection, which allows mosaicing in a more general imaging condition. The mosaicing process is fully automated; the reconstructed panoramic images are also free from artifacts, such as ghosting, blurring, or image distortion. In the case where a tilted camera is translating or panning a scene, most mosaicing algorithms produce a curl panoramic image. In this thesis, the Rectified strip mosaicing technique was used in addition to general manifold projection for rectifying strips and straightening the panoramic mosaics. Two motion estimation algorithms, Refinement Motion Estimator(RME) and Multi-resolution Incremental Motion Estimator (MIME) are adopted in this thesis. Experiments were conducted to test both algorithms and their output mosaics were compared.