©COPYRIGHT NOTICE: All the documents on this server have been submitted by their authors to scholarly journals or conferences as indicated, for the purpose of non-commercial dissemination of scientific work. The manuscripts are put on-line to facilitate this purpose. These manuscripts are copyrighted by the authors or the journals/conferences in which they were published. You may copy a manuscript for scholarly, non-commercial purposes, such as research or instruction, provided that you agree to respect these copyrights.

©COPYRIGHT NOTICE FOR IEEE PUBLICATIONS: Copyright 1999 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: + Intl. 908-562-3966.



Z. Jiang, D. Q. Huynh, W. Moran, and S. Challa. Tracking Pedestrians using Smoothed Colour Histograms in an Interacting Multiple Model Framework. IEEE International Conference on Image Processing (ICIP), Brussels, Belgium, Sep 2011.  PDF

Abstract: In this paper, we present a method for tracking pedestrians in video sequences captured by a fixed camera. Pedestrians are detected in every video frame using the human detector proposed by Dalal and Triggs. An interacting multiple model method is used to predict and update pedestrian trajectories from current frame to the next one. We employ a stationary model and a constant velocity model in our method to handle cases such as when a pedestrian suddenly stops or changes walking direction. We smooth the colour histogram that describes the appearance of each detected pedestrian using kernel density estimation. Our experimental results show that our tracking method outperforms one that uses the Kalman filter and colour histograms.



S. Sedai, D. Q. Huynh, and M. Bennamoun. Evaluating Shape and Appearance Descriptors for 3D Human Pose Estimation. Conference on Industrial Electronics and Applications (ICIEA) , Beijing, China, Jun 2011.   PDF

Abstract: In this paper, we present a comparative evaluation of several appearance and shape descriptors in the context of 3D human pose estimation. Among the shape descriptors, we evaluate the Discrete Cosine Transform (DCT) and the Histogram of Shape Context (HoSC) descriptors. The five appearance descriptors that we evaluate are all variants of the Histogram of Oriented Gradients (HOG) descriptor. We evaluate these descriptors quantitatively using the HumanEva-I dataset. We report the performance of the descriptors using the Relevance Vector Machine (RVM) regression and K-nearest neighbor (KNN) regression methods. We found that the appearance descriptor computed at multiple spatial regions gave the best performance when RVM regression was used for pose estimation. The DCT descriptor performed the best when KNN regression was used for pose estimation.



S. Sedai, D. Q. Huynh, and M. Bennamoun. Supervised Particle Filter for Tracking 2D Human Pose in Monocular Video. IEEE Workshop on Applications of Computer Vision (WACV), Kona, Hawaii, pp. 367-373, Jan 2011.  PDF

Abstract: In this paper, we propose a hybrid method that combines supervised learning and particle filtering to track the 2D pose of a human subject in monocular video sequences. Our approach, which we call a supervised particle filter method, consists of two steps: the training step and the tracking step.
In the training step, we use a supervised learning method to train the regressors that take the silhouette descriptors as input and produce the 2D poses as output. In the tracking step, the output pose estimated from the regressors is combined with the particle filter to track the 2D pose in each video frame. Unlike the particle filter, our method does not require any manual initialization. We have tested our approach using the HumanEva video datasets and compared it with the standard particle filter and 2D pose estimation on individual frames. Our experimental results show that our approach can successfully track the pose over long video sequences and that it gives more accurate 2D human pose tracking than the particle filter and 2D pose estimation.



Z. Jiang, D. Q. Huynh, W. Moran, S. Challa, and N. Spadaccini. Multiple Pedestrian Tracking using Colour and Motion Models. Digital Image Computing: Techniques and Analysis, Sydney, Australia, pp. 328-333, Nov/Dec 2010.  PDF

Abstract: This paper presents a method that combines colour and motion information to track pedestrians in video sequences captured by a fixed camera. Pedestrians are firstly detected using the human detector proposed by Dalal and Triggs which involves computing the histogram of oriented gradients descriptors and classification using a linear support vector machine. For the colour-based model, we extract a 4-dimensional colour histogram for each detected pedestrian window and compare these colour histograms between consecutive video frames using the Bhattacharyya coefficient. For the motion model, we use a Kalman filter which recursively predicts and updates the estimates of the positions of pedestrians in the video frames. We evaluate our tracking method using videos from two pedestrian video datasets from the web. Our experimental results show that our tracking method outperforms one that uses only colour information and can handle partial occlusion.



S. Sedai, M. Bennamoun, and D. Q. Huynh. Localized Fusion of Shape and Appearance Features for 3D Human Pose Estimation. British Machine Vision Conference, Aberystwyth, UK, Aug/Sep 2010.   BMVC2010

Abstract: This paper presents a learning-based method for combining the shape and appearance feature types for 3D human pose estimation from single-view images. Our method is based on clustering the 3D pose space into several modular regions and learning the regressors for both feature types and their optimal fusion scenario in each region. This way the complementary information of the individual feature types is exploited, leading to improved performance of pose estimation. We train and evaluate our method using a synchronized video and 3D motion dataset. Our experimental results show that the proposed feature combination method gave more accurate pose estimation than that from each individual feature type.



F. Flitti, M. Bennamoun, D. Q. Huynh, and R. A. Owens. Probabilistic Human Pose Recovery from 2D Images. IEEE Int. Conf. on Image Processing, Hong Kong, pp. 1517-1520, Sep 2010.   PDF

Abstract:Image based human pose recovery has many applications in different industries such as games, entertainment, physiological rehabilitation and biometrics. This paper presents a new pose estimation algorithm from monocular images based on a nonlinear mapping of human silhouettes, coded using a collection of local image moments, to the pose space using a mixture of Neural Networks (NN) regressors. All parameters are estimated automatically. Experiments and comparative results show a superior performance of the proposed method.


D. Q. Huynh. Metrics for 3D Rotations: Comparison and Analysis. Journal of Mathematical Imaging and Vision, 2009. vol. 35, no. 2, pp. 155-164, Oct 2009.  PDF

Abstract:
3D rotations arise in many computer vision, computer graphics, and robotics problems and evaluation of the distance between two 3D rotations is often an essential task. This paper presents a detailed analysis of six functions for measuring distance between 3D rotations that have been proposed in the literature. Based on the well-developed theory behind 3D rotations, we demonstrate that five of them are bi-invariant metrics on SO(3) but that only four of them are boundedly equivalent to each other. We conclude that it is both spatially and computationally more efficient to use quaternions for 3D rotations. Lastly, by treating the two rotations as a true and an estimated rotation matrix, we illustrate the geometry associated with iso-error measures.


B. Moran, D. Q. Huynh, X. Wang, M. Edwards, A. Harris, and B. F. La Scala. An EM Approach to Mineral Analysis Using Natural Gamma Rays. Digital Signal Processing, vol. 19, pp. 793-808, 2009.   PDF

Abstract:
We describe here a method for the analysis of materials on a conveyor belt using the natural gamma spectra collected with a BGO (Bismuth Germanate) gamma ray detector. This detector collects gamma ray emissions from the Potassium (K), Uranium (U), and Thorium (Th) atoms in the materials. Based on these data, and using a Poisson model for the data generation, a statistical model is proposed and an approximate maximum likelihood (ML) technique based on the expectation-maximization (EM) algorithm is then used to estimate the amount of each of the three elements in the material. The statistical model is further refined to incorporate parameters of drift in the detector and an estimation technique for this is developed and tested against real data. The Cramér-Rao lower bounds for the estimators are calculated.


D. Q. Huynh, A. S. Saini, and W. Liu. Evaluation of Three Local Descriptors on Low Resolution Images for Robot Navigation. Image and Vision Computing New Zealand, Wellington, New Zealand, pp. 113-118, Nov 2009.   PDF

Abstract:
This paper presents an evaluation of the SIFT (Scale Invariant Feature Transform), Colour SIFT, and SURF (Speeded Up Robust Feature) descriptors on very low resolution images.  The performance of the three descriptors are compared against each other on the precision and recall measures using ground truth correct matching data. Our experimental results show that both SIFT and Colour SIFT are more robust under changes of viewing angle and viewing distance but SURF is superior under changes of illumination and blurring. In terms of computation time, the SURF descriptors offer themselves as a good alternative to SIFT and CSIFT.


D. Q. Huynh. Frequency Estimation of Musical Signals using STFT and Multitapers. Proc. 6th International Symposium on Image and Signal Processing and Analysis, Salzburg, Austria, pp. 34-39, Sep 2009.  PDF 

Abstract:
This paper presents a detailed analysis and comparison of the Short-Time Fourier Transform (STFT) and Thomson’s multitaper method for frequency estimation of musical signals from a classical guitar. We show that more accurate frequency estimates can be obtained by taking into account the frequencies in a small neighbourhood around the identified frequency peaks. We also demonstrate that the multitaper method yields better frequency estimates than those from the STFT while the extra computation time required is almost negligible.



S. Sedai, M. Bennamoun, and D. Q. Huynh. Context-based Appearance Descriptor for 3D Human Pose Estimaton from Monocular Images. Digital Image Computing: Techniques and Analysis, Melbourne, Austria, Dec 2009.  PDF

Abstract:
In this paper we propose a novel appearance descriptor for 3D human pose estimation from monocular images using a learning-based technique. Our image-descriptor is based on the intermediate local appearance descriptors that we design to encapsulate local appearance context and to be resilient to noise.We encode the image by the histogram of such local appearance context descriptors computed in an image to obtain the final image-descriptor for pose estimation. We name the final image-descriptor the Histogram of Local Appearance Context (HLAC). We then use Relevance Vector Machine (RVM) regression to learn the direct mapping between the proposed HLAC image-descriptor space and the 3D pose space.  Given a test image, we first compute the HLAC descriptor and then input it to the trained regressor to obtain the final output pose in real time. We compared our approach with other methods using a synchronized video and 3D motion dataset.  We compared our proposed HLAC image-descriptor with the Histogram of Shape Context and Histogram of SIFT like descriptors. The evaluation results show that HLAC descriptor outperforms both of them in the context of 3D Human pose estimation.



S. Sedai, F. Flitti, M. Bennamoun, and D. Huynh. 3D Human Pose Estimation from Static Images using Local Features and Discriminative Learning. Proc. 19th IEEE International Conference on Image Analysis and Recognition (ICIAR), Halifax, Canada, Jul 2009.   PDF

Abstract: In this paper an approach to recover the 3D human body pose from static images is proposed. We adopt a discriminative learn- ing technique to directly infer the 3D pose from appearance-based local image features.We use simplified Gradient Location and Orientation his- togram (GLOH) as our image feature representation. We then employ the gradient tree-boost regression to train a discriminative model for mapping from the feature space to the 3D pose space. The training and evaluation of our algorithm were conducted on the walking sequences of a synchronized video and 3D motion dataset. We show that appearance- based local features can be used for pose estimation even in cluttered environments. At the same time, the discriminatively learned model al- lows the 3D pose to be estimated in real time.



D. Q. Huynh and A. Heyden. Recursive Structure and Motion Estimation from Noisy Uncalibrated Video Sequences. Proc. 19th IEEE International Conference on Pattern Recognition, Tampa, FL, USA, 8-11 December 2008.   PDF

Abstract: This paper builds on a novel framework of hybrid matching constraints for estimation of structure and recovery of camera focal length and motion, combining the advantages of both discrete and continuous methods. Our recursive method can deal with both image noise and outliers. The system is an extension of the epipolar hybrid matching constraints in conjunction with a simple structure estimation scheme using standard triangulation. The extension enables the system to deal with varying focal length of the camera. The structure obtained from some previous image frames is used to improve estimates of the camera focal length and motion for the current image frame. These are, in turn, used to refine the structure. Finally, a RANSAC outlier rejection scheme is employed to reject outlier tracks, inevitably obtained from any tracker. The performance of the proposed system is demonstrated on simulated experiments.



B. Moran, D. Huynh, M. Edwards, A. Harris, X. Wang, and B. La Scala. On-Belt Analysis of Minerals Using Naturally Occurring Gamma Radiation. Proc. 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 3669-3672, Las Vegas, Nevada, U.S.A., 30 March - 4 April 2008.   PDF

Abstract: We describe a method to analyze materials on a conveyor belt using natural gamma spectra collected with a BGO (Bismuth Germanate) gamma ray detector, which collects emissions from Potassium (K), Uranium (U), and Thorium (Th) in the materials. A statistical model is proposed based on a Poisson process and an approximate maximum likelihood (ML) technique via the expectation-maximization (EM) algorithm is then used to estimate the amount of each of the three elements in the material. A refinement of the statistical model is used to estimate linear drift in the detector.


D. Deluca-Cardillo, D. Q. Huynh, and M. Bennamoun. 3D Pose Recovery of the Human Arm from a Single View. Proc. of Image and Vision Computing New Zealand, pages 46-51, Hamilton, New Zealand, 5-7 December 2007.  PDF

Abstract:
Markerless motion capture of humans in video sequences is a challenging problem and requires advanced visual tracking techniques. This paper analyses the annealed particle filter for the 3D pose recovery of the human arm in a strict setting and with ground truth information. For evaluation purposes, we focus on the pose recovery of the human arm only so the dimension of the search space is small. In our experiments, video sequences of two calibrated cameras were captured to obtain the ground truth of the 3D pose in each frame; however, only one video sequence was used for motion capture, so the occlusion problems are currently not considered. The accuracy of the annealed particle filter was evaluated against the number of particles and the number of layers used. 


D. Wedge, D. Huynh, and P. Kovesi. Using Space-Time Interest Points for Video Sequence Synchronization. IAPR Conference on Machine Vision Applications, pages 190-194, Tokyo, Japan, 16-18 May 2007.   PDF

Abstract: We introduce an algorithm for synchronizing two video sequences recorded by stationary cameras. It extends common RANSAC-based approaches that recover either a homography or a fundamental matrix from putatively matched spatial features in two images. In our algorithm, we detect space-time interest points in each sequence which represent events such as objects changing direction, and putatively matching points from each sequence are determined. A nested RANSAC framework on these putative matches is then used to firstly recover the frame offset and ratio of frame rates of the two sequences, then either a homography or a fundamental matrix relating the two views, depending on the type of motion contained within the sequences. No camera calibration or object tracking is required. Real sequences containing motion either on a plane or in free space are synchronized and it is demonstrated that this approach is successful in recovering the ratio of frame rates, the frame offset, and the homography or fundamental matrix relating the two sequences.


D. Wedge, D. Huynh, and P. Kovesi. Motion Guided Video Sequence Synchronization. In LNCS 3852, Proc. Asian Conference on Computer Vision, Springer-Verlag, pages 832-841, Hyderdad, India, 13-16 January 2006.   SPRINGER

Abstract:
We present an algorithm that synchronizes two short video sequences where an object undergoes ballistic motion against stationary scene points. The object's motion and epipolar geometry are exploited to guide the algorithm to the correct synchronization in an iterative manner. Our algorithm accurately synchronizes videos recorded at different frame rates, and takes few iterations to converge to sub-frame accuracy. We use synthetic data to analyze our algorithm's accuracy under the influence of noise. We demonstrate that it accurately synchronizes real video sequences, and evaluate its performance against manual synchronization.


D. Q. Huynh and A. Heyden. Scene Point Constraints in Camera Auto-Calibration: An Implementational Perspective. Image and Vision Computing journal, vol 23, no 8, pp. 747-760, August 2005.   PDF

Abstract:
We present a scheme for incorporating scene constraints into the auto-calibration process for the structure and motion recovery problem. The steps covered by the scheme include projective factorization of the joint image measurement matrix, recovery of the absolute dual quadric, the upgrade from projective structure to its Euclidean counterpart, and incorporation of constraints from orthogonal scene planes into bundle adjustment. The focus of the paper is on the implementation details of all these steps and discussion of the various issues that arose. We have tested the scheme on both synthetic and real image data and found that it is more advantageous to incorporate into camera auto-calibration and bundle adjustment as many scene constraints as are available rather than performing auto-calibration and bundle adjustment alone.


D. Q. Huynh, R. Hartley, and A. Heyden. Outlier Correction in Image Sequences for the Affine Camera. Proc. IEEE International Conference on Computer Vision, pp. 585-590, Nice, France, 11-19 October 2003.   PDF

Abstract:
It is widely known that, for the affine camera model, both shape and motion can be factorized directly from the so-called image measurement matrix constructed from image point coordinates. The ability to extract both shape and motion from this matrix by a single SVD operation makes this shape-from-motion approach attractive; however, it can not deal with missing feature points and, in the presence of outliers, a direct SVD to the matrix would yield highly unreliable shape and motion components. In this paper, we present an outlier correction scheme that iteratively updates the elements of the image measurement matrix. The magnitude and sign of the update to each element is dependent upon the residual robustly estimated in each iteration. The result is that outliers are corrected and retained, giving improved reconstruction and smaller reprojection errors. Our iterative outlier correction scheme has been applied to both synthesized and real video sequences. The results obtained are remarkably good.


D. Q. Huynh and A. Heyden. Robust Factorization for the Affine Camera: Analysis and Comparison. Seventh International Conference on Control, Automation, Robotics and Vision, pp. 126-131, Singapore, 2-5 December 2002.    PDF

Abstract:
Based on our previous work on the use of subspace distances for the outlier detection problem in video sequences under affine projection, this paper reports our further analysis of the problem and presents two algorithms for computing the reprojection errors of image features in the outlier detection process.  Extensive experiments on real video sequences have been conducted to verify the performance of the algorithms. The key contributions of the paper are presentation of the relationship between subspace distances and reprojection errors and demonstration that reprojection errors can be estimated without explicitly computing the projective structure.


A. Heyden and D. Q. Huynh. Auto-calibration via the Absolute Quadric and Scene Constraints. International Conference on Pattern Recognition, Vol. 2, pp. 631-634, Quebec, Canada, 11-15 August 2002.   PDF

Abstract:
A scheme is described for incorporation of scene constraints into the structure from motion problem. Specifically, the absolute quadric is recovered with constraints imposed by orthogonal scene planes. The scheme involves a number of steps. A projective reconstruction is first obtained, followed by a linear technique to form an initial estimate of the absolute quadric. A nonlinear iteration then refines this quadric and the camera intrinsic parameters to upgrade the projective reconstruction to Euclidean. Finally, a bundle adjustment algorithm optimizes the Euclidean reconstruction to give a statistically optimal result. This chain of algorithms is essentially the same as used in auto-calibration and the novelty of this paper is the inclusion of orthogonal scene plane constraints in each step. The algorithms involved are demonstrated on both simulated and real data showing the performance and usability of the proposed scheme.


D. Q. Huynh, A. Heyden, and S. Khan. A Scheme for Combining Auto-Calibration and Scene Constraints. Asian Conference on Computer Vision, Vol 2, pp. 436-441, Melbourne, Australia, 23-25 January 2002.   PDF

Abstract:
A scheme for combining auto-calibration and scene constraints in the "structure-from-motion" problem is proposed.  This scheme focuses on the recovery of the absolute quadric using auto-calibration while imposing orthogonal scene plane constraints.  First, an initial estimate of the absolute quadric is obtained using a linear method.  A nonlinear constrained optimization step is then applied to refine this quadric and the camera intrinsic parameters to upgrade the estimated projective reconstruction to Euclidean.  Finally, a bundle adjustment algorithm optimizes the Euclidean reconstruction to give a statistically optimal result.  Constraints from orthogonal scene planes are applied to the initial estimation and refinement steps of the absolute quadric.  The performance of the scheme is demonstrated on both simulated and real video data.



D. Q. Huynh and A. Heyden. Outlier Detection in Video Sequences under Affine Projection. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, Kauai, Hawaii, 9-14 December 2001.    PDF

Abstract:
A novel robust method for outlier detection in structure and motion recovery for affine cameras is presented.  It is an extension of the well-known Tomasi-Kanade factorization technique designed to handle outliers.  It can also be seen as an importation of the LMedS technique or RANSAC into the factorization framework.  Based on the computation of distances between subspaces, it relates closely with the subspace-based factorization methods for the perspective case presented by Sparr and others and the subspace-based factorization for affine cameras with missing data by Jacobs. Key features of the method presented here are its ability to compare different subspaces and the complete automation of the detection and elimination of outliers.  Its performance and effectiveness are demonstrated by experiments involving simulated and real video sequences.


A. H. H. Ngu, Q. Z. Sheng, D. Q. Huynh, and R. Lei. Combining Multi-visual Features for Efficient Indexing in a Large Image Database. The International Journal on Very Large Data Bases, vol 9, no 4, pp. 279-293, May 2001.   VLDBJ

Abstract:
The optimized distance-based access methods currently available for multidimensional indexing in multimedia databases are developed based on two major assumptions: a suitable distance function is known a priori and the dimensionality of the image features is low.  It is not trivial to define a distance function that best mimics human visual perception in image similarity measurement.   Reducing high-dimensional features in images using the popular Principle Component Analysis (PCA) might not always be possible due to the non-linear correlations that may be present in the feature vectors.
We propose in this paper a fast and robust hybrid method for nonlinear dimensions reduction of composite image features for indexing in a large image database.  This method incorporates both the PCA and non-linear neural network techniques to reduce the dimensions of feature vectors so that an optimized access method can be applied.  To incorporate human visual perception into our system, we also conducted experiments that involved a number of subjects classifying images into different classes for neural network training.  We demonstrate that not only can our neural network system reduce the dimensions of the feature vectors but that the reduced dimensional feature vectors can also be mapped to an optimized access method for fast and accurate indexing.
hardcopies available  


D. Q. Huynh. The Cross Ratio: A Revisit to Its Probability Density Function. British Machine Vision Conference, Vol. 1, pp. 262-271, 11-14 September 2000.    PS  |   PDF

Abstract:
The cross ratio has wide applications in computer vision because of its invariance under projective transformation.  In active vision where the projections of quadruples of collinear landmark points in the scene are tracked in the image sequence for robot localisation or online camera calibration, one often needs to compute cross ratios from noisy image data for some subsequent operations.  Being able to assess the reliability of each computed cross ratio value against a known level of image noise is therefore of importance.  This aim motivates our research to derive the probability density function (p.d.f.) of the cross ratio based on the normality assumption of the associated random variables and to investigate into empirical cases where this assumption fails to hold.  Although an analytical formula for the general p.d.f. of the cross ratio has not been achieved, our research results show that (i) the distance between the closest pair of collinear points is a significant factor that determines the shape of the p.d.f. of the cross ratio and (ii) a good estimate of the cross ratio can be obtained if the points of the quadruple are sufficiently far apart.
online conference proceedings  


D. Q. Huynh, Y. S. Chou, and H. T. Tsui. Semi-automatic Metric Reconstruction of Buildings from Self-calibration: Preliminary Results on the Evaluation of a Linear Camera Self-calibration Method. International Conference on Pattern Recognition, Vol. 4, pp. 599-602, 3-8 September 2000.   PSPDF

Abstract:
Algorithms for camera self-calibration vary depending on the number of images used, the camera model assumed, and the number of intrinsic parameters that need to be recovered.  In this paper, we investigate the linear self-calibration method proposed by Newsam et al for our project on 3D reconstruction of architectural buildings.  This self-calibration method assumes that the principal point is known, the camera has square pixels and has no skew.  It allows 3D shape to be reconstructed from two images while giving the camera the freedom to vary its focal length.  Since the paper by Newsam et al reports only the theoretical work on camera self-calibration, in this paper, we evaluate the focal lengths obtained from their method with those computed from Tsai's calibration method.  Our experimental results show that the focal lengths from the two methods differed by less than 5% and the reconstructed 3D shape was very good in that angles were well preserved.  Our future research will focus on further improvement of optimal 3D reconstrcution in the presence of image noise and further develop this method into a package for 3D reconstruction of buildings to be used by a layperson.


D. Q. Huynh. Affine Reconstruction from Monocular Vision in the Presence of a Symmetry Plane. International Conference on Computer Vision, pp. 476-482, Kerkyra, Greece, 20-27 September 1999.   PS   PDF

Abstract:
This paper reports a closed-form solution for reconstructing a scene up to an affine transformation from a single image in the presence of a symmetry plane.  Unlike scene reconstruction in stereo vision, the affine reconstruction process discussed in this paper does not require any knowledge about camera parameters or camera orientation relative to the scene, so camera self-calibration is totally eliminated.  By setting in the scene a plane mirror which creates lateral symmetric world points for an uncalibrated, perspective camera to capture, the linear equations involved in the reconstruction process can be derived from two sets of similar triangles.  The affine reconstruction is relative to an arbitrary affine coordinated frame implicitly defined on the mirror plane.  Also involved in the process are the estimation of the epipole and recovery of the image-to-mirror plane homography.   Implementation on estimating the epipole is detailed.  A real experiment is presented to demonstrate the reconstruction.


D. Q. Huynh, R. A. Owens, and P. E. Hartmann. Calibrating a Structured Light Stripe System: A Novel Approach. International Journal of Computer Vision, vol 33, no 1, pp. 73-86, September 1999.

Abstract:
The problem associated with calibrating a structured light stripe system is that known world points on the calibration target do not normally fall onto every light stripe plane illuminated from the projector.  We present in this paper a novel calibration method that employs the invariance of the cross ratio to overcome this problem.  Using 4 known non-coplanar sets of 3 collinear world points and with no prior knowledge of the perspective projection matrix of the camera, we show that world points lying on each light stripe plane can be computed.  Furthermore, by incorporating the homography between the light stripe and image planes, the 4 x 3 image-to-world transformation matrix for each stripe plane can also be recovered.  The experiments conducted suggest that this novel calibration method is robust, economical, and is applicable to many dense shape reconstruction tasks.
hardcopies available
earlier, shorter conference version


M. J. Brooks, L. de Agapito, D. Q. Huynh, and L. Baumela. Towards Robust Metric Reconstruction Via a Dynamic Uncalibrated Stereo Head.  Image and Vision Computing, vol 16, no 14, pp. 989-1002, December 1998.

Abstract:
We consider the problem of metrically reconstructing a scene viewed by a moving stereo head. The head comprises two cameras with coplanar optical axes arranged on a lateral rig, each camera being free to vary its angle of vergence.  Under various constraints, we derive novel explicit forms for the epipolar equation, and show that a static stereo head constitutes a degenerate camera configuration for carrying out self-calibration. The situation is retrieved by consideration of a stereo head undergoing ground plane motion, and new closed-form solutions for self-calibration are derived. An error analysis reveals that reconstruction is adversely affected by inward-facing camera vergence angles that are similar in value, and by a principal point location whose horizontal component is in error. It is also shown that the adoption of domain-specific robust techniques for computation of the fundamental matrix can significantly improve the quality of scene reconstruction.  Experiments conducted with dynamic stereo head images confirm that avoidance of near-degenerate configurations and use of robustness techniques are essential if reliable reconstructions are to be attained.
hardcopies available


L. de Agapito, D. Q. Huynh, and M. J. Brooks. Self-calibrating a Stereo Head: An Error Analysis in the Neigbourhood of Degenerate Configurations. International Conference on Computer Vision, pp. 747-753, Bombay, India, 4-7 January 1998.

Abstract:
We show that the self-calibration of a stereo head from corresponding points in an image pair is in certain circumstances prone to considerable error.  A novel error analysis reveals that the automated determination of relative orientation and focal length is adversely affected when the cameras verge inwards a similar amount, and when the principal point locations have a horizontal error.  This analysis is facilitated by the adoption of closed-form solutions for self-calibration from previous work of the authors.  It is also shown that estimation of the fundamental matrix associated with a stereo head image pair is improved when a domain-specific parameterization and associated computational techniques are adopted.  Experiments conducted with such imag pairs suggest that, given cognisance of sensitive configurations and adoption of the revised method of fundamental matrix estimation, robust reconstructions are attainable.  This is demonstrated on the problem of metrically reconstructing a scene from two pairs of images obtained by an uncalibrated stereo head undergoing unknown ground-plane motion.
hardcopies available


D. Q. Huynh. Calibration of a Structured Light System: A Projective Approach. Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 225-230, Puerto Rico, 17-19 June 1997.    PS |  PDF

Abstract: We present in this paper a novel calibration method that uses cross ratio to compute world points falling onto any given light stripe plane of a structured light system.  We show that, by using 4 known non-coplanar sets of 3 collinear world points, the direct 4 x 3 image-to-world transformation matrix for each light stripe plane can also be recovered from plane-to-plane homography.  Preliminary experiments conducted with a calibration target and a mannequin suggest that this novel calibration method is robust and is applicable to many shape measurement tasks.
(now superseded by the IJCV journal article above)


G. N. Newsam, D. Q. Huynh, M. J. Brooks, and H.-P. Pan. Recovering Unknown Focal Lengths in Self-Calibration: An Essentially Linear Algorithm and Degenerate Configurations. International Archives of Photogrammetry and Remote Sensing, vol. XXXI, part B3, commission III, pp. 575-580, Vienna, Austria, 9-19 July 1996.    PSPDF 

Abstract: If sufficiently many pairs of corresponding points in a stereo image pair are available to construct the associated fundamental matrix, then it has been shown that 5 relative orientation parameters and 2 focal lengths can be recovered from this fundamental matrix.  This paper presents a new and essentially linear algorithm for recovering focal lengths.  Moreover the derivation of the algorithm also provides a complete characterisation of all degenerate configurations in which focal lengths cannot be uniquely recovered.  There are two classes of degenerate configurations: either one of the optical axes of the cameras lies in the plane spanned by the baseline and the other optical axis; or one optical axis lies in the plane spanned by the baseline and the vetor that is orthogonal to both the baseline and the other axis.  The result that the first class of configurations (i.e. one in which the optical axes are coplanar) is degenerate is of some practical importance since it shows that self-calibration of unknown focal lengths is not possible in certain stereo heads, a configuration widely used for binocular vision systems in robotics.


M. J. Brooks, L. de Agapito, D. Q. Huynh, and L. Baumela. Direct Methods for Self-Calibration of a Moving Stereo head. European Conference on Computer Vision, vol. 2, pp. 415-426, Cambridge, UK, April 1996.    PS |   PDF

Abstract:
We consider the self-calibration problem in the special context of a stereo head, where the two cameras are arranged on a lateral rig with coplanar optical axes, each camera being free to vary its angle of vergence.  Under various constraints, we derive explicit forms for the epipolar equation, and show that a static stereo head constitutes a degenerate camera configuration for carrying out self-calibration in the sense of Hartley [4].  The situation is retrieved by consideration of a special kind of motion of the stereo head in which the baseline remains confined to a plane.  New closed-form solutions for self-calibration are thereby obtained, inspired by an earlier discrete motion analysis of Zhang et al. [11].  Key factors in our approach are the development of explicit, analytical forms of the fundamental matrix, and the use of the vergence angles in the parameterisation of the problem.


H-P. Pan, D. Q. Huynh, and G. Hamlyn. Two-Image Resituation: Practical Algorithm. SPIE Videometrics IV (part of SPIE's International Symposium on Intelligent Systems & Automated Manufacturing), vol. 2598, pp. 174-190, Philadelphia, Pennsylvania, USA, 22-27 October 1995.    PSPDF

Abstract:
Two-image resituation refers to the recovery of the geometric configuration of two stereo images.  This involves determining three intrinsic parameters for each image and five relative orientation parameters.  We show here that this can be achieved using only the image coordinates of homologous points, and needs no other control information from object space.  The approach is based on a thorough analysis of epipolar constraints.  The explicit coplanarity equation defined by the intrinsic and relative orientation parameters is recast into a quadratic form whose parameters define a general coplanarity matrix.  This matrix in turn can be written as the product of three matrices, two of which are defined by the intrinsic parameters, and one, called the special coplanarity matrix, is a function of the five relative orientation parameters.  This paper presents a practical procedure for computing all these parameters from only image measurements.  The basic strategy is first to find approximate values via closed-form solutions, and then to iteratively fine-tune them to precise values.  The key steps are: (1) solving for the general coplanarity matrix via a nonliear least-squares optimisation; (2) solving for two focal lengths from the general coplanarity matrix via a closed-form algebraic solution; (3) determining the special coplanarity matrix from the general coplanarity matrix and the focal lengths; (4) determining the relative orientation parameters including three baseline components and three rotation angles via closed-form solutions; (5) fine-tuning all the explicit parameters via an iterative linearized least-squares solution.  Original or improved solutions are developed for most stages of this procedure.  Finally the computational theory is tested numerically.


K. C. Ng, B. F. Alexander, S. H. Boey, S. Daly, J. C. Kent, D. Q. Huynh, R. A. Owens and P. E. Hartman. Biostereometrics -- A noncontact, noninvasive shape measurement technique for bioengineering applications. Journal of the Australasian Physical & Engineering Sciences in Medicine, vol 17, no 3, pp. 124-130, September 1994.    PDF

This paper won a Kenneth Clark prize for the best paper in Volume 17 of the Journal.

Abstract: Recent advances in noncontact, noninvasive shape measurement offers new tools for clinical and research applications in such areas as skin surface measurement, facio-maxillary measurements, rehabilitation and prothesis, monitoring of post-operative shape changes in reconstructive/cosmetic surgery.  This paper describes a system based on the triangulation technique which has several advanced features.  Principle of operation and structure of the system are described.  Examples of successful applications in the biomedical field are detailed.


Last updated in January 2011 by Du Huynh (du@csse.uwa.edu.au)