UWA logo

 
Computer Science
4th Year Projects in 2000



Dr Cara MacNish (cara@cs.uwa.edu.au)

Cara obtained her BE(Hons) in Electronic Engineering at UWA in 1987. This was followed by a short period working as a research and development engineer for ACET Ltd, before taking up the Prince of Wales Studentship at Trinity College, Cambridge, in 1988. In 1992 she obtained her Ph.D. in Logic-based Inference Systems from the University of Cambridge, and in March 1992 took up a Lectureship in Computer Science at the University of York. She returned to Australia in 1995 to join the Computer Science Department at UWA, and in 2000 obtained her Chain Saw Operator's Certification.

Cara's research interests include logic and artificial intelligence, automated planning, machine learning, iterative optimisation, software specification and Computer Science education.

For more information see the Intelligent Computing Research Group web site.


Iterative Techniques for Speech Reconstruction

Systems for speech recognition and natural language understanding may have to deal with corrupted speech sequences for many reasons. For example: The last case is particularly pertinent to the Department's Lipreading Project, which is attempting to reconstruct speech from visual information. Here, even if the information is captured perfectly, there will still be ambiguities, since the same mouth patterns may produce different sounds.

The aim of this project is to investigate the use of distributed iterative search techniques, such as Genetic Algorithms, as a way of completing or correcting incomplete or incorrect speech sequences. The idea is to find new sequences that are as "similar" as possible to the recorded sequence while being phonetically feasible (and, ultimately, grammatically and semantically correct).

Intelligent Techniques for Code Analysis and Feedback

Evolutionary Optimisation and Test Sequence Generation

The datlab system is a Java system for analysing data structure implementations submitted by 2nd year students. The system, and some preliminary work on evolutionary test sequence generation, is described in [ MacNish00].

The main approach used by the datlab system for testing student code is to run sequences of test methods on the code, and compare the resulting output and the errors thrown with those of a "master copy" that is known to work correctly. While these test sequences can be generated pseudorandomly, there are a number of problems with this approach. First, errors may only arise from fairly specific sequences of operations. Long pseudorandom sequences are therefore needed to give a high probability of uncovering errors. Since the response time of the system is also important with many students using the system at once, there are practical limits on the size of test sequences used, and on some occasions erroneous code will pass unnoticed.

Secondly, it is important to be able to provide useful feedback to students. One way in which this has been done is to provide students with the sequence of operations that brought about the failure of the submitted code. They can then be encouraged to work through the sequence against a diagrammatic representation of the data structure to identify where its state is corrupted. (The corruption does not, in general, occur in the last method called - this method simply reveals the corrupted state.) If long test sequences are used this process becomes unwieldy and impractical.

Finally, the long pseudorandom sequences provide little feedback to the lecturer about the types of errors frequently being made by students. This information could be very useful for addressing common misunderstandings in lectures and tutorials.

What we seek, therefore, is a way of generating test sequences that fulfill the following goals.

  1. The sequences should uncover "most" errors - in particular, frequently occuring classes of errors should be found. Note that frequency of occurrence of an error has little correlation with frequency of pseudorandom generation of a sequence that uncovers the error. (Project students may wish to explore this correlation.)

  2. Errors should be uncovered by "short" sequences, which can then be used by students to investigate the behaviour of their code and by lecturers to characterise common misunderstandings amoung students.
The aim of these projects is to explore the development and use of iterative optimisation techniques (or "evolutionary" techniques) for test sequence generation. This provides an interesting application area for iterative optimisation techniques for a number of reasons:

At least two projects can be offered in this area, focussing on the following optimisation techniques:

  1. Genetic Algorithms

    Genetic algorithms [Davis91,Goldberg89] are based on the biological process of evolution and natural selection, and are now a well-established technique for iteratively improving solutions in search spaces that are not "well-behaved". A population of individuals (encoded solutions) is "evolved" by crossover and mutation operations, chosen randomly according to some parameters, with the aim of finding successively improved results (solutions with an improved fitness value).

  2. Particle Swarm Optimisers

    Particle swarm optimisers are a less well-known technique that originally grew out of attempts to model the behaviour of flocks of birds and schools of fish. Each individual "flies" through the search space, influenced by its own best position to date and the best position found by the "flock", ideally converging on a near-optimal solution. A lack of understanding of the mathematics of the optimiser led the original authors to cap velocity in order to obtain convergence, thus giving behaviour that appeared closer to a swarm of particles, hence the (somewhat misleading) name.


Return to the 4th year project list