UWA logo

 
Computer Science
4th Year Projects in 2002



Dr Cara MacNish

Introduction: The Future of Intelligent Computation!?

While digital computing hardware gets faster and faster and computers can be programmed to crunch more and more numbers, produce fancier graphics, and communicate over larger and larger networks, on many basic intelligence tasks the average two-year-old leaves them in the shade. So, where to from here..?

Symbolic Computation versus Neural Networks

Symbolic manipulation seems to be essential to much of the "intelligent" behaviour that makes us human. So much so that in the 1960s Newel and Simon famously stated that "a physical symbol system has the necessary and sufficient means for general intelligent action". Much work in artificial intelligence and cognitive science since then has focussed on symbolic manipulation using a digital computing model. While this has had a great deal of success in many applications, it has also had its problems, particularly with regard to computational explosion and dealing with imprecise information.

The neural or connectionist computing model, on the other hand, is well suited to distributed parallel processing and dealing with imprecision, and is thought to better reflect the substrate of human cognition. The difficulty with this model however is how (and even whether) to represent and manipulate the symbolic information thought to be necessary for "high-level" cognitive tasks. How, for example, would a non-linear function approximator (feedforward neural net) process language?

Recurrent Neural Networks

One of the most important capacities that humans have is the ability to remember and base decisions upon present and previous states of the "world" (usually some small part of the real world on which they are focussed). This ability to store state makes it possible to perform symbolic manipulation tasks such as reasoning, counting and performing mathematical operations, understanding language, playing chess, and so on.

Recent research in recurrent neural nets (RNNs) has shown that these networks have the capacity to perform a range of operations that require memory, or state. This in turn enables them to carry out many tasks that depend on sequences or structured information.

The aim of these projects is to replicate and extend some of these results. In general terms, we would like to see to what extent neural networks are capable of tasks that involve stored state. While this is a difficult task, we have the advantage of a proof of concept - human beings!

Individual Projects

  1. Recurrent Neural Networks and Discrete-State Games

    Neural networks are often used in game playing software, for example for chess or draughts, for evaluating game positions (or states). They are typically used by search or reinforcement-learning programs written in a "high-level" language, that are responsible for the "logic" of the game-playing agent. In this sense the "intelligence" of the neural network is limited.

    One or more projects are available investigating the ability of recurrent neural networks to develop more game "intelligence". Examples include investigating the representation of game states in the hidden nodes of a network, attempting to learn state transitions corresponding to legal moves, and combining this with an evaluation module allowing a network to evaluate moves by "looking ahead".

  2. Finding and Categorising Errors in Java Code

    While some programming errors (eg syntactic errors) can be found and categorised by a compiler, others cannot. These are programs that run OK but do the wrong thing. They are often called "logical errors" since the logic of the program is incorrect.

    One way to find logical errors is to run the program through tests and observe the outputs and exceptions to see if the operation of the program is correct. While this tells you if the program is incorrect, it doesn't directly point you to the error.

    The aim of this project is to use machine learning techniques to attempt to learn to categorise logical errors on the basis of observation of the "external" behaviour of code in response to test sequences. Data for the project will be data structure classes submitted to the datlab system in the second year Data Structures and Algorithms unit. A more detailed introduction to the system can be found here.

  3. Teaching a Neural Network to Behave as a Stack

    In order for humans to perform many tasks (such as language processing) they require a mental stack, or first-in-last-out buffer. More formally, there is a class of languages, called context-free languages (CFLs) that cannot be processed without a stack. Examples of CFLs include palindrome languages (strings that read the same in both directions, such as radar or perhaps dammit I'm mad) and languages that include matched parentheses, as found in English and Java (tho English is a little more complex than a CFL).

    Rodriguez (2001) has experimented with recurrent neural networks that learn palindrome languages (among others). The aim of this project is to validate Rodriguez' experiments on palindromes and extend this to attempt to learn a simple stack data structure.

    Primary Reference

    Rodriguez, P., "Simple Recurrent Networks Learn Context-Free and Context-Sensitive Languages by Counting", Neural Computation 13, 2093-2118, 2001.

  4. Learning Turing Machines

    Turing machines are a model of computation devised by the genius Alan Turing in order to define what can be effectively computed. They consist of a finite-state controller that can read and write to an infinite tape. Theoretical results have shown that Turing machines can be implemented entirely in recurrent neural networks. However such theoretical implementations are not robust in the presence of noise, and cannot be learned in practice.

    de Oliveira et al (2001) have propsed an alternative approach for practical implementation of Turing machines in which the tape is considered to be part of the external environment (in the same way a human might use a piece of paper for temporary results). This allows them to focus on the controller, which they show can be implemented as a finite-state machine. Finite-state machines can, however, be robustly learned. The aim of this project is to validate their work and then attempt to learn a controller for a simple Turing machine.

    Primary Reference

    de Oliveira, W. R., de Souto, M. C. P. and Ludermir, T. B., "Agent-Environment Approach to the Simulation of Turing Machines by Neural Networks", Proceedings of the International Joint Conference on Neural Networks (IJCNN-01), 71-76, 2001.

  5. Learning about Intention in Dialogue

    Supervised with Daniel Midgley.

    This project is wicked!

    Humans use many cues other than sentence structure (syntax) and meaning (semantics) in order to understand language. In dialogue, for example, we often disambiguate the meaning of words using our knowledge of the intention of the speaker. When someone asks us "Do you know the time?" we know they are not seeking the literal answer yes or no, but rather are requesting information. Similarly, the above sentence doesn't mean this project is mean or evil.

    In order for computers to understand language they will also need to access these cues. One way to attempt to do this is for the computer to learn to tag the text with additional information. Perez-Ortiz and Forcada (2001) have recently presented an approach to tagging text for parts-of-speech (eg "verb transitive") using recurrent neural networks. The aim of this project is to verify this work and apply it to tagging for intention. The project will use the Verbmobil corpus, which has been used for other approaches to intention tagging in this Department.

    Primary Reference

    Perez-Ortiz, J. A. and Forcada, M. L., "Part-of-Speech Tagging with Recurrent Neural Networks", Proceedings of the International Joint Conference on Neural Networks (IJCNN-01), 1588-1592, 2001.

  6. Music Composition with Recurrent Neural Networks

    Music is an inherently temporal process: the pleasure in music comes from the way in which notes are combined into temporal sequences. Like language, it can also be represented symbolically. In fact music theorists have worked on the idea of "musical grammars": rules and patterns for constructing music. A system for artificial composition must take these rules into account, but must also have variation to avoid sounding mechanistic.

    Chen and Miikkulainen (2001) have experimented with using simple recurrent networks for generating musical sequences. Good musical generators are found by using an evolutionary approach to evolve networks, determining the fitness of the candidate networks by evaluating how well they conform to both general musical rules and those typical of the style of the composer Bartok. Some of their music (along with their paper) can be found at the University of Texas Neural Network Research Group applications page.

    This project will seek to validate and improve upon their work. It would suit someone who has studied music theory and composition, and depending on the extent of that study may require seeking co-supervision.

    Primary Reference

    Chen, C. J. and Miikkulainen, R., "Creating Melodies with Evolving Recurrent Neural Networks", Proceedings of the International Joint Conference on Neural Networks (IJCNN-01), 2241-2246, 2001.

  7. Training Recurrent Networks with Particle Swarm Optimisers

    Backpropagation and related gradient-descent based methods have proven very successful in training standard (first-order) feedforward neural networks where the unit transfer functions are continuously differentiable. In other types of networks, however, they may be either inapplicable, inefficient, or unsuccessful. For example, they cannot be applied to networks in which the unit transfer functions are not continuously differentiable (for example, piecewise linear functions). They are less successful in higher-order networks, such as product unit networks, where the weight search space is more eratic. And they require modification for application to recurrent networks. These are important limitations in sequential processing problems, since concise solutions appear to require second-order recurrent networks, with many theoretical solutions also making use of piecewise linear networks.

    Alternative approaches to training have used global optimisation strategies, particularly genetic or evolutionary algorithms. Moriarty and Miikulainen (1996), for example, use genetic algorithms to evolve networks for reinforcement learning tasks. A broader approach is taken by Ismail and Engelbrecht (2000), who compare gradient descent with a GA, a particle swarm optimiser (PSO) and a leapfrog optimisation routine for training product unit networks. The PSO algorithm is most successful in three of the four functions studied and is more successful than the GA in all four studies.

    The aim of this work is to find efficient strategies for training first and second-order recurrent networks. More specifically this will involve investigating the application of PSO-based algorithms and comparing these with evolutionary approaches and more traditional approaches such as back-propagation through time (BPTT).

    References

    Ismail, A. and Engelbrecht, A. P., "Global optimisation algorithms for training product neural networks", Proceedings of the International Joint Conference on Neural Networks (IJCNN-00), 132-137, 2000.

    Moriarty, D. E. and Miikkulainen, R., "Efficient reinforcement learning through symbiotic evolution", Machine Learning 22, 11-32, 1996.

General

Group Participation

Experience has shown that it can be very beneficial for research students (among others!) to have a group of people with related interests to bounce ideas off. Students undertaking these projects will join the Adaptive Systems Research Group and will be expected to attend and contribute to group meetings.

Tools

The projects will use neural network simulators such as the Matlab Neural Network Toolbox, PDP++ or tlearn, along with a high level language such as Matlab or Java.

Some of these tools provide examples of simulations described in the literature that can be used as a starting point.


Previous years' projects


Return to the 4th year project list