Introduction: The Future of Intelligent Computation!?
While digital computing hardware gets faster and faster and computers
can be programmed to crunch more and more numbers, produce fancier
graphics, and communicate over larger and larger networks, on many
basic intelligence tasks the average two-year-old leaves them in the
shade. So, where to from here..?
Symbolic Computation versus Neural Networks
Symbolic manipulation seems to be essential to much of the
"intelligent" behaviour that makes us human. So much so that in the
1960s Newel and Simon famously stated that "a physical symbol system
has the necessary and sufficient means for general intelligent
action". Much work in artificial intelligence and cognitive science
since then has focussed on symbolic manipulation using a digital
computing model. While this has had a great deal of success in many
applications, it has also had its problems, particularly with regard
to computational explosion and dealing with imprecise information.
The neural or connectionist computing model, on the other hand, is
well suited to distributed parallel processing and dealing with
imprecision, and is thought to better reflect the substrate of human
cognition. The difficulty with this model however is how (and even
whether) to represent and manipulate the symbolic information thought
to be necessary for "high-level" cognitive tasks. How, for example,
would a non-linear function approximator (feedforward neural net)
process language?
Recurrent Neural Networks
One of the most important capacities that humans have is the ability
to remember and base decisions upon present and previous states
of the "world" (usually some small part of the real world on which
they are focussed). This ability to store state makes it possible to
perform symbolic manipulation tasks such as reasoning, counting and
performing mathematical operations, understanding language, playing
chess, and so on.
Recent research in recurrent neural nets (RNNs) has shown that these
networks have the capacity to perform a range of operations that
require memory, or state. This in turn enables them to carry out many
tasks that depend on sequences or structured information.
The aim of these projects is to replicate and extend some of these
results. In general terms, we would like to see to what extent neural
networks are capable of tasks that involve stored state. While this is
a difficult task, we have the advantage of a proof of concept - human
beings!
Individual Projects
- Recurrent Neural Networks and
Discrete-State Games
Neural networks are often used in game playing software, for example
for chess or draughts, for evaluating game positions (or states). They
are typically used by search or reinforcement-learning programs
written in a "high-level" language, that are responsible for the
"logic" of the game-playing agent. In this sense the "intelligence" of
the neural network is limited.
One or more projects are available investigating the ability of
recurrent neural networks to develop more game
"intelligence". Examples include investigating the representation of
game states in the hidden nodes of a network, attempting to learn
state transitions corresponding to legal moves, and combining this
with an evaluation module allowing a network to evaluate moves by
"looking ahead".
- Finding and Categorising Errors in Java Code
While some programming errors (eg syntactic errors) can be found and
categorised by a compiler, others cannot. These are programs that run
OK but do the wrong thing. They are often called "logical errors"
since the logic of the program is incorrect.
One way to find logical errors is to run the program through tests and
observe the outputs and exceptions to see if the operation of the
program is correct. While this tells you if the program is incorrect,
it doesn't directly point you to the error.
The aim of this project is to use machine learning techniques to
attempt to learn to categorise logical errors on the basis of
observation of the "external" behaviour of code in response to test
sequences. Data for the project will be data structure classes
submitted to the datlab system in the second year Data
Structures and Algorithms unit. A more detailed introduction to the
system can be found here.
- Teaching a Neural Network to
Behave as a Stack
In order for humans to perform many tasks (such as language
processing) they require a mental stack, or first-in-last-out
buffer. More formally, there is a class of languages, called
context-free languages (CFLs) that cannot be processed without
a stack. Examples of CFLs include palindrome languages (strings that
read the same in both directions, such as radar or perhaps
dammit I'm mad) and languages that include matched parentheses,
as found in English and Java (tho English is a little more complex
than a CFL).
Rodriguez (2001) has experimented with recurrent neural networks that
learn palindrome languages (among others). The aim of this project is
to validate Rodriguez' experiments on palindromes and extend this to
attempt to learn a simple stack data structure.
Primary Reference
Rodriguez, P., "Simple Recurrent Networks Learn Context-Free and
Context-Sensitive Languages by Counting", Neural Computation
13, 2093-2118, 2001.
- Learning Turing Machines
Turing machines are a model of computation devised by the genius Alan
Turing in order to define what can be effectively computed. They
consist of a finite-state controller that can read and write to an
infinite tape. Theoretical results have shown that Turing machines can
be implemented entirely in recurrent neural networks. However such
theoretical implementations are not robust in the presence of noise,
and cannot be learned in practice.
de Oliveira et al (2001) have propsed an alternative approach for
practical implementation of Turing machines in which the tape is
considered to be part of the external environment (in the same way a
human might use a piece of paper for temporary results). This allows
them to focus on the controller, which they show can be implemented as
a finite-state machine. Finite-state machines can, however, be
robustly learned. The aim of this project is to validate their work
and then attempt to learn a controller for a simple Turing machine.
Primary Reference
de Oliveira, W. R., de Souto, M. C. P. and Ludermir, T. B.,
"Agent-Environment Approach to the Simulation of Turing
Machines by Neural Networks", Proceedings of the
International Joint Conference on Neural Networks
(IJCNN-01), 71-76, 2001.
- Learning about Intention in Dialogue
Supervised with Daniel Midgley.
This project is wicked!
Humans use many cues other than sentence structure (syntax) and
meaning (semantics) in order to understand language. In dialogue, for
example, we often disambiguate the meaning of words using our
knowledge of the intention of the speaker. When someone asks us
"Do you know the time?" we know they are not seeking the
literal answer yes or no, but rather are requesting
information. Similarly, the above sentence doesn't mean this project
is mean or evil.
In order for computers to understand language they will also need to
access these cues. One way to attempt to do this is for the computer
to learn to tag the text with additional
information. Perez-Ortiz and Forcada (2001) have recently presented an
approach to tagging text for parts-of-speech (eg "verb transitive")
using recurrent neural networks. The aim of this project is to verify
this work and apply it to tagging for intention. The project will use
the Verbmobil corpus, which has been used for other
approaches to intention tagging in this Department.
Primary Reference
Perez-Ortiz, J. A. and Forcada, M. L., "Part-of-Speech Tagging with
Recurrent Neural Networks", Proceedings of the
International Joint Conference on Neural Networks
(IJCNN-01), 1588-1592, 2001.
- Music Composition with Recurrent Neural Networks
Music is an inherently temporal process: the pleasure in music comes
from the way in which notes are combined into temporal
sequences. Like language, it can also be represented symbolically. In
fact music theorists have worked on the idea of "musical grammars":
rules and patterns for constructing music. A system for artificial
composition must take these rules into account, but must also have
variation to avoid sounding mechanistic.
Chen and Miikkulainen (2001) have experimented with using simple
recurrent networks for generating musical sequences. Good musical
generators are found by using an evolutionary approach to evolve
networks, determining the fitness of the candidate networks by
evaluating how well they conform to both general musical rules and
those typical of the style of the composer Bartok. Some of their music
(along with their paper) can be found at the University
of Texas Neural Network Research Group applications page.
This project will seek to validate and improve upon their work. It
would suit someone who has studied music theory and composition, and
depending on the extent of that study may require seeking
co-supervision.
Primary Reference
Chen, C. J. and Miikkulainen, R., "Creating Melodies with Evolving
Recurrent Neural Networks", Proceedings of the
International Joint Conference on Neural Networks
(IJCNN-01), 2241-2246, 2001.
- Training Recurrent Networks with
Particle Swarm Optimisers
Backpropagation and related gradient-descent based methods have proven
very successful in training standard (first-order) feedforward neural
networks where the unit transfer functions are continuously
differentiable. In other types of networks, however, they may be either
inapplicable, inefficient, or unsuccessful. For example, they cannot
be applied to networks in which the unit transfer functions are not
continuously differentiable (for example, piecewise linear
functions). They are less successful in higher-order networks, such as
product unit networks, where the weight search space is more
eratic. And they require modification for application to recurrent
networks. These are important limitations in sequential processing
problems, since concise solutions appear to require second-order
recurrent networks, with many theoretical solutions also making use of
piecewise linear networks.
Alternative approaches to training have used global optimisation
strategies, particularly genetic or evolutionary algorithms. Moriarty
and Miikulainen (1996), for example, use genetic algorithms to evolve
networks for reinforcement learning tasks. A broader approach is taken
by Ismail and Engelbrecht (2000), who compare gradient descent with a GA, a
particle swarm optimiser (PSO) and a leapfrog optimisation routine for
training product unit networks. The PSO algorithm is most successful
in three of the four functions studied and is more successful than the
GA in all four studies.
The aim of this work is to find efficient strategies for training
first and second-order recurrent networks. More specifically this will
involve investigating the application of PSO-based algorithms and
comparing these with evolutionary approaches and more traditional
approaches such as back-propagation through time (BPTT).
References
Ismail, A. and Engelbrecht, A. P., "Global optimisation algorithms for
training product neural networks", Proceedings of the
International Joint Conference on Neural Networks
(IJCNN-00), 132-137, 2000.
Moriarty, D. E. and Miikkulainen, R., "Efficient reinforcement
learning through symbiotic evolution", Machine
Learning 22, 11-32, 1996.
General
Group Participation
Experience has shown that it can be very beneficial for research
students (among others!) to have a group of people with related
interests to bounce ideas off. Students undertaking these projects
will join the Adaptive
Systems Research Group and will be expected to attend and
contribute to group meetings.
Tools
The projects will use neural network simulators such as the Matlab Neural Network
Toolbox, PDP++ or tlearn, along with a high
level language such as Matlab or Java.
Some of these tools provide examples of simulations described in the
literature that can be used as a starting point.
|