The ISYS Project

Generation and Visualization Mechanisms Based On The Immune System

An Introduction

(IP-REP-002)

John Hunt and Denise Cooke

Centre for Intelligent Systems,
Computer Science Department,
University of Wales, Aberystwyth,
Penglais, Aberystwyth,
Dyfed, SY23 3DB.
Tel: 01970 622537.
Email: jjh@aber.ac.uk

Status: Draft
Date: 1st March 1996
Version: 1.0

Circulation: Open


Abstract

This document is an introduction to the ISYS project. This project is researching machine learning mechanisms based on the Immune System for tasks such as data mining. The report attempts to provide some background to the technique being developed as well as outline the work being carried out on the research project.


Table of Contents

  1. Introduction
  2. The AIS as a machine learning approach
  3. The operation of the AIS
  4. Applying the AIS
  5. The Proposed research
    1. Aims of the Project
    2. Objectives of the Project
  6. Expertise at the University of Wales, Aberystwyth
  7. References

1. Introduction

The Artificial Immune System (AIS) implements a learning technique inspired by the human immune system which is a remarkable natural defence mechanism that learns about foreign substances. There are a number of aspects which are unique to the immune system which make it of particular interest as the basis of a machine learning system. These are:

* The presence of a content addressable memory.

* That it learns by experience.

* It is inherently distributed.

* It can forget little used information (which is why humans need regular vaccinations).

* It is inherently generalist (i.e. it does not get over taught).

* Theories are available about how it achieves the above, although some are controversial.

However, the immune system has not attracted the same kind of interest from the computing field as the neural operation of the brain or the evolutionary forces used in learning classifier systems. Exceptions to this include modelling the immune system to obtain more biological knowledge (e.g. [3]), the use of an immune network model for optimisation of functions and controllers [1] and the attempt to build a immune-based system specifically for image recognition [4].

The primary aims of the ISYS project can be summarised as:

* To develop a toolkit for building AIS-based systems

* To apply the AIS to a range of application problems

* To refine the theory of the AIS

2. The AIS as a machine learning approach

The development of the AIS [2, 5, 6, 7] had a number of distinct motivations. These were:

* to remove the need for negative examples. Thus removing the time consuming and tedious process of identifying or creating such negative examples and ensure that no positive examples are accidentally included.

* to provide self organisation. Some techniques such as neural networks and genetic algorithms are relatively simple but can take a great deal of time to optimise for a particular application. A self organising system would alleviate this problem. This is particularly important if such systems are to be used by those who wish to benefit from applying them to their application problems rather than to get into the "nitty gritty" of implementation and optimisation.

* to use an Explicit symbolic representation. One problem for people using Neural Networks is that it can be difficult to identify what the network "knows". This is an important issue in many application areas where either the knowledge represented by the learning system is the significant aspect (rather than the results produced by for example the network) or where the knowledge must be verified for legal or commercial reasons.

* the provision of a content addressable memory. Case based reasoning has been one of the growth areas in AI over the last few years. One of the reasons for this is its relative simplicity, another is its ability to perform a fuzzy match on information held within cases. This can be referred to as a content addressable memory and has many desirable features.

* to allow for distributed processing. If vary large amounts of data are to be processed then the ability to exploit distributed processing platforms is likely to be an advantage. The immune system is inherently distributed and could therefore be a useful metaphor for a distributed learning system.

These motivations lead to the development of the AIS approach which has been implemented in a number of prototypes. The approach borrows much from the theory of the immune system but without attempting to be an exact model of the immune system.

To place the AIS within the context of other machine learning approaches, it is illustrative to compare attributes (see table below). The AIS offers noise tolerant, unsupervised learning within a system which is self-organising, does not require negative examples and explicitly represents what it has learnt. Such a system combines the advantages of learning classifier systems with some of the advantages of neural networks, machine induction and case-based retrieval.

Click here for Picture

3. The operation of the AIS

The AIS comprises a root object, a network of cells, a teaching data set and a test data set. Each cell in the network possess a pattern matching element which is generated by mimicking the genetic mechanisms by which antibodies are formed in the natural immune system. This enables complex vocabularies and promotes diversity of the pattern matching elements.

The system exhibits two types of response: primary and secondary. The primary response is the learning phase when the AIS learns about patterns in the input teaching data. The secondary response represents a pattern recognition process during which the AIS attempts to classify new data relative to the data it has seen before.

During the learning phase, input data is inserted into the cell network. The cells in the vicinity of the insertion point are presented with the data. An immune-based matching algorithm is used to establish the match between the data and the cell. If the match value exceeds a threshold, the cell becomes stimulated and produces other cells whose pattern matching element can mutate, which may produce better matches for the input data. These cells join the network of cells. This network acts to reinforce the stimulation level of the better cells and repress poorer cells via a feedback mechanism. The size of the network and the links within the network are dynamically generated by the interaction of the cells.

Following the learning phase, the cells can recognise test data which has features in common with the teaching set. This can indicate that the new data is similar to a particular class of teaching data, or that the new input contains a pattern similar to that present in some of the teaching data.

4. Applying the AIS

To apply the AIS to a particular problem, it is first taught with a sample teaching set in a one shot or an incremental manner (depending on the problem). The information learnt can then be exploited in a number of ways. For example, the cells can be examined to see what common features have been learnt. This process can explicate information which is implicit in the data. Alternatively, if cells can recognise previously unseen data, then appropriate inferences can be made about the new data (e.g., the new data can be classified as being of a certain type).

5 The Proposed research

Various researchers, including the author's research group, have shown how learning systems can be used to solve complex problems which could not have been easily solved using expert systems, or intelligent knowledge based systems (i.e., due to the lack of available knowledge or heuristics). This project is an investigation and evaluation of the design and implementation of a machine learning system based on concepts taken from the human immune system. It is aimed at complementing existing techniques and one of our aims is to integrate our AIS with existing approaches such that users can select the most appropriate tool for their particular application. We therefore intend to identify the class/es of problem most appropriate for solution by the artificial immune system as part of our research.

Our current research has centred on the development of a prototype system based on theories used to describe the human immune system. In a previous project funded by the University of Wales, Aberystwyth, we showed how such a system could be used in the tasks of pattern recognition and promoter sequence recognition. We have shown how the immune system can be used as the basis of a learning system which is self organising, explicitly represents the information it has learnt, is noise tolerant and learns incrementally in an unsupervised manner. This is a novel set of features which has the potential to allow the extraction of implicit information from raw data. We expect to identify a range of uses, e.g., for image analysis, market prediction, fault classification, data mining, etc. Tools for such tasks are now commercially viable and are proving invaluable to industry (e.g., Integral Solutions Ltd.'s Clementine which has gained a DTI SMART award). We believe that a technique which complements the existing set of approaches could be of significant commercial benefit. As Integral Solutions Ltd. are the industrial collaborators on this project we will benefit from their industrial experience in this area as well as consider integrating the AIS with Clementine.

5.1 Aims of the Project

The primary aim of this project is to enable the development of a toolkit for building learning systems based on the immune system. These systems should be capable of solving real-world problems in industry and commerce.

This involves further research into the algorithms, representations and operation of our Artificial Immune System. It requires the investigation of the capabilities (e.g., classification, prediction, control, etc.) of such a system on standard test suite problems. It should also consider the performance of the system on relevant real-world applications (which should be identified and investigated based on the results obtained from analysing the test suite problems).

We shall identify suitable test suites (e.g., from the Irvine repository of databases for machine learning research, University of California) by considering the amount of data available, how realistic the data is, and how representative it is of industrial and commercial problems.

It is important that we choose test problems which highlight the strengths of existing techniques as well as those which highlight the strengths of the AIS. This is necessary to obtain a fair comparison.

We shall do this using the extensive network of contacts available through the Centre for Intelligent Systems as well as through our collaborator's experience. We have already identified a number of potential industrial problem areas which we shall target once we have started the project.

5.2 Objectives of the Project

The main objectives of the research are listed below.

Development of the AIS theory. Our previous research has concentrated on developing the underlying concepts and algorithms of the AIS using an experimental approach. To develop the AIS further we need not only to consider the wider aspects of the immune system but also to formalise these theories and to understand the information processing operations being performed.

The implementation of a development toolkit. To promote the accessibility of the system to potential users we will develop a toolkit for building immune system based learning systems. This has proved an excellent way of enabling other approaches (e.g. neural networks and machine induction) to obtain both acceptance and significant financial benefits (e.g. ISL have found that the Clementine toolkit has been more acceptable to potential users due to its visual programming style interface).

The construction of a series of demonstrators which illustrate the performance of AIS based systems on test suite problems. These demonstrators will illustrate the capabilities of the AIS and allow comparison with existing techniques. We aim to show that it can out-perform existing approaches on certain classes of problem.

The construction of one or more systems to address real world problems. Although test suite problems are suitable for testing and comparison, it is still necessary to consider real world data. Therefore, we will illustrate the performance of the AIS on one or more real world problems. During the lifetime of the project we will be seeking to involve suitable companies from a range of industrial and commercial sectors.

The above objectives translate into a set of criteria against which the project should be assessed. That is, for the project to be a success we must show that we have a concrete theory (concepts and algorithms) on which the AIS is based and that we have produced a development toolkit which embodies this formalised theory. We must show that this tool kit can be used to construct a set of systems which address the test suite problems and that the AIS is useful for real world problems.

6. Expertise at the University of Wales, Aberystwyth

In the latest government Research Assessment Exercise, the expertise in research of the Computer Science Department was recognised through the award of a grade 4. We are based within the Centre for Intelligent Systems in the Computer Science Department. This Centre is funded by the Higher Educational Funding Council for Wales, and supported by the Welsh Development Agency, as a centre of expertise. The Centre has extensive experience of collaborating with industry on a wide range of research topics, including qualitative modelling, learning classifier systems and adaptive control. It also has experience of technology transfer, e.g., a Teaching Company Scheme with Kayes (Presteigne) Ltd. and has received a grant from the Welsh Office to promote Expert Systems technology in SME's. This experience, as well as the contacts the Centre has with industry, will be an invaluable asset to the project. The Centre also possesses a considerable range of sophisticated artificial intelligence software (e.g., neural network simulators, case-based reasoning tools, machine induction software, expert system shells) as well as data visualisation software which will be available to the project.

7. References

[1] Bersini, H. (1991) Immune network and adaptive control. Proceedings of the First European Conference on Artificial Life. (Ed. F. J. Varela and P. Bourgine). MIT Press.

[2] Cooke, D.E., and J.E. Hunt (1995) Recognising Promoter Sequences Using An Artificial Immune System. In Proceedings of the Third International Conference on Intelligent Systems for Molecular Biology. pp 89-97, Pub. AAAI Press, California.

[3] Farmer, J.D., N.H., Packard, and A.S. Perelson (1986) The immune system, adaptation, and machine learning. Physica D, Vol. 22, 187-204.

[4] Gilbert, C.J. and T.W. Routen (1994) Associative memory in an immune-based system. Proceedings of AAAI'94, AAAI Press, Menlo Park, California. Vol. 2, 852-857.

[5] Hunt, J.E., and D.E. Cooke. Learning using an Artificial Immune System. to appear in the Journal of Microcomputer Applications, 1996.

[6] J. E. Hunt, D. E. Cooke and H. Holstein, Case memory and retrieval Based on the Immune System, in the First International Conference on Case Based Reasoning, (October 1995) Published as Case-Based Reasoning Research and Development, Ed. Manuela Weloso and Agnar Aamodt, Lecture Notes in Artificial Intelligence 1010, pp 205 -216.

[7] J. E. Hunt and D. E. Cooke, An Adaptive, Distributed Learning System, based on the Immune System, in Proc. of the IEEE International Conference on Systems Man and Cybernetics, pp 2494 - 2499, (1995).