Portions of a generic genetic-algorithms library

From CSSEMediaWiki
Revision as of 06:02, 29 September 2008 by Tureiti Keith (Talk | contribs)
Jump to: navigation, search

Contents

Introduction

Over the last few months I have become interested in problems associated with modelling non-linear systems. This interest has lead me to look into artificial neural networks, and how they can be trained using genetic algorithms. For those with an interest in this area, development environments like Matlab provide an ideal platform for quick implementation and testing. However, at deployment time, a Matlab script is of little use to clients with a limited budget, or to developers targeting some embedded platforms. [What about on non-x86 architectures e.g. the supercomputer situated in the engineering department - that don't run Matlab?].

Genetic Algorithms for Dummies

Disclaimer about how much I actually know about genetic algorithms. [Damn it Jim I'm a software engineer, not an AI specialist!]

Requirements

The intention of this project is to provide a framework that is capable of mimicking the back-end behaviour of the Matlab genetic algorithms library. The framework should allow a client to adapt and extend the framework to suit their specific needs. The framework should:

  • Allow optimizing of client defined types
  • Allow client defined selection, reproduction, and mutation strategies
  • Provide the easy removal of unused strategies for the purposes of reducing the size of the final distributable

Design

Candidate Creation

A major goal of the framework is to provide the ability to optimise client defined types. The client should be able to extend the framework to define a candidate, its DNA, and the process by which these are created. To facilitate this, the framework should define a contract by which both the framework can work and the client can inherit or realise. This is design by contract. This contract should define abstract entities that include:

  • A candidate
  • A candidate's DNA
  • A class responsible for candidate creation
  • A class responsible for DNA creation

Here we have a cascaded pair of abstract classes responsible for the creation of a cascaded pair of abstract products. This is the contractual portion of a factory method. The client is left to implement the concrete creator, and products as they please.

Future Extensions

One comment I wish to make here is that it was probably a mistake to leave the creation of Candidates and DNA entirely up to the client. The framework should provide Chromosome objects that represent the basics: bytes, floats, integers, etcetera.

Calculating Fitness

Having provided the client with the ability to define candidates, the client should also be able to assess the fitness of these candidates. As the fitness of a candidate is completely dependent on the context of the application, the framework can have no knowledge of how this is calculated. As with candidate creation, the fitness calculation will be defined using a contract in the form of an interface. The framework will define its work in terms of this interface, the client can realise this interface.

The contract for fitness calculations is defined by the IFitnessInspector interface.

Selecting the Fittest

Once the fitness of all candidates has been assessed, the candidates that will live on to the next generation are selected. There are many methods by which candidates can be selected. The selection methods presented in the Matlab genetic algorithms library should be reproduced here. The framework should also allow for client defined selection methods.

Creating Candidates via Reproduction

Once the fittest candidates have been selected, these candidates must go on to reproduce, and so create the next generation. As with the selection process, there are many possible ways to create new candidates via reproduction. The framework should also allow client defined reproduction processes.

Mutating Candidates

Following reproduction, the genetic algorithm must mutate some proportion of the population. As with the reproduction and selection processes, the Matlab library describes several ways in which candidates may be mutated. The framework should provide these, and also allow for client defined mutation strategies.

Candidate Migration

A First Attempt

This is the starting point - something I put together around 6 months ago.

My first working attempt at a generic genetic-algorithms library is shown in Fig. 1. This section will dissect and critique portions of the initial library design.

Figure 1: A UML 2.1 class diagram representing my first working attempt at a generic genetic-algorithms library

Creating New Candidates

A set of candidates are required to initiate the algorithm. The creation of this initial set or generation of candidates is the responsibility of the library user.

The First Attempt

In the first attempt, a modified Factory pattern was used here, this was taken from a Microsoft Patterns and Practices article by Doug Purdy. An abstract factory class is provided by the library, the user is expected to provide a concrete candidate factory class.

Figure 2: A UML 2.1 class diagram representing my first working attempt at generation initialisation structure
Figure 3: A UML 2.1 communication diagram representing my first working attempt at generation initialisation communications

A New Design

To facilitate building candidates, the user should provide the genetic algorithm with a concrete candidate factory.

Figure 4: A UML 2.1 class diagram representing an attempt to redesign the generation initialisation
Figure 5: A UML 2.1 communication diagram representing an attempt to redesign the generation initialisation communications

Selection of the Fittest

Upon creating a generation of candidates, the process of creating a new generation begins. To start, a portion of the candidates are selected from the current generation based on fitness. The fitness of each candidate is calculated, then one of several candidate selection algorithms is applied to the current generation. The selection strategy may vary per iteration. The library user is responsible for choosing and configuring the selection strategy.

The First Attempt

Fitness calculations are known about by

  • the GeneticAlgorithms class
  • the Generation class
  • the Candidate class
Figure 6: A UML 2.1 class diagram representing my first working attempt at a selection structure
Figure 7: A UML 2.1 communication diagram representing my first working attempt at selection communications

A New Design

Figure 8: A UML 2.1 class diagram representing an attempt to redesign the selection structure
Figure 9: A UML 2.1 communication diagram representing an attempt to redesign the selection communications

Creating New Candidates via Reproduction

A new generation begins by selecting a set of candidates based on fitness. These candidates become the parents in the new generation; they reproduce to create a new set of genetically similar candidates. Reproduction, also known as crossover, is performed using one of several possible crossover algorithms. The same reproduction strategy is usually applied to the entire generation in any one iteration. It is often kept constant throughout the algorithm life-cycle. The library user is responsible for choosing the crossover strategy.

The First Attempt

Figure 10: A UML 2.1 class diagram representing my first working attempt at a candidate reproduction structure
Figure 11: A UML 2.1 communication diagram representing my first working attempt at candidate reproduction communications

A New Design

File:GaNewDesignReproductionStructure.jpg
Figure 12: A UML 2.1 class diagram representing an attempt to redesign the reproduction structure
File:GaNewDesignReproductionCommunications.jpg
Figure 13: A UML 2.1 communication diagram representing an attempt to redesign the the reproduction communications

Mutating Candidates

Following the reproduction process, a proportion of the children produced may be mutated. In mutation, there are two responsibilities. One is performing the mutation - defining this algorithm is the responsibility of the library user. The other is selecting candidates for mutation - here the library user is responsible for choosing from one of several selection strategies defined within the library.

A New Design

File:GaNewDesignMutationStructure.jpg
Figure 14: A UML 2.1 class diagram representing an attempt to redesign the mutation structure
File:GaNewDesignMutationCommunications.jpg
Figure 15: A UML 2.1 communication diagram representing an attempt to redesign the mutation communications

A New Design

Core Class Structure

Figure 2: A UML 2.1 class diagram representing an attempt to redesign the generic genetic-algorithms library

Communications

Figure 3: A UML 2.1 communication diagram representing an attempt to redesign the generic genetic-algorithms library

References

  1. ^ http://www.mathworks.com/access/helpdesk/help/toolbox/gads/

See Also