The controllers (i.e. the neural networks) have a fixed topology and a fixed tanh activation function, and so a simple fixed length array of real numbers is sufficient to represent the genome of each of the controllers. A modular topology (see figure 2) inspired by the layout of a multiloop PID controller was chosen since it had been found to greatly improve evolvability .
1 1
x a0
1
z a2
.
(longitudinal cyclic)
q z
(main rotor collective)
longitudinal module
1 1 1
collective module
y a1 a3
v
p
lateral module
(lateral cyclic)
r
yaw module
collective)
Fig. 2. Modular network used in the waypoint task. The same network topology was also used for the velocity task but the inputs ∆x, ∆y ans ∆z were substituted respec- tively by ∆u, ∆v ans ∆z˙
A variation of ES (Evolutionary Strategies), with a total population of 33 inpiduals and an elite of 10, is used to evolve the weights. The first population consists of neural networks with small random synaptic weights; each controller is then evaluated on the task at hand. This involves using the controller under test to fly the simulated helicopter model and try to achieve the desired task (e.g. flying a set of waypoints, or reaching a specific vectorial speed). Its fitness represents the ability demonstrated by the controller on the specific task used. The population is then sorted by fitness and the worst 23 inpiduals are replaced with mutated versions of the 10 best inpiduals (the elite). The algorithm does
not apply recombination; the weights of the network are simply mutated by adding a random value (drawn from a Gaussian distribution with mean 0 and standard deviation 0.01). In this way, a new population is created, and the evolutionary process can then be repeated for the next generation.
The possibility of defining different tasks and the different fitness functions associated with them allows us to customise the helicopter controller to our needs. This constitutes a really valuable option, and offers clear advantages when compared to the traditional manual design of the controller.
As noted in section 3.2 our algorithms for state estimation and model iden- tification have not yet been validated; this is due to a manufacturing problem with the IMU sensor that is currently being rectified. In the meantime, a freely available dynamic helicopter simulator with dynamics qualitatively similar to our helicopter [30] was used to test our approach to the evolution of the con- troller. This simulator accurately reproduces the dynamics of the XCell 60 model helicopter. Blade element theory is used as the basis for the computation of ro- tor thrust and drag forces, and the main rotor dynamics and stabilising bar are modelled as proposed in Mettler et al. [21]. Dynamic coupling and aerodynam- ics effects are also modelled. The simulator outputs the same state variables as will be available from the Hirobo helicopter, and accepts the same flight con- trol inputs. Although qualitatively similar to the Hirobo in all essential respects, the simulated helicopter is much less stable1, and so it definitely constitutes a challenging test bench for our design approach.
Two different tasks were devised to test the design method. In the first, the helicopter is commanded to perform a specific flight trajectory; this controller will be useful for testing autonomous flight. In the second, the controller is requested to fly the helicopter with a specific (vectorial) velocity; this controller gives a basis on top of which the classic Reynolds flocking rules could be applied. In both tasks, there is an initial stage in which the controller is evolved for a few tens of generations using the heading error as the only fitness function. This evolution is very quick, and produces a minimal yaw stabilisation that enables