Go to the first, previous, next, last section, table of contents.

Simulated Annealing

Stochastic search techniques are used when the structure of a space is not well understood or is not smooth, so that techniques like Newton's method (which requires calculating Jacobian derivative matrices) cannot be used.

In particular, these techniques are frequently used to solve combinatorial optimization problems, such as the traveling salesman problem.

The basic problem layout is that we are looking for a point in the space at which a real valued energy function (or cost function) is minimized.

Simulated annealing is a technique which has given good results in avoiding local minima; it is based on the idea of taking a random walk through the space at successively lower temperatures, where the probability of taking a step is given by a Boltzmann distribution.

The functions described in this chapter are declared in the header file `gsl_siman.h'.

Simulated Annealing algorithm

We take random walks through the problem space, looking for points with low energies; in these random walks, the probability of taking a step is determined by the Boltzmann distribution

p = e^{-(E_{i+1} - E_i)/(kT)}

if E_{i+1} > E_i, and p = 1 when E_{i+1} <= E_i.

In other words, a step will occur if the new energy is lower. If the new energy is higher, the transition can still occur, and its likelihood is proportional to the temperature T and inversely proportional to the energy difference E_{i+1} - E_i.

The temperature T is initially set to a high value, and a random walk is carried out at that temperature. Then the temperature is lowered very slightly (according to a cooling schedule) and another random walk is taken.

This slight probability of taking a step that gives higher energy is what allows simulated annealing to frequently get out of local minima.

An initial guess is supplied. At each step, a point is chosen at a random distance from the current one, where the random distance r is distributed according to a Boltzmann distribution r = e^(-E/kT). After a few search steps using this distribution, the temperature T is lowered according to some scheme, for example T -> T/mu_T where \mu_T is slightly greater than 1.

Simulated Annealing functions

Simulated annealing: gsl_siman_Efunc_t
double (*gsl_siman_Efunc_t) (void *xp);

Simulated annealing: gsl_siman_step_t
void (*gsl_siman_step_t) (void *xp, double step_size);

Simulated annealing: gsl_siman_metric_t
double (*gsl_siman_metric_t) (void *xp, void *yp);

Simulated annealing: gsl_siman_print_t
void (*gsl_siman_print_t) (void *xp);

Simulated annealing: gsl_siman_params_t
These are the parameters that control a run of gsl_siman_solve.

/* this structure contains all the information 
   needed to structure the search, beyond the 
   energy function, the step function and the 
   initial guess. */

struct s_siman_params {
  /* how many points to try for each step */
  int n_tries;          

  /* how many iterations at each temperature? */
  int iters_fixed_T;    

  /* max step size in the random walk */
  double step_size;     

  /* the following parameters are for the 
     Boltzmann distribution */
  double k, t_initial, mu_t, t_min;

typedef struct s_siman_params gsl_siman_params_t;

Simulated annealing: void gsl_siman_solve (void *x0_p, gsl_siman_Efunc_t Ef, gsl_siman_metric_t distance, gsl_siman_print_t print_position, size_t element_size, gsl_siman_params_t params)
Does a simulated annealing search through a given space. The space is specified by providing the functions Ef, distance, print_position, element_size.

The params structure (described above) controls the run by providing the temperature schedule and other tunable parameters to the algorithm (see section Simulated Annealing algorithm). p The result (optimal point in the space) is placed in *x0_p.

If print_position is not null, a log will be printed to the screen with the following columns:

number_of_iterations temperature x x-(*x0_p) Ef(x)

If print_position is null, no information is printed to the screen.

Examples with Simulated Annealing

GSL's Simulated Annealing package is clumsy, and it has to be because it is written in C, for C callers, and tries to be polymorphic at the same time. But here we provide some examples which can be pasted into your application with little change and should make things easier.

Trivial example

The first example, in one dimensional cartesian space, sets up an energy function which is a damped sine wave; this has many local minima, but only one global minimum, somewhere between 1.0 and 1.5. The initial guess given is 15.5, which is several local minima away from the global minimum.

/* set up parameters for this simulated annealing run */

/* how many points do we try before stepping */
#define N_TRIES 200             

/* how many iterations for each T? */
#define ITERS_FIXED_T 10        

/* max step size in random walk */
#define STEP_SIZE 10            

/* Boltzmann constant */
#define K 1.0                   

/* initial temperature */
#define T_INITIAL 0.002         

/* damping factor for temperature */
#define MU_T 1.005              
#define T_MIN 2.0e-6

gsl_siman_params_t params 
     K, T_INITIAL, MU_T, T_MIN};

/* now some functions to test in one dimension */
double E1(void *xp)
  double x = * ((double *) xp);

  return exp(-square(x-1))*sin(8*x);

double M1(void *xp, void *yp)
  double x = *((double *) xp);
  double y = *((double *) yp);

  return fabs(x - y);

void S1(void *xp, double step_size)
  double r;
  double old_x = *((double *) xp);
  double new_x;

  r = gsl_ran_uniform();
  new_x = r*2*step_size - step_size + old_x;

  memcpy(xp, &new_x, sizeof(new_x));

void P1(void *xp)
  printf("%12g", *((double *) xp));

main(int argc, char *argv[])
  Element x0; /* initial guess for search */

  double x_initial = 15.5;

  gsl_siman_solve(&x_initial, E1, S1, M1, P1,
                  sizeof(double), params);
  return 0;

Here are a couple of plots that are generated by running siman_test in the following way:

./siman_test | grep -v "^#" 
  | xyplot -xyil -y -0.88 -0.83 -d "x...y" 
  | xyps -d > siman-test.eps
./siman_test | grep -v "^#" 
  | xyplot -xyil -xl "generation" -yl "energy" -d "x..y"
  | xyps -d > siman-energy.eps

Traveling Salesman Problem

The TSP (Traveling Salesman Problem) is the classic combinatorial optimization problem. I have provided a very simple version of it, based on the coordinates of twelve cities in the southwestern United States. This should maybe be called the Flying Salesman Problem, since I am using the great-circle distance between cities, rather than the driving distance. Also: I assume the earth is a sphere, so I don't use geoid distances.

The gsl_siman_solve() routine finds a route which is 3490.62 Kilometers long; this is confirmed by an exhaustive search of all possible routes with the same initial city.

The full code can be found in `siman/siman_tsp.c', but I include here some plots generated with in the following way:

./siman_tsp > tsp.output
grep -v "^#" tsp.output  
  | xyplot -xyil -d "x................y" 
           -lx "generation" -ly "distance" 
           -lt "TSP -- 12 southwest cities" 
  | xyps -d > 12-cities.eps
grep initial_city_coord tsp.output 
  | awk '{print $2, $3, $4, $5}' 
  | xyplot -xyil -lb0 -cs 0.8 
           -lx "longitude (- means west)" 
           -ly "latitude" 
           -lt "TSP -- initial-order" 
  | xyps -d > initial-route.eps
grep final_city_coord tsp.output 
  | awk '{print $2, $3, $4, $5}' 
  | xyplot -xyil -lb0 -cs 0.8
           -lx "longitude (- means west)" 
           -ly "latitude" 
           -lt "TSP -- final-order" 
  | xyps -d > final-route.eps

This is the output showing the initial order of the cities; longitude is negative, since it is west and I want the plot to look like a map.

# initial coordinates of cities (longitude and latitude)
###initial_city_coord: -105.95 35.68 Santa Fe
###initial_city_coord: -112.07 33.54 Phoenix
###initial_city_coord: -106.62 35.12 Albuquerque
###initial_city_coord: -103.2 34.41 Clovis
###initial_city_coord: -107.87 37.29 Durango
###initial_city_coord: -96.77 32.79 Dallas
###initial_city_coord: -105.92 35.77 Tesuque
###initial_city_coord: -107.84 35.15 Grants
###initial_city_coord: -106.28 35.89 Los Alamos
###initial_city_coord: -106.76 32.34 Las Cruces
###initial_city_coord: -108.58 37.35 Cortez
###initial_city_coord: -108.74 35.52 Gallup
###initial_city_coord: -105.95 35.68 Santa Fe

The optimal route turns out to be:

# final coordinates of cities (longitude and latitude)
###final_city_coord: -105.95 35.68 Santa Fe
###final_city_coord: -106.28 35.89 Los Alamos
###final_city_coord: -106.62 35.12 Albuquerque
###final_city_coord: -107.84 35.15 Grants
###final_city_coord: -107.87 37.29 Durango
###final_city_coord: -108.58 37.35 Cortez
###final_city_coord: -108.74 35.52 Gallup
###final_city_coord: -112.07 33.54 Phoenix
###final_city_coord: -106.76 32.34 Las Cruces
###final_city_coord: -96.77 32.79 Dallas
###final_city_coord: -103.2 34.41 Clovis
###final_city_coord: -105.92 35.77 Tesuque
###final_city_coord: -105.95 35.68 Santa Fe

Here's a plot of the cost function (energy) versus generation (point in the calculation at which a new temperature is set) for this problem:

Go to the first, previous, next, last section, table of contents.