Brady Ouren

Particle swarm optimization

Motivation

I happened to finish reading the old book Chaos and The Meme Machine recently and it’s lead me to thinking more about generative and evolutionary-type algorithms and how they look implemented as computation. This post will be the first part in a short series of my exploration with this. My initial goal is to 1) implement a particle swarm optimization, 2) visualize it, and 3) optimize it (run concurrently, speed up recursion, whatever).

As a short intro to PSO I’ll explain what little I know currently. Particle swarm optimization is good for finding global optima in non differentiable functions and fits into the category of evolutionary algorithms which is the main reason I picked this particular one to tackle here. It’s both interesting to compose and cool to visualize.

Implementation

To get started we define the initial data structure based on how we’d like to scope the optimization.


newtype Point = Point { unPoint :: (Float, Float) }
  deriving (Eq)

data SwarmOptimization
  = SwarmOptimization
  { numOfParticles :: Int
  , fitnessFunction :: Point -> Float
  , rangeX :: (Float, Float)
  , rangeY :: (Float, Float)
  -- ^ range x/y are for the upper and lower bound of the particle space
  }

Essentially we’re working towards a function that allows stepping through an optimization and returning the state of particles (pso :: SwarmOptimization -> Steps -> [Particle]). This should eventually provide fields for the hyper parameters but we’re going to autopopulate them for now.

So, we’ll need Particles. In PSO the particles store a current location, velocity and possibly the most optimal point (best) they’ve seen thusfar.

data Particle
  = Particle
  { loc :: Point
  , velocity :: Velocity
  , best :: Maybe Point
  }
  deriving (Show, Eq)

newtype Velocity = Velocity { unVelocity :: (Float, Float) }
  deriving (Eq)

type Particles = Map.Map String Particle

note: There’s not a great reason I made Particles a Map other than for debugging the individual particles by giving them specific names. This will be changed to something faster later on.

Our top level view of the algorithm is pretty simple:

  • initialize n particles at random locations within the range
    • initialize random velocities for each particle
  • recurse over all particles until their velocity hits zero or max iterations is reached
    • find global best point by evaluating the fitness function on the particle’s location
    • update particle velocity
    • move the particles
    • if new location is better than best update it

The first thing we’ll write is the step function for initializing the particles and starting the recursion.

step :: SwarmOptimization -> Int -> IO Particles
step (SwarmOptimization {..}) step = do
  particles <- mkParticles numOfParticles rangeX rangeY
  pure $ go step particles
  where
    go 0 p = p
    go n p =
      let g = globalBest fitness_function p
          newParticles = moveParticles g fitness_function p
       in go (n-1) newParticles

Main update logic

Everything so far has been putting the wheels in motion but now comes the actual particle movement.

moveParticles
  :: Particle
  -> FitnessFunction
  -> Hyperparameters
  -- ^ we'll talk about these soon
  -> Particles
  -> Particles
moveParticles p f coef = Map.map (update p)
  where
    update (Particle { loc = globalLoc }) personal@(Particle { loc }) =
      let newVel = getNewVelocity personal globalLoc coef
          newLoc = addP loc newVel

          best = if f newLoc < f loc then newLoc else loc
       in personal { loc = newLoc
                   , velocity = newVel
                   , best = Just best
                   }

Here we 1) get the new velocity, 2) get the new point location, 3) update the personal best comparing against previous and current points at the supplied fitness function. This will clearly be applied to all the particles in our domain for every iteration of the process. The function getNewVelocity has yet to be defined, but before we do, it’s necessary to talk about the hyperparameters in PSO.

Hyperparameters

Before we can compute the particle’s new velocity we have to apply some top level scaling params to our acceleration.

These hyperparameters control Exploitation and Exploration. Simply put, exploitation describes the particle’s reliance on the global maximum and exploration focuses more on personal bests. The way this works to our advantage is that particles don’t get stuck in local minimums and explore the space to find better and better solutions.

  • w: inertia weight. The lower it is, the stronger the convergence of the particles
  • c1 and c2: “acceleration coefficients”. They control the personal- and global-velocity respectively
data Hyperparameters
  = Hyperparameters
  { w :: Float
  , c1 :: Float
  , c2 :: Float
  }

In our code we use a function called coefficients which will determine our hyperparameters based on how far along we are in our step count, so:

coefficients :: Int -> Int -> Hyperparameters
coefficients current totalSteps = ...

We might want to change this once we start tweaking the velocity for specific fitness functions.

New velocity per particle

With the hyperparameters out of the way we can do the velocity evaluation.

The pseudocode for this, as defined on the wiki, is as follows

newVelocity ← w currentVelocity + c1 r1 (personal - current) + c2 r2 (global - current)

Our implementation looks pretty similar if you ignore the weird point/velocity scaling functions. If they get annoying I’ll make some typeclasses for them to clean it up but for now they’re fine. I’ll also note the lack of the r1 and r2 scaling. We might need to add these later, but essentially they’re adding more randomness to particle movement.

getNewVelocity (Particle { loc, velocity, best}) global (Hyperparameters w c1 c2) f =
  (w `scaleV` velocity) `addV` pv `addV` gv
  where
    prevBest = fromMaybe loc best
    -- personal
    -- c1 * r1 * (pbest - current)
    pv = scaleV c1 $ diffToVel prevBest loc

    -- global
    -- c2 * r2 * (global - current)
    gv = scaleV c2 $ diffToVel global loc
    -- w*v + pv + gv

It’s very hard for me to see if this will lead to correct particle movement so we’ll set up a test. I took the schaffer function from a list of non-differentiable functions and we’ll use it for the rest of the posts as a goalpost for defining success. For now I’m only concerned with the speed of the implementation and relative sanity of output (we’ll dig deeper in the next post).

swarm :: Int -> IO [Particle]
swarm s = do
  p <- step pso s
  return $ Map.elems p
  where
    (rx, ry) = mkSquareRange 20 20
    pso = SwarmOptimization
      { numOfParticles = 20
      , rangeX = rx
      , rangeY = ry
      , fitness_function = schafferFunc
      }

schafferFunc :: Point -> Float
schafferFunc (Point (x, y)) = 0.5 + (n / d)
  where
    n = (sin (x ** 2.0 - y ** 2) ** 2) - 0.5
    d = (1 + 0.001 * (x**2 + y**2)) ** 2

The full code for this commit can be found here