Object Oriented to Functional: Sieve of Eratosthenes

One problem with moving from object oriented programming to functional programming is that many programmers try to apply the techniques and patterns that they’re familiar with from the object oriented programming language to the functional programming language. This doesn’t work. Some features of a functional programming language, like immutable data structures and the lack of while and for loops feel like hinderances to writing the program using the familiar OOP patterns.

They are hinderances to implementing the OOP style. The problem here isn’t the functional programming language, but that the OOP style doesn’t work there. A new programming style is required. This post describes and discusses a program written in the OOP style in Ruby, a direct translation of that style in Elixir, and a functional re-write in Elixir.

The Sieve of Eratosthenes is an algorithm for searching for primes that requires state from one step to the next. It seems like a good problem to solve using object oriented programming.

The Sieve of Eratosthenes (the sieve) works this way:

  1. To find the prime numbers up to a certain limit n, start with an array of numbers from 2 to n, all of them starting off marked as prime
  2. the next number still marked as prime is a prime
  3. mark all of the multiples of that prime number as non-prime
  4. continue at step 2 until the next prime number is greater than $\sqrt n$
  5. any remaining numbers still marked as prime are also prime

Here is a Ruby OOP implementation of the algorithm:

class Sieve
  attr_accessor :is_prime, :limit

  def initialize(limit)
    self.is_prime = Array.new(limit + 1, true)
    self.limit = limit
  end

  def apply_prime(prime)
    (prime * 2 .. self.limit).step(prime) do |idx|
      self.is_prime[idx] = false
    end
  end

  def next_prime(start)
    (start + 1 .. self.limit).each do |idx|
      return idx if self.is_prime[idx]
    end
    nil
  end

  def find_primes()
    max = Math.sqrt(limit)
    prime = 2
    while prime <= max && prime != nil
      apply_prime(prime)
      prime = next_prime(prime)
    end
  end

  def print_primes()
    (2 .. self.limit).each do |idx|
      puts(idx) if self.is_prime[idx]
    end
  end
end

def display_primes(limit)
  sieve = Sieve.new(limit)
  sieve.find_primes()
  sieve.print_primes()
end

The Sieve class contains two member variables:

  • is_prime - an array of booleans that indicates whether or not the index is a prime number
  • limit - the number to which we’re finding primes

The methods of the class:

  • initialize(limit) - sets is_prime to an array defaulting all the values to true, and limit to the limit passed in
  • apply_prime(prime) - for all multiples of prime, apply_prime sets is_prime to false
  • next_prime(start) - starts from the index after start and returns the next index for which is_prime is true
  • find_primes() - implements the Sieve of Eratosthenes algorithm by calling apply_prime starting with 2 and as long as there are more primes, looping until we reach the square root of the limit
  • print_primes() - prints all the primes found once find_primes has been executed

Finally, display_primes(limit) can be used to execute the sieve up to a certain point and print out all the primes discovered.

The following is a literal conversion of the Ruby program above into Elixir:

defmodule Sieve.OOP do
  defstruct sieve_list: nil, limit: 0

  def find_primes(sieve) do
    max = :math.sqrt(sieve.limit) |> trunc()
    Enum.reduce_while(2..max, {2, sieve}, fn _, {prime,sieve} ->
      sieve = apply_prime(sieve, prime)
      prime = next_prime(sieve, prime)
      if prime > max do
        {:halt, sieve}
      else
        {:cont, {prime,sieve}}
      end
    end)
  end

  def apply_prime(sieve, prime) do
    Enum.reduce(2 .. div(sieve.limit, prime), sieve, fn count, sieve ->
      %{sieve | sieve_list: :array.set(count * prime, 1, sieve.sieve_list)}
    end)
  end

  def next_prime(sieve, prime) do
    Enum.reduce_while(prime + 1 .. sieve.limit, nil, fn idx, _ ->
      if :array.get(idx, sieve.sieve_list) == :undefined do
        {:halt, idx}
      else
        {:cont, nil}
      end
    end)
  end

  def generate_prime_list(sieve) do
    Enum.reduce(2..sieve.limit, [], fn idx, acc ->
      if :array.get(idx, sieve.sieve_list) == :undefined do
        [idx | acc]
      else
        acc
      end
    end)
    |> Enum.reverse()
  end

  def print_primes(sieve) do
    Enum.each(2..sieve.limit, fn idx ->
      if :array.get(idx, sieve.sieve_list) == :undefined, do: IO.puts(idx)
    end)
  end
end

def display_primes(limit) do
  sieve = struct(Sieve.OOP, [sieve_list: :array.new(limit + 1), limit: limit])
  sieve = find_primes(sieve)
  print_primes(sieve)
end

You can see how each of these functions corresponds to the functions in the Ruby code. Unfortunately, it’s awkward. There seem to be many things wrong with this:

  • Without a GenServer, there’s no way to store the state, so we have to keep passing around the sieve data structure. (And a GenServer is not the right solution either)
  • reduce_while seems like an awful way to do a loop
  • the code is repetitive
  • since the array operations are immutable, we’re copying the array over and over, every time we change it to mark an element as non-prime

What are we to do? We should reframe the description of the Sieve of Eratosthenes in a functional way, as a series of transformations on a data set:

  1. Track the primes and the unchecked numbers
  2. The first of the remaining unchecked numbers should be added to the list of primes
  3. filter the unchecked numbers and remove all number that are multiples of the prime moved to the prime list
  4. As long as the prime number <= $\sqrt n$, go to step 2

As you can see here, the problem has now been reframed as a sequence of filter operations.

defmodule Sieve.List do
  def find_primes(limit) do
    find_primes(Enum.to_list(2..limit), [], :math.sqrt(limit))
  end

  def find_primes([first|_] = rest, primes, max) when first > max do
    Enum.reverse(primes) ++ rest
  end

  def find_primes([prime|rest], primes, max) do
    Enum.reject(rest, fn n -> rem(n, prime) == 0 end)
    |> find_primes([prime|primes], max)
  end

  def display_primes(limit) do
    find_primes(limit)
    |> IO.inspect(label: "primes", limit: :infinity)
  end
end

Surprisingly, the 42 line Ruby class is a 55 line Elixir program when we attempt to replicate the object oriented style, but only 19 lines when written in a more functional way. This is due to the expressiveness of functional programming languages. It is more to the point.

The primary algorithm is written in three lines:

def find_primes([prime|rest], primes, max) do
  Enum.reject(rest, &(rem(&1, prime) == 0))
  |> find_primes([prime|primes], max)
end

The first argument represents the list of numbers that remain after each iteration through the sieve. Through each iteration, the first element in this list is the next prime, which we extract using pattern matching.

The Enum.reject call removes all elements from the remainder of the list, rest, that are multiples of that prime.

Then, the magic line:

|> find_primes([prime|primes], max)

which is Elixir shorthand for:

prime_candidates = <previous function call>
find_primes(prime_candidates, [prime|primes], max)

This is a recursive call to take the results of removing all the multiples of the newly found prime, and pass it back in to find the primes in the remainder. At the same time, we prepend the newly found prime to the list of primes we’ve found so far and pass that in as the second argument.

We make use of pattern matching to avoid if statements and variable assignments. The awkwardness of not having a looping structure is replaced with the more direct Enum.reject and recursion. State is replaced with an update to a structure as it’s being passed into the next iteration. An iteration through the list of potential primes is replaced with a single function call to remove multiples of our current prime.

When you write a program in an object oriented programming language, you focus on the state that you need to keep track of, and the methods to update that state. In functional programming, you instead focus on writing functions that transform the data step by step.

As you become familiar with these patterns, it’ll become easier over time. Just like any new concept, it can take some time to learn. When you do, you will be rewarded with the ability to write concise, expressive, functional code.