Why be pure?

Newcomers to Haskell are often told that Haskell is “pure”, or that it is “mathematical”. It is often not clear what this means, or why these properties would be of interest to begin with. In this blog post I aim to combat this by showing why purity can be helpful, both in writing programs and improving performance.

What is purity?

Before we get into the why, we have to get into the what. Haskell is a purely functional language, but what does that mean? Purity tells us that functions in Haskell behave like mathematical functions. That is, functions in Haskell must always give eqaul output for equal input. In mathematics we’d hope that if we have a function ff, and two values xx and yy such that x=yx = y then f(x)=f(y)f(x) = f(y) and the same is true for Haskell. This is the reason that you are not allowed to modify the value of variables in Haskell as you are in other programming languages such as Python. Consider the following Python program:

n = 0

def f(x):
  global n
  n = n + x
  return n

The function f will give a different result each time we call it with the same input.

>>> f(1)
1
>>> f(1)
2
>>> f(2)
4
>>> f(2)
6

But then f(2)f(2) is not always equal to f(2)f(2), so Python is not pure.

Now that we know what it means for Haskell to be pure, we can move on to the more interesting question. Why place these restrictions on our language? Requiring Haskell to be pure might seem arbitrary, or even counter-productive, but playing by the rules of mathematics will let us use the tools of mathematics!

Equational Reasoning

One of the most powerful tools handed to us by mathematics is equational reasoning. It uses the fact that equality in Haskell behaves like mathematical equality, and in doing so it allows us to prove properties about Haskell programs in ways that are not possible in non-pure languages.

A Gentle Example

To show what I mean, I will prove that map (+1) [1,2,3] = [2,3,4]. First, recall the definition of map:

map :: (a -> b) -> [a] -> [b]
map f []     = []
map f (x:xs) = f x : map f xs

Second, recall how lists are represented in Haskell. When we write [1,2,3], this is shorthand for 1 : (2 : (3 : [])). Now we can start using our equalities! The definition of map tells us that map f (x : xs) = f x : map f xs, thus map (+1) (1 : 2 : 3 : []) = (+1) 1 : map (+1) (2 : 3 : []) Continuing this reasoning yields us the following:

    [map (+1) (1 : 2 : 3 : [])]
  = (+1) 1 : [map (+1) (2 : 3 : [])]
  = (+1) 1 : (+1) 2 : [map (+1) (3 : [])]
  = (+1) 1 : (+1) 2 : (+1) 3 : [map (+1) ([])]

In each step the code within [] is replaced using the same equality rule. For the final step we look at the first case of map telling us that map f [] = []

  = (+1) 1 : (+1) 2 : (+1) 3 : []
  = 2 : 3 : 4 : []
  = [2,3,4]

While this is mildly interesting, it’s not very useful as we could have simply asked Haskell to evaluate the code for us in order to yield the same result.

Proving an interesting property

A more exciting example is proving that map f (map g xs) = map (f . g) xs for all functions f, g, and lists xs. Recall the definition of .

(.) :: (b -> c) -> (a -> b) -> (a -> c)
(.) f g x = f (g x)

That is, f . g is a function that first applies g to an input and then applies f to the result of g. Thus we are proving that first mapping one function, and then another is the same as mapping the composition of said functions. All lists are in one of the forms [] or x : xs, thus it suffices to prove that the property holds for these two cases. If you are familiar with doing proofs by induction this is precisely what we will be doing.

First, consider the [] case:

    map f (map g [])
  = map f []
  = []
  = map (f . g) []

Secondly, consider the x : xs case, assuming that the property holds for xs:

    map f (map g (x : xs))
  = map f (g x : map g xs)
  = f (g x) : map f (map g xs)
  = f (g x) : map (f . g) xs    [by our assumption]
  = (f . g) x : map (f . g) xs 
  = map (f . g) (x : xs)

Now this is an interesting result! It tells us that any time we wish to map over two functions we may simply map over their composite instead. This is not only interesting but has performance implications as well, since it tells us we only need to use map once!

Avoiding Duplication

In programming there are many cases when data has to be duplicated. If we assign a value to a variable x and then assign y = x, then we often don’t want y to change when x does. In order to avoid this the data stored in x is duplicated and then stored in y. This duplication takes time and memory, so most programming languages don’t do it by default, instead, they have the user specify when a duplication is needed. In practice when creating large programs or libraries which need to work without any assumptions, one often has to employ considerable amounts of data duplication.

How does this relate to Haskell and purity? Since Haskell is pure data can never be modified, and there is thus never a need for data duplication. It’s fine to pass around references as their contents will never be modified. When structures are rebuilt the old contents are reused and only the parts that are being changed are recreated. This allows for more efficient use of both memory and processing time.

The Takeaway

The purity of Haskell allows for mathematical reasoning about programs. This not only makes it possible to be more confident in the correctness of our programs but can be used in order to optimize code as well. In fact, the primary Haskell compiler, GHC, uses these guarantees in order to optimize programs. The restrictions imposed by purity turn into properties that programmers and compilers alike may use to their advantage.