The Fibonacci numbers have become a bit of a poster child for recursive definitions, perhaps due to it’s simplicity. You’ll surely find it in the early chapters of most books teaching functional programming (a programming paradigm where recursive definitions are common).
Indeed, if we open Chapter 5: Recursion of LYAH we are greeted with the following.
Definitions in mathematics are often given recursively. For instance, the Fibonacci sequence is defined recursively.
Likewise, in Chapter 1.2.2 Tree Recursion of SICP we are yet again greeted by a familiar face.
Another common pattern of computation is called tree recursion. As an example, consider computing the sequence of Fibonacci numbers
With this in mind, it might come as a surprise that there is a closed, non-recursive, formula for the -th Fibonacci number. Perhaps more surprising is that we will discover this formula by using the ideas presented in the above chapter of SICP.
A naive way of calculating the -th Fibonacci number is to use the definition above. Check if , if , and otherwise calculating and . This corresponds to the following Haskell code:
fib :: Integer -> Integer
fib 0 = 0
fib 1 = 1
fib n = fib (n-2) + fib (n-1)
However, there is an issue with this method, many Fibonacci numbers will be calculated numerous times, as for each Fibonacci number evaluated we split into two paths, evaluating the previous and twice previous Fibonacci number. The reader which prefers visuals might appreciate Figure 1.5 from the SICP chapter.
How might we fix this then? A human calculating the -th Fibonacci number might construct a list of Fibonacci numbers, calculating each Fibonacci number only once. While it is possible to do this on the computer it is superfluous to carry around all previous numbers, as we only need the previous two to calculate the next one. We might think of this as a 2-slot window, moving along the Fibonacci numbers, taking steps to arrive at . In code we could represent this as follows:
-- steps -> f n-2 -> f n-1 -> f n
window :: Integer -> Integer -> Integer -> Integer
window 0 a b = a
window steps a b = window (steps-1) b (a+b)
fib :: Integer -> Integer
fib n = window n 0 1
In each step we move the window by replacing the first slot in the window by what was previously in the second slot and filling the new second slot of the window with the sum of the previous two slots. This is then repeated times, and then the first slot of the window is returned.
What does this have to do with mathematics, and this beautiful proof that I have promised? We shall begin to translate this moving window into the language of mathematics, our window is a pair of numbers, so why not represent it as a vector. Furthermore, we may view sliding our window one step as a function from vectors to vectors. This poses an interesting question: is this function a linear transformation?
It is! This is great news as it means we can represent our step function by a matrix. With some basic linear algebra one can deduce that
Then to calculate the -th Fibonacci number we take the starting window and multiply it by . Now that the sliding window has been expressed purely in the language of linear algebra we may apply the tools of linear algebra.
If you’re familiar with linear algebra there might be a part of your brain yelling “diagonalization” right now. We’ve translated our problem into linear algebra, but even for a small matrix calculating can become costly for high . Diagonalization is a technique in which we express a matrix in a base where all base vectors are eigenvectors of the original matrix. The benefit of doing this is that it turns exponentiation of matrices, which is hard to calculate into exponentiation of scalars, which is much easier to calculate.
An eigenvector for our matrix is a vector for which for some scalar , which we call an eigenvalue. If there are any such vectors we can find them using their definition.
Subtracting from both sides yields:
An equation of the form will only have non-trivial solutions if the column vectors of are linearly dependent, that is if . Thus we can find all scalars for which there are non-trivial vector solutions by solving . Because of this property the polynomial is called the characteristic polynomial of .
In our case we have the following:
Solving for yields two eigenvalues:
Would you look at that, , the golden ratio! Some of you might already know that the golden ratio is connected to the Fibonacci numbers, in fact, as you get further and further into the sequence of the Fibonacci numbers the ratio approaches .
Now we can solve for and , and if the two resulting vectors are linearly independent we may use them as the basis of our diagonalization matrix. Gauss elimination yields:
These vectors are indeed linearly independent, and we can use them as basis vectors for our diagonal matrix. We will now to write where
We then have that
Which is very nice since
After calculating we can solve for to get our closed expression.
Whenever I first happened upon the closed formula for the -th Fibonacci number it seemed so shockingly random, a formula with bunch of square roots always giving me a recursively specified integer. After I learned this proof it doesn’t feel as random anymore, instead, I feel it would be more surprising if we carried out the whole diagonalization process and ended up with no roots. Perhaps more importantly, it opened my eyes to the usage of linear algebra as a powerful mathematical tool, and not just something to be applied in geometry, flow balancing or computer graphics.
It was pointed out to me on mastodon that this technique is of interest even if it is not possible to diagonalize the stepping matrix. This is because using fast binary exponentiation one can perform matrix exponentiation in logarithmic time. Thus any linearly recursive algorithm can be calculated in logarithmic time!
Fast binary exponentiation uses the identity , thus we can split the exponent in 2 when even, rather than performing multiplications. Recursively doing this each time the exponent is even yields logarithmic time exponentiation.