The benefit in using De Brujin notation is that, rather than a map, a dynamic array may be used for the environment, with variables being idecies into said array. An additional benifit is that the machine does not need to concern itself with alpha equivalence. As a side-note, counting conventionally starts at , not , however in a virtual machine counting from is, of course, much more natural.
The second improvement to be made is to “dump” the dump stack. Rather than using the dump stack the current state will be pushed onto the stack as a closure which the return instruction can then restore when exiting a lambda.
The third improvement is to introduce a new, static “stack”. The rationale for this is that many programs will include top level definitions which are in scope throughout the program. Including these definitions in the environment whenever a closure is created is a waste of both memory and processing power.
The fourth improvement is allowing multi-variable lambdas. In the case of the SECD machine using a multi-variable lambda rather than multiple single variable lambdas corresponds to not nesting closures, which would lead to redundant copies of the same environment being made. Note that this does not sacrifice currying, as will become clear in later sections.
In this section a SECD machine implemented in Haskell will be introduced bit by bit. This machine is by no means an efficient nor powerful implementation, it serves only to demonstrate the core concepts of the SECD machine.
The machine will only be able to recognise two types of values. Integers and Closures.
data Val
= I Int
| Clo Int [Val] [Inst] -- number of arguments, environment, instructions
deriving (Eq, Show)
The machine comes with the following instruction set.
data Inst
= Const Int
| Global Int
| Local Int
| Closure Int Int -- number of args, number of instructions
| App
| Ret
| Dup
| Add
deriving (Eq, Show)
Const pushes a constant onto the stack.
Global pushes a global variable onto the stack.
Local pushes a local variable onto the stack.
Closure creates a closure taking a set number of arguments and consuming a set amount of instructions from the control stack. The resulting closure is pushed onto the stack.
App pops a closure and an argument from the stack. If the closure takes more than one argument the argument is pushed onto the closures environment and the closure is pushed back onto the stack. If the closure takes only one argument the closure will, rather than being pushed onto the stack, replace the current environment, such that the closures instructions are placed in the control stack, and its environment placed in the environment stack. A closure is then formed from the old control and environment stacks, which is pushed onto the stack.
Ret pops a closure in the second index of the stack and installs it as the current control and environment stacks. The element at the top of the stack remains untouched, yielding the result of an application.
Dup duplicates the element at the top of the stack. It’s only included as a matter of convenience.
Add pops the top two elements off the stack, adds them, and pushes the result back onto the stack.
All stacks grow from right to left, that is, the left most element is at the top of the stack.
Before | After | ||||
---|---|---|---|---|---|
Control | Env | Stack | Control | Env | Stack |
Const i : c | e | s | c | e | i : s |
Global i : c | e | s | c | e | Globals[i] : s |
Local i : c | e | s | c | e | e[i] : s |
Closure n a : ... : c | e | s | c | e | Closure a {e} [ ... ] : s |
App : c | e | Closure {e'} [c'] : v : s | c' | v : e' | Closure {e} [c] : s |
App : c | e | Closure {e'} [c'] : v : s | c | e | Closure {v : e'} [c'] : s |
Ret : c | e | v : Closure \{e'\} [c'] : s | c' | e' | v : s |
Dup : c | e | v : s | c | e | v : v : s |
Add : c | e | v : s | c | e | v + v : s |
Consider the following lambda expression:
It composes a lambda which increments its input by one with an identical lambda and then applies 0 to the result. In order to run this on our SECD machine it must first be rewritten with De Brujin index notation.
Now in order to create an instruction tape for the SECD machine it must be rewritten, using Closure as lambdas, Local for variables and keeping in mind to explicitly apply to and return from lambdas. Additionally it will have to be written in postfix form due to the nature of stack based machines. After performing these actions the expression has been transformed into the following tape:
Reading this (or one of the previous lambda expressions) backwards you should be able to convince yourself of the validity of this translation.
This can now directly be translated into a list of instructions for the Haskell machine implementation.
[ Const 0, Closure 1 4, Const 1, Local 0, Add, Ret, Dup
, Closure 3 6, Local 0, Local 1, App, Local 2, App, Ret
, App, App, App]
Which can then be evaluated in order to yield the normalized expression.
λ eval [] [ Const 0, Closure 1 4, Const 1, Local 0, Add, Ret, Dup
, Closure 3 6, Local 0, Local 1, App, Local 2, App, Ret
, App, App, App]
[I 2]
As expected, the expression reduces to 2.
While this is a functional implementation, it does have a few fatal flaws. First of, it’s usage of linked lists. While more natural in haskell a real SECD machine should of course use arrays to achieve constant variable lookup time. Second is the perhaps more obvious lack of features, to implement an interesting language one would need a more complete instruction set. Third is the representation of the tape. In a real-world SECD machine one would have a succint binary representation, allowing for simple compact storage of programs and speedy execution.
The implementation itself is trivial and is little more than a Haskell translation of the instruction table above.
module SECD where
data Inst
= Const Int
| Global Int
| Local Int
| Closure Int Int -- args, scope
| App
| Ret
| Dup
| Add
deriving (Eq, Show)
data Val
= I Int
| Clo Int [Val] [Inst] -- args env code
deriving (Eq, Show)
vadd :: Val -> Val -> Val
vadd (I x1) (I x2) = I (x1 + x2)
vadd _ _ = error "attempted addition with closure"
data SEC = SEC [Val] [Val] [Inst] [Val] -- stack, environment, control, global
deriving Show
-- | Takes a global env, a list of instructions and returns and int
eval :: [Val] -> [Inst] -> [Val]
eval globals insts = go (SEC [] [] insts globals)
where
go :: SEC -> [Val]
go sec@(SEC stack env control global) = case control of
[] -> stack
(x:xs) -> case x of
Const i -> go (SEC (I i:stack) env xs global)
Global i -> go (SEC (global !! i:stack) env xs global)
Local i -> go (SEC (env !! i:stack) env xs global)
Closure a s -> go (SEC (Clo a env (take s xs):stack) env (drop s xs) global)
Ret -> let (v:(Clo 0 e c):st) = stack in go (SEC (v:st) e c global)
App -> case stack of
(Clo 1 e c:v:st) -> go (SEC (Clo 0 env xs:st) (v:e) c global)
(Clo n e c:v:st) -> go (SEC (Clo (n-1) (v:e) c:st) env xs global)
Dup -> let (v:st) = stack in go (SEC (v:v:st) env xs global)
Add -> let (x1:x2:st) = stack in go (SEC (vadd x1 x2:st) env xs global)
Feel free to use any of the code or concepts here in your own projects! The code can be considered free of copyright and all notions of liability and/or warranty.
Modern SECD Machine, University of Calgary - Gave me the idea to scrap the D stack.