What I learned writing a LISP Interpreter
Why LISP?
After being reminded of Advent of Code at the beginning of this month, I decided to start participating in it. I knew of this site before, but in the past I learned of it after the event was already over. One thing I love about these types of programming challenges is seeing the solutions that other people come up with and the approach they take to solving the problems.
Python is my go-to language for solving these types of problems. No other language that I know of does a better job at getting out of my way and letting me focus 100% on the problem at hand. When thinking about the programming languages that I prefer, I naturally ask myself a few questions: What exactly makes a programming language more fun to use than another? What makes you appreciate one programming language more than another?
I love reading about programming languages. One of my favorite (historic) blogs to read is that of Andy Gavin, who was one of the founders and lead programmer at Naughty Dog Games. He is someone who was also fascinated by the power of programming languages, so much so that he even designed a compiler for his own LISP-like language called GOAL and then proceeded to write multiple best-selling PS2 games using it. Seeing as Advent of Code puzzles provide a good opportunity to learn a new language, I decided that I wanted to try and solve a few with LISP. This was also inspired by projects like Open GOAL which are reverse engineering Gavin's custom LISP dialect. It is all just fascinating to me.
Writing a LISP interpreter took me down a fun rabbit hole of computer science and programming language design and implementation that I would like to share a little bit about with you.
Writing an Interpreter
Firstly, I would like to recommend Robert Nystrom's "Crafting Interpreters", which is a free (and amazingly high quality) book about implementing an interpreter for a programming language. I referenced this book quite a bit.
My LISP journey was broken down into 3 main parts:
Tokenizing
The first part of writing a language is tokenizing. This is breaking down the source text of a file into individual bite-sized pieces that correspond to something meaningful in the language. It's up to you to decide the grouping of words and text corresponds to meaning in your language.
For example, the source text:
(define square (lambda (n)
(* n n)))
Would become a list of tokens, with each having its own type assigned to it:
Found 15 tokens in total.
'(' 'define' 'square' '(' 'lambda' '(' 'n' ')' '(' '*' 'n' 'n' ')' ')' ')'
Things like whitespace and line breaks are effectively ignored, as they don't have any meaning in the language. Reading in the tokens was as simple as looping through a list of regular expressions and detecting the first one that matches the beginning part of the source file:
struct tokenRegex tokenRegexes[] = {
{ TokenType::WHITESPACE, "[\\s,]+" },
{ TokenType::COMMENT, ";.+" },
{ TokenType::PAREN_LEFT, "\\(" },
{ TokenType::PAREN_RIGHT, "\\)" },
{ TokenType::QUOTE, "\'" },
{ TokenType::BOOLEAN, "false" },
{ TokenType::BOOLEAN, "true" },
{ TokenType::SYMBOL_NUMBER, "[^\\s\\[\\]{}('\"`,;)]+" },
{ TokenType::STRING, "\\\".*\\\"" }
};
When a pattern is matched, a Token
is created, given a type, and the matched part of the source
string is set to the content of that token.
When this is done, you have an entire list of all tokens in the source program,
which leads to:
Parsing
Parsing is the step where the data in individual tokens are used to build
what I called Expr
s, or Expressions.
In LISP and functional languages, everything is an expression.
An expression object could take one of a few forms.
In my implementation, an expression can be a Number
, Symbol
, Boolean
, or List
.
typedef struct Expr
{
ExprType type;
union as
{
struct ExprSymbol symbol;
struct ExprNumber number;
struct ExprList list;
struct ExprBool boolean;
} as;
} Expr;
The job of the parser is to start at the beginning of the token list,
and use those tokens to construct individual expressions.
For example, if we see a number token, we simply set the type of expression
to Number
, and then convert the string value of the number into its integer
equivalent.
The same thing is done for symbols and booleans, but lists are a bit different.
A list expression can be implemented as a dynamically sized array of expressions.
Creating and populating a list expression is done by recursively parsing expressions
and adding it to the list:
consume(TokenType::PAREN_LEFT, "Expected (, found " + token.content);
expr->type = ExprType::List;
while (check(TokenType::PAREN_RIGHT) == false)
{
auto temp = this->parse();
expr->as.list.exprs->push_back(temp);
}
consume(TokenType::PAREN_RIGHT, "Expected ), found " + token.content);
When parsing is done, you're left with a list of Expressions that form your program! All that's left to do is evaluate them.
Evaluation
Evaluation is taking an expression and converting it to its most primitive evaluated form.
Consider the expression (+ 1 2)
, after calling eval
on this, we would get a new expression 3
.
Numbers and booleans evaluate to themselves.
Symbols are simply variables, so their value is determined by doing a lookup of the symbol name
and returning whatever expression it has assigned to it.
Lists are the most complicated type to evaluate.
in LISP, the first item in a list is always considered a name of a function to call,
and the rest of the items in the list are considered its arguments.
This is where evaluating gets interesting.
Consider (square 4)
. How would this be evaluated?
(define square (lambda (n) (* n n)))
; the symbol 'square' will now return a list: (lambda (n) (* n n)) when evaluated.
(square 4)
; can be also understood as:
((lambda (n) (* n n)) 4)
Now before each function call, you must create a new variable environment which maps the function arguments to the symbols given as the function parameters.
((lambda (n) (* n n)) 4)
; becomes
((lambda (n) (* 4 4))) ; since n = 4
; or most simply
(* 4 4)
; so
(square 4)
; has become
(* 4 4)
; or just
16
All functions are defined by the user, but there are some special functions which the language implementation must provide in order to make the language useful to begin with. For my implementation, I wrote custom code for the following functions:
+ - * / > lambda define quote if car cdr cons empty?
This gives the user of the language access to many helpful standard functions
that they can use to build their programs.
Now they can perform arithmetic on numbers, compare numbers, define functions,
get the first
and rest
of a list, use an if
statement, and tell if a list is empty.
You might wonder why I only implemented >
and not <
, <=
, >=
, =
, etc.?
The answer to this is quite fun: You can actually implement these functions in the LISP language itself,
being bootstrapped by the single >
boolean operator:
(define < (lambda (a b) (> b a)))
(define = (lambda (a b) (if (> a b) false
(if (< a b) false true))))
(define >= (lambda (a b) (if (> a b) true
(if (= a b) true false))))
(define <= (lambda (a b) (>= b a)))
Using basic primitives and fleshing out a little standard library in the language you just created is one of the most magical feelings while creating a language! It's like watching Frankenstein's monster come to life.
Using My Language
Although this little LISP language is just a toy, I was curious as to how effective it was. Could I actually use this to solve a real problem? I looked at the Advent of Code problems that I had yet to solve, and chose Day 17. This looked like the perfect problem to try and solve in my language. The puzzle input is just 4 numbers, so there's no need for dealing with files, parsing strings, etc. The question itself appeared to be centered around numbers and math, not around string manipulation as some other puzzles are.
After spending some time on the problem and slowly fleshing out my standard library with generic functions as I needed them,
I was finally able to solve Part 1 in my own language!
What was even more shocking was that it honestly wasn't that bad to use.
Implementing functions like map
and filter
is a breeze in LISP:
(define filter (lambda (fn xs acc)
(if (empty? xs) acc
(if (fn (car xs))
(filter fn (cdr xs) (cons (car xs) acc))
(filter fn (cdr xs) acc)))))
So after writing a few basic functions like map
, filter
, range
, and max
,
solving the problem was actually quite enjoyable!
It felt a lot like using python's functional list features to solve problems.
I didn't feel like I was fighting the language very much.
Considering how basic my implementation was at that point,
not feeling too constrained by my primitive language and being able to easily write helper functions
is a testament to the power of LISP.
Here is the code I wrote that earned me my first star for Day 17:
; target area: x=288..330, y=-96..-50
(define xMin 288)
(define xMax 330)
(define yMin -96)
(define yMax -50)
(define inBounds (lambda (x y)
(and (inX x) (inY y))))
(define inX (lambda (x)
(and (>= x xMin) (and (<= x xMax) true))))
(define inY (lambda (y)
(and (>= y yMin) (and (<= y yMax) true))))
(define pastX (lambda (x) (> x xMax)))
(define pastY (lambda (y) (< y yMin)))
(define pastBounds (lambda (x y) (or (pastX x) (pastY y))))
(define step (lambda (x y xVel yVel peak steps)
; if we have hit our magic area, return highest peak
(if (inBounds x y) peak
; if we are past the bounds, return FAIL
(if (pastBounds x y) -1
; otherwise, do the next step
(step
(+ x xVel)
(+ y yVel)
(if (> xVel 0) (- xVel 1)
(if (< xVel 0) (+ xVel 1) 0))
(- yVel 1)
(if (> y peak) y peak)
(+ 1 steps))))))
(define shoot (lambda (xVel yVel) (step 0 0 xVel yVel 0 0)))
(define xVelDist (lambda (x) (if (<= x 1) x
(+ x (xVelDist (- x 1))))))
(define range (lambda (from to acc) (if (= from (+ 1 to)) acc
(range (+ from 1) to (cons (- to from) acc)))))
(define testRange (range 0 100 '()))
(define searchY (range 0 100 '()))
(define peaks (lambda (x) (> x 0)))
(define searchX (filter (lambda (x) (inX (xVelDist x))) testRange))
(define maxAtX (lambda (x)
(max (filter peaks (map (lambda (y) (shoot x y)) searchY)) 0)))
(max (map maxAtX searchX) 0)
What more can you ask for from a project like this? I implemented a programming language and then used it to solve a real problem. It was a really good feeling getting the right answer using a tool that you made yourself!
Takeaways
LISP is a language that is timeless. Getting my feet wet in writing an interpreter for LISP has taught me a lot about the fundamentals of creating a programming language. For its simplicity, writing LISP feels incredibly powerful. Although it is dated now, I could see myself loving LISP if I were a programmer back in the 80's and 90's. Most of the languages we take for granted today weren't around back then, or were in their infancy and not very practical to use. Imagine having a language like LISP where you could warp it into whatever you wanted it to be!
Python reached version 1.0 in January 1994. The major new features included in this release were the functional programming tools lambda, map, filter and reduce. Van Rossum stated that "Python acquired lambda, reduce(), filter() and map(), courtesy of a Lisp hacker who missed them and submitted working patches".