(select (g v svg) (.&edges ?x ?y (?x :path ?y) (.&draw-path :sw 1.75 :so 0.85)) (.&segments ?x ?y (?x :stipple ?y) (prob 0.7 (.&draw-stipple 2.0 1.3))) (.&vertices ?x (?x _ _) (prob 0.1 (.&draw-circ 8f0 :fill :black))))
The main purpose of the DSL is to easily select either vertices, edges, or
connected segments. Then draw each match. All selectors and drawing methods are
.&. For lack of a better word I have called these
prefixed symbols "triggers".
Implementing the full DSL requires several other libraries, which makes it inconvenient as a tutorial. However, you probably recognize that iterating over a dataset and doing something to each match—depending on some condition—is a pretty common programming pattern. So much so that it might seem unnecessary to re-invent it. But I promise there is a pretty good reason, if you just bear with me for a bit.
What are we Doing?
In this post we will implement a simplified working version of the Common Lisp macro that makes this DSL work. Instead of selecting things from a graph we will select keys in a list of key value pairs based on the prefix of the key.
We already have a template above, but here is the smallest example that outlines the syntax we want to achieve:
(select l (.&itr "DE" (.&print)))
That is: for each item in list,
l; where the key starts with
"DE", print the matching item.
As we will see, the design of Lisp (in general) and CL (in particular) enables us to do this is in 35 lines of code. You can see the final code in this gist.
To avoid going on multiple tangents I will gloss over some technical details. And instead provide a number of smaller code examples as we go, to illustrate what each part of the code does, and why we need it.
For a much more detailed intro to Common Lisp I recommend Practical Common lisp by Peter Seibel. Large parts of the book are available online. The early early chapters cover syntax, types, semantics, and most importantly macros, in detail.
If you have more than a passing interest in meta programming I can recommend Let over Lambda by Doug Hoyte. Much of which is available online as well.
The code snippets below can be pasted into a
if you want to experiment along the way. In which case I suggest installing SBCL. Which
is a pretty solid CL implementation. SBCL can be installed
from package mangers on Linux (
apt-get install sbcl)
and Mac (
brew install sbcl).
You can also download SBCL here.
Let's start with what is an inadequate, but hopefully not entirely incomprehensible intro to the parts of CL we will be using.
Comments in CL are prefixed with
;. In addition I will use
;; for comments that show what some code evaluates to. Eg. if you
type it into the REPL.
Assuming you have some programming experience you are already used to using symbols to represent numbers, syntax, functions and variables. So much so that you might not even think about these as symbols at all.
If you type a symbol like
xyz into a CL REPL it will be
interpreted as a variable. Just as you might expect. And if the variable is not
defined, you will get an error:
; error: the variable XYZ is unbound
If an S-expression starts with a symbol, that symbol is expected to be a function. For example:
(this is an s-expression)
; error: undefined function: THIS
Unlike in many laguages, however, symbols are first class citizens in CL. That is, symbols are distinct from the meaning, or value of the symbol. And as we will see shortly you can manipulate symbols directly. Consider this example:
; it is convention to use uppercase for
; something that has been evaluated(let ((xyz 73)) xyz)
;; 73(let ((xyz 73)) 'xyz)
'xyz is actually a
shorter way to write
(quote xyz). quote is called a special
operator in CL. All you need to know before we move on is that
will return whatever was passed in without evaluating it.
This might not immediately seem useful, but we will see that it has important consequences.
Now we need to look at some lists. This is Lisp after all. You can make lists like this:
(list 1 2 3)
;; (1 2 3)(list 1 2 (list 3 4))
;; (1 2 (3 4))
quote we can write the following instead. It is a little more
compact, and evaluates to the same lists:
'(1 2 3)
;; (1 2 3)'(1 2 (3 4))
;; (1 2 (3 4))
We also need to access elements or parts of lists. The functions we will be
subseq. The first
two have unusual names. I refer you here for an explanation as
to why. But here are examples of what they do:
(car '(1 2 3 4))
;; 1(cdr '(1 2 3 4))
;; (2 3 4)
cdr compliment each other by
returning the first element, and the rest of the list, respectively. We will
see this pattern in use later on.
In general you can use
subseq to get parts of a list:
(subseq '(1 2 3 4) 1)
;; (2 3 4)(subseq '(1 2 3 4) 1 3)
;; (2 3)
To get an element at an arbitrary index you can use
(nth ind a),
but we won't be needing it. We will however need to get substrings
from a string. Fortunately
subseq has us covered for that as well:
(subseq "abcd" 1 3)
As mentioned in the introduction we want to iterate lists of key value pairs. For simplicity we will just use lists of lists. And one item is the list with a key and a value:
;; ((KEY1 VAL1)(key2 7))
;; (KEY2 7))
For convenience CL also has the
The backtick is a more sophisticated version of
'. You can
` to behave the same as
'. Except you can
, to turn evaluation back on temporarily:
`(a 1 2 ,(+ 3 4) b)
;; (A 1 2 7 B)
Another very convenient feature of is that
will splice in a list:
`(a 1 2 ,@(list 3 4) b)
;; (A 1 2 3 4 B)
Sometimes it is necessary to nest
,@ in more complicated ways to achieve the
result you want. This gets confusing rather quickly. Experimenting in the REPL
is an invaluable approach to figure out what is going on in more complex
cases. Our use will be pretty straightforward, but we will see that the syntax
is rather convenient for writing macros.
So far we have seen
quote used in a few different examples, but
what about this:
'(list 1 2 3)
;; (LIST 1 2 3)
What we get back here looks suspiciously like the code we used initially to create our first list. So what if we try to pass this quoted code to CL's eval function?
(eval '(list 1 2 3))
;; (1 2 3)
The quoted code evaluates to the same list as before!
We won't be using
eval directly anywhere. But this ability to make
S-expressions that can be interpreted as code is really the core of CL macros.
We will see this soon.
Any S-expression can be interpreted as code. But whether the code works in the context where it is evaluated is another matter. Take this for example:
(eval '(fx 1 2 3))
; error: undefined function FX
As noted before, the first symbol in an S-expression is expected to be a
function. We have not defined
fx. So we get an error.
CL comes out of the box with everything we need to write a DSL/macro in CL. But first we will define a few extra utility functions to help us out.
Since we will be looking for "triggers" in our code, we need a way to find
them. We have already seen that code is represented as lists with symbols in
them. If we convert symbols to strings, we can check the prefix of each string
; turn anything into a string(defun mkstr (&rest args) (with-output-to-string (s) (dolist (a args) (princ a s)))) (mkstr (list 1 2 3))
;; "(1 2 3)"(mkstr '(list 1 2 3))
;; "(LIST 1 2 3)"(mkstr 'xyz)
;; "XYZ"(mkstr 'xyz 'abc)
And here is a function to check if a string is a prefix of another string:
; returns t if "s" has this "prefix"; otherwise nil(defun startswith? (s prefix) (let ((s (mkstr s))) (and (<= (length s) (length prefix)) (string= prefix s :end2 (length prefix))))) (startswith? "abc" "ab")
;; T(startswith? "abc" "de")
We will also need to convert strings back into symbols:
; turn anything into a symbol(defun symb (&rest args) (values (intern (apply #'mkstr args)))) (symb "XYZ")
;; XYZ(symb 'xyz )
Now we can make the following function that tells us whether a given s-expression is a trigger:
; return t if body is a list, where the
; first symbol starts with "trig"(defun has-trigger? (body trig) (and (listp body) (startswith? (car body) trig))) (has-trigger? '(.&hi 3 2) ".&")
;; T(has-trigger? '(.&hello) ".&")
;; T(has-trigger? '(".&") ".&")
;; NIL(has-trigger? 'ohhai ".&")
If you remember back to the beginning; what we set out to do was really just to iterate a list, then do something with items that match our condition. It seems like we have deviated a little from that? Surely there must already be a way to do this in CL?
; l is a local variable in loop(loop with l = '(.&abc def ghi) for s in l if (startswith? (mkstr s) ".&") do (print s))
The fact that a better—at least more comprehensive—version of our
DSL already exists in CL will not deter us! In fact, this exact
loop is the perfect template for the last bit of code we need. In
the next section we combine everything.
Compiling our DSL
Now that we have all the parts, we need to put it together. The following functions do just that: The functions call each other recursively, so here are both of them:
(defun do-trigger (l body)
; body will only ever be on the format:
; (.&itr sel ...) or (.&fx ...)(let ((fx (subseq (mkstr (car body)) 2))) (if (equal "ITR" fx)
; if ITR: return the loop we saw earlier`(loop for $ in ,l if (startswith? (car $) ,(second body)) do ,@(rec l (subseq body 2)))
; else: return (fx $ ...)`(,(symb fx) $ ,@(rec l (cdr body)))))) (defun rec (l body) (cond
; if body is an atom: return body
; [symbols, numbers, strings, t, nil are atoms]((atom body) body)
; else if body is (.&trigger ...): call do-trigger((has-trigger? body ".&") (do-trigger l body))
; else body is a list: recursively call rec
; and splice it back together(t `( ,(rec l (car body)) ,@(rec l (cdr body))))))
I hope the comments make it relatively clear how this works, but it is a good idea to play around with this to really get a feel for what is happening.
(defmacro select (l &body body)
; (gensym) creates a new, unique symbol(let ((l* (gensym "L"))) `(let ((,l* ,l)) ,@(rec l* body))))
And using it:
(select `((urq 77) (abc 3) (abcd 2) (dec 8)) (.&itr "DE" (print (list "HIT" $))) (print '2nd-loop) (.&itr "AB" (print (.&list 'hi)) (print (.&list 'ohhai))))
; which prints:
; ("HIT" (DEC 8))
; ((ABC 3) HI)
; ((ABC 3) OHHAI)
; ((ABCD 2) HI)
; ((ABCD 2) OHHAI)
; as we would hope
If you want to see what a macro expands into, you can check for yourself:
; expands macros once'(select l ((urq 77) (abc 3) (abcd 2) (dec 8)) (.&itr "DE" (.&print))))
;; (LET ((#:L94 L))
;; ((URQ 77) (ABC 3) (ABCD 2) (DEC 8))
;; (LOOP FOR $ IN #:L94
;; IF (STARTSWITH? (CAR $) "DE")
;; DO (PRINT $)))
If you are now feeling confused about how this works, you might also be wondering when our macro generates this code? In the next section I will try to give you some examples that can help thinking about this further.
Runtime or Compile time?
defmacro syntax is very similar to
we have used multiple times. This makes it easy to fall for the deception that
macros are just weird functions. But they are not functions in the sense many
programmers are used to.
Macros behave very different from functions in three important ways. First, the
input to a macro (e.g. our
select macro) is not evaluated. You
might say the arguments are implicitly quoted.
Second, macros are not functions that return results; they are code that generates new code. The generated code is what will be executed in the program's runtime.
And third. We know from testing our
select macro in the previous
section that the
rec function is definitely executed as a part of
the macro expansion. That is how our
loop code is generated after
But the macro expansion does not happen in the same context as a function in the same location in source code is executed. It is sometimes said that the macro is expanded at compile time. Although the distinction between compile time and runtime can get a little blurry.
Consider this macro:
(defmacro dummy (a b)
: format is a fancy print(let ((res `(format t "~&executed: ~a ~a~&" ,a ,b)))
; prints something when dummy is expanded:(format t "~&expanded: ~a ~a~% ~a~&" a b res)
; dummy returns this code:res))
What do you think will be printed if we do this:
(loop for i from 0 below 3
; counts i=0, 1, 2do (format t "~&log: ~a~&" i) (dummy 99 i))
This is the printed output (in SBCL):
; expanded: 99 I
; (FORMAT T ~&executed: ~a ~a~& 99 I)
; log: 0
; executed: 99 0
; log: 1
; executed: 99 1
; log: 2
; executed: 99 2
We can see that even though it appears as if
dummy is inside a
loop, it is only expanded once. But the code generated by
dummy is executed three times.
Moreover, we see that while the line that prints
dummy in the code, the
expanded message from the
macro expansion appears first.
In addition to all that we also see in the
expanded message that
dummy is expanded the counter is just the symbol
Debugging macros is notoriously difficult. If you don't understand these examples right away there is no reason to despair. It took me a long time to get a good intuition for this. But I hope these small examples can help a little.
Extending a Language
I framed this tutorial around building a DSL. That is the title of this post
after all. But a more important motivation for writing this is to show that the
way we recursively processed s-expressions as data in
rec is no
different from how Lisp reads and evaluates any other source code you hand to
Saying this absolutely an oversimplification. And I am not skirting the details because I don't think understanding them is important. It is important. But I wanted to give a small taste of what it can be like to work with code that generates code. That way it might seem a little less abstract if you want to learn more in the future.
I hope I managed to make it feel kind of intuitive. Almost as if you are discovering a tool as you are using it. That has been my experience of using CL over several years. And it would be nice to try to replicate some of that feeling in others in a slightly shorter amount of time.
Even if this tutorial seems like an overly complicated way to write what is ostensibly a for loop. Another way to think about it is that we have extended the Lisp language by writing a plugin for Lisp. And the plugin is written in Lisp as well. Once you start thinking about programming and—more importantly—problem solving in this way, it is very hard to go back.
TL;DR, Macros are an extremely powerful way to fundamentally shape the language you are using to solve your problems. Not just the words and tools you already have in it.
EDIT: I expand on this approach in the next post.
- Learning CL is absolutely worth your time if you are interested in learning a new programming language. However, if you want to spend your time learning something you are more likely to use directly in a professional setting you may want to consider Clojure instead. If I had not already built all my utilities I probably would start using it instead.
- I don't know what is the easiest alternative for Windows, sorry.
symbare two of the first functions introduced as a series of utilities in On Lisp by Paul Graham. the book and the other utilities can be seen in the link.
- In fact loop is a macro (or DSL, if you will) for looping that is already part of CL. Practical Common Lisp has an entire chapter about it.
- The reason for using
(gensym)is an entire topic in its own right. We have skirted the issue here, but macro hygiene is an important topic for writing good macros. Different Lisps have different approaches to this issue, and it is covered in detail in both Practical Common Lisp, On Lisp and Let over Lambda. I can also mention that the way we unceremoniously use
$as the iterator symbol in
do-triggeris vulnerable what's called variable capture; one of the primary issues related to macro hygiene.