In the previous post I described how to
implement a small macro called select
. The macro implements a tiny
DSL for
selecting items from a list of key/value pairs.
I also wrote that while select
in of itself is not very useful,
the overall approach for making a DSL is quite powerful. So Let's use the same
technique to do something more interesting.
My vector library (cl-veq
) has multiple
utilities for doing (among other things) vector mathematics. A core component
of cl-veq
is a macro called vv
.
I outlined the idea and motivation behind vv
in an earlier post. You probably want to look at
both these earlier posts before reading on. In any case, all these posts are
written so it should be possible to follow even if you have no experience with
Common Lisp
(CL).
What are we Doing this Time?
We will extend the approach used in the select
macro to implement a working version of a part of vv
from
cl-veq
. Specifically, we will make this syntax work for doing
vector operations with arbitrary functions on packs of values
:
(2!@+ (2!@* 1.0 2.0 3.0 4.0) (2!@/ 5.0 6.0 7.0 8.0))
Which can be written like this in vanilla CL:
(values (+ (* 1.0 3.0) (/ 5.0 7.0)) ;; (values 3.7142856 8.75)
(+ (* 2.0 4.0) (/ 6.0 8.0)))
It should work for vectors of dimension 1-9
. And as you see
we put the dimension before the trigger in the symbol, and the function name
after the trigger. So e.g. this should work too:
(3!@- 1 2 3 4 5 6) ;; (values -3 -3 -3)
We will see that we don't require much more code than we needed for
select
. And the result is pretty easy to extend with other
triggers (!@
). The result can be seen in this gist.
Values, values, values
First we need some utilities to
handle values
. cl-veq
has a lot of sugar coating to
make it more convenient to handle value packs. I will only introduce a few
here, but enough to give you an idea of how it can be done.
The first one is for making sure you get all values
from one or
more values
:
(defmacro ~ (&rest rest) `(multiple-value-call #'values ,@rest)) (~ (values 1 2));; (~ 1 2)
(~ (~ 1 2));; (~ 1 2) ; surprise!
(~ (~ 1 2 (~ 3)) (~ 4 5));; (~ 1 2 3 4 5) ; and so on
Note that I am using ~
to mean a value pack, as well as a syntax
that will coerce one or more values
into a single
values
.
Sometimes you want all values in a list instead. In which case we have the following macro:
(defmacro lst (&body body)
`(multiple-value-call #'list (~ ,@body)))
(lst (~ 1 2) (~ 3 4 (~ 5 6))) ; (1 2 3 4 5 6)
cl-veq
contains a good deal more sugar coating. But to keep this
relatively simple we will only introduce one more macro which is handy for
debugging. It allows you to wrap it around any combinations of
values
to print them, while still returning the same
values
.
(defmacro vpr (&body body)
(let ((res (gensym)))
`(let ((,res (lst ,@body)))
(format t "~&;> ~{~a~^ | ~}~&;; ~{~a~^ | ~}~&"
',body ,res)
(apply #'values ,res))))
(vpr (~ 1 2) (~ 3 4)) ;; (~ 1 2 3 4)
We see that functionally it behaves just like ~
, but it also prints
the following two lines:
;> (~ 1 2) | (~ 3 4)
;; 1 | 2 | 3 | 4
Seeing as the notation we are implementing has explicit dimension, it would
make sense to also have a specific way to ensure that something is exactly
n
values
. E.g. 3~
or similar. There are
several ways to achieve this. We won't implement it here, but it can be done by
extending the approach as we are about to implement.
Parsing the new Syntax
We already have startswith?
as we defined in the previous post.
However, this time we will be using something slightly more general:
(defun match-substr (sub str) (loop with lc = (length sub) for i from 0 repeat (1+ (- (length str) lc)) if (string= sub str :start2 i :end2 (+ i lc)) do (return-from match-substr i)))
match-substr
will return the first index where sub
matches str
, otherwise it returns nil
. That means we
can find our triggers like this:
(match-substr "!@" "2!@fx");; 0
(match-substr "!@" "abc!@+");; 3
(match-substr "!@" "abc!");; nil
This time we need to extract the dimension (prefix) from the trigger symbols in
addition to the function name (postfix). Assuming that sym
complies with our syntax, the following is sufficient:
(defun split-vv-trigger* (sym trig)
(values (digit-char-p (char sym 0))
(symb (subseq sym (+ (length trig)
(match-substr trig sym))))))
(split-vv-trigger* "3!@+" "!@") ;; (~ 3 +)
digit-char-p
returns the digit (as a number), if the input is a digit; and nil
otherwise. As such, we will get an error later when we need the dimension to be
a number.
Debugging macros can be really tricky, so we will check whether the dimension is actually a number instead:
(defun split-vv-trigger (sym trig) (let ((d (digit-char-p (char sym 0)))) (unless d (warn "~a wants digit prefix. got: ~a" trig sym)) (values d (symb (subseq sym (+ (length trig) (match-substr trig sym)))))))
You can probably spot several other things we could improve in this code, but this should give you an idea of how you can get better error messages in your own code. Let's try this out:
(split-vv-trigger* "3!@+" "!@");; (~ 3 +) ; as before
(split-vv-trigger "a!@+" "!@");; (~ nil +)
; WARN: !@ wants digit prefix. got a!@fx
Note that warn
won't stop execution (by default). So you might still get another error later,
but now you have an indication of what went wrong. I won't go into error
handling further here. But know that you can also use error
, if you
actually want to interrupt execution.
Just like in select
we need to know whether a given s-expression
contains our trigger. The only difference this time around is that we explicitly
check whether the first object in the s-expression is an actual symbol. To
see why try to remove the call to symbolp
. Here
is the function:
(defun has-vv-trigger? (body trig) (and (listp body) (symbolp (car body)) (match-substr trig (mkstr (car body))))) (has-vv-trigger? '((2!@+ 1 2)) "!@);; nil
(has-vv-trigger? '(2!@+ 1 2) "!@);; t
Putting it Together
Now that we have all the pieces, we can define the functions to traverse code
and compile it. It is very similar to do-trigger
and
rec
that we used for the select
macro. But obviously
the code we generate is quite different:
; create a list with n new symbols
(defun nsym (n name) (loop repeat n collect (gensym name))); compile !@ triggers
(defun vv-do-trigger (body) (multiple-value-bind (dim fx) (split-vv-trigger (mkstr (car body)) "!@") (let ((args (nsym (* 2 dim) (mkstr "VAR" fx)))) `(multiple-value-bind ,args (~ ,@(vv-rec (cdr body))) (values; make the actual function calls:
,@(loop for a in args; first dim symbs
for b in (subseq args dim); last dim symbs
collect `(,fx ,a ,b))))))); recursively process code with !@ triggers:
(defun vv-rec (body) (cond ((atom body) body) ((has-vv-trigger? body "!@") (vv-do-trigger body)) (t `(,(vv-rec (car body)) ,@(vv-rec (cdr body))))))
Again the actual macro is almost disappointingly trivial. But this time we use
progn
. Which is
another special operator (like quote
). It is often used exactly
the way we use it here; to accept and evaluate any number of forms, then return
the last result. We have used implicit progn
multiple places
already. One example is in the body of let
, which behaves exactly
the same way. Here is the final macro:
(defmacro vv (&body body) `(progn ,@(vv-rec body)))
So let's test it with the example syntax we started with:
(vv (2!@+ (2!@* 1.0 2.0 3.0 4.0) ;; (~ 3.7142859 8.75)
(2!@/ 5.0 6.0 7.0 8.0)))
Which is the result we wanted. The expanded code looks like this:
(macroexpand-1 '(vv (2!@+ (2!@* 1.0 2.0 3.0 4.0) (2!@/ 5.0 6.0 7.0 8.0))));; (PROGN
;; (MULTIPLE-VALUE-BIND (#:VAR+93 #:VAR+94 #:VAR+95 #:VAR+96)
;; (~ (MULTIPLE-VALUE-BIND (#:VAR*97 #:VAR*98 #:VAR*99 #:VAR*100)
;; (~ 1.0 2.0 3.0 4.0)
;; (VALUES (* #:VAR*97 #:VAR*99)
;; (* #:VAR*98 #:VAR*100)))
;; (MULTIPLE-VALUE-BIND (#:VAR/101 #:VAR/102 #:VAR/103 #:VAR/104)
;; (~ 5.0 6.0 7.0 8.0)
;; (VALUES (/ #:VAR/101 #:VAR/103)
;; (/ #:VAR/102 #:VAR/104))))
;; (VALUES (+ #:VAR+93 #:VAR+95)
;; (+ #:VAR+94 #:VAR+96))))
We see that the expansion is a great deal more complicated than the expansions
for select
. This is basically why we want a DSL like this in the
first place; nesting expressions with value packs quickly gets verbose, but the
DSL hides this away quite efficiently.
If you want to extend this implementation of vv
, you can pretty
easily insert checks for more triggers into the cond
in
vv-rec
.
This is isn't necessarily the most efficient approach, but that frequently does not matter that much for macros, as they are expanded at compile time (once), rather than at runtime. If you do run into issues with slow compilation you can always go back and optimize macros later.
Conclusion
I think it took me about two months to implement the full version of
vv
. It has several more modes/triggers and also deals with
arrays of point vectors. You can read about how it
works in the documentation for cl-veq
.
Part of the reason for the development time is that it tends to be how I work on larger macros. Where I have an idea of what I want. But I also need to use the DSL in practice over time to figure out if my initial ideas were reasonable or not. Both in terms of syntax and readability as well as functionality.
Your experience may vary, but this is the approach I would recommend for developing these kinds of tools. It is also where CL really shines, as it allows you to mold the language you are using while you are using it. Giving you an additional dimension to approach your problem from.
-
We have not said anything about the
'#
("sharp quote") notation. It's a bit confusing, but there are actually two namespaces in CL. One for variables and one for functions, roughly speaking. Sometimes you need to refer specifically to a symbol in the function namespace (e.g.list
). In which case the sharp quote reader macro comes in handy. This is also why you will sometimes see the'#
in front of lambda. Opinions vary on what is the "best" convention. I tend to only use it when I have to. And I make macros to avoid it even then. In fact, the full implementation ofvv
hasm@
andf@
for this specific issue. They translate tomultiple-value-call
with and without'#
respectively. - The condition handling in CL is actually pretty sophisticated. And I don't know it very well, so I won't be commenting on it further here.
format
is an interesting DSL, you can read more about it here.