A Vector DSL

In the previous post I introduced my graph data structure (cl-grph), and gave some examples of Datalog queries. Datalog is a Domain Specific Language (DSL) for queries on graphs.

Building a Datalog compiler as an example DSL in a blog post is a little ambitious. Instead we will look at two different approaches for making a DSL for vector mathematics.

It is necessary to explain some technical details specific to Common Lisp (CL) in order to introduce the motivation for making this DSL. I make an effort to give a cursory explanation of these details, so that anyone with some programming experience will be able to follow the core points.

To start us off I want to show you this quote:

[...] a good programmer does not just write programs. A good programmer builds a working vocabulary. A good programmer does language design, though not from scratch, but building on the frame of a base language.
—Guy Steele, Growing a Language, 1998

This is an excellent presentation. And while Steele is talking about language design in more general terms, I still hope you will agree that it is relevant to the topic at hand.

Generative geometry, rendered in Blender/Cycles

Values and Values

We will make frequent use of the CL function values. values is a way to group several values together. The most relevant use in this context is that functions can return a block of (values ...):

; define function named fx
(defun fx (a) (values 1 2 3 a))

(fx 7) ;; returns: (values 1 2 3 7)

This makes it convenient to use values to represent point vectors. Which is part of the motivation behind what follows.

However, a function will only receive one value from each block of values. That is, if we do this:

; define an nary function that returns
; any number of arguments as a list:
(defun my-function (&rest rest) rest)

(my-function (values 1 2) (values 3 4)) ;; '(1 3)

my-function only sees the first value from each value block. To call my-function on all values we have to use multiple-value-call:

(multiple-value-call
  #'my-function (values 1 2)
                (values 3 4)) ;; '(1 2 3 4)

This is awkward. And brings us to the next part of the motivation. Surely we can avoid this.

Experimental tiling

Macros

The term macro in programming can mean a few different things. Here we are talking about CL macros. A macro in CL looks a lot like defining a function:

(defmacro awesome-defun (name args &body body)
  ; recognize this bit just below ⁂ :
  `(defun ,(symb "awesome-" name) (,@args)
     (progn ,@body)))

; if we call the macro:
(awesome-defun print (p) (print p))
; ⁂  it produces this code:
 (DEFUN AWESOME-PRINT (P)
  (PROGN (PRINT P)))
; by convention, generated code is in UPPERCASE

We are going to ignore the details of the macro syntax. But notice that the final function definition (⁂ ) is a perfectly valid function definition. That has just been evaluated. In other words we can now call the function:

(awesome-print :hello-world)
; prints: :hello-world

You undoubtedly would like all your code to be full of awesome functions. Yet it may not be obvious why the ability to extend the language is so meaningful. But if you can write code that can define functions, then you can do just about anything.

The next sections introduce two examples.

Experimental tiling

Using macros

We start by creating addition (2+) and multiplication (2*) for 2d point vectors. One possible approach is something like this:

; 2d addition +
(defun internal-2+ (ax ay bx by)
  (values (+ ax bx) (+ ay by)))
; a macro that uses multiple-value-call as described above
(defmacro 2+ (&rest rest)
  `(multiple-value-call #'internal-2+ ,@rest))

; 2d multiplication *
(defun internal-2* (ax ay bx by)
  (values (* ax bx) (* ay by)))
; multiple-value-call again
(defmacro 2* (&rest rest)
  `(multiple-value-call #'internal-2* ,@rest))

With these definitions we can do vector mathematics:

; simple
(2+ 1 2 3 4)         ;; (values (+ 1 3) (+ 2 4))
; nested
(2+ (2* 10 20 30 40) ;; (values (+ (* 10 30) 3)
     3 4)            ;;         (+ (* 20 40) 4))

Considering how repetitive the above function and macro definitions are, you can probably see what is coming next. Now we create a macro that will define vector operations for a given dimension and function expression.

Here is a naive version of that macro:

(defmacro define-dim-op ((dim name) &body body)
  (let ((mname (symb dim name))             ;; ex: '2+
        (fname (symb :internal- dim name))  ;; ex: 'internal-2+
        (args (loop for i from 1 repeat dim ;; ex: '(g11 g12 g21 g22):
                    append (loop for j from 1 repeat dim
                                  collect (symb :g i j)))))
    `(progn (defun ,fname (,@args) ,@body)  ; define function
            (defmacro ,mname (&rest rest)   ; define wrapper
              `(multiple-value-call ,#',fname ,@rest)))))

It can be used like this:

(define-dim-op (2 +)
  (values (+ g11 g21) (+ g12 g22)))

This creates the same 2+ operation as before:

(2+ 1 2 3 4) ;; (values (+ 1 3) (+ 2 4))

With some more work this approach can be extended to other types of operations. For example, if you want to sum two arrays of point vectors, or if you want to reduce the dimension of point vectors in an array and so on.

But what if there is a more general approach?

Distorted 3d scan, rendered in Blender/Cycles

An Alternative Approach

So far we have created code that builds a library of all the relevant functions (and corresponding wrapper macros) that we need. What if we instead identify the pattern of the operations we want. Then design the code that matches that pattern.

With 2+ in mind. We have two vectors of dimension two. We want to call + on consecutive pairs of elements in those vectors. Furthermore we want to express it as an S-expression, where the first two values belong to the first point vector, and the next two values belong to the second.

Using 2+ looks like this: (2+ 1 2 3 4). If we add a unique identifier to the function name, it will be possible to recognize it in code. We will use !@:

(2!@+ 1 2 3 4)
; which gives us the pattern:
; ([dim]!@[fx] (values ...) (values ...))

Describing how to do this substitution is outside the scope of this post , but it is not too difficult to make a macro that traverses arbitrary code, finds function calls that match [dim]!@[fx] and generates something like the following code:

; binds arg/a = 1, arg/b = 2, ...
(MULTIPLE-VALUE-BIND (ARG/A ARG/B ARG/C ARG/D)
                     (MULTIPLE-VALUE-CALL
                       #'VALUES 1 2 3 4)
  (VALUES (+ ARG/A ARG/C) (+ ARG/B ARG/D))))

Which should look quite familiar by now, even if it is a little more involved than before.

Finally, here is an illustration of how you can extend this directly by using your own function:

; labels defines a local function, inverse-sub
(labels ((inverse-sub (a b) (- b a)))
  (2!@inverse-sub 1 2 3 4)) ;; (values (- 3 1)
                            ;;         (- 4 2))

Experiments with symmetry

Macros as Compilers

The trigger symbol approach outlines what is really a kind of compiler for a small DSL. With this framework in place the next step is to extend the DSL (and thus CL itself) with additional syntax and/or new triggers. That way it can handle more types of vector operations.

Hopefully this has brought you a little closer to appreciating that macros are a powerful tool. And that They enable you to alter the language you are using to solve your problem. Not just the vocabulary.

If you want to read more about this in particular you can look at the vv macro in the cl-veq documentation.

EDIT: In this post I describe in detail how to write a small Common Lisp DSL. And here is an example implementation of the vector DSL.

If you want to learn Common Lisp I suggest starting with Peter Seibel's excellent book Practical Common Lisp.
Efficiency is another important consideration. Passing values between functions can be more efficient than using heavier data structures. This is the case in SBCL where this approach is significantly faster than all other alternatives I have tried. Hit me up if you have other suggestions.
There are a number of technical improvements that strictly speaking should be made to this macro, but I have omitted then in the interest of simplicity. If you want an exhaustive discussion of CL macros I suggest On Lisp by Paul Graham and Let over Lambda by Dough Hoyte.
Here is a working example implementation.
(multiple-value-call #'fx (values a b)) is roughly equivalent to (multiple-value-bind (a* b*) (values a b) (fx a* b*))