Macros, the Lisp advantage

Learning about macros in Lisps was one of my biggest whoa-moments in my programming career and since then I’ve given presentations about them to audiences ranging from 1 to 100 people. I have a little script that I follow in which I implement a custom form of the if-conditional. Unfortunately, I don’t think I’ve managed to generate many whoa-moments. I’m probably not doing macros justice. It’s not an easy task as they can be complex beasts to play with.

As we are experimenting with Clojure, I eventually needed a tool that I knew was going to be a macro, and I built a simple version of it. The tool is assert_difference. I don’t know where it first appeared, but Rails ships with one and not satisfied with that one I built one a few years ago. In the simplest case it allows you to do this:

assert_difference("User.count()", 1) do

assert_difference("User.count()", 0) do

assert_difference("User.count()", -1) do

The problem

Do you see what’s wrong there? Well, wrong is a strong word. What’s not as good as it could be? It’s the fact that User.count is expressed as a string and not code, when it is code. The reason for doing that is that we don’t want that code to run, we want to have it in a way that we can run it, run the body of the function (adding users, removing users, etc) and run it again comparing it with the output of the previous one.

There’s no (nice) way in Ruby or many languages to express code and not run it. Rephrasing that, there’s no (nice) way to have code as data in Ruby as in most languages. I’m not picking on Ruby, I love that language, I’m just using it because it’s the language I’m most familiar with; what I’m saying is probably true about any programming language you chose.

One of the things you’ll hear repeated over and over in the land of the Lisp, be it Common Lisp, Scheme or Clojure is that code is data. And that’s what we want here, we want to have a piece of code as data. We want this:

assert_difference(User.count(), 1) do

assert_difference(User.count(), 0) do

assert_difference(User.count(), -1) do

which in Lisp syntax it would look like this:

(assert-difference (user-count) 1

(assert-difference (user-count) 0

(assert-difference (user-count) -1

and this, ladies and gentlemen, is not only easy, it’s good practice.

Enter the macro

I achieved it with a simple 4-line macro and here it is:

(defmacro assert-difference [form delta & body]
  `(let [count# ~form]
     (assert-equal (+ count# ~delta) ~form)))

If you are not familiar with Clojure that will be very hard to read, so, let me help you:

The first line defines the macro with the name assert-difference  and getting 3 or more parameters with the first one called form , the second delta  and all other parameters, as a list, in body. So, in this example:

(assert-difference (user-count) 1

we end up with:

  • form => (user-count)
  • delta => 1
  • body => [(add-user-to-database)]

Note that the parameters to the macro didn’t get the value of calling (user-count), it got the code itself, unexecuted, represented as data that we can inspect and play with, not an unparsed string.

The body of the macro is a bit cryptic because it’s a template. The backtick at the beginning just identifies it as a template and ~  means “replace this variable with the parameter”. ~@  is a special version of ~  that we have to use because body contains a list of statements instead of a single one. That means that:

`(let [count# ~form]
    (assert-equal (+ count# ~delta) ~form))

turns into:

(let [count# (user-count)]
   (assert-equal (+ count# 1) (user-count)))

Is it starting to make sense? count#  is a variable that is set to (user-count), then we execute the body, that is (add-user-to-database) and then we execute  (user-count) again and compare it count#  plus delta. This is the code that’s emitted by the macro, this is the code that actually gets compiled and executed.

If you are wondering about why the variable name has a hash at the end, imagine that variable was just named count  instead and the macro was used like this:

(let [count 10]
  (assert-difference (user-count) 1
    (add-users-to-database (generate-users count)))

That snippet defines count, but then the macro defines count again, by the time you reach (generate-users count) that count  was masked by the macro-generated one. That bug can be very hard to debug. The hash at the ends makes it into a uniquely named variable, something like count__28766__auto__ , that is consistent with the other mentions of count#  within that macro.

Isn’t that beautiful?

The real solution

The actual macro that I’m using for now is:

(defmacro is-different [[form delta] & body]
  `(let [count# ~form]
     (is (= (+ count# ~delta) ~form))))

which I’m not going to package and release yet like I did with assert_difference because it’s nowhere near finished and I’m not going to keep on improving it until I see the actual patterns that I like for my tests.

You might notice that it doesn’t use assert-equal. That’s a function I made up because I believe it was familiar for non-clojurians reading this post. When using clojure.test, you actually do this:

(is (= a b))

There’s one and only one thing to remember: is . Think of is  as a generic assert and that’s actually all that you need. No long list of asserts like MiniTest has: assert, assert_block, assert_empty, assert_equal, assert_in_delta, assert_in_epsilon, assert_includes, assert_instance_of, assert_kind_of, assert_match, assert_nil, assert_operator, assert_output, assert_predicate, assert_raises, assert_respond_to, assert_same, assert_send, assert_silent and, assert_throws.

In a programming language like Ruby we need all those assertions because we want to be able to say things such as “1 was expected to be equal to 2, but it isn’t” which can only be done if you do:

assert_equal(1, 2)

but not if you do

assert(1 == 2)

because in the second case, assert doesn’t have any visibility into the fact that it was an equality comparison, it would just say something like “true was expected, but got false” which is not very useful.

Do you see where this is going? is  is a macro, so it has visibility into the code, it can both run the code and use the code as data and thus generate errors such as:

FAIL in (test-users) (user.clj:12)
expected: 1
  actual: 2
    diff: - 1
          + 2

which if you ask me, is a lot of very beautiful detail to come out of just

(is (= 1 2))

But language X!

When talking to people about this, I often get rebuttals that this or that language can do it too and yes, other languages can do some things like this.

For example, we could argue that this is possible in Ruby if you encase the code in an anonymous function when passing it around, such as:

assert_difference(->{User.count}, 1) do

but it’s not as nice and we don’t see it very often. What we see is 20 assert methods like in MiniTest. To have an impact in the language, these techniques need to be easy, nice, first class citizens, the accepted style. Otherwise they might as well not exist.

An even better syntax for blocks in Ruby might help with the issue and indeed what you are left with is a different approach to incredible flexibility and it already exists. It’s called Smalltalk and there’s some discussion about closures, what you have in Lisp,  and objects, what you have in Smalltalk, being equivalent.

I’m also aware of a few languages having templating systems to achieve things such as this, like Template Haskell, but they are always quite hard to use and left for the true experts. You rarely see them covered in a book for beginners of the language, like macros tend to be covered for Lisp.

There are also languages that have a string based macro system, like C and I’ve been told that Tcl does as well. The problem with this is that it’s from hard to impossible to build something that’s of medium complexity, to the point that you are recommended to stay away from it.

All of the alternative solutions I mention so far have a problem: code is not (nice) data. When a macro in Lisp gets a piece of code such as:

(+ 1 2)

that code is received as a list of three elements, containing + , 1  and 2 . If the macro instead received:

(1 2 +)

the code would be a list containing 1, 2 and +. Note that it’s not valid Lisp code, it doesn’t have to be because it’s not being compiled and executed. The output of a macro has to be valid Lisp code, the input can be whatever and thus, making a macro that changes the language from prefix notation to suffix notation, like in the last snippet of code, is one of the few first exercises you do when learning to make macros.

What makes it so easy to get code as data and work with it and then execute the data as code is the fact that Lisp’s syntax is very close to the abstract syntax tree of the language. The abstract syntax tree of Lisp programs is something Lisp programmers are familiar with intuitively while most programmers of other languages have no idea what the AST looks like for their programs. Indeed, I don’t know what it looks like for any of the Ruby code I wrote.

Most programmers don’t know what an AST actually is, but even the Lisp programmers that don’t know what an AST is have an intuition for what the ASTs of their programs are.

This is why many claim Lisp to be the most powerful programming language out there. You could start thinking of another programming language that has macros that receive code as data and that their syntax is close to the AST and if you find one of those, congratulations, you found a programming language of the Lisp family because pretty much those properties make it a member of the Lisp family.

Why I love Lisp

This post was extracted from a small talk I gave at Simplificator, where I work, titled “Why I love Smalltalk and Lisp”. There’s another post titled “Why I love Smalltalk” published before this one.

Desert by Guilherme Jófili

Lisp is an old language. Very old. Today there are many Lisps and no single language is called Lisp today. Actually, there are as many Lisps as Lisp programmers. That’s because you become a Lisp programmer when you go alone in the desert and write an interpreter for your flavor of lisp with a stick on the sand.

There are two main Lisps these days: Common Lisp and Scheme, both standards with many implementations. The various Common Lisps are more or less the same, the various Schemes are the same at the basic level but then they differ, sometimes quite significantly. They are both interesting but I personally failed to make a practical use of any of those. Both bother me in different ways, and of all the other Lisps, my favorite is Clojure. I’m not going to dig into that, it’s a personal matter and it’ll take me a long time.

Clojure, like any other Lisp, has a REPL (Read Eval Print Loop) where we can write code and get it to run immediately. For example:

;=> 5

"Hello world"
;=> "Hello world"

Normally you get a prompt, like user>, but here I’m using the joyful Clojure example code convention. You can give this REPL thing a try and run any code from this post in Try Clojure.

We can call a function like this:

(println "Hello World")
; Hello World
;=> nil

It printed “Hello World” and returned nil. I know the parenthesis look misplaced but there’s a reason for that and you’ll notice it’s not that different from Javaish snippet:

println("Hello World")

except that Clojure uses the parenthesis in that way for all operations:

(+ 1 2)
;=> 3

In Clojure we also have vectors:

[1 2 3 4]
;=> [1 2 3 4]


;=> symbol

The reason for the quote is that symbols are treated as variables. Without the quote, Clojure would try to find its value. Same for lists:

'(li st)
;=> (li st)

and nested lists

'(l (i s) t)
;=> (l (i s) t)

Here’s how defining a variable and using it looks like

(def hello-world "Hello world")
;=> #'user/hello-world

;=> "Hello world"

I’m going very fast, skipping lots of details and maybe some things are not totally correct. Bear with me, I want to get to the good stuff.

In Clojure you create functions like this:

(fn [n] (* n 2))
;=> #<user$eval1$fn__2 user$eval1$fn__2@175bc6c8>

That ugly long thing is how a compiled function is printed out. Don’t worry, it’s not something you see often. That’s a function, as created by the operator fn, of one argument, called n, that multiplies the argument by two and returns the result. In Clojure as in all Lisps, the value of the last expression of a function is returned.

If you look at how a function is called:

(println "Hello World")

you’ll notice the pattern is, open parens, function, arguments, close parens. Or saying it in another way, a list where the first item is the operator and the rest are the arguments.

Let’s call that function:

((fn [n] (* n 2)) 10)
;=> 20

What I’m doing there is defining an anonymous function and applying it immediately. Let’s give that function a name:

(def twice (fn [n] (* n 2)))
;=> #'user/twice

and then we can apply it by name:

(twice 32)
;=> 64

As you can see, functions are stored in variables like any other piece of data. Since that’s something that’s done very often, there’s a shortcut:

(defn twice [n] (* 2 n))
;=> #'user/twice

(twice 32)
;=> 64

Let’s make the function have a maximum of 100 by using an if:

(defn twice [n] (if (> n 50) 100 (* n 2))))

The if operator has three arguments, the predicate, the expresion to evaluate when the predicate is true and the one when it’s false. Maybe like this it’s easier to read:

(defn twice [n]
  (if (> n 50)
      (* n 2)))

Enough basic stuff, let’s move to the fun stuff.

Let’s say you want to write Lisp backwards. The operator at the last position, like this:

(4 5 +)

Let’s call this language Psil (that’s Lisp backwards… I’m so smart). Obviously if you just try to run that it won’t work:

(4 5 +)
;=> java.lang.ClassCastException: java.lang.Integer cannot be cast to clojure.lang.IFn (NO_SOURCE_FILE:0)

That’s Clojure telling you that 4 is not a function (an object implementing the interface clojure.lang.IFn).

It’s easy enough to write a function that converts from Psil to Lisp:

(defn psil [exp]
  (reverse exp))

The problem is that when I try to use it, like this:

(psil (4 5 +))
;=> java.lang.ClassCastException: java.lang.Integer cannot be cast to clojure.lang.IFn (NO_SOURCE_FILE:0)

I obviously get an error, because before psil is called, Clojure tries to evaluate the argument, that is, (4 5 +) and that fails. We can call it explicitly turning the argument into a list, like this:

(psil '(4 5 +))
;=> (+ 5 4)

but that didn’t evaluate it, it just reversed it. Evaluating it is not that hard though:

(eval (psil '(4 5 +)))
;=> 9

You can start to see the power of Lisp. The fact that the code is just a bunch of nested lists allows you to easily generate running programs out of pieces of data.

If you don’t see it, just try doing it in your favorite language. Start with an array containing two numbers and a plus and end up with the result of adding them. You probably end up concatenating strings or doing other nasty stuff.

This way of programming is so common on Lisp that it was abstracted away in a reusable thing call macros. Macros are functions that receive the unevaluated arguments and the result is then evaluated as Lisp.

Let’s turn psil into a macro:

(defmacro psil [exp]
  (reverse exp))

The only difference is that I’m now calling defmacro instead of defn. This is quite remarkable:

(psil (4 5 +))
;=> 9

Note how the argument is not valid Clojure yet I didn’t get any error. That’s because it’s not evaluated until psil processes it. The psil macro is getting the argument as data. When you hear people say that in Lisp code is data, this is what they are talking about. It’s data you can manipulate to generate other programs. This is what allows you to invent your own programming language on top of Lisp and have any semantics you need.

There’s an operator on Clojure called macroexpand which makes a macro skip the evaluation part so you can see what’s the code that’s going to be evaluated:

(macroexpand '(psil (4 5 +)))
;=> (+ 5 4)

You can think of a macro as a function that runs at compile time. The truth is, in Lisp, compile time and run time are all mixed and you are constantly switching between the two. We can make our psil macro very verbose to see what’s going on, but before, I have to show you do.

do is a very simple operator, it takes a list of expressions and runs them one after the other but they are all grouped into one single expression that you can pass around, for example:

(do (println "Hello") (println "world"))
; Hello
; world
;=> nil

With do, we can make the macro return more than one expression and to make it verbose:

(defmacro psil [exp]
  (println "compile time")
  `(do (println "run time")
       ~(reverse exp)))

That new macro prints “compile time” and returns a do that prints
“run time” and runs exp backwards. The back-tick, ` is like the quote ' except that allows you to unquote inside it by using the tilde, ~. Don’t worry if you don’t understand that yet, let’s just run it:

(psil (4 5 +))
; compile time
; run time
;=> 9

As expected, compile time happens before runtime. If we use macroexpand things will get more clear:

(macroexpand '(psil (4 5 +)))
; compile time
;=> (do (clojure.core/println "run time") (+ 5 4))

You can see that the compile phase already happened and we got an expression that will print “run time” and then evaluate (+ 5 4). It also expanded println into its full form, clojure.core/println, but you can ignore that. When that code is evaluated at run time.

The result of the macro is essentially:

(do (println "run time")
    (+ 5 4))

and in the macro it was written like this:

`(do (println "run time")
     ~(reverse exp))

The back-tick essentially created a kind of template where the tilde marked parts for evaluating ((reverse exp)) while the rest was left at is.

There are even more surprises behind macros, but for now, it’s enough hocus pocus.

The power of this technique may not be totally apparent yet. Following my Why I love Smalltalk post, let’s imagine that Clojure didn’t come with an if, only cond. It’s not the best example in this case, but it’s simple enough.

cond is like a switch or case in other languages:

(cond (= x 0) "It's zero"
      (= x 1) "It's one"
      :else "It's something else")

Around cond we can create a function my-if straightforward enough:

(defn my-if [predicate if-true if-false]
  (cond predicate if-true
        :else if-false))

and at first it seems to work:

(my-if (= 0 0) "equals" "not-equals")
;=> "equals"
(my-if (= 0 1) "equals" "not-equals")
;=> "not-equals"

but there’s a problem. Can you spot it? my-if is evaluating all its arguments, so if we do something like this, the result is not as expected:

(my-if (= 0 0) (println "equals") (println "not-equals"))
; equals
; not-equals
;=> nil

Converting my-if into a macro:

(defmacro my-if [predicate if-true if-false]
  `(cond ~predicate ~if-true
         :else ~if-false))

solves the problem:

(my-if (= 0 0) (println "equals") (println "not-equals"))
; equals
;=> nil

This is just a glimpse into the power of macros. One very interesting case was when object oriented programming was invented (Lisp is older than that) and Lisp programmers wanted to use it.

C programmers had to invent new languages, C++ and Objective C, with their compilers. Lisp programmers created a bunch of macros, like defclass, defmethod, etc. All thanks to macros. Revolutions, in Lisp, tend to just be evolutions.

Thanks to Gonzalo Fernández, Alessandro Di Maria, Vladimir Filipović for reading drafts of this.

Croation translation on

Lisp macros feel like cheating

Common Lisp macros feel like cheating. I’ve reached chapter 9 of Practical Common Lisp, where the goal is to build a unit test framework, and you can see right away how the patterns are easily abstracted out with macros. It’s so easy it feels like cheating.

Getting a text representation of the test code to be able to point what when wrong, to show the piece of code failing, is supposed to be a hard task. Well, it is a hard task, sometimes impossible, in most programming languages. In Common Lisp it’s so trivial, it feels like cheating.

If you want to know what Lisp is about, read up to chapter 9 of PCL (Practical Common Lisp), at least. I used to tell people to at least read chapter 3, but it seems not to be enough, sadly, to impress the average programmer (either because they just don’t see it or chapter 3 is still too basic).

Comments at the original blog

Nubis Says:

Hey, good point. You know what also feels like cheating to me? The loop language.
I mean, from the 99 lisp problems:
Flatten a tree of lists:

(defun flatten (list)
  (loop for i in list if (listp i) append (flatten i) else collect i))

or, remove duplicated values from a list

(defun makeset (list)
  (loop for i in list unless (find i set) collect i into set finally (return set)))

At first it looks like cheating, but then i realized that the loop language is to lists what regular expressions are to strings, and it’s better than having a bunch of different people writing their loops and boilerplate in different ways.

best regards

September 3rd, 2007 at 4:30 e