Let’s Write a Macro

NB: This isn’t really aa tutorial on writing Clojure macros, it’s a description of a macro I wrote and how I went about it. If your looking for an introduction to writing Clojure macros there’s an excellent one at Clojure for the Brave and True.

I’ve been working on a library for managing users and I’ve found I’ve been writing a lot of code validating parameters, it looks like this:

(if (and (string? username) (not (str/blank? username)))
  (if (and (string? password) (not (str/blank? password)))
    ;; Do Stuff
    (throw (Exception. "Invalid password")))
  (throw (Exception. "Invalid username")))

This is ugly and repetitive, surely we can do better.

My first attempt was to remove the duplication by extracting some of duplicate code into a predicate:

(defn valid-str? [s]
  (if (and (string? s) (not (str/blank? s)))
    true
    false))

Thats a good start since it simplifies the code a bit, but you still writing code like:

(if (valid-str? username)
  (if (valid-str? password)
    ;;Do Stuff
    (throw (Exception.)))
  (throw (Exception.)))

What I really want is a way to wrap the code that depends on the username and password so that it only executes if the values are valid or else throws the relevant exception. This sounds like a job for a macro.

A First Stab

My first attempt was to write the following:

(defmacro validate
  "Checks if the parameter is valid and either executes body or throws an exception."
  ([s body]
    `(when (valid-str? ~s)
       ~@body))
  ([s body ex]
    `(if (valid-str? ~s)
       ~@body
       (throw ~ex))))

So now I could write code like:

(validate username
  (validate password
    ;; Do Stuff
    (Exception.))
  (Exception.))

Hmmm…not that much of an improvement and my macro is limited to only validating strings, surely I can do better than that. What I really want is something that I can use like the following:

(validate predicate username
  ;; Do Stuff
  (throw (Exception. "Invalid username")))

(validate predicate [username password]
  ;; Do Stuff
  [(throw (Exception. "Invalid username"))
   (throw (Exception. "Invalid password"))])

Now that’s a big improvement, there’s less code, it’s easier to read, the intent is clearer and we can use any thing we want as a validation function. Ok, so now I know what I want how do I get it ? Well my first thought was that this is a bit like cond.

The cond Macro

To quote the cond docstring:

Takes a set of test/expr pairs. It evaluates each test one at a time. If a test returns logical true, cond evaluates and returns the value of the corresponding expr and doesn’t evaluate any of the other tests or exprs. (cond) returns nil.

Here’s an example from Clojuredocs:

(cond
  (> n 0) "negative"
  (> n 0) "positive"
  :else "zero")

So in my use case the code would look like:

(cond
  (not (valid-str? username)) (throw (Exception. "Invalid username"))
  (not (valid-str? password)) (throw (Exception. "Invalid password"))
  :else ;; Do Stuff))

That’s pretty good, it’s certainly an improvement on what I’d been doing previously, but I’m not sure it’s so easy to understand the intent of the code. However given cond is a macro if we look at its source code it might give us an idea of where to start. The source of the cond macro looks like this:

(defmacro cond
  "Takes a set of test/expr pairs. It evaluates each test one at a time. If a test returns logical true, cond evaluates and returns the value of the corresponding expr and doesn't evaluate any of the other tests or exprs. (cond) returns nil."
  {:added "1.0"}
  [& clauses]
    (when clauses
      (list 'if (first clauses)
        (if (next clauses)
          (second clauses)
          (throw (IllegalArgumentException. "cond requires an even number of forms")))
        (cons 'clojure.core/cond (next (next clauses))))))

Here we can see that it first checks if it has any clauses before creating an if clause and then, if it has more clauses, recursively applies itself to the remaining clauses. The code it generates is not unlike what I’d been writing originally so it definitely looks like the right approach. Whilst the cond code doesn’t give me a solution to how to write my macro it does make me think maybe I should try rewriting my it using cond.

A Second Stab

So taking that into account here’s my second attampt at the macro:

(defmacro validate
  [pred s body ex]
    (let [terms (interleave (map #(list not `(~pred ~%1)) s) ex)]
     `(cond
        ~@terms
        :else (do ~@body))))

There are a few things going on here.

First, (map #(list not `(~pred ~%1)) s) creates a sequence of cond terms by wrapping each string with the predicate term and negating the result. Then the sequence is interleaved with the sequence of handlers that need to be called if a particular string fails validation. Finally, it inserts the terms into a cond expression and adds the body of code to execute if all the strings are valid as the :else clause of the cond expression.

I can use the macro as follows:

(validate valid-str? ["abc" 123]
  ((println "These strings ")
   (println "are all good."))
  [(println "Not a valid string.")
   (println "Not a valid string either.")])

Which will print Not a valid string either. to the console.

Some issues

It’s looking pretty good, but there are some issues. Firstly, the macro assumes the strings and handlers will be sequences…pass it a single value and it blows up. The easiest way to fix that is to ensure that we’re always dealing with sequences, for example:

(defmacro validate
  [pred s body ex]
    (let [vals (if (sequential? s) s (vector s))
          handlers (if (sequential? ex) ex (vector ex))
          terms (interleave (map #(list not `(~pred ~%1)) vals) handlers)]
      `(cond
         ~@terms
         :else (do ~@body))))

Secondly, the macro assumes the sequence of strings will be the same length as the sequence of handlers. In most circumstances they will be so this seems to be a reasonable assumption, but what about the case where you have 2 or more strings to validate but only want to provide an error handler for the first string ? Well because we use interleave to combine the strings with their handlers the macro will only interleave to the length of the shortest sequence and so in this case only the first string will be validated. One way to get around this is to manually ensure that the sequences are the same length by padding the handlers sequence with nil, but that’s messy and prone to errors. A better way is to automatically pad the sequences so they’re the same length. To do that we first need a padding function, such as:

(defn pad [vals len pad-fn]
  (let [s (if (sequential? vals) vals (vector vals))]
    (if (> len (count s))
      (concat s (take (- len (count s)) (repeatedly pad-fn)))
      (identity s))))

This function takes a sequence, the length we want to pad the sequence to and a function to generate the additional elements and returns a sequence containing the original values padded to the required length. So adding that to the macro we get:

(defmacro validate
  [pred s body ex]
    (let [vals (if (sequential? s) s (vector s))
          handlers (pad (if (sequential? ex) ex (vector ex)) (count vals) #(identity false))
          terms (interleave (map #(list not `(~pred ~%1)) vals) handlers)]
      `(cond
         ~@terms
         :else (do ~@body))))

The last issue with the macro is that in its current form it requires us to provide a vector of error handlers, even if the vector is empty. If we don’t provide the vector the macro blows up. So we need to be able to handle the case where no error handlers are provided.

The easiest way I can think to do that is to create a multi-arity macro so we can handle the situation where no error handlers are passed as a special case. It turns out that if we don’t have to worry about the error handlers the macro becomes much simpler since all we need to do is ensure that the predicate is valid for all elements of the input sequence and as it happens Clojure provides us a function to do just that,every?. So adding in our special handling we get:

(defmacro validate
  ([pred s body]
    `(let [vals# (if (sequential? ~s) ~s (vector ~s))]
       (when (every? ~pred vals#)
         (do ~@body))))
  ([pred s body ex]
    (let [vals (if (sequential? s) s (vector s))
          handlers (pad (if (sequential? ex) ex (vector ex)) (count vals) #(identity false))
          terms (interleave (map #(list not `(~pred ~%1)) vals) handlers)]
      `(cond
         ~@terms
         :else (do ~@body)))))

Conclusion

So there we have it a macro we can use for to ensure code is only executed if a sequences of values are all validated and as an aside we’ve also got a function to pad sequences so they end up the same length. There are probably some things we could do to improve the macro, but for the moment it works as required and that’s good enough for me.

Posted in General | Leave a comment

Clojam – A Clojure library for Google Code Jam

I put together this small library for doing Google Code Jam with Clojure and just uploaded it to Github in case anyone else might find it useful.

Code Jam

Code Jam is Google’s annual coding competition. It consists of a series of rounds and in each round a series of problems that must be solved in a limited time. The problems usually consist of reading data from an input file and processing it in some way o get the required results with points being awarded for completing the problem for small and large datasets as well as how quickly the problem is solved.

I wanted to try solving the problems using Clojure and while working through the prior years problems I put together this library to handle the mundane stuff like reading the input and writing the output so I could focus on solving the actual problem.

Usage

So how do you use it ?

First you need to include the library by adding the following to your project.clj:

[amanoras/clojam "0.1.0”]

Next you need to include the library in your code, for example:

(:use [clojam core cases utils])

Finaly you should (but you don’t have to) define a main function that can be called from the command line e.g. using lein run, so that you can pass in the names of your input and output files. The core of the library is the jam function which takes as arguments:

  • the path to the input file
  • a vector that describes how the data in the input file should be combined into cases
  • a function that solves a case
  • a function that prints the output of a case in the correct format
  • the path to the output file

For example:

(defn -main [infile outfile & args]
  (jam infile [:param1 :param2 :param3]
     solve-case
     output-format outfile))

In this example the data in the input file should be grouped into cases 3 lines at a time, but sometimes the input file can contain several fixed lines before the cases start. In this case you can pass a nested vector as the last element of the vector describing the file structure. This is best illustrated with an example:

(defn -main [infile outfile & args]
  (jam infile [:param1 [:param2 :param3]]
     solve-case
     output-format outfile))

In this example we’re specifying that the data file contains 1 fixed line and then each case consists of 2 lines of data.

It should be noted that this format vector ignores the first line of the data file as this always contains the number of cases in the file.

Samples

To get an idea of how the library works I’ve included some example code in the samples folder. These are the solutions to the problems for round 1A in 2008. You can see the problem descriptions here.

Posted in General | Leave a comment

Mocking with Midje

So I was playing around with an application recently and wanted to integrate Chas Emerick’s excellent Friend authentication library. I setup a simple User service to get users, roles etc in a format that could be consumed by Friend and wanted to test the code using Midje. The question was how to represent the user data repository ?

One option was to setup a test database usaing H2 or something similar and testing the code against it, but that would mean having to reload the database each time the tests were run to ensure the data was consistent between test runs. Another option was to store the data as a map in memory, but that seemed to have the same limitations as a test database. In the Java world we’d get around these sorts of limitations using a mocking framework like jMock or EasyMock, but how could I do this in Clojure ?

Fortunately Midje provides some great support for doing this kind of thing through  prerequisites and meta-constants. Prerequisites are great, they allow us to specify the return value of a function without having to specify the implementation. For example, let’s say I have a function get-user-roles that returns the roles for a given user as a set which in turn calls a function retrieve-user-roles that gets the user data from a database and we want to test get-user-roles without having to worry about the database, well in this case we can use a prerequisite to mock retrieve-user-roles. Here’s what our test could look like:

(fact "`get-user-roles` returns a set of roles for a user"
(get-user-roles ..user-id..) => #{..user..}
(provided (retrieve-user-roles ..user-id..) => [{:name ..user..}]))

Here we specify our test case as:

(get-user-roles ..user-id..) => #{..user..}

which basically says that if we call get-user-roles passing in ..user-id.. the result should be a set containing ..user.. . The important part is:

(provided (retrieve-user-roles ..user-id..) => [{:name ..user..}]

This is our prerequisite, specified by the (provided) form, and it basically says that when retrieve-user-roles is called with a parameter of ..user-id.. it will return a vector containing a map of {:name ..user..}. Another cool feature of prerequisites is that when we specify a (provided) form not only are we specifying the return value, but also that the code under test must call the function in the prerequisite with the stated parameters. If it doesn’t then the test will fail.

Now some of you may have noticed the odd parameters that were used in the test case, e.g. ..user-id.. . This is an example of a meta-constant, the second thing that Midje provides to assist in mocking. Meta-constants allow us to defer decisions about what data we want to use in our tests, essentially they allow us to substitute the data for a symbol and then refer to the symbol rather than having to worry about the actual value. For example, in the sample code above we pass a meta-constant, ..user-id.., to get-user-roles rather than passing in an actual user ID since we don’t really care about what value is passed to the function only that when ..user-id.. is passed a specific result should be returned. True in this instance we could hard code a value in the test, but using a meta-constant gives us a couple of advantages. Firstly it makes explicit that we aren’t concerned with the actual value that is passed to the function whilst making it clear the value under test is a user ID and secondly it makes it easier to catch typos and errors where the meta-constant is used as the test will fail if the meta-constant name is wrong or used inconsistently within the test.

As a further example suppose we want now to test admin accounts. In this case we can easily write a second test passing in a new meta-constant ..admin-id.. to the get-user-roles function and add a new prerequisite to return a different set of data for admins. Our test can now expect a different set of roles to be returned without having to change any core code or worry about what data is in the database. Magic.

Posted in Clojure | Tagged | Leave a comment

Ganelon

When it comes to web development with Clojure everything pretty much revolves around Ring and Compojure for HTTP abstraction and routing with a generous helping of Hiccup or Enlive for HTML templating plus Friend or SQL Korma or whatever else you want need to round out your stack. This works pretty well since it gives you a huge amount of control over how you put together your application, but as a newcomer to Clojure it can be really daunting since not only are you trying to learn the nuances of a new language but also the intricacies of a whole bunch of libraries. When you’re new to something like this what you really want is a one stop shop where you can get all the parts without having to worry  about how the pieces fit together.

One early attempt to do this for Clojure was Noir. Noir is essentially an abstraction over Ring/Compojure that includes Hiccup by default and adds some extra features like cookie handling and stateful sessions and some syntactic sugar for creating pages. However with the recent deprecation of Noir that pretty much puts us back where we started. But as they say when God closes a door he opens a window and in this case the best bits of Noir were repackaged as lib-noir (a library that can be accessed by any Clojure code without the rest of the framework) and this was taken up by a couple of new frameworks such as Luminus and Ganelon. Both of these projects build on top of Compojure by adding lib-noir features and smoothing out some of Noir’s rough points, but whereas Luminus is closer to a straight replacement for Noir (although it uses a different templating library and adds database support)  building a full stack framework a la Rails for Clojure, Ganelon has taken the best bits of Compojure and Noir and added some AJAX sizzle.

Essentially Ganelon is Ring, Compojure and Noir (with better handling for custom middleware) and some javascript and CSS (Bootstrap anyone ?), but it takes an unusual approach to AJAX. Really what it comes down to is that you can create widgets (which are snippets of HTML you want to perform some AJAX operation on) and actions (which are server-side functions that return JSON or javascript operations). They work by allowing part of your web page to call some code on the server, that code can then generate some HTML fragments or other ouput that then gets sent back to the browser and inserted into the DOM. Pretty simple right ? But all of this is written in Clojure, no javascript no Clojurescript just plain Clojure. What I really like about this is that the code that you use to generate the initial rendering can be reused to generate the updates. This is great because it means you can have a single code base and don’t need one set of code to generate the initial page and another one to do the samething in javascript on the client side.

There’s obviously more to it than just that and I’ll try to work through a tutorial in the near future, but for the moment I recommend that anyone interested in Clojure web development check it out.

Posted in Clojure | Tagged , | 1 Comment

Clojurescript Links

I’ve been following Clojurescript since it was released, but have only recently started using it. Just in case you don’t know what it is, Clojurescript is a compiler for Clojure that compiles to Javascript so it can be executed in a browser. It  can also generated highly optimized Javascript by running it through the Google Closure (not to confused with Clojure) compiler.

Since there are already some really good introductions to Clojurescript online I thought I’d post some links to those rather than posting a tutorial on getting started with Clojurescript, so here they are:

Posted in Clojure | Tagged | Leave a comment

TDD in Clojure with Midje

One of the things I love about Clojure is that it has great support for writing unit tests. For starters you have clojure.test built-in to the Clojure API. The great thing about that is you don’t have any extra dependencies, once you have Clojure you can start writing tests. But if you don’t like clojure.test there are several other unit test frameworks you can use with differing approaches to unit testing. One these is Midje by Brian Marick. Midje is a unit test framework that encourages readable tests that can be written bottom-up or top-down and tests are written as facts that are asserted against the code under test. In the rest of this post I’m going to show how to get up and running with Midje in a Clojure project and define some tests for a sample project.

Setting up Midje

The easiest way to start using Midje is to install the lein-midje plugin. There are 2 ways to do that. If you’re planning on using Midje in all your projects you can add lein-midje to the :plugins list in the :user profile in ~/.lein/profiles.clj e.g.

{:user
  {:plugins [[lein-midje "2.0.0-SNAPSHOT"]]}}

However if you want to be able to configure this on a project by project basis you can add the lein-midje plugin to the :plugins list in the :dev profile of your project.clj file e.g.

:profiles {
  :dev {
    :dependencies [[midje "1.4.0"]]
    :plugins [[lein-midje "2.0.0-SNAPSHOT"]]}})

The final step is to add midje 1.4.0 (or what ever the latest version is) to the :dev :dependencies as shown above. Once you’ve done that you can run lein midje from the command line within your project to validate everything is setup correctly (if you have any existing tests you should see a message like “All claimed facts (X) have been confirmed.”)

Sample Project

I find the best way to get to grips with a new language or library is to build a small sample project using it. For that reason I’ve put together a simple project to demonstrate some of the features of Midje. As I said the project is really simple, it involves creating a single function that given a number will return that number multiplied by 2. The full source code can be found here, but I’ll be describing the steps needed to recreate the code in the following sections.

So the first thing we need to do is create a new project to work with. Again, the best way to do this is using Lein. The default project template will do nicely so the first step is to go to the directory were you want to create your project and run:

lein new clj-midje-example

Once you’ve done that cd into the project directory and we’re ready to go. We’re going to create a function called times-2 that when given a number returns that number multiplied by 2. Simple, but we’ll expand on it a bit later to demonstrate some of Midje’s features.

First Test

So let’s write our first test. When we created our new project Lein created a sample test case in clj-midje-example/test/clj_midje_example/core_test.clj. We’re going to replace this sample test case with one of our own, but since this test case uses clojure.test we’re going to have to make some other changes as well to get Midje working. So open clj-midje-example/test/clj_midje_example/core_test.clj in an editor and replace the line (:use [clojure.test]) with (:use [midje.sweet]). midje.sweet is the core namespace for Midje and needs to be included in any files containing test cases. Next we need to remove the deftest declaration since we aren’t using clojure.test anymore. Now we’re ready to start writing tests.

Tests in Midje are written as facts. Facts take the following form:

(fact "Doc String"
  (function-to-test params) => expected-result)

Whilst the expected result can be a literal Midje also allows functions such as even?, pos?, nil? etc. Midje provides a number of helper functions you can use to test results, you can read more about them on the Midje Wiki.

For our first test all we really want to do is check that the function does what it’s supposed to do i.e. multiply a single argument by 2, so our test looks like:

(fact "2 * 2 equals 4"
  (times-2 2) => 4)

Add this to core_test.clj and save the file. From the command line run:

lein midje

You should see an error like Exception in thread "main" java.lang.RuntimeException: Unable to resolve symbol: times-2 in this context, compiling:(clj_midje_example/core_test.clj:8). That’s because we haven’t written our times-2 function yet so lets do that.

Write the Code

Open clj-midje-example/src/clj_midje_example/core.clj in an editor and add:

(defn times-2 [a]
  (* 2 a))

and rerun lein midje. This time you should see All claimed facts (1) have been confirmed. Great, the test passes, but the function doesn’t really do much so lets spice things up a bit by making it do a bit more.

Our function as it is right now is great, it takes in a number and multiplies it by 2, simple and effective. But what if we also wanted it to work with vectors and strings ? Let’s say that we wanted a function that if we passed in a string would return the original string repeated e.g. passing in "Hello World!" would return "Hello World!Hello World!". Or if we passed in a vector such as [1 2 3] we would get back [[1 2 3] [1 2 3]] ? Let’s try it.

More Tests

Once again we start by writing some test cases, 1 to test the behaviour for strings and 1 to test the behaviour for vectors. So let’s open clj-midje-example/test/clj_midje_example/core_test.clj in an editor and add:

(fact
  (times-2 [1 2 3]) => [[1 2 3] [1 2 3]])

(fact
  (times-2 "123") => "123123")

Run lein midje again and you should see an error like:

FAIL at (core_test.clj:11)
    Expected: [[1 2] [1 2]]
      Actual: nil

FAIL at (core_test.clj:14)
    Expected: "123123"
      Actual: nil

FAILURE: 2 facts were not confirmed. (But 1 was.)

Now let’s add the code to get the test cases to pass.

Update the Code

We need to modify our function to allow it to also accept strings and vectors in addition to numbers. The easiest way to do that is to check the type of the function’s argument and handle it appropriately. Open clj-midje-example/src/clj_midje_example/core.clj in an editor and replace the times-2 function with the following:

(defn times-2 [a]
  (cond
    (= (type a) java.lang.Long) (* 2 a)
    (= (type a) clojure.lang.PersistentVector) (vec (repeat 2 a))
    (= (type a) java.lang.String) (apply str (repeat 2 a))))

Once again run lein midje and you should see All claimed facts (3) have been confirmed.

Excellent, our function now works exactly as we wanted. Except there’s one small problem, this is completely the wrong way to handle polymorphism in Clojure. The right way would have been to use Protocols or Multimethods. So let’s refactor the code to do it right. To keep things simple we’ll use Multimethods.

Multimethods are a system Clojure provides to support polymorphism. They allow dispatching on types, values, attributes or metadata of function arguments as well as the relationships between the arguments. You define a multimethod using defmulti and provide a dispatching function as part of the definition that is applied to each of the method arguments to determine a dispatching value. That value is then used to determine which method, defined using defmethod, to call. You can also define a default method, using the :default dispatch value, that will be called if no other matching methods can be found. Here’s what the refactored code using Multimethods looks like:

(defmulti times-2 class)

(defmethod times-2 String [a]
  (apply str (repeat 2 a)))

(defmethod times-2 clojure.lang.PersistentVector [a]
  (vec (repeat 2 a)))

(defmethod times-2 :default [a]
  (* 2 a))

So what we’ve done here is define a times-2 method using defmulti and we’ve specified the dispatch function to be class so we can define different behaviour depending on the argument type. We then define 3 times-2 methods using defmethod. The first method will be called if the argument is a String and will return the String repeated twice. The second method will be called if the argument is a vector and will return a vector containing the original vector repeated twice. The third method uses the :default value so will be called whenever the argument is anything but a String or a vector.

So now let’s run our tests again to make sure our refactoring hasn’t changed anything. After running lein midje you should see still All claimed facts (3) have been confirmed.

Conclusion

We’ve only scratched the surface of what you can do with Midje. We haven’t talked at all about how you can group related facts using facts or about how you can define prerequisites using the provided function nor anything about the powerful prepackaged checkers that Midje provides. This is a framework with a lot to it and I’m hoping you can see that whilst Midje is simple to get started with it has a great deal of depth. As always the best place to find out more is the Midje Wiki.

Posted in Clojure | Tagged , | Leave a comment

Hello

Hi there.

Welcome to my blog. As you can see there isn’t much here yet, but I’m planning to post here fairly regularly so that should change quickly.

My plan for this blog is to use it as a sort of “brain dump,” some where to store stuff I’ve been thinking about. I expect it will end up as part tutorial, part rant, but hopefully it will be useful to someone. In my day job I work in Java web development and with the Google API’s so I expect some of that will surface here, but in my spare time I love using Clojure and I plan to document what I learn about various libraries here as a reference.

So that’s it for now, the next post should be much more technical.

Posted in General | Leave a comment