John Fremlin's blog: Uncompromising meta-programming

Posted 2011-03-09 23:00:00 GMT

Meta-programming is using programs to generate or transform programs, like a compiler, it's difficult to explain how it can be both easy and useful. One simple example is efficient serialization: automatically generating converters for between in-memory and network or disk data-representations as efficiently as writing the conversion programs by hand. There is no compromise between efficiency and syntax.

As another example, in tpd2, which is designed for high performance, meta-programming is used to allow a convenient syntax for outputting HTML templated strings with performance as good as hand generated code, compile time safety in terms of spell-checking of attributes and that TD (table cell) elements appear only inside TR (table row) elements, etc.

Syntax becomes independent of implementation: meta-programming explicitly allows convenient syntax, without repetitive drudgery while achieving optimal behaviour: having efficient code is possible without compromising syntactical elegance and forcing the human programmer to repeatedly apply optimization tricks to complicate the program, making it harder to understand and modify.

The basic syntax for the HTML generation in TPD2 is in appearance like a function call: (<h1 "Hello") returns "<h1>Hello</h1>", and nested tags with attributes are handled: so that (<p "This is a " (<a :href "link.html" "link")) goes to "<p>This is a <a href="link.html">link</a></p>". Anywhere a literal string appears a function can be called instead, or a variable substituted, so that dynamic generation of content is easy. This syntax can be naively implemented by defining functions <p, <a, <h1 etc.

Efficiency can be achieved with this simple syntax: an example might be

(defun greet (name) 
       	   (<h1 "Hello " name) 
	   (<p "Pleased to meet you, " name ", thanks for reading.")))

Performing string concatenations at runtime is O(n2), but not running string concatenations means that the scatter gather IO must be used with small fragments which is inefficient in the kernel as each fragment needs to be access checked, and having many fragments puts pressure on the GC. In fact in tpd2, string concatenation was wasting the majority of time in some benchmarks.

With a naive implementation, the <h1 would boil down to (strcat "<h1>" "Hello " name "</h1>"), or even worse and there would be no way to merge the "<div>" with the "<h1>" tags. With Lisp's convenient meta-programming, where the <div operator is no longer a function, however, it can inspect its arguments and perform an optimization across boundaries: in this case transforming the greet function to the optimal (given an optimal strcat operator which is complicated in itself):

(defun greet (name)
       (strcat "<div><h1>Hello " 
       	       "</h1><p>Pleased to meet you, " 

Guaranteed safety achieved through static analysis of the properties of the program, even domain specific ones is better than turning up bugs in time consuming testing; properties are proved rather than gathering statistical evidence about them. The domain specific transformations with meta-programming techniques naturally tend to allow compile time checks.

For example, with the tpd2 HTML generation, attributes are spell-checked so that, for example, an href attribute is not allowed in an <p> element. This would be impossible to check with HTML validation tools unless every codepath that could generate HTML were exercised. Additionally, the HTML DTD specifies that only some tags can appear in others; for example, a <div> cannot appear in an <h1>. Where possible, violations of these constraints are also reported at compile time; however, as tag contents can be generated by arbitrary programs, it is not possible to find all violations. In fact, this brings up a major weakness of Lisp meta-programming: a sophisticated compiler may have a strong understanding of the potential control flow with powerful constant propagation and type inference, but that is not exposed to the macro-expansion facilities.

I hope I have been able to present a practical example of the utility of convenient meta-programming; of course many more sophisticated things are possible (like compiling regular expressions to machine code or allowing convenient definition of object relational mappings). It allows one to keep a pretty syntax while achieving optimal machine code generation and simultaneously increasing safety. Additionally, it a natural interface barrier between the application programmer and the systems programmer that allows either to comfortably modify their design and implementation independently. No compromise is necessary and none is made.

Post a comment