Martin Fowler on Language Workbenches

Martin has just put up an article on Language Workbenches - IDEs for creating and using DSLs.

As you'd expect from Martin, this is an insightful piece, with enough follow-on links to keep you interested and busy for days.

One of the links is to a second article on code generation. Here Martin explains how to write a code generator for a DSL. The first point that comes out is the important distinction between concrete and abstract syntax. This distinction allows a language to have a number of concrete views, which map to the same abstract structure, which code generators then take as input. This saves having to rewrite the code generator every time you add a new concrete view to the language. In our own DSL Tools, we are emphasizing graphical and XML concrete syntaxes for languages. We also generate an API from a language definition which allows direct access to the abstract data structures in memory for purposes such as code generation (all the file handling is done for you).

Martin continues, in the article, to talk about code generation itself. The first approach he demonstrates is not really a generator at all, but rather an interpreter. This is written in plain code and makes use of reflection. The second approach uses text templates. In generating code for designers from definitions of DSLs, we have found text templates to be our preferred method for writing code generators. We wrote our own text templating engine, which is included as part of DSL Tools. We have taken great care to architect the engine so that it can be integrated into different contexts, which means that it can be hosted in different environements (e.g. inside visual studio or not) and can accept inputs from multiple sources. For DSLs, we've built a Visual Studio host and the extensions that allow direct access within templates to models in memory through the generated APIs mentioned above. My colleague Gareth Jones has blogged about the engine, and there its use in a DSL Tools context is illustrated in the walkthroughs that are part of the DSL Tools download. We're actively working on more complete documentation for the engine itself, including the APIs.

Aspects that Martin did not touch on in his article include the issues of orchestrating the generation of multiple files from multiple sources, integration with source control (though it is a moot point whether generated files should be checked into source control or not), as well as how to handle cases where 100% code generation is not feasible - particular tricky are the cases where code in the same file has to be further added to - skeleton code is generated, but the programmer has to fill in method bodies, for example. We haven't answers to these yet, but they're on the roadmap.