Domain Specific Languages

At OT2004, I ran a workshop on domain specific languages. OT is where the UK experts in software development gather each year, and so the people attending a workshop generally know more than whoever is running it. It's a great way to get some good ideas about a topic you're working on.

So here's a brief intro to the topic, and some of the ideas that came out of the workshop part.

What's a Domain Specific Language, and Why is it Important?

A domain specific language is, well, a language specific to a domain.

If you happen to be writing software for phone switches, you probably use a Call Processing Language, which describes the process each call goes through -- the state changes from dial tone to dialling, to ringing or engaged, etc. There is some kind of compiler or generator that creates the code for the phone switch from the CPL statements; generally, the generator will produce a mixture of code, tables, database schemata, etc. The benefits of using CPL rather than writing the code directly are:

  • In Call Processing Language, you talk in terms of telephone calls, rather than some implementation of them in C or whatever. So if it's your job to think about how some feature like Call Waiting should work, then you can think clearly about that and leave the C to someone else -- the writer of the generator.
  • Conversely, the writer of the generator is free to worry about implementation details, rather than how the big features fit together.
  • The CPL statements just talk about the progress of one call -- whereas of course many calls are happening concurrently in the running system; and the implementation probably doesn't have a separate process thread for each call.

Domain Specific Languages can be thought of as an extension of the idea of Software Product Lines. The generic framework provide the architecture that encompasses a wide range of solutions; the language provides a medium for specialising the framework to a particular set of requirements.

DSLs aren't new, though they have tended to be more visible in horizontal domains: SQL, regular expressions, HTML, workflow, scripting languages. We believe that, just as in the case of call processing languages, they can be equally useful in vertical (business specific) domains such as financial processing.

In fact, designing a DSL can be a useful strategy on any 'big' project. Designing a phone call processing language pays its way even if you only have one phone switch to develop: partly because of the separation of implementation detail from requirements; and partly because the process of designing the language forces you to think about the key concepts in the requirements domain.

Learning points from the workshop

The workshop took the form of a series of exercises performed in groups. Participants were asked to develop DSLs for two different domains (baggage handling and fast food). (benjaminm said some nice things about it.) These are points that came up in the ensuing discussions:

  • DSLs and Software Product Line techniques are good for the architecture of a large system even if it isn't a product family:
    • Inventing a language helps componentize an application. The way you divide up an application into pluggable components is different, when you're trying to make it generic, than when you're just trying to split a large design into modules. Thinking of the language first helps you do that.
    • SPLs reduce dependencies between applications and components, thereby turning large projects into multiple smaller and independent ones (unlike conventional modularization, which leaves strong dependencies of overall design on component designs).
  • Because within a specific domain, the solutions are generally well mapped out, it is sufficient for a DSL to talk about the requirements rather than the solutions. The generator/interpreter can select solutions matching the requirements. (E.g. you don't have to understand database implementation (much) to write SQL.)
  • A DSL statement -- for example the phone example -- might cover multiple substantial parts of a business domain employing multiple machines, and might include some manual processes. Generating software from it is just one thing you could do with it. Ordering the hardware and configuring staff training manuals might be others; as well as creating animations for prototyping purposes.
  • MDA (tm) doesn't cover the use of models for non-software purposes.

Designing a DSL :

  • It's important to think of your language as such from the beginning -- too many projects have started with a table of configurable parameters which has been extended and extended, and become a mess. Sendmail was an example.
  • People generally design languages by first inventing sample statements, then abstracting the grammar.
  • Most people find it easier to invent a draft concrete syntax first, then abstract; but of course it's also possible to write an abstract model and then clothe it in concrete syntax. The two techniques can be parallel and complementary.
  • First write down the points of variability between members of the family of products (or business domains).
  • Common pitfall: write sample statements that describe what's common to all specializations of the product family; rather than statements that distinguish one member from another.
  • Common pitfall: degenerate into tables.
  • There may be a relationship between the DSL and the language used in the GUI to report the state of the system. For example in railway control, the plan of the lines, switches and signals would be an input to software generation; it would also be the basis of the plan seen by the controllers at runtime.
  • UML (with suitable icons) covers the syntax of many of the diagrams that are mostly used. Not sure about grammars.
  • Languages come with tools: editors, checkers, animators, generators, debuggers.
  • Most models will be partial, expressing the point of view of one class of user requirements from particular sources, often expressed in their own terms -- legal & regulatory constraints, main users, monitoring and admin, ... different kinds of requirement: functional, deployment, performance, .... These models can be composed at the implementation level; but it's much more useful if they can be composed at the more abstract level, and tested for consistency at that level.