Alan Cameron Wills - Domain Specific Languages

Models, domain-specific languages, code generation, ....

SPA Conference

As always, the SPA conference was stimulating. Gareth Jones and I ran a workshop on “software factories and DSLs”. Some of the points that came out clearly for me:

  • Composing languages. It’s a big investment to create a DSL from scratch; or from vendor-provided basic bits. People need to be able to make languages from fragments that can be passed around. There are roughly two categories:
    • Little fragments. If you see a diamond in a flowchart-like context, you expect it to be a decision point; if you see a big arrowhead on an entity-relational sort of diagram, you expect it to mean some form of inheritance, with implications about the features of the things at the two ends of the arrow. It would be good to package parameterisable bits of notation along with their rules of interpretation. Language authors need to be able to create these fragments as well as using them.
    • Big fragments. Sequence diagrams and project planning charts are variants of the same thing: bars one way with arrows between them the other. There’s a big chunk of definition there that can be either be parameterised and used as a complete language; or as a part of a larger language. Again, the editing behaviour and semantics are part of the definition, as well as syntax.
    • Fragment kits. Fragments should come in coherent kits, designed to fit together, and from which authors can choose and parameterise subsets. Generally we wouldn’t expect to find either sort of fragment passed around as an isolated item.
  • Refactoring languages. Language ideas evolve. We need to make it easy to generalise language definitions, extracting and merging the common parts of existing definitions.
  • Evolution and migration. DSLs change, and existing instances or statements of the language need to be migrated; or the editors and processors of the language need to be backward compatible. We need to provide support for
    • Restricting changes to (backward-compatible) augmentations if the author so chooses
    • Creating migration tools
    • Version identification implicit in every language — so that statements of the language effectively have strong names
    • Views and abstractions. Most DSLs are abstractions – the baggage-track diagram doesn’t define the whole airport software. We need to make easy to project and combine views, at two levels:
    • The model/presentation layer distinction – multiple partial views of one model
    • Weaving at generation time – we take two or more DSL statements and generate artefacts from them. With code generation technology, we have a couple of ways of doing this: one code template can query more than one model; and a template can generate another template, which queries another model.
  • Tool == language. When you use any website or computer tool, you’re using a DSL. At Amazon, you use a language of book purchase; in SQL, you use a language of data query. The design principles of each of them are the same: find the things that vary from one instance to another, and focus the language on that; make the most common things easy to say; let the underlying framework capture the best implementation patterns in the domain.
  • Discoverability. Languages, like tools, need to be usable by novices. If you use C# or Java every day, you don’t need much prompting as to its syntax – though Intellisense does help, of course. If you don’t purchase a book every day, it’s good that the website explains the steps. DSLs are often used by novices: the language and its tools should make it easy for novices to use.
  • Late composition. SQL can be embedded in C#: obviously, you can write SQL in a string and write some stuff to process it; but C wasn’t designed with SQL as an explicit part of it. Late composition is a useful feature of more sophisticated languages, allowing plug-in statements to be made in the most appropriate language at the time of writing.
  • Successive approximation. Good modelling languages support the process of arriving at a solution, making it easy to state what you’ve decided so far, without saying what you don’t know yet. For example, in UML, leaving an association undecorated means “I haven’t decided the cardinality yet” rather than assuming a default. This is maybe what distinguishes a modelling language from a programming language. Our language definition tools should support successive approximation.
  • Shared multi-user transactional models. For some of our language authors, it will be important to synchronise the work of many users: a language statement seen by one user will be a view on a model shared with others. In other words, a database.
  • Distributed models. A single central model isn’t always appropriate. Dynamic coupling of models is useful – for example to perform business transactions, or to synchronise a local project with a larger one. Along with this goes dynamic integration of the relevant language fragments.
  • Semantic zones. One aspect of language evolution is that languages evolve separately in different groups of people, and then need to merge or interchange when those groups start to work with each other. For this purpose, we need infrastructure that can apply translations automatically, and tools that help create the translations.