Graphical languages – semantics vs syntax

Recent postings have talked about “semantics” of graphical languages. When I was involved in UML development this word caused a lot of argument and confusion. In my view, it goes like this.

A graphical language has three primary aspects: concrete syntax, abstract syntax, and semantics. Concrete syntax (notation) defines what the language actually looks like on the screen or the paper: the shapes, connections, adornments and textual annotations that the user sees and manipulates, and the gestures used to manipulate them. Abstract syntax (grammar) gives names to the concepts that the notation depicts, and defines which configurations of these are valid. Often, the abstract syntax for a graphical language is defined using a model, itself expressed using a graphical language. Some people call this a “metamodel”; my colleagues and I usually call it a domain model. Usually such a model is insufficient on its own, and must be supplemented by an additional set of constraints, sometimes called “well-formedness rules”. A typical such rule would be the constraint that an inheritance graph must not form cycles.

Semantics is none of the above, although terms such as “static semantics” or “short-range semantics” have been used to refer to what I called abstract syntax. Semantics define what valid expressions in the language mean. (Aside: saying “expressions in the language” is very easy when talking about textual languages, whereas thinking of a diagram as an expression may feel a little uncomfortable. You have to get used to it.) The way to give meaning to an expression is to map it to one or more other interesting structures whose meaning is known. To define semantics objectively, the mapping and the meaning of the result must both be precisely defined.

A mapping of the graphical language to statements in a natural language does not give objective semantics, although it may be useful. A precise mapping of the graphical language to a programming language gives objective semantics to the extent that the programming language does: which is high, in the case of popular standardized languages. If such a mapping exists it probably means that the diagram is a visualization of some aspect of the structure of the program. Another useful target for semantic mappings is expressions in some logical formalism, such as set theory, predicate logic or abstract algebra. Examples of diagrams with this kind of semantics are Venn diagrams and Petri nets. A final useful target for semantic mappings is another graphical language, which itself has well-defined semantics.

You might observe that abstract syntax involves a mapping from expressions in the language to the structure {valid, invalid}, and therefore is a kind of degenerate semantics. I prefer to distinguish validity from meaning. You might also observe that it might be desirable to give some kind of meaning to invalid expressions. To do that, you’d have to introduce different kinds of validity.

I’m off to spend the holidays with my family. Merry Christmas!