Using the cognitive dimensions framework to design usable APIs


A few weeks ago, I posted some
details about the cognitive dimensions framework that we use at Microsoft when considering
API usability. Up to now, we’ve been using it primarily to describe the results of
studies done on APIs in our labs. What I’d really like to see though is different
API teams being able to use the framework as a means to design an API, rather than
evaluate it. Over the last couple of weeks I’ve been writing up a document that guides
API teams in evaluating whether or not their API design is progressing in the direction
they would like it to, with respect to the framework. I thought I would share
this with you. I’m hoping that you’ll be able to provide good feedback on this approach
so that I know whether or not this really is useful for teams designing an API.

The basic idea is to start the design of an API by describing the type of developer
that the API is designed for in terms of the cognitive dimensions framework. For example,
your customer might be a developer that prefers APIs that expose a set of aggregate
components, that prefers an API that supports a bottom up learning style etc
etc. Then, for each scenario that the API is designed to support, write the code that
the developer will have to write using the API to implement that scenario. Note that
you can do this before you’re anywhere near ready to start implementing your API.
Just write out the code in a text editor – what you are doing is saying ‘When the
API is ready, here is the code that we would like developers to write when tackling
this scenario’. It’s sort of like designing a UI for a product before implementing
the product. Designing an API and a GUI in this manner (from the perspective of the
user) means that it is less likely that the implementation details of the product
will surface in either the GUI or the API.

Once you have a set of code samples, you can then start to evaluate them in terms
of the framework.

Let’s begin by looking at how we might evaluate the abstraction level dimension. We
start by doing a task analysis of
each of the scenarios.

For each user goal that the API supports, describe the tasks that the user has to
implement to accomplish that goal. For example, in the System.IO namespace one goal
might be to append a line of text to a file. For such a goal, the list of tasks might
be:

"urn:schemas-microsoft-com:office:office" />
 

  • Open the file with a given name
    • Create the file if it does not already exist
  • Write a line of text to the file
  • Close the file 

Note that a task analysis for an API describes the different actions that a developer
would expect to have to accomplish. It does not describe how the API supports these
goals. For example, one API might create a file automatically if you attempt
to open a file that does not exist, another API might force the developer to
explicitly create the file. They both support the same action from the developer’s
perspective, they just support it differently.

 

For each task that you describe in each goal, list the different API components that
are involved in accomplishing that task. For example, to open the file in append mode,
you might create an instance of the StreamWriter class and set the append parameter
to true in the StreamWriter constructor. The StreamWriter constructor will create
the file if it does not exist so there is no extra action to be taken to create the
file. To write to the file, you would then call StreamWriter.Write or StreamWriter.WriteLine
using the instance of the System.IO.StreamWriter class returned by the constructor.
You would then finish off by calling StreamWriter.Close.

 

Do a similar task analysis for each goal that the API supports. For example, an additional
goal for the System.IO classes might be to read a line of text from a file. In this
case, the tasks would require the use of the StreamReader class in order to open,
read from and close the text file.

 

Having described the different tasks that the user has to accomplish to achieve each
goal and the different components that the user has to use to implement each task,
you can now describe the nature of the components in the following terms.

 

  • If individual tasks require two or more components to be used in conjunction, the
    components are considered primitives. In other words, individual components exposed
    by the API do not map on to unique user tasks.
  • If each individual goal requires the use of only one component, but the set of goals
    requires different components, the components are described as factored components.
  • If the set of goals could all be accomplished with the same set of components, the
    components are described as aggregate components.

Given the above definition, the System.IO classes are best described as factored components
since each goal requires the use of different components (StreamWriter to write to
a file and StreamReader to read from a file). Each class is factored to a particular
goal, but different components are required to achieve different goals.

Do
the above analysis for each scenario that your API supports. If each scenario that
your API supports is likely to be accomplished by the same type of user, you should
attempt to ensure that each scenario exposes the same type of abstractions. If each
scenario that your API supports is likely to be accomplished by different users, you
should attempt to ensure that each scenario exposes the type of abstractions that
those users are most likely to be successful and comfortable with.

Given
the above analysis, if the target customer for System.IO prefers to work with factored
components, this indicates that the design will satisfy them with respect to this
dimension.

We’d
complete the analysis by looking at the other dimensions in turn. I’ll describe how
to analyze the remaining dimensions in later postings.

Comments (5)

  1. Frank Hileman says:

    I have one quibble: the distinction between "goal" and "task" seems completely arbitrary. They are one and the same; the "goal" is a task, and a "task" is a sub-task. These can nest to any level. Once this distinction is removed, the difference between a "primitive" and any other component is hard to see. The term "primitive" might be better used to describe the lowest level operation in the API.

    These are great ideas! Years ago I worked on a development team for a public, widely used API in C. We decided that all new public additions to the API had to have preliminary documentation written before the API itself was developed. The documentation had to include samples of the most common uses.

    We found that the process of writing the documentation and samples often caused radical changes to the design of the API. Since these changes occurred before a line of code was written, they happened at the cheapest point in the development process. The rising popularity of "Test before development" is achieving some similar benefits, by writing tests using the API before the API is developed.

    The "doc before development" order arose because the more traditional "develop then document" order often caused severe redevelopment costs and unpredictable deadlines. Developers think they know precisely the requirements for an enhancement, when in fact, the requirements are not really known in detail until the end user documentation is created. Giving developers a role in documentation also forced them to see how difficult it might be to explain the "monster" they were creating. They could see the "monster" for what it really was, from the user’s perspective.

  2. Steven Clarke says:

    Your point about the definition of a task is well taken Frank. As you may know, it’s a well debated topic amongst usability folks also. However, my aim in using the term in this context was to focus on the user’s perspective when thinking about the design of an API. A user has a particular goal or state that they want to achieve and a set of tasks or operations that they think they need to perform to achieve that state. The difference is in describing a goal as the final state that a user wants to achieve and tasks as the way that they expect to achieve that state (which as you say, will involve reaching some intermediate state).

    When thought of in this way, it’s harder to describe a set of tasks as a set of goals in and of themselves since it’s less likely that a user would describe a task as a final state that they would want to achieve. For example, it’s more likely that a user will open a file in order to do something with the contents of that file. And many users would expect that in order to read from a file, it needs to be opened first. It then follows that the abstraction level of an API can be measured with respect to the user, rather than the API itself. One user’s aggregate component could be another user’s primitive – it all depends on how the user would describe the tasks that they think they need to perform to achieve a particular goal.

    Writing the documentation first is a great way to see the API from the user’s perspective as you describe. What we are trying to do with the framework is to measure what we are seeing and to use these measurements to direct API teams to design something that will work well for our users.

  3. Parallel programming is difficult. No surprises there really. I came across a great slide deck discussing