The Semantic Gap

There are 2 worlds:

  1. The world as we think about it.
  2. The world as we can manipulate it.

The difference between these two is what is called the semantic gap

Our industry has been struggling with the semantic gap for decades.  An excellent example of the semantic gap is provided by the TechProsaic blog entry:  VMWare Perl Toolkit verses Powershell V1 ToolKit which shows the 20 lines of Perl required to do the same as the Get-VM cmdlet.

Someone could read this blog and walk away thinking, "PowerShell is great and Perl is crap" – you’d be both right and wrong.  PowerShell is great but Perl is not crap.  (Hats off to superstar Larry Wall and Perl, very few people and technologies that have had the level of (positive :-) ) impact these 2 have had on the industry.  The world is a better place because that guy was born!)   The difference between the 2 examples is the semantic gap.  The PowerShell example has a very small gap between what you think and what you type. The Perl example has a very large gap.

At the end of the day, the semantic gap is "owned" by the people that provide the instrumentation.   VMWare could have just as easily provided a PowerShell Script that took just as many lines as the Perl example or they could have provide a Perl library or script which provides the semantics of the Get-VM cmdlet.

So why do instrumentation providers close or not close the semantic gap?  Ahh – there’s the question!

I might be wrong on this point but I think that this is just an example of Maslow’s hierarchy of needs.  Technology used to be REALLY REALLY hard, getting ANYTHING to work was a major accomplishment.  If you got your code to actually do something useful, your first reaction was not to go back and put in good error messages, it was to the bar and brag to your geek posse.  The quintessential example of this is the "ed" text editor which has a single error message:  "?".   (Make fun of it if you will but it was ahead of the times in terms of internationalization/localization :-) ).    Once you get things working, then you can worry about things like good error messages.  Later you can worry about higher order functions and user experience.

PowerShell has the advantage of standing on the shoulders of giants.  Some of those giants are conceptual like:  Bash, C#, Perl, Tcl, VMS/DCL, AS400 CL, etc etc.  But what really made a difference is that PowerShell was able to stand on the shoulders of CODE:  .NET, XML, WMI, ADSI, ADO, etc, etc.  Because we could assume that all that wonderful stuff was there, we could move up Maslow’s hierarchy and focus our energies on closing the Semantic Gap.  Obviously our goal in this is to allow our customers to stand on our shoulders and close the semantic gap between what we providefor  and the business problems that they face. 

I believe that it is metaphysically impossible for the CDs that we (any vendor) ship to directly solve business problems.  Every business has its own environment, philosophy, politics, history, personality, etc.  Every business needs to take what vendors deliver and adapt it (through actions, operations or scripts) to meet their needs (to close THEIR semantic gap).  My vision of success for PowerShell is that it will provide the easiest, fastest, cheapest, most reliable mechanism for customers to take whatever vendors ship them and adapt it to meet their unique needs.  Those needs change over time so PowerShell must allow a solution that is easy, cheap and safe to understand and maintain.

But back to the point- it is the instrumentation providers that close the semantic gap, PowerShell itself provides very limited instrumentation.  That said, it plays a huge role in the process.  We do that through advocacy and design.  PowerShell has a clear vision of the right user experience where users can easily compose high level task oriented abstractions.  We stress this vision to the instrumentation suppliers and provide lots of guidance for how to do things and what words to use for verbs, parameters, noun-schema etc.  The design of PowerShell strongly encourages instrumentation suppliers to:

    • Use common Verb naming
    • Leverage the single common command line parser
    • Support common semantics like -WHATIF, -CONFIRM, -VERBOSE, -DEBUG
    • Leverage common utilities for sorting, filtering, manipulation, formatting, import/export, etc.
    • Allow both ISA and HASA composition models
    • Leverage common parameter validation logic to produce consistent error messages

I firmly believe that economics determine what people do and don’t do so PowerShell is designed from the ground up to make composable, high-level task oriented abstractions be the cheapest things to produce and support.  In other words, PowerShell and .NET deal with the low levels headaches and minutiae of Maslow’s hierarchy which frees up instrumentation providers to focus on the higher levels.  This is certainly borne out by the experiences of those teams that write PowerShell cmdlets.  The consistent feedback we hear is

  1. WOW! … writing cmdlets is really easy.
  2. Most of time is spent on thinking through the design of the User Experience.

Their is a lot more to this semantic gap discussion and when you get this in focus, a bunch of the stuff we do in PowerShell begin to make sense.  It is why we have an adaptive type system (because it allows you to think about the data you want and not the prison that data is locked in [think XML hell]).  It is also why we have both an ISA and a HASA composition model and why we fanatically about data coercion so that things "just work" and you don’t have to spend your energy dealing with data impedances. 

There is a lot more to say on this topic but I think that is enough for a Sunday afternoon.  I’m going to go get some coffee

Cheers!

Jeffrey Snover [MSFT]
Windows Management Partner Architect
Visit the Windows PowerShell Team blog at:    http://blogs.msdn.com/PowerShell
Visit the Windows PowerShell ScriptCenter at:  http://www.microsoft.com/technet/scriptcenter/hubs/msh.mspx