Formatting intro

Anson and i were discussing formatting last night (at around 1 am).  He’d received some feedback from some customers about the new formatting engine Kevin has written for Whidbey.  The issue that the feature (like many others added in whidbey) tends to be very aggressive.  So, unlike VS2003 which only really affected indentation and curly-brace placing, the the new formatter tends to go after all whitespace (except inside comments/strings) and tries to figure out the right amount of space that it should actually take up.  So one can think of the formatter as simply taking a list of tokens and a function from whitespace to whitespace and producing the updated token list.  In other words you could write it as:

# type token = Whitespace of string | Identifier (* plus other tokens *);;
type token = Whitespace of string | Identifier
#let rec format token_list f =
    match token_list with
        [] -> []
      | h::t ->
          (match h with
              Whitespace(_) -> (f h)
            | _ -> h)::(format t f);;
val format : token list -> (token -> token) -> token list = <fun>

That’s really basically it.  We can extend this slightly further to deal with the grammatical (ast) structure of code, but that’s pretty trivial to do with Functors. However, the difficulty really comes into definining the function f.  This opens up a big can of worms.  In whidbey we’ve taken the route of supplying some basic functions for you.  For example:

# let clear w = match w with Whitespace (_) -> Whitespace (“”) | _ -> w;;

Which removes whitespace (which you might see when formatting

“if (” into “if(”


let trim w = match w with Whitespace (_) -> Whitespace (” “) | _ -> w;;

Which reduces a sequence of whitespace into one space.

There are also functions for dealing with newlines, and indendation.  But for the most part that’s all we’ve provided.  The issue is that this isn’t a very rich system.  Because we’re defined all the modifications ourselves people are incapable of defining their own way of formatting whitespace.  For example, you cannot say “I want 2 spaces between “class” and the name of the class”.   Next post will deal with our thoughts on how to make this better.

Comments (8)

  1. Dave Thomas says:

    Some developers make entire languages out of whitespace…

    I think developers can be very picky about how code is formatted, Soom of my peers write excellent code but It looks ugly to be because of the formatting I.e. in this case the developer does not white space between operations

    (int x=24;)rather than(int x = 24)

  2. Michael Carr says:

    If there’s ONE feature I’d like to have, it’s the ability to COMPLETELY TURN OFF the automatic formatting in Visual Studio 2003. Every time I turn around Visual Studio has mangled my HTML beyond recognition. My favorite is when VS2003 removes the whitespace between the end of a server control and the following literal text, therefore making the two run together on the rendered page. I put the space back, but VS2003 pulls it back out later at some random time when I’m not looking. I finally gave up and just let my website look goofy with missing spaces…

    How about a Service Pack for VS2003 that allows me to disable automatic formatting altogether??

  3. Toshiyuki Ichikawa says:

    Is the actual engine written in OCaml?

    Just curious. 🙂

  4. Brian Schkerke says:

    There’s a post in the newsgroups (<a href="">Google Groups</a>) where Microsoft support says that the .NET 1.1 team tried to enable a way to remove the automatic formatting but that the formatting was too deeply integrated into the engine.

    The Visual Studio reformatting of HTML is why I no longer use the graphical design mode while working with the IDE. It limits me in painful ways but reformatting a few thousand lines of HTML so I can make sense of my code isn’t worth it.

    Please don’t release any future IDEs without the ability to completely specify how a rule behaves, or whether or not that behavior is active. I’m a huge fan of whitespace — I believe it is underused, often by the same folks who think that combining five or six operations in one line of code makes their program super efficient. To be forced to format spacing in one way or another would be horrible.

    Disclaimer: I know this post specifically addresses the problem of not being able to define your own rules, and that the next post will discuss solutions to that. I still like to voice my opinion. 🙂

  5. Michael: In Whidbey you can disable all automatic formatting of code.

  6. Brian: I’ll send your thoughts to our PM here who will know how to get that to the right ears. We may even be able to find a blog where you can directly to someone responsible for this area.

  7. Toshiyuki: Unfortunately no. However, there’s always hope for the future. F# is a good way to bridge the gap between OCaml and the managed world.

  8. Brian: Scott Guthrie blogs about ASP.NET and I’m sure he’d want to hear this feedback. I also know from Tech-Ed that the non-formatting of html code was something they made sure to do for VS2k5.

    His blog is at: