Where extensible formatting breaks down


Imagine the following bit of code:


public void ImFeelingStandardToday(int i, int j, int k)


{


}


 


        public void ImFeelingWhimsicalToday


            (


                int i,


                int j, int l,


                int k


            )


                {


                }


The first is the method as formatted by the standard rules we have in the formatting engine.  The second is method of formatting that is nearly impossible to express with any kind of limited set of formatting functions.  With an extensible model it might be possible to express rules that would make the second happen, but it would be very difficult to express in a rule based system the concepts of Standard versus Whimsical.  You might as wel have rules based on:



  1. Am i coding during the day

  2. Am i coding with the sun in front of me or behind me

  3. Is it my brithday

  4. etc.

So once you have some sort of model that allows rich rules you also need to ability to take a section of code and say “under no circumstances shall you format this code.  This code carries a certain undefinable beauty under its current form and you shall not touch it”. 


Unfortunately, it’s not clear how one would go about doing this.  You could use #regions, or specialized comments //*, but those end up changing your code thus potentially affecting it’s ‘being’ (for lack of a better word). 


You could make it an editor feature where you select a region of code and that’s remembered in some special file (like breakpoints, etc).  However, is that really good enough?  What if you transfer that file to someone else.  You don’t want them mucking around with that formatting.  So we need it to not actually touch the code, but travel around with the file.  how to do that how to do that…


How about WinFS metadata?  Or NTFS streams?  We could store this additional strong information with the code that indicated our understanding of it on a higher level than just tokens without affectin gyour code at all.  We could even try to be pretty smart about edits that happened outside of VS.  Say you modify the file in notepad.  A quick hash verification of the source would reveal that things were not the same anymore and we coudl prompt you if you wanted to preserve this structured data.  Any thoughts on this?


Comments (9)

  1. What I really want is "stylesheets" for source code so different users can see different formatting based on their preferences, but the actual text written to the source file is formatted by a different set of settings (so say, a team could have their source code formatting standard, but then individual team members get to see source how they want).

  2. Brian Schkerke says:

    Stylesheets would work in some cases but what about those where you simply don’t have a standard? I like to use tabs to line up variable declarations at the beginning of a class — how could I specify this using a stylesheet?

    Frivolous? Maybe, but it’s my style and I find it extremely easy to read and debug. I think any hard and fast rules should be avoided, period. Making the engine extensible and definable is great, but make the engine turn offable or, even better, replaceable. (Hm. Ship the IDE with a few various engines. One representing VS .NET 2003, one representing the new parser, and one as a sample to use for defining your own. Now you can ship using whatever rules you want as a default but you allow for others to replace it at whim. This also creates yet another market for third party components.)

    My programmers don’t know how to use CSS with HTML (we work with C# and WinForms); I shudder to think of trying to teach them CSS.

  3. I don’t think NTFS streams would work, since source control systems probably wouldn’t handle it.

    A special comment could work. I suppose the question would be if this comment would be human-readable or contain serialized information for the IDE. If it contained serialized info, it’d most likely be more compact, and thus the comment could be at the top (or bottom) out of everyone’s way.

  4. Erik: I agree, however there are issues with that model (which can be addressed). Say we stick with your model and each developer sees the code as they want it, but it actually gets saved in some company standard way. Then what happens when you try to debug? The line/columns numbers don’t match up between the source file you have and the original source file that the dll was built against. Similarly, you want to see how your file ‘diffs’ against a previous version. If you change your formatting then it will probably look like every line is different.

    In order to commodate your (Transformed) model, all toolsets need to be upgraded to understand this and to properly take the transformation into account.

    Note: I think this is a very good idea and it’s something I"ve wanted as well. Unfortunately it doesn’t look like you can do it without updating a whole lot of tools. And I don’t think anyone has made the justification yet that it’s worth the cost just so that formatting works out well.

    On another note: C# in ASP.NET code is handled in a similar (but non extensible) manner. I.e. a tranformation happens behind the scene so that a different C# file behind the scenes can have portions of it viewable in web snippets. Unfortunately, the architecture is such that this is not transparent to the tools and everything needed to be rewritten to check "are we in a web snippet or not". By not having a transparent model we’ve introduced enormous cost to our process. If we rearchitect that in the future so that tools don’t need to care about the transformation (i.e. it gets handled behind the scenes for them in one centralized location) then it’s quite possible that what you’re asking for will become incredibly easy.

  5. Brian: Agreed, stylesheet would only be for token level formatting. We’d need a parse tree based extensible formatting system for what you’re describing and I’ll discuss that later 🙂

    As to CSS: My idea is to hide that from the user for the most part. That’s all deep in the guts of the system. We would allow fine grained and course grained control of this. I.e. there would be a UI (similar to todays) which allwed you to make broad formatting changes. Those broad changes would evetually filter down to small changes to the styles (which very few people would ever need to be concerned about). However, for those who found the broad options too overly broad, then could always learn a bit more and tweak the underlying representation.

    Different engines would also work well. What would you recommend the interface to them be?

  6. Michael: You need a better source control system 🙂

    Note: Longhorn is going to push metadata heavily through WinFS. So the concept of data beyon just the main stream contents is something that other tools are going ot need to recognize and support.

    A specialized comment violates one of the thigns I mentioned before which is basically that it adds tokens (and therefor changes the meaning) of the code you’ve written. I consider comments part of the code I write and I dont’ want to clutter things up by adding machine meta-instructions in my code.

  7. Well, if MSFT would make a decent source control product, maybe it wouldn’t be an issue :).

    Sure, a comment changes your code. However, in view of the overall file, does it really matter? A line at the top/bottom barely effects things, and an editor could hide that line (if it were the last) without any ill effects.

  8. Approaching source code formatting from a slightly different angle, do you know if there is any appetite within the framework team to work on the CodeDOM namespace to allow a developer to get more fine-grained control over emitted code? The code generated by the code object tree passed to the GenerateCodeFromCompileUnit method is created in one fell swoop, with no interaction with the calling app. I’d be interested in a mechanism that raised events as code was being generate to do, among other things, custom formatting of source.