Enforcing patterns at the compiler level

(in case you’re wondering, right now I’m cherry-picking a few comments people have asked me to write on. I’ll get to them in order soon…)

Jonathan Crossland asked

.. your thoughts on enforcing patterns at the compiler level.

As an example:
– excluding the public scope from field declarations, making them private by default and only private.
– declaring a field as public appears as a compiler warning (level x)

or an abstract example
– putting accessors (constructor like) on the object level, so that we can code against someone setting an instance to another as in
myobj = yourobj (yourobj fires a get, myobj fires a set)

We haven’t talked about this at length in the design meetings I’ve been at, but there is a big gap in my attendence.

Given that we already have an extendible tool like FXCop available, I don’t think it makes sense for each compiler to do work that would duplicate the FXCop features. I’m fairly sure that both of your first examples can be done in FXCop now.

The abstract one could either be done by post-compile IL analysis, or it could be done by a compiler. Doing so in the compiler would be difficult to do in an general way, and our compiler architecture doesn’t really lend itself to that sort of things.

Philosophically, it would be nice to have some way to leverage the knowledge that the compiler has about the code, but we don’t currently have any plans in that area.

Jonathan, can you comment on why you want to be able to detect reference assignments? I’d like to understand the scenario better.

Comments (10)

  1. Jonathan Crossland says:

    Hi Eric

    Firstly, you are correct that fxCop can take care of us for the patterns.

    Over time though, the level gets a little higher or lower, whichever way you look at it :), and certain things make it down to the compiler level. ///<summary> as another example?

    With regards to the assignments – if as you say the current compiler architecture doesnt warrant this kind of new functionality, then so be it. It is more of a purity or nicety, I guess.

    At the moment – an asignment is pointer copy and it would nice to override that behaviour in the case of wanting separate instances.

    For an example, I have an existing complex object and would like to get a deep copy. I would have to de/serialize to memory stream and back.

    but would it not be cooler to override the assignment?

    With knowledge of assignment, one could also intervene and alter something from within the object itself. Imagine an object contains a property holding a unique key of some kind (perhaps mapped to database key)

    Now if serialization is done, I have to alter the property on the client side after assignment takes place which is not good. However with accessors, I would be able to change it inside the object, thus keeping business logic inside the object.

    The only way to stay inside the object is to provide a Static method like GetCopy or implement known interfaces.

    Also having accessors to assignments means that the ‘design’ is consistent with everything else in our code. A Property Gets and Sets, why not an object?

    There was another reason, which I can’t think of at the moment. ๐Ÿ™‚


  2. For the record, I dont believe that all Patterns could or should be enforced at compiler level. Its just nice to blog about ๐Ÿ™‚

    I remembered the other reason,

    Currently you have to derive from ContextBoundObject to get into the object, if we could replace a real object with a TransparentProxy there could be some great flexibility. I was playing around with the concept of dynamically compiling a method behind a method call using Context and Reflection and Remoting (http://www.agileopensource.net) which brought me to this particular example of where I wanted to be able to link in an object (proxy) determined at run-time within an assignment accessor. In aid of things like dynamic compilation, remoting and so on.


  3. Jonathan,

    Surprisingly, your proxy example may be one of the few actual acceptable reasons to use assignment overloading I’ve seen. I’ve been arguing against it for some time now, mainly because I don’t care for some of the side effects(people can just cause too much trouble). I am curious about your specific intents, how should this react with nulls? Or how should it behave as far as calling new or return values as apposed to ref1 = ref2?

  4. Yes, all this is theoretical, and the conversation only came up, as I thought Eric could conduct the thinking about it for a blog entry, not me. ๐Ÿ™‚

    Ok so now I am forced to do some thinking for myself.

    Provided that you take my words here – as not very technically thought through, we will be just fine. ๐Ÿ˜‰

    There are many ways (as an example based on explicit operator code or something like that), but I will talk around a constructor like approach.

    The bottom line to this approach is value semantics vs. reference semantics. Value and Reference semantics ask and then deal with different questions and implementations. At the root, I would imagine reference semantics behave as they do, for simplicity sake and could be altered to behave as value semantics (to a certain degree), The one approach is to open this can of worms.

    Now to your specific question about *nulls*, new and *reference copy* with regards to this approach.

    Nulls are simple โ€“ If unallocated (no static reference or reference in the activation record) then null is null. No accessor, no event, nothing fired, hurt or harmed.

    The new keyword, fires the .cctor and .ctor and allocates the class object. This remains as is. Nothing affects this.

    The reference copy is where the work is required. When an assignment occurs, (I would imagine) a combination of instructions such as ldind.ref, isInst, castclass and others manipulate the pointers, and create our current way of doing things. An injection of code is required here to fire a .get and .set, which is handled pretty much like the instance contructor (.ctor) as in having โ€˜specialname โ€˜ etc. The runtime would call the .get or .set depending on the direction of the assignment. The result would not be void as in the contructors case, but rather any of the valid types within its structure. On return IsInst, castclass, ldNull, mkrefany or whatever was needed would be called to make sure that the reference on the stack was valid in terms of reference, Type and so on. Castclass would always return T and not the base classes. The .get and .set would always return the same Type as the enclosing type. I would imagine that the same instance could not override .Get and .Set else it could be recursive. All of these can be ruled by syntax.

    When it comes to proxies, or whatever is returned, the runtime would need to be sure that it is valid and nothing more. This way of doing it would have a class that looked like

    public class MyObj


    public MyObj()



    // TODO: Add constructor logic here



    public .Get() // or .Set



    // TODO: Return this; or a valid instance, returning null is fine and does not assign to this instance


    // I could use Activator here

    // I could use Reflection to get the caller and return what I like based on who called.



    You can accuse me of over simplifying it, but I do know that there would be many difficulties, no matter the approach – but a design could be achieved.

    I know that with much thought a good design could be put in place to provide us with access to the assignment.

    Do we need it, do we want it, wouldnโ€™t it be cool? ๐Ÿ™‚

  5. Daniel O'Connell says:

    As a whole, I fear this feature would be much more of a downside than anything else. Being able to change the way assignment works is troubling. Though to achieve it you’d probably want to override the = operator instead of extra syntax. However, adding it to the class and letting it be polymorphic has its ups(I’d actually like to see operators in this place, although it is probably impractical).

    One of the things this approach worries me about is converting reference to value type assignments…I don’t think I want that kind of complexity in the language personally, don’t know if anyone else would either

  6. ๐Ÿ™‚ too true

    If it were done correctly though, the compiler/run-time would deal with reference and value semantics in the same way as it does now. There would be no visible difference anywhere else, except for an accessor that could be placed inside your class. if you do not have the .get in your class everything remains as you know it.

    A solution to this (as you mentioned) could very well be put up through the overriding of =. In this case, everything remains the same if you do not override it – and "Yes" it would have less syntax, which may be better.

    Eric, does the compiler work with an AST?

    How extensible is the design, and would it allow different kinds of things to plug into the walk?

    How would this kind of functionality affect other languages on the framework? is another big question.

    With this kind of functionality, one soon runs into many, many different situations, very quickly.

    On the whole, it would be a wonderful challenge.

  7. Daniel O'Connell says:

    Thats part of the upsides of openshared source, you can modify the rotor or mono compilers to whatever you want. I’ve been playing with adding some sort of extended verification framework to mono(being in C#, you can use reflection, rotor’s compiler is in C++ and unmanaged). Its not nearly as nice as a modification to the whole framework, but it does give you a place to experiment.

  8. Rick Byers says:

    Hi Eric,

    You said: "Philosophically, it would be nice to have some way to leverage the knowledge that the compiler has about the code, but we don’t currently have any plans in that area."

    I’ve often thought that its pretty difficult for tools to do meaningfull things with most source code due to the complexities of parsing. Any chance we’ll see the front-end and back-end (and maybe even each phase) of C# compiler be accessible independantly? This would allow tools to do analysis of the code without having to parse it manually, and even do simple transformations on the parse tree before it is compiled. If the representation of the parse tree was an XML document, powerfull tools could be incredibly easy to build. Do tools like Intellisense use their own parser, or hook into the C# compiler in some undocumented way?

    Microsoft Research has a compiler (lcsc) that can do this (and much more), see http://www.research.microsoft.com/research/pubs/view.aspx?msr_tr_id=MSR-TR-2003-32. Unfortunately, lcsc isn’t available to the public.

    I guess this would have extremely limited utility in the real world, and would probably cause more trouble than it’s worth. But for certain types of tools (refactoring, analysis, language research, etc.) it would be incredibly usefull.