Preventing third-party derivation, part one


In this episode of FAIC, a conversation I had with a C# user recently. Next time, some further thoughts on how to use the CLR security system in this sort of scenario.

Him: I have this abstract class defined in one assembly:

// Assembly FooBar.DLL
public abstract class Foo
{
    internal abstract void DoSomething();
// …
}

I want to create a concrete class derived directly from Foo in another assembly, Blah.DLL.

Me: You are going to have to learn to live with disappointment.

Him: But I should be able to because I have public access to the class Foo!

Me: That statement is false. Having public access to a class absolutely does not mean that it is always legal to create a subclass in another assembly. There are any number of reasons why it might not be legal to create a subclass in another assembly; you happen to have run across just one of them. (There are plenty more — for example, you could make a public class and mark all the constructors as internal. It’s not possible to subclass that from another assembly either, since the derived class constructor does not have an accessible base class constructor that it can call.)

Him: You’re right. In this case, I can’t provide an overriding implementation for the internal abstract method DoSomething() because it is not accessible. Therefore the compiler will give an error when I try to subclass Foo. How do I get around this?

Me: You don’t get around it. The type safety system is working as designed; don’t try to defeat it.

If you own the FooBar.DLL assembly then of course there are several ways you can do what you want. Two that immediately come to mind are (1) mark Blah.DLL as a friend assembly of FooBar.DLL using the InternalsVisibleTo attribute (2) change the accessibility of DoSomething to public, protected or protected internal.

Him: Why is it even allowed to have a nonextensible class like this in the first place?

Me: It is sometimes a good thing to make a class that cannot be extended arbitrarily. I have myself occasionally used a similar technique to ensure that no third party can subclass one of my public base classes, though, as we’ll see later, there are other, perhaps better ways.

Look at it this way: if the author of the class wanted you to be able to subclass it, they probably would have made it possible. Clearly they do not want you to subclass this class.

Him: My previous question has been thoroughly begged. Why would someone want to prevent a third party from subclassing?

Me: I can think of a few reasons. Here are two:

1) Designing a class to be extended effectively by third parties is work, and work requires effort. If the class is not intended to be extended by third parties, but must be unsealed (for internal extension, for example) then the implementer is faced with a choice: spend the time/money/effort unnecessarily, provide an extensible but brittle class, or prevent extension by some other method.

This trick is a cheap, easy and straightforward way to prevent extension by arbitrary third parties without preventing extension by internal classes. As we’ll see next time, there are other ways you can do this too.

2) A class may need to be extensible internally and used externally, but must not be extended by third parties for security reasons. Imagine you have a method Transfer which takes two instances of your class BankAccount. Suppose BankAccount is an abstract class with two derived classes, CaymanIslandsAccount and SwissAccount. Do you really want arbitrary third parties able to make new objects which fulfill the BankAccount contract but were not implemented by you?

Again, you end up with a tradeoff – either implement Transfer so that it does type checks on the BankAccount passed to it (possibly expensive both in initial implementation and maintenance), implement Transfer so that it accepts any old thing (dangerous!), or prevent anyone from making an unknown kind of BankAccount (cheap, safe).

In general, good security design is “make them do something impossible in order to break the system”. (Or even better “make them do seven impossible things”.) By making it impossible for third parties to fulfill the contract, you add additional security to the system. (By no means enough security, but a little bit more.)

Him: Thanks, I see what’s going on here. Clearly I got into this situation because this trick of using access modifiers to prevent extension is insufficient to convey the author of FooBar.DLL’s intentions.

Me: Next time we’ll talk about more obvious ways to state the intention of “please don’t subclass this thing”.

Comments (27)

  1. Joren says:

    I don’t think you can deduce that Foo’s author doesn’t want you to subclass Foo based only on the existence of an internal abstract method.

    It’s perfectly reasonable to have a public class with a few public methods and a few internal ones. It might then occasionally happen that it is convenient to make a class abstract (for the usual reasons) and so you might end up with a public abstract class with some public methods and some internal methods, some of which just happen to be abstract.

    Of course in that case the class can no longer _be_ subclassed by a third party, which should be something that is explicitly designed for, but this isn’t an unthinkable oversight.

    If I wanted a class not to be subclassed, I’d make it sealed. If I wanted it to be subclassed, but not externally, I’d make all constructors private or internal (and not protected). If I for some reason would need a protected constructor, I’d look for any attributes that enforce the inability to externally subclass my class. If I couldn’t find any, then as a last resort I might use an abstract internal method to enforce it. But then I’d at least document that I don’t want external subclasses and that this is how I enforce it.

  2. mike says:

    >My previous question has been thoroughly begged

    ??

    http://en.wikipedia.org/wiki/Begging_the_question

  3. EricLippert says:

    Right — begging the question is answering a question in a manner which provides no new information, but rather, depends logically upon the answer to an equally difficult or more difficult question.

    Answering "Why is this allowed?" with "because it is desirable" doesn’t tell you anything. Rather, it logically relies upon the answer to the harder question of "why is it desirable?"

    A similar example is attributed to Newton, in his criticism of the atomic theory. Why are some substances hard and some soft?  Because all substances are made of these things called "atoms", and some atoms are hard and some are soft.

    Newton quite rightly pointed out that this was begging the question; it’s answering the question in a manner that provides no new information and raises the same question at a different level.

  4. EricLippert says:

    Joren: yes, you have re-stated my whole point here — that if that is what the author of the class INTENDED, then they did a lousy job of communicating it.

  5. Mark Rendle says:

    Today I have already learned of the InternalsVisibleTo attribute, and it’s only 8am. Thanks Eric!

  6. Franck Jeannin says:

    FYI, InternalsVisibleTo is especially useful when you want to write unit tests for internal methods/classes as you usually do it from a separate dll.

  7. Ian Marteens says:

    Maybe the whole thing points to a missing feature in most popular type systems. What did the author try to prevent: reusing a base implementation or adding new players to the game by implementing the type interface?

    If I remember well, C# requires same visibility to a type and its implemented interfaces. However, I think there are public types in mscorlib implementing internal interfaces (maybe I’m wrong in this). Should C# allow this situation? On the other hand, there’s no easy way in C# to reuse an implementation without inheriting. I’m thinking in some alternative as interface delegation in good old Delphi.

  8. EricLippert says:

    You do not remember well. C# does not require that an implemented interface be as accessible as the type. C# requires that a base class be at least as accessible as a derived class, but does not make the same restriction on interfaces. This is perfectly legal:

    public class C : C.I { private interface I {} }

    That is, C is a public class which implements private interface I.

    Similarly, this is perfectly legal:

    public interface I<X>{}

    public class C : I<C.D> { private class D {} }

  9. Andy Wilkinson says:

    This does seem a very interesting way for the author to support his intention to stop external subclassing. I can think of one class in particular where I came across this issue. The biggest problem from my point of view is that from the publically visible members there is no way of knowing that you could not subclass – after all, the thing that is stopping you is internal – there is little chance that this is going to be in the documentation (imagine the MSDN docs for the Foo class – it is public so in the docs, but nothing will indicate that you can’t sub-class).

    Surely there are much more obvious ways to do such things. What I would consider the standard way of indicating that someone can’t externally create an instance of something without some internal help is through having only private or internal constructors as you suggested as an alternative. One look at the docs in MSDN and you can see there are no public constructors – so you can see that it is not subclassable.

    As far as I can see though, all implementations from the same DLL (e.g. class Bar : Foo) will either have to have a similar trick to stop subclassing, or be external (as in the case I have in mind!). Otherwise you can subclass Bar instead, override all other overridable methods, and leave the Bar implementation of the internal abstact method. Can anybody think of any issues here I haven’t thought about?

  10. Ian Marteens says:

    <i>public class C : C.I { private interface I {} }</i>

    Heh, heh… that produces a "Circular base class dependency involving C and C.I" (Visual Studio 2008 SP1). I’m working in a compiler, and I have found the problem when importing type declarations from mscorlib, via reflection. Of course, I just ignore those "internal interfaces", since they don’t have any effect on client code.

    In any case, it would be nice if C# allowed it, with internal nested types. A private interface would be of little use in these cases, imho.

  11. EricLippert says:

    Whoops. As you point out, that is NOT legal. The reason I said it was legal is that I am actually fixing that bug as we speak, and on MY machine, it is legal. 🙂

  12. Eduardo Fonseca says:

    what about using friend assemblies?

  13. EricLippert says:

    What about them? Could you ask a more specific question?

  14. Anon says:

    I get the gist of what you’re saying, but I hope your real conversation wasn’t so harsh. Ouch.

    I also didn’t get much from this, and had to reread ‘the conversation’ a few times to glean 1 fact – that there’s some compiler hack that if I have an abstract class and put an abstract internal method it can’t be subclassed outside of non-friend assemblies.

    +1 for trivia – I’m sure it took the newbies weeks to figure out how they broke the build when they altered a base-class.

  15. EricLippert says:

    It’s not a "hack". This is the logical consequence of two perfectly sensible rules:

    1) A concrete derived class must implement all abstract methods of an abstract base class.

    2) It is illegal to override a method that you cannot access.

    Therefore, if you have an abstract class with internal abstract methods, it is impossible to make a concrete derived class in another assembly because there will always be a method that cannot be overridden.

  16. Pop Catalin says:

    "Designing a class to be extended effectively by third parties is work, and work requires effort".

    If you don’t do this, your overall design suffers, your library is crippled/limited from the start. And a little too much laziness, and the developers will move away from your library. You’re limiting the usefulness of your classes to basically some predicted scenarios, while in the real world the shear number of possible "valid" use cases is orders of magnitude higher than what one can predict.

    "A class may need to be extensible internally and used externally, but must not be extended by third parties for security reasons. Imagine you have a method Transfer which takes two instances of your class BankAccount. Suppose BankAccount is an abstract class with two derived classes …"

    One way around this is to make Transfer non virtual and have a method TransferInternal() marked as "internal virtual", which provides the necessary security while still allowing consumers of the class to use OO techniques to "add" functionality through derivation, and not be forced to use awkward workarounds.  What if someone needs to add some extra data to the SwissAccount class, without modifying it’s core functionality, that must be used at higher levels? This scenario is made impossible by preventing derivation.

    Slightly offtopic:

    I’m strongly against sealing of classes to prevent derivation. It’s so anti OOP. Instead of sealed, .Net should have had a closed keyword that prevents you from reimplementing interfaces the class already implements, and for virtual methods the final modifier is sufficient.  There were a few times when I really wished some classes are not sealed so I wouldn’t have to use some crazy workarounds instead of simple derivation.

    I’s true that some classes like System.String need to be sealed because of their special layout, but on other cases I would really preferred a closed approach instead of sealed.

  17. Michal Dvorak says:

    Reduced accessibility isn’t a security option, you could always get around it.. Don’t even think about it this way 🙂

    I agree with Pop Catalins first paragraph, that attitude "i won’t make this extensible beacuse i’m lazy", will definetly return later.. Such "lazy programming" has hidden costs in long-term development.

    Anyway, in my opinion (even little OT), one should use the interface almost everywhere, where is some base abstract class. Let’s have an example:

    public abstract class FooBase {

       public abstract void DoSomething();

       public virtual void DoSomething2();

       public void DoSomething3();

       protected void HelperMethod();

    }

    Abstract method definitelly should be in iface, virtual method too, because the possible consumer can’t rely on the default implementation (extension don’t have to call the base). Helper method can’t be consumed, and argument that ‘it will be easier to extend that class’ is wrong, you still can have base abstract class, with standard implementation etd., that implements iface, but consumer doesn’t need to know about it. And at last, the problematic non-virtual method. I think, that consumer should never rely on the actual accurate implementation, because it’s just hard to maintance and not flexible. Such method should be also declared by iface, that is, it could be virtual.. if it does sometning, that need’s to be "closed", implement it as sealed in that easy-to-use abstract class.

    Sadly, C# alone does not encourage this pattern.. IoC does, and I love it 😛

  18. EricLippert says:

    First off, who said anything about designing base class libraries? Designing an entire library of functionality that can be extended is even more work than designing a class to be extended. Why would someone do this expensive and difficult work if that wasn’t the goal of their business?

    Second, your philosophical belief (or, perhaps, psychological condition! 🙂 ) is common in this industry. I call the belief that object-oriented design principles are always good principles and should be adhered to when solving every problem, whether the business problem needs OOP or not, "object happiness".

    Writing code that works well in stated, by-design scenarios and fails fast in all other scenarios is frequently my goal. That is a sensible business goal in a world where unnecessary extensibility has large design costs, implementation costs, testing costs, maintenance costs, and more exotic costs like "preventing us from making changes to anything for fear of breaking someone who is using our stuff in a crazy way".

    In short, I reject object happiness. Unnecessary extensibility is pure concentrated evil and should be avoided at all times. It is a huge expense that buys you more huge expenses later.

  19. EricLippert says:

    Furthermore, I object to the characterization of a good engineering choice as "lazy". Almost none of us design and implement software for ourselves; we work for someone else, and we are therefore spending their money. Deciding to spending someone else’s money responsibly by cutting out all unnecessary features isn’t _lazy_, it is _responsible_.

    The time and effort that you save by not designing, implementing, testing and maintaining unnecessary extensibility is time and effort that you can devote to improving your employer’s bottom line in other ways.

  20. Pop Catalin says:

    The time an effort saved by building on top of an extensible OO architecture, instead building on top of a closed architecture, using all sorts of workarounds and hacks, is the real time saver. What happened to the open-closed principle?

  21. EricLippert says:

    Absolutely! If you are designing an architectural framework, by all means, do the work to make it open and extensible by careful design choices and high-quality implementation. That’s obviously goodness. That would be an example of necessary extensibility.

    But most programmers are not designing and implementing architectural frameworks. Most programmers are implementing business logic. When business logic classes are unnecessarily designed and built as though they were base class libraries, they end up over-complicated, over budget, late, hard to test, full of security problems, and have extension points that do not solve any problem ever encountered by their users. That’s waste.

    Don’t make investments in technology that you don’t need and might never need, on the off chance that someone, some day, might want to extend the class. Let that hypothetical person pay the price of unsealing the class and building extensibility into it if that’s worth it to them.

  22. David Betz says:

    Fortunately, however, you can use reflection to access just about anything you want (and oh do we ever use that).  You can even use IL to inherit from internal classes (which, yes, I end up doign all the time).  It’s a shame people mark all kinds of stuff as internal when its obvious that it could be genericly used by others.  One of the worst examples of this lameness is in the System.ServiceModel.Web assembly where there’s all kinds of goodness to create a JavaScript proxy.  For some reason it’s all internal forcing people to do reflection and IL tricks to get around it.  If they didn’t set their buffer manager as internal, we jQuery guys would feel the love a bit more.

    On the other hand, there are situations where you don’t want people to touch it at all.  However, sealing something or setting somethign as internal doesn’t do anything for a skilled, determined individual.  Reflector and basic knowledge of how the C# compiler works makes just about anything accessible.  In my work, if I need to protect a redistibutable section of code from extension (kind of like in the bank account example, but for non-secure stuff), I’ll be sure to use a static class.  Since it’s essentially a sealed, abstract class, there’s no way to touch it.  However, if I were doing something that required security and must be protected from IL viewing, there’s no way I’m going to package that in an assembly anyhow.  Obfuscation or assembly encryption are well and good, but I seriously prefere designing a system using a software-as-a-service model.  While I’m not ready to go as far as Juval Lowy with his every-class-is-a-service campaign, I normally do follow a every-assembly-is-a-service model.  This protects everyone (and gives all applications some serious flexibility.)  There’s no need for the other guys to know what’s going on in the black box.  Here’s your interface, your endpoint, and your protocol information.  No code protection required.

  23. Alex Fedin says:

    Just wanna highlight an issue with the mentioned approach, because you may make your customers to hate you just because you did not provide them with any point of extensibility.

    For instance, in Silverlight, there is a number of specialized collections derived from the abstract generic collection System.Window.PresentationFrameworkCollection<T>. Some of them are public, some of them internal, but all of them are sealed. At the same time, the PresentationFrameworkCollection<T> has few abstract internal methods, those make you unable to derive your own collection from this one.

    Just because of this, the customer should spend much more time/efforts/money on reinventing the wheel by creating the same functionality in his/her own code.

    So, you have to always remember about your customers, even if there would be just one single customer – you. Because the requirements are going to change, so the software changes follow the requirements. You better be prepared for that with your code structure, then in a future trying to find any workarounds for the limitations you put in yourself.

  24. Alex Fedin says:

    He-he… Seems that everybody is complaing about that the mentioned approach is violating an open/close principle. Wouldn’t it be more valuable to tell about how to follow open/close, then how to violate it?

  25. Alex Fedin says:

    He-he… Seems that everybody is complaing about that the mentioned approach is violating an open/close principle. Wouldn’t it be more valuable to tell about how to follow open/close, then how to violate it?

  26. Jeff Brown says:

    You can actually create a concrete subclass of an abstract type with an internal constructor.  Just not in C#.

    Basically in IL you would define your new constructor with a body that initializes whatever you need but does not call any superclass constructor directly.  Instead you can use reflection if you really need to invoke that superclass constructor.

    This works because the CLR does not check that a subclass actually calls the constructor of its superclass.

  27. Stefan Noack says:

    well.. accessibility levels are there so that you can structure your code in a way that prevents you and others from trying things you did not plan to be done by anyone. i use them to make sure that one thing (i.e. setting a property value) can only be done in one way (by making the backing field private) so that later when i add functionality (i.e raising an event) to the setter, i can be sure that neither i or someone else did go a way i did not expect. all in all you reduce the number of ways one can go and thus the number of ways something you did not think of could happen.

    oh well.. and i make ugly stuff private /internal so no one can see it 😉