Debug and Retail Builds


Over the weekend some of the big-brains on the CLR team (oh, and me too) are having a debate about introducing the concept debug and retail builds of the .NET Framework.  The high level concept is to move some “code correctness” checks out of the retail build (the one we install with the OS, redist or Windows Update) and into a debug build that would only be intended for development time usage.   I must admit that the concept of different debug and retail builds goes against my instincts for simplicity, but some smart folks feel like they are absolutely required so it is causing me to take a 2nd look and to ask you for your input.  


Here is my (hopefully unbiased) thoughts on the pros\cons of the general solution.


 


Pros (for separate debug and retail builds)



  1. Faster performance when using the retail build
    Mitigation: the CLR\JIT could always get smarter

  2. Richer checks in the debug build
    Mitigation: FxCop, Customer Debug Probs, etc could get better to cover these types of scenarios

 


Cons (against multiple builds)



  1. Disconnects between development time and deployment time environment creates opportunities for hard to track down bugs.
    Mitigation: Some people’s experience on unmanaged platforms is that this is a rare case that can be dealt by runtime debug bits in production environment when required.

  2. Costs to deploy, manage and update different builds of the .NET Framework.  
    Mitigation: the would be very close and could even be the same binary with different code paths enabled via a config file

  3. It is a slippery slope.  We will have to define carefully and explain repeatedly which checks remain in the retail build and which go into the debug build.  It seems likely that we will miss a few and allow a security check to be only in the debug build.
    Mitigation: careful review and education.

  4. At-a-glance complexity of the platform increases.  
    Mitigation: That is life.

 


 


This last con deserves some reflection.  Let me start by telling an unrelated story.  As many of you know I teach 2-day long class to WinFX developers on how to build great managed code APIs. I had just got done covering the threading story where I explain that managed code does not have a bunch of threading models like COM does. There is not rental, free, etc.  After the talk one of the developers from the original COM team came up to me and explained that COM didn’t start out with a bunch of different threading models either, but over time it was forced to “mature” into them.  He predicted the same would happen for managed code… “just wait” he said….


Here is a similar story from the standardization of the C# programming language at ECMA.  We were talking about adding the C# global namespace qualifier  (“::”).  One of the senior engineers from Intel made a very insightful comment. She said that on its merits it is hard to argue not to include this feature as there is no viable workaround, however it does seem be an affront to the simplicity of the language. In her opinion it was an orthogonal “hack” to the language design.    


 


I hope you can see the parallels with the debug\retail debate.  Part of me wants to sigh deeply and just accept that my baby is just going to get more complicated every release it is not worth fitting.  And part of me wants to rage against this kind of encroachment all the more.  


 


Questions for you:


So what do you think – should we sigh or rage? 


Do you have any additional pros\cons to share?  


Any experiences using debug and retail builds that could help this debate?


 


Standard Disclaimer:


I feel it is worth stressing that this is merely “hallway conversations” at this point. We have no concrete plans in this direction, I am just asking for very early input?


 


 


 

Comments (21)

  1. Pavel Lebedinsky says:

    I have some experience using checked builds of Windows, and I think it’s a great tool. I have found many bugs in the projects I’ve been working on simply by running them on a checked build, or by copying checked versions of system dlls to a retail build.

    However, I still think that you shouldn’t release debug builds of the framework, and here’s why.

    Checked builds of Windows were nice but mostly because tools like AppVerifier were not available at the time. The AppVerifier approach is superior because AppVerifier can be easily turned on on any system in a matter of minutes, whereas with a debug build you have to reinstall everything.

    If you do decide to make debug builds available, I still think that you should invest at least 90% of efforts in instrumenting retail builds to make them easier to debug, with things like Customer Debug Probes, tracing that can be turned on in retail, etc. Don’t add any checks into debug builds unless they are impossible to implement in retail.

  2. Andrew says:

    <Pavel>

    If you do decide to make debug builds available, I still think that you should invest at least 90% of efforts in instrumenting retail builds to make them easier to debug, with things like Customer Debug Probes, tracing that can be turned on in retail, etc. Don’t add any checks into debug builds unless they are impossible to implement in retail.

    </Pavel>

    totally agree with you.

    +1 against separation.

  3. David Levine says:

    I agree completely with Pavel. The checked build was great when I needed it but typically you need it when you don’t have it; if the sun and moon and the stars are all aligned when the bug strikes, great, but otherwise you wind up reinstalling the entire system and trying to recreate the original circumstances to duplicate the problem – not an easy task.

    The other problem is that if the bug is timing related then when you change builds you can also change the problem. It’s like the Heisenberg Uncertainty Principle – the mere act of observing a bug causes it to change.

    Look at this from the user’s perspective – they do not want to spend a day or more reconfiguring a system just to help someone’s tech support center track down a problem. And to be honest, most developers and testers don’t either. It’s a burden to try to keep track of symbol files, checked builds, and other dependent files, and it takes training and skill to use it correctly. Most devs don’t bother running checked builds.

    The other thing is this; performance is a great feature but so is validation. There’s a tradeoff between the two, and where to make the cut depends on the type of software being written. If I’m writing an OS then performance is key and checked builds make sense; if I’m writing a business app I want everything validated – if I need more peformance I’ll add memory, increase the CPU speed, etc. I’m more concerned about the correctness of transactions then about the few milliseconds faster the code will execute.

    What we (um… I) want is a system that already has it all there, ready to go when needed, but disabled until it is turned on. Leave the instrumentation in place to do the extra validation but don’t actually do it until it is explicitly enabled. When a problem occurs I want to be able to capture it right now!

    Put the diagnostics into the retail build and provide knobs to turn it up and down a few notches as needed. And provide more tools to do static analysis.

    A random idea – use an AOP-like mechanism to dynamically inject validation checks into the code when a method is JITd (or reJITd), and to remove them when done. To extend this, make this API externally accessible so if the app detects an abnormal condition it can programmtically turn on/off the checks and log the output.

  4. Although the "Richer checks" sounds awfully compelling to start with, I wonder if a virtual execution environment like the CLR can’t offer a better solution. Given the flexibility that the CLR has over something statically compiled and linked, do you really still need Checked and Retail builds like you do with unmanaged code?

    The Customer Debug Probes already seem to be offering many of the kinds of things that you might have expected to go into a Checked build, for example. I understand that CDPs tend to work at a fairly low level – down at the execution engine level rather than the class libraries. But would it be possible to do something similar, allowing developers to opt into various extra levels of checking at the class library level without having to bifurcate the CLR?

    Also, part of me thinks that if there are sanity checks that can be done, I’d like to see them in the normal framework anyway! I remember the possibilty of a Checked build being floated ages ago on some mailing list, and suggestions for what might go in it were requested. At the time, I suggested that Windows Forms might like to check the caller’s thread ID to detect a common multithreaded programming error. But I’ve changed my mind now – I’d much prefer it if the retail build of Windows Forms detected when an attempt to use a control on the wrong thread was made, and always threw an exception. It would have avoided a lot of grief over the years…

    So my feeling is that if a check for valid inputs can be done it should always be done. (But having said that, I appreciate that there are some kinds of checks which can have a very high cost. For these, some kind of opt-in like you get with CDPs is what I’d prefer. But again, I admit that I’ve thought very hard about what the implications would be for working set, size of the .NET redistributable, etc… So I reserve the right to change my opinion once I’ve given it more thought, or once someone else has pointed out to me why I’m wrong. 🙂 )

  5. Ken Brubaker says:

    Brad Abrams is thinking out loud whether we should be subject to a split personality for the CLR by re-introducing the ghastly debug build. However, reflecting on my earlier experience, I would say to let Moore’s Law have it’s way.

  6. Raymond Chen says:

    Very simple: If it’s not on by default, 99% of developers won’t use it.

    Because most developers are Mort. If you ask them, "Do you ever run your program on a checked build?" or "with the debugging flag enabled?" they will say, "What’s a checked build? What’s the debugging flag?"

    How many C# developers use Customer Debug Probes? Probably less than 1%.

    How many C# programming books teach you about Customer Debug Probes? Probably none.

  7. Dmitriy Zaslavskiy says:

    If additional, expensive checks can be solved by adding CDP and maybe [Conditional/DebugOnly] attribute (not the C# trick, but CLR feature) I am all for it.

    Although this would not be significantly different from having a debug build!

    Raymond you are correct 99% of programmers don’t use/know about those feature, but I am not sure inflicting 50% (for different values of 50) speed degradation on all users is worth it.

    So my prefs would be CDP/Attribute for libraries or a separate debug build.

  8. Mike Dunn says:

    Sorry for the OT remark, but this line:

    It’s like the Heisenberg Uncertainty Principle – the mere act of observing a bug causes it to change.

    just made my day. Thanks 😉

  9. Adam says:

    The core os team loves checked builds alas we frequently find people who didn’t use them to test their stuff.

  10. Ken Cowan says:

    As Nancy Reagan said, just say no!

    The first release of NT needed more computing resources than the average PC could deliver. We all programmed to the metal back then, and it made sense to eek out as much perf as you possibly could. Even the API set was relatively primitive.

    We now program to a virtual execution environment and the API set is much richer. In fact, we don’t even call it an API set, it’s a "framework". Compare then: we indexed arrays and you could pretty much do the codegen in your head. Now: We load a dataset from an XML stream and bind the whole thing to a data grid. We, your customers, have already voted by our actions. We prefer programmer productivity — optimizing our investment in people — rather than optimizing hardware resources. If the checks slow down the framework so much that my users can’t run my app to get the real work done, then I’ll have to work extra hard on my app. At that point, you’re burning people resources and I’d say you’ve gone too far.

    One of the hardest debugging problems we see is where it works on my machine but not on yours. Even slight changes in your code can alter a run. Slight changes in timing can uncover race conditions. Slight changes in memory usage could change the timing of garbage collections.

    So, focus on improving the platform for both dev and production scenarios, and keep it a single platform.

    KC

  11. W Poust says:

    Please don’t "solve" performance problems before you identify that there is a performance problem. For the apps I’ve written, network access and database queries of large databases have always been the slow areas. Those issues were always resolve at the application level by getting "smarter" i.e. improviing network data requests and adding indexes.

    My job as a developer is difficult enough without having multiple builds for libraries/frameworks. You should hear some of the choice words I have to say when I have to use a debug version of a 3rd-party library. I definitely try to stay in Raymond Chen’s 99% Mort group… by choice.

    If you are developing software and depending on debug checks for verifying code correctness, I’d say that’s a big RED flag.

  12. Keith Hill says:

    I wonder if there could be a solution like the one used by products like BoundsChecker? Doesn’t it hook the Windows API or something like that? The nice thing about providing some sort of API or API-hooking would be to open it up to third parties like Compuware & IBM/Rational. There’s a big security consideration there but perhaps that can somehow be mitigated? For the record, I am not against a debug build. We use the CRT debug runtime checks during our beta phases. Usually on our last beta we switch back to non-debug CRT libs and turn off assertions and such. We have found a lot of otherwise hard to find bugs with the CRT debug libs using the heap check functionality.

  13. Let me refine the suggestion a little bit to include a few more details I was toying with. Consider the post on Rico’s blog:

    http://weblogs.asp.net/ricom/archive/2004/03/01/82195.aspx

    Now let’s consider the practical impact of a change like that. Consider the case of passing a null reference to a BCL method. Currently we throw an ArgumentNullException, which I thought would be an incredibly useful technique for helping to assign blame to the appropriate piece of code on the stack very quickly. You no longer get a stack trace with an AV 6 levels deep in printf – instead you get an exception with a stack trace that tells you "You screwed up, not the library". When I moved from C++ projects to developing our base class libraries in 1998, I thought this was a great idea. I wasted a minute on several occasions thinking I had found a bug in someone else’s C++ code, when it was really just me passing in a bad parameter.

    My thinking on this has started to change a bit. In practice, an ArgumentNullException is useless to anyone but a developer. An end user doesn’t care if they get an ArgumentNullException or a NullReferenceException – all they know is they found a bug and they need to either submit a Watson crash dump or go find a developer to fix it. However, we make those users pay an (admittedly somewhat minor) perf cost for these types of checks at runtime today. There are some similar checks that are potentially more expensive, such as verifying that you don’t alter a collection while enumerating it.

    I came up with an interesting workaround that I won’t go into in great detail, but suffice to say it would have the properties Rico summarizes in the link above. Perhaps we even get rid of ArgumentNullException altogether at development time and instantly dump you into a debugger – I don’t know. (Of course this would generalize to other ArgumentExceptions, but would stop at some point. Clearly we would continue to have to throw exceptions like FormatException if you call Int32’s Parse with a String containing junk.) Assume we make the syntax look nice and possibly even hook up static analysis tools to understand this (I have a proposal, but it’s so long most of our architects haven’t read it.)

    Would you (as a very small, self-selected segment of the .NET Framework users) be interested in potentially getting both better error checking at development time for precondition failures to .NET Framework methods, as well as ensuring your users do not pay a price for a marginally better error message for developers? How much of a change to your debugging experience would you be comfortable dealing with? How many people use VS’s Debug & Retail configurations today? Would you be comfortable developing & debugging primarily only using VS’s Debug configuration? What if this was some other set of flags, like an "Enable precondition validation" flag you can set per-assembly via a config file or a simple GUI tool?

  14. B.Y. says:

    #ifdef DEBUG in IL? This is a bad idea. Just another complexity that’s not needed 99% of the time, so don’t add it.

    In addition to the CONs listed by Brad, what if:

    *the debug flag is turned on accidentally on user’s machine

    *a developer finds his app only works in debug mode, so he codes his app to turn on debug flag at start.

    If you really want to help developers (well, the few developers that care anyway), just make it like the checked build of Windows: it’s a different set of binary files intended to help developer debugging only, not to be use in any production environment, and it will be ignored by 99% of developers anyway. But no run-time IL DEBUG flag.

  15. Brian Grunkemeyer says:

    In response to these two:

    *the debug flag is turned on accidentally on user’s machine

    The application runs slower, but nothing horrible happens.

    *a developer finds his app only works in debug mode, so he codes his app to turn on debug flag at start.

    The developer was covering up a race condition (or other similar bug) in his code, and is making an extremely unenlightened choice. As we all know, race conditions may easily repro on a different set of hardware that is either faster or slower than our current machines, or has more processors (like a 2-way or 4-way SMP machine).

    In this case, the developer didn’t solve his problem – he instead ignored it and made it appear to go away on his current hardware. The application runs slower, but nothing horrible happens.

  16. Pavel Lebedinsky says:

    > just make it like the checked build of Windows: it’s a different set of binary files intended to help developer debugging only, not to be use in any production environment…

    What if the problem happens only in production environment?

    Where I work, we have test environments that take several days or even weeks to build (they are intended to be the mirrors of what we have in production). There is no way the owner of this environment would allow me to do something as intrusive as install a debug build on it (and sometimes you can’t even reboot to create an image, because the system is under constant load trying to reproduce a memory leak or something).

    The ability to do debugging on a retail build is very important to me.

    > and it will be ignored by 99% of developers anyway.

    That’s one of the main arguments for building all debugging functionality into the retail build. The only way these 99% of developers will ever use it is if they can turn it on with a couple of mouse clicks from Visual Studio.