Test what you Ship, Debug what you Test


Test what you ship. This should be obvious.

Debug what you test. When you test, you’ll find failures, and you’ll need to debug those failures. So you need to be able to debug what you test.

Therefore you’re going to need to be able to Debug what you Ship.  This also happens when you get crash reports from the field.

So don’t delude yourself into thinking that you can have some retail-only feature that doesn’t need to be at all debuggable. I hear this a little too often.  So:

– build PDBs for public builds, not just internal builds.
– be wary of obfuscation techniques that are completely undebuggable.
– be wary of applying some post-build optimization magic that can’t be debugged.
– consider having a few global pointers to key data structures so that you can find them in retail bits. The extra 4 or 8 bytes of data will save you many hours.

Comments (19)

  1. Mihailik says:

    It’s unclear, Mike.

    So you are advocating shipping ‘Debug’ builds? It’s rather weird — don’t you think ‘Release’ switch in Visual Studio was created on purpose?

    And having ‘anchor’ pointers may completely distort GC idea. The memory will not be freed, effectively it’s memory leak you are creating with those 4 or 8 bytes. It may cost gigabytes actually.

    I am sure you have in mind something reasonable. But think about unfortunate reader who will stop using Release builds, who will hook all the data collections to some static field ‘for a rainy day’. You’ve got to be more clear.

  2. niki says:

    I think his point is that at some time you might (be lucky to) get a memory dump of your application when it crashes on a production machine somewhere in Alaska. When you do things like obfuscation, throwing away PDBs, using ILMerge or simply removing logging information in release builds, you should think twice about what that might mean in the long run.

    Some companies can get away with the motto "if we can’t reproduce the problem in-house, then it isn’t our problem", but for some of us this just isn’t good enough, and at least for those debbuging released code is sometimes inevitable. All he’s saying is: be prepared for that time.

  3. Peter Ritchie says:

    No, Mike is advocating generating (and storing, managing, etc.) the debug info for release builds (i.e. generating PDB files on release builds).  Yes, it will be harder to debug; but it will be "debuggable" with its source code.  The corollaries being obfuscation and post-build optimizations (like ILMerge) would invalidate the PDB

    I’m not sure why that’s off by default in Visual Studio.  Mike, maybe you can talk to someone to change that default?

  4. Peter’s right.

    I’m not saying ship Debug builds. Ship optimized retail builds. Just don’t forget to build the PDBs for them and keep them on hand internally.

    Likewise, Obfuscation + post-build steps are ok: just make sure that you have a debugging plan for them. For example, some obfuscators generate "mapping" files that can help you debug obfuscated code. It’s more painful, but at least it’s possible in the rare instance that you need it.

    Peter – What exactly is off by default in VS?

  5. Oleg – you’re right that global roots may potentially cause memory leaks for the GC. One approach is to clear the root when it’s likely no longer needed. Since the root is only used for debugging, you have more flexibility about when you clear it. (Clearly it too early won’t crash your program, it will just make it harder to debug in that particular window)

  6. Mihailik says:

    >> One approach is to clear the root when it’s likely no longer needed.

    Hm, I don’t see what scenario you are talking about.

    If the data belongs to Form instance, there is no need to root it in global. During debug one can iterate through Forms global collection and find needed form, then get the data.

    If not form — than what?

    The whole idea of having ‘hidden’ global roots throws us back to manual memory management times. I don’t think it is good idea in general. There might be cases when it might be used as hack/trick. But not in general.

    The cost of manual memory management bug is huge. It’s the worst nightmare for debugging.

  7. Peter Ritchie says:

    @Mike: RE: PDB off by default.

    Sorry, I’m wrong about that with VS2005, it’s 2003 (and previous) that doesn’t generate PDB  by default in Release configurations.

  8. niki says:

    >If the data belongs to Form instance, there is no need to root it in global. During debug one can iterate through Forms global collection and find needed form, then get the data.

    Did you ever actually do that, in a memory dump  in a release build, without PDBs? It’s certainly possible to "iterate through" complex structures in WinDBG+SOS, but if the thing you’re looking for is somewhere deeply nested, like in an element of a collection that’s a member in a structure somewhere in the call stack (…), it’s anything but easy. Add some C++/mixed mode code and it can get nearly impossible. And it often is impossible in case of a stack overflow. Having a global pointer to some key object (e.g. the request that’s currently handled, which is of course set to null as soon as it’s completed) can be a life-saver in such a situation.

    Of course that doesn’t mean "make global pointers to every object you create somewhere". Nobody said that. But you could look at your code and just think "if I got a memory dump tomorrow, what information would be most helpful to see what’s been going and what led to the crash?".

  9. Peter Ritchie says:

    I think a walkthrough and/or a video of traceing/debugging a memory dump would be very helpful with both PDBs and the utilizing global roots…

  10. Mihailik says:

    >> but if the thing you’re looking for is somewhere deeply nested, like in an element of a collection that’s a member in a structure somewhere in the call stack (…), it’s anything but easy.

    I think functionality and reliability is first-class feature of app. And memory dump analysis is so much least priority, honestly.

    >> And it often is impossible in case of a stack overflow.

    That is most edge case of edge cases. If an app goes stack overflow, there are no excuses that may adjust using it in production as is, instead of investigating the problem armed with the complete range of debuggers.

    >> Having a global pointer to some key object (e.g. the request that’s currently handled, which is of course set to null as soon as it’s completed)

    I wonder how would you enforce that nice ‘set to null as soon as it’s completed’.

    And also some silly inner voice suggests me you just completely missed multithreading scenario.

    I just imagined you having 1 global reference to ‘current’ request, in case where there are 14 worker threads simultaneously processing 14 requests. Even if something crashes, the only info you get is some random request, and 1/14 chance that it is the request induced crash.

    I believe the idea of having redundant global references for the sake of debugging is either naive or crude evil.

  11. niki says:

    > I think functionality and reliability is first-class feature of app. And memory dump analysis is so much least priority, honestly.

    Reliability and the ability to find and fix bugs are really two sides of the same coin. So, if I ship a product now, I can either pack my bags right away and prepare for a flight to my customer as soon as they discover a bug, or I can plan a troubleshooting strategy before shipping.

    > That is most edge case of edge cases. If an app goes stack overflow, there are no excuses that may adjust using it in production as is, instead of investigating the problem armed with the complete range of debuggers.

    So essentially you’re saying "production software mustn’t contain bugs". I fully agree with this, it just isn’t the case in the real world. In the real world (at least where I work) production software tends to have a few bugs that haven’t been found during reviews or testing. Sometimes they can be reproduced in a debugging environment, but sometimes they can’t.

    > I believe the idea of having redundant global references for the sake of debugging is either naive or crude evil.

    It seems you’re trying to think of naive or crude evil examples only. Believe me, I know my application better than you do, I know when to store which global pointers and when to delete them. And I also happen to know how many threads are running in my app…

  12. Mihailik says:

    >> Believe me, I know my application better

    And here we returned to my initial statement.

    Having redundant global roots is absolute hack. Never can it be the general recommendation.

    >> And I also happen to know how many threads are running in my app…

    Then you know that setting/clearing global field likely needs synchronization. And you know how synchronization tends to induce deadlocks and race conditions.

  13. Dakk says:

    Unfortunately I can’t ship what I debug and debug what I ship because Visual Studio 2005 is incapable of debugging a managed C++ assembly compiled with Visual Studio 2002 — and no I cannot upgrade that assembly at this time because its used by an addin framework that must be compiled in VS 2002.  It would have been nice to be able to debug it because that would of helped me more quickly recognize that managed C++ sometimes breaks polymorphism due to changes in the compiler between versions.

  14. Dakk –

    Can you give a simple exmaple of how MC++ breaks polymorphism?

    You can still debug:

    You VS2002 MC++ apps has to bind against a version of the runtime. So…

    1) If it binds against .Net 1.0/1.1; then you can use VS2002/VS2003 to mixed-mode debug it. Or you can use VS2005 to managed-only or native-only debug it.

    2) If it binds against .Net 2.0, then you can use VS2005 to debug it.

    The restriction is that you can’t use VS2005 to mixed-mode debug .Net 1.x apps.

  15. Steve Steiner says:

    For implementing ‘global pointers’ to key data structures it may be worthwhile to enforce some invariants: 1. They are weak references, and 2. They are write-only (accessable with a setter only property on a singleton).

    I think those invariants address Olig’s quite reasonable concerns with the idea.

  16. Dakk says:

    Mike,

    MC++ breaks polymorphism in some cases.  I ran into this on our project and I reproduced it in a small set of projects.

    Basically create a mc++ assembly project in vs2002.  The assembly should have this code:

    public __gc class BaseClass {

     public:

       virtual void Foo(long var) {

         System::Console::WriteLine("    BaseClass :: Foo");

       }

    };

    Now create a mc++ assembly project in vs2005 and add this code:

    public ref class Subclass : BaseLibrary::BaseClass

    {

     public:

       virtual void Foo(long var) override {

         System::Console::WriteLine(       "BaseLibraryCPPExtension :: Subclass :: Foo");

         BaseClass::Foo(var);

     }

    };

    Now write a client exe in vs2005 that only references the base assembly written and compiled in 1.0.  The client exe should load the extension assembly written and compiled in 1.0, create an instance of the subclass and call the virtual method.

    Assembly^ cppAsm = Assembly::Load("BaseLibraryCPPExtension");

    BaseLibrary::BaseClass^ cppObject = (BaseLibrary::BaseClass^)cppAsm->CreateInstance("BaseLibraryCPPExtension.Subclass");

    cppObject->Foo(1);

    This code will not call the overridden method in mc++ 2.0 class but only the base method.

    As far as I can tell its because the mc++ 1.0 compiler puts a different attribute on the parameters of the virtual function than does the mc++ 2.0 compiler.  Use ildasm to look at the function signatures in both assemblies to see this.  Notice if you don’t use a "long" parameter this goes away and polymorphism works.

    Luckily for us we control all the assemblies I mentioned.  If instead the mc++ 1.0 assembly was supplied by a 3rd party vendor I don’t know how we would fix this problem.

    One final note, if you create a C# assembly in vs2005 that subclasses the base class and overrides the method it will work.

    I really hope you are able to reproduce this.  I could send you my set of projects that reproduce it if it would help.

  17. Dakk says:

    Mike –

    "The restriction is that you can’t use VS2005 to mixed-mode debug .Net 1.x apps."

    What I’m seeing is that regardless of what version of the framework it binds against at runtime, I cannot debug mc++ code built in vs2002.  Is that what you mean by .Net 1.x apps?  Or do you mean I can’t use vs2005 to debug apps that bind against the 1.0/1.1 framework, in which case it seems like I should be able to debug my mc++ assembly built in vs2002.

    I have a mc++ assembly written and compiled in vs2002.  It binds against the 2.0 framework at runtime.  I cannot step into or through any of the functions that exist in the assembly compiled in vc2002.  I can however step into mc++ assemblies built in vs2005 and used in the same executable.  Yes this is a mixed mode application but it binds against 2.0 framework.  

  18. I went through the repro.

    1. Re 9:09 comment: I *can repro* the broken polymorphism case regarding ‘long’ parameters. I agree this is wrong. At the very least, it should give a C4490 warning. I’m following up with the MC++ people about this.

    2. Re 9:20 comment: I *can successfully debug* here. VS2005 is succesfully loading the 2002 code (using the binaries built by 2002, no recompiling), and I’m stepping through it, while interop-debugging.

    Note that I do a get a Dll loaderlock MDA from the 2002 component.

    Note:

    – I changed from the Assembly.Load/CreateInstance to do a direct binding and call GcNew.

    – All code is compiled debuggable.

  19. Dakk says:

    Mike –

    Thank you very much for the followups.  I’m quite happy that someone else was able to reproduce the polymorphism bug.

    I’ll have to go back and try again to figure out how to debug the MC++ 1.0 assembly from VS2005.  This problem might go away for us though because we are converting the MC++ 1.0 assembly to build with VS2005 and writing a COM wrapper on top of it so we can use it (via the COM wrapper) from unmanaged C++ code that still must be compiled in VS2002.  We were trying to use the MC++ 1.0 assembly directly from that code but without success.