Hiding Complexity


In a comment on yesterday's post, Manip asked:

 Larry: You said Macros work to hide the complexity and say so like it is a bad thing.. ? Excuse me but I thought that was the POINT of using a Macro..

Actually, in the world in which I live (writing systems programs that exist in the working set of hundreds of millions of users), hiding complexity is a very, very bad thing.

You need to be VERY careful whenever you do something that hides complexity, because it's likely to come back and bite you on the behind.  I wrote up this story back in April, but it bears repeating:

Way back in the days of MS-DOS 4.0,   I was working on the DOS 4 BIOS, and one of the developers who was working on the BIOS before me had defined a couple of REALLY useful macros to manage critical sections.  You could say ENTER_CRITICAL_SECTION(criticalsectionvariable) and LEAVE_CRITICAL_SECTION(criticalsectionvariable) and it would do just what you wanted.

At one point, Gordon Letwin became concerned about the size of the BIOS, it was like 20K and he didn’t understand why it would be so large.  So he started looking.  And he noticed these two macros.  What wasn’t obvious from the macro usage was that each of those macros generated about 20 or 30 bytes of code.  He changed the macros from inline functions to out-of-line functions and saved something like 4K of code.  When you’re running on DOS, this was a HUGE savings.


Because the macro hid complexity, we didn't realize that we'd hidden a huge size problem in one line of source code.  When you're dealing with an OS that runs on machines with 256K of RAM, that hidden complexity is a big issue.

The usual reaction that people usually have to this story is "What's the big deal.  That was 20 years ago, you idiot, machines now come with gigabytes of RAM, nobody cares about that stuff."

But they're wrong.  The same issue shows up even on today's machines, it just shows up differently. On today's machines, the bottleneck isn't typically the amount of physical RAM on the machine, instead, the bottleneck is in time to read an image from the disk. 

The thing is that the speed of RAM is limited by the speed of light, and the laws of quantum physics, but the speed of a hard disk is limited by the physics of large objects.  As a result, reading data from RAM is blindingly fast compared to reading data from disk (no duh). 

Every page that's faulted into memory while loading your application adds between 10 and 50 milliseconds (or more, if the machine's got a slow or fragmented hard disk) to the boot time of the system. The larger your code, the more time it'll take to load it into RAM. 

If the user can't use the system until your code's been loaded into RAM, then you just kept them from doing their work.  And that's the world in which I live - the system can't log the user on until the audio subsystem's loaded (at a minimum to play the logon chime), so it's really important that my code come in as quickly as possible.

We have teams of engineers in Microsoft whose sole job is to profile the disk boot process of Windows - they investigate literally every read from the disk that occurs during the boot process to ensure that it's necessary.  If a components code is taking "too many" disk reads to load, then these guys will come down like a ton of bricks on the poor developer responsible for that code.

So it's CRITICAL that I know where all the code in my application is going, and what it's doing.  If I've hidden complexity in a macro, or a templated function, then I may have many pages of code hidden behind a single line of source.

Comments (23)

  1. Anonymous says:

    This is also why choosing "Optimize for Speed" may be less helpful than choosing "Optimize for Size." Saving 50-100 cycles here and there is helpful, but saving 50-100ms at startup is even better.

  2. Anonymous says:

    Then sometimes you get disgruntled employees that do this with macros:

    #define printf if((rand() % 2000) == 0) exit(0);

  3. Anonymous says:

    As I remember reading somewhere, not only does "Optimize for Size" speed up the initial from-disk load, but it also reduces the number of memory pages your code is across, thus reducing the page-in/out delays.

    It’s what I ususally use.

  4. Anonymous says:

    Another reason to hate complexity, of course, is that complex code requires complex coders and complex documentation. Even then, it often hides problems. I remember a bug in an NDIS driver I wrote once that resulted from the fact that the "function" I was calling was really a macro with side effects for which I was unprepared.

    Complexity is a killer for a dev team >= 2 people.

  5. Anonymous says:

    Larry, thanks for the answer. I see what you mean, I am not going to stop using Macros in my code but I might give them each individually some more thought (as opposed to using a procedure).

  6. Anonymous says:

    Larry,

    Sorry but I can´t (completely) agree with you

    I am an old scholl developer, who started writing 8080 assembly code in the early 80’s and moved on to C about the time of DOS 3.

    There is a clear limit on the complexity of a program you can write with completely in-lining your code. At some point yoy have to structure it, by means of macros, subroutines or (glup!) objects. By doing this, you are hidding complexity.

    In your DOS 4 BIOS example, the hidding was moved on from the macro to a subrotine and the possibility of something going wrong do to not understanding what is inside it is still there.

    I’ve also beem burned a lot of times by code that was "hidden" in a macro, a subrotine, a library or in the OS.

    The solution? Better documentation (the free software folks will also say access to the source, but I wil trade good docs for source code anytime).

    And yes, Steve, critical code should be writen by the best coders and get the best documentation.

  7. Anonymous says:

    Daniel,

    The complexity I’m talking about is what happens when you see:

    ENTER_CRITICAL_SECTION(CritSec);

    When you look at the source code, you see one line.

    When you look at the assembly language, you see a kilobyte of code.

    As code moves from one developer to another, this problem gets worse – the next developer doesn’t realize the costs of calling the function, so they blindly use it.

    This happens with more than just macros. How much code is added to your application when you add:

    vector<myType> myVector

    to your code?

    When you add:

    vector<myOtherType> myOtherVector

    you bring in exactly the same amount of code, just by adding a single line of code in the source.

  8. Anonymous says:

    You also see this problem with a lot of the new complex compiler constructs and managed data in C++/STL. Just because it takes one line of code to write, doesn’t mean it is fast to run.

    I once saw a routine that tried to count the number of words in a string (lets ignore the complexity of doing this for something else other than U.S. English). They used what they were use to using, a managed string such as CString (in this case is was Borland’s AnsiString). The algorithm worked by chopping off the front of the string the whitespace and words until the string became empty. In other words, for every word in the string, they hit the memory manager four times (two allocations and two frees).

    Even though the routine looked simple, it was a memory and CPU hog.

  9. Anonymous says:

    On the bloat of std::vector:

    I switched from std::vector to a home grown array where I didn’t need all the "power". For every unique template instance of the vector that was replaced, the application dropped by 3-4k in size. For a WTL application that was only 200k is size, the savings was huge.

  10. Anonymous says:

    My point is that hidding complexity is a must for big projects. Hidding complexity is a Good Thing, and is the base for structured and object oriented programming.

    The alternative to hidding complexity is to expose it at every step, by not using macros, templates, funcions cals or objects. The resulting program is a nightmare, cause your trying to take everything in acount at the same time.

    Of course you will have to do it in Assembly, because the compiler is hiding the complexity of the instruction set and memory allocation 😉

    I find it very usefull in programming that I studied engineering and have some notion of what happens inside a CPU, but most of the time I am thinking very above this level.

    Daniel

  11. Anonymous says:

    Isn’t it really a trade-off (as with many things?). The macros may have been calls to functions instead of inline code. In fact, I wouldn’t be surprised if the fix for the code size problem was to change the macro definitions to function calls instead of changing each instance of the macro use.

    Virtually all choices in programming is a tradeoff between complexity, size, and speed.

    C hides the complexity of assembly. The CLR hides the complexity of managing memory.

    Templates hide the complexity of… actually a lot of the template usage I’ve seen seem to increase complexity (at least in the implementation of the templated classes).

    Anyway, one of the key jobs of a developer is to manage these kinds of tradeoffs properly.

  12. Anonymous says:

    The issue is how well you can hide the complexity.

    For example, CreateFile() works well enough that generally you aren’t concerned with the reality behind the scenes. It’s a fairly solid abstraction to work from.

    My team uses a set of macros for error handling and we did (IMO) a good job on building a solid, non-leaky abstraction. We pay careful attention to the size and quality of the code generated and the simplicity and directness of the code is easily worth it.

    Macros vs. functions vs. templates are really no different here – they all hide concepts and complexity. You always have to be careful about your dependencies and who you use and make sure that you use things in the way that they were intended to be used and that the things you use want to support you. (They’re not our there from the goodness of their heart, they’re trying to solve problems and maybe make money so as usual, all engineering decisions are in an economic context, not an academic one.)

    Re: size/speed:

    It’s funny because the compiler folks think that compiling for size is stupid but we still build almost all of windows for size because (a) the on-disk footprint costs and (b) except for some rare code paths like rendering or the i/o path in servers, we’re usually more memory-constrained than CPU-constrained. It really doesn’t matter if the icon will draw in 3% less time if we had to take a page fault to get to the code to draw it.

  13. Anonymous says:

    If you use a template that never touches the template type, but always uses pointers or references to it, the amount of code produced can be reduced considerably; the compiler can use the same code for a pointer, regardless of type, in many cases.

    Take a linked list: each link needs its data and a pointer to the next link. If you use a pointer for the data instead of storing it directly, the code implementation for manipulating links can be the same. It’s still a pointer even if it’s a pointer to an int or to a MyClass. The compiler will realize this and merge eqivalnet implementations into a single code output. The type safety of a template without code bloat.

  14. Anonymous says:

    The problem with this Foolhardy is that you lose RAII for those templates – And someone has to allocate and free that memory so you trade heap fragmentation for code size. I’m not sure which is better.

  15. Anonymous says:

    I think one of the points here has been that between two ways of hiding complexity, macros vs. function calls, macros are more dangerous. If a macro is nontrivial then it uses enormous amounts of memory (per occurence) instead of just enormous amounts of time.

    If a macro calls a function, if the macro only adds a trivial amount of additional wrapping or whatever, then the main cost is in the function, the same as it would be if a macro were not involved. Then every call still takes the amount of CPU time that the function needs, but the function only occupies its allotment of memory once.

    As for whether abstractions and complexity hiding are good or bad, well they’ve been added to every language that was ever invented. Macros were invented for assembly language before they were invented for C. Fortran had ordinary subroutines and functions and also had statement functions. Lisp had function definitions and recursion. It is impossible to write a nontrivial program without them.

    Badness comes about when a program calls overly expensive versions when cheaper versions would have sufficed. The abstraction might be partly to blame, but it’s still virtually impossible to live without the abstraction.

    Badness also comes about even when a program uses cheap operations if it uses orders of magnitude more of them than necessary. For example someone thinks sorting is expensive so they write their own sort routine without calling any subroutines, and they write an order n-squared sort. The abstraction is not to blame in these cases.

  16. Anonymous says:

    In Principio &raquo; hiding complexity is a very, very bad thing

  17. Anonymous says:

    By the way, 12/2/2004 11:16 AM Michael Grier [MSFT]

    > It’s funny because the compiler folks think

    > that compiling for size is stupid

    30 years ago it actually was, and the problem is that compiler books haven’t been updated adequately since then.

    30 years ago, although there were some machines with virtual memory, they were still in the minority and they weren’t the most powerful machines. It was understandable that speed was more important than size in those days.

  18. Anonymous says:

    Norman,

    25 years ago, compiling for size WAS compiling for speed. The memory on systems was so slow that the fetch time for reading from RAM overrode the time to execute the instructions (for all instructions except DIV and MUL).

  19. Anonymous says:

    12/6/2004 6:50 AM Larry Osterman

    > 25 years ago, compiling for size WAS

    > compiling for speed.

    Was not.

    In systems without virtual memory 25 or 30 years ago, compiling for speed meant things like inlining instead of calling functions (in the generated object code), and unrolling loops, and all kinds of things to reduce the number of instructions executed.

    Compiling for size became important only when hard limits on memory size were being hit. This did even happen sometimes in embedded systems. Usually in embedded systems you want to optimize for speed, but if you hit the limit of your address space or the affordable limit on installed RAM or ROM, and if there’s no such thing as disk drives or other places to swap out some of your data, then you have to start optimizing for size. Then you want to choose functions that are comparatively large and called comparatively few times and decide not to inline those, and choose which loops not to unroll, etc. Or do humongous amounts of work to operate on hash tables and figure out some hashing functions that will avoid using too much memory, instead of simple direct array accesses.

    There were always tradeoffs between space and time. Compiling for size was never the same as compiling for speed until virtual memory and page faults took over.

  20. Anonymous says:

    Norman,

    That might be the case on big iron machines, but on PCs, the memory bandwidth far outweighted the processor speed.

    Interestingly enough, we’ve come full circle – these days, memory bandwidth of the system is usually the bottleneck in big systems.

  21. Anonymous says:

    Programming embedded systems using Intel’s compilers and assemblers, including one of the same processors that some PCs used, compiling for size and compiling for speed were still opposites. Programming embedded systems using other microprocessors’ tools was the same. The problem was the same as on big iron. It didn’t change until virtual memory and page swapping took over, and then the change affected both PCs and big iron. Mostly it hasn’t affected embedded systems (though the microprocessor’s on-chip cache can cause part of the same problem in an even more complicated manner).

    > these days, memory bandwidth of the system

    > is usually the bottleneck in big systems.

    And in PCs, since memory is big enough for it now. The next release of Windows will take care of that though, won’t it? Will need paging even when we have 8GB of real RAM, right?

Skip to main content