The Performance War — Win it 5% at a time

If it feels like getting good performance out of your application/library/service/whatever is more like “trench warfare” than it is like “shock and awe” then you’re probably doing something right.

The trouble with performance work is that the easy work gets all the press.  Well, sort of.  Let me explain. 

Suppose you’re an elite performance engineer. Sometimes a big ugly nasty problem gets dropped in your lap. When that happens you roll up your sleeves, apply some good techniques, and maybe you net a nice fat win for your clients.  Everyone cheers and they all rush to the store buy a superhero cape just like the one you wear.  Children ask for your autograph.  It’s good times for everyone. 

OK well, maybe I’m getting a little carried away.

The thing is when something like the above happens probably you shouldn’t be celebrating at all.  Giant performance wins that could not be achieved without the help of a superhero are usually a sign that something has gone terribly wrong in the design process.  Perhaps the team in question left their performance work to a point that was too late in the cycle.  Perhaps they had some basic flaws that had to be remedied, flaws that never should have crept into the codebase in the first place.  But the fact that some “superhero” was able to come along and significantly fix them points more towards mistakes in the original code than it does to any greatness on the part of the hero.

Still, sometimes that’s the gig and so you do it.

Now the “real” performance work, the stuff you should be proud of because it’s hard, is much less glamorous.  Basically it happens by carefully understanding your whole process, avoiding any big problems (so that they never require a superhero to come along and fix), and steadily working on the most important areas slowly but surely. 

In a mature product with a healthy process you’re much more likely to see a 50% gain come in the form of many 5% gains compounding to get to your goal via sustained effort and quality control.  Those wins are largely unsung but they are the hard ones.  They are the wins that give you headroom for your new features and convert your older features from sluggish to snappy.  Every one of them is hard work.

The tragedy is that more care and effort often goes into any one of those fixes than one superhero action, but the capes get all the good press.

Huge instant performance wins are more often a sign of problems than they are of greatness.

Comments (12)

  1. BlackTigerX says:

    great post, I couldn’t agree more

  2. Cheong says:

    Sure. But managements simply can’t understand.

    They want good improvement figures. If you’ve persuding your boss to give you time for profiling, they’ll surly be much happier to see a 50% improvement than a 5% one.

  3. Andrew D. says:

    That is a great post! Very interesting! I hope I will some day be able to perform both the "Superhero" type And the "real" performance work, as appropriate/necessary. 😉


  4. oastorga says:

    So to make managers happy, you create a bad desing and then later you fix it. Maybe you’ll get even 90% performance improvement. 🙂 The truth is that educating managers is also part of the process. I think they can understand (it’s difficult, I know), maybe we are using the wrong approach. There are some obstinate individuals out there I must admit. How about this, let’s try to make them think is their idea…. keep your spirits up…

  5. I don’t know. I would love to have a project where I have to fight for each 5% of performance.

    But mostly I just have to explain to people what a StringBuilder is for, and find O(N^2) parts and replace them with O(N*log(N)) algorithms.

    The most complex part is to get the people to recognize that their design is broken and to make them change it without insulting them.

  6. Norman Diamond says:

    Here’s something where 5% won’t cut it: Windows Vista beta 1 checked build.

    I’m installing it on a Pentium III 600 MHz with 320 MB of RAM and 17 GB of available hard disk space, because the machine is available for use as a crash box. mscorsvw.exe is taking more than 90% of the CPU time, so we know that RAM isn’t a problem. The total commit charge is around 580 MB but paging isn’t a problem. If the machine were doing a lot of paging then the program would spend most of its time waiting and CPU usage would be down around 3%. Also if the machine accessed the hard disk a lot then I’d see the activity icon in the LCD a lot more often than I’m seeing it.

    So mscorsvw.exe is really doing stuff during this install, but what? Let’s assume there are lots of cache misses. The memory bus is probably 100 MHz and the RAM probably responds in 2 clock cycles. So if the CPU is spending most of its time waiting for RAM to respond, then there have probably been 150 trillion RAM accesses since the install started.

    Want to compute that or should I give a spoiler? Yes I do wonder how many months it’s going to take for Vista beta 1 checked build to finish installing itself. "Do not restart your PC during this time" for how many months?

    And after it finishes, will it be usable? After some number of weeks without being activated, it won’t allow logins, right? Without logging in, it won’t be possible to configure the network card to use a fixed IP address[*], so I won’t be able to activate it over the internet. And guess which company is refusing to activate Vista betas by phone.

    [* Of course in the retail build, which installed successfully, it’s also hard enough to configure the network card to use a fixed IP address. The settings dialogs go through the motions but the OS keeps trying to get an address from DHCP.]

  7. Travis Owens says:

    Of course if that’s 5% bursts across multiple changes and there’s a 3 year pause between releases, 50% is possible.

    I mean look at some of the XML bragging in .Net 2.0 where somebody here on MSDN blogging showed improvements as large as 400%.

    And just recently I came across a chart showing many boosts in the .Net Compact Framework v 2.0, one improvement was a whopping 800%.

  8. Travis Owens says:

    Here’s one site showing the massive XML speed improvements with XML in .Net 2.0

    Of course looking back at old code, and having more optimized code, it’s easy to see myself calling that old code "a problem".

  9. A while ago I wrote about how you often win the performance war 5% at a time.  The theme of that…

  10. I remember Raymond Chen gave a talk at the last PDC and had this fun quote:

    "One of the questions…

  11. Last

    week I described an optimization that helps Paint.NET’s startup performance by

    avoiding our…

  12. The thing about performance work is that it’s very easy to be fooled into looking into the wrong areas.