Application Performance Tuning Philosophy

I’m in the process of embarking on some investigation and discussion with folks about performance tuning EF apps (in particular I’m going to dive into a critical area or two on my D3 project), and the more I thought about this, the more convinced I became that there is a fair amount of background that needs to be set out before diving into the heart of the matter.  At least for me, there are a number of things about perf tuning that just weren’t obvious when I first began considering it.  To be successful at this endeavor, just as with any other, we need to really know what we’re about and something of the underlying ideas that lead to a workable approach.  So here’s my off the cuff description of the way I think about performance.

There Are Two Ways to Improve Performance

In my opinion, there are basically two different ways to actually improve the performance of an application (improving the performance of a framework or a server overlaps a lot but each has some unique challenges that I’m not going to consider today).  The first technique is all about your application design, and the second is a matter of tuning individual operations. 

Without a doubt the biggest wins come from making good design decisions up front.  What I’m talking about here are coarse-grained things that affect the entire way your app works.  Getting these right is generally either a measure of understanding your problem space well through a lot of experience, being lucky or writing your application, discovering what you got wrong and then completely re-writing large portions of it.  An example of this kind of issue would be the difference between a client-server application where the client is essentially just a dumb terminal that requires every single key-stroke to travel to the server to be handled and one where you have a smarter client that does a fair amount of display, validation and other logic on the client giving great responsiveness and then just sends larger batch operations to the server.  If you happen to choose the first approach, you might discover huge perf problems, and if that happens it will be a major issue to redesign your application to the second approach.

While the first approach is largely about forecasting where you are going and making sure you create an architecture that leads you down the right path for your largest decisions, the second approach looks at how to improve much finer-grained individual scenarios.  The techniques you use for this approach are much different than those used for the first one.  In fact, often if you try to think carefully and optimize your fine grained operations up front in the way you do for the overall architecture, you will introduce a ton of complexity which makes your application much harder to develop, test and maintain and even after all that experience shows that most often you will not have actually improved the performance in a way that has a meaningful positive effect on the end user experience—you may actually have made it worse.

So while design is very important, tuning is also critical to get a great experience from your app.

Tuning is Experimental Science

While lots of what we do as software developers is very different from experimental science (resembling something closer to craftsmanship involving a mix of art & science, engineering & design), performance tuning done right is ALL SCIENCE.  We need to have an understanding of the underlying concepts, to have some intuition about what will and will not work, but just like an experimental scientist we have to train ourselves to thoroughly harness that intuition and slave it to a process that will lead us to results we can prove and depend on.

I like to tell developers on my team, “If you think you know what will improve the perf, you are almost certainly wrong.”  As developers we often have that feeling that we just know what is wrong with the code, and we want desperately to dive in and implement our fix.  In practice, though, often we’re wrong.  Oh sure, we will improve the performance of some part of the code, but unless we’ve done a very careful investigation first we may have made an improvement in an area of the code that is already so fast it doesn’t matter while ignoring or maybe even making worse the hot path (the thing that is the current slowest part which has the most effect on the user experience).  Further, we often have a tendency to take a shotgun approach and make a lot of changes before measuring again with the result that if the perf gets better, worse or stays approximately the same we have no way to know what change we made is responsible.

The only real answer is to make a baseline measurement, look at the data, create a hypothesis and then make the smallest possible change to test that hypothesis.  Further, it’s super important that we carefully prioritize our experiments.  We want to go after the lowest hanging fruit.  That is to tackle those problems which will have the biggest impact on the end user experience for the lowest possible cost in terms of development cost and final code complexity.  Obviously this means we have a tendency to look for the hot path through the application as the first step, but we also have to apply a complexity filter since it may be that the hot path is something that would be very hard to change, while there may be 2 or 3 other things that are almost as significant as the hot path but much easier to change.  Just always remember: measure, measure, measure.  When we’re done we want to KNOW the app will be faster and that we’ve gotten there as quickly and easily as possible.

Tuning Improvements Tend to be the Enemy of Clean Code

To add further strength to the comment that we want to make the cheapest possible changes, I’ll point out that while top-level architecture design style perf improvements can often make your application even cleaner and easier to understand and work with at the same time that they improve performance, the kinds of changes you make when tuning tend to have the opposite effect.  So you want to start with the simplest, cleanest possible implementation for every part of your app.  Remember: You never know in advance where the tuning will really be needed so wait until you can measure and find out where you have to introduce the hacks rather than putting them in up front.

One example of this principle comes to mind from when I used to work on Excel a number of years ago.  Excel is a product that has been around for a LONG time in software years, and it’s been through a lot of changes.  When I worked on the product it had already been around a long time so there was a lot of legacy.  One aspect of this legacy was a perf optimization they had put into the product back when it was being used on MUCH slower systems than we have today.  There was this code which was used to interpret the core data structures representing cells in the spreadsheet.  The structure itself was highly optimized to make cells take as little memory as possible with the result that the bytes in memory could have one of several meanings so the code would look at a few bits up front to figure out how to interpret the rest.  Not only was this data structure very complicated, but to make matters worse the code for interpreting the structure was repeated all over the code base.  They went to great lengths to make sure it stayed the same everywhere, but nevertheless it was repeated.  They had done this for very good reasons.  At the time it was first done that way, the cost of calling to a single central routine to interpret the structure in all the many places that needed to interact with it was just too high.  If they had to make that jump every time and then return, recalcs would take forever.  The problem was that by the time I was working on the project, there really wasn’t anyone left on the team who knew every place that might have a copy of that code, and when we wanted to introduce a change to the data structure, the cost was astronomical to hunt down every case and fix it and then to test the product like maniacs to make sure we hadn’t missed anything.  At about the same time, a related small project rewrote the core spreadsheet engine from scratch in a small component designed for plugging into web pages.  Since this was written from scratch in an era where computers had sped up considerably, this little component was able to centralize its code for the same task with the result that it had more features in some areas than the big brother excel product for several years until excel could finally be updated.

The moral of the story? Start clean, measure carefully, tune in biggest-bang-for-the-buck order.  Stop tuning as soon as you can.  (Which implies that you need to know just how fast is fast enough.)

- Danny