More on Perf and Productivity

Anthony Moore, Dev Lead for the BCL team posted a great comment on Rico’s latest quiz… It was so good I could not let it be buried there, so I thought I’d re-post it here… I am sure it will generate some comments ;-)

I am a colleague of Rico, and I feel I should comment on these results. My team owns the RAD IO, String and number parsing routines compared here.

This significant performance difference might make you wonder if you should be using these APIs and whether there will always be such differences between roll-your-own solutions and general-purpose ones. Rico's conclusion points at some of the reasons to be careful here. I should also comment on some future plans and to what extent it will be possible to have the best of both worlds.

Some considerations in interpreting these numbers:

- A notable result is that a 50X increase in allocations results in only a 7X increase in execution team. The CLR’s GC is incredibly efficient at allocating and cleaning up small, short-lived objects.

- It is kind of apples and oranges to compare RAD Object Oriented API performance with C/C++ style coding using buffers. Environments like .NET, Java, Jscript, VB6 and Smalltalk all allocate a lot more objects than buffer-based C and C++. Most developers find that the productivity benefits outweigh the specific performance and memory costs most of the time.

- As the example shows, you sacrifice productivity, readability, correctness, features and flexibility by going down this path. As an example, none of these routines are globalization-aware. As another, it would be expensive to switch to parsing types other than Integer if you stick with this approach.

- Why would we ship a product with such apparent performance problems? This example is contrived in that the parse is doing no actual work other than extracting values. In most real scenarios the “actual work” tends to outweigh overhead like this, and these APIs do not show up high in performance profiles. Another factor is that String and IO operations tend to be significantly faster than operations like database access and network access, and so they don’t often create performance bottlenecks.

That being said, there may be cases were you do need to re-implement .NET Framework APIs to tailor them to a specific purpose that is a bottleneck for you. A key point is that we really want YOU to let us know when you need to do this, so we can allocate time to making the APIs faster for everyone in future.

Future Plans

We would like to get the point where we can provide RAD alternatives such that you could not easily re-implement a better one for performance reasons. Roughly speaking, we would like to provide APIs that perform within 2X of these hand-rolled, special cased versions. Plans we have for future versions:

1) Improving the raw performance of integer parsing but special casing the code, which is currently using a code path shared with Double and Decimal parsing, which is a lot more complicated.

2) Providing a way to split a String without needing to create any temporary objects. This is possible if a new splitting API is introduced and some new abstractions.

3) Providing a way to parse the base data types from a sub-string, or from a part of a Char Array.

However, we really need as much input from YOU as possible. Feedback in blog comments is fine, but the recommended way is to go through the official feedback mechanism:

https://lab.msdn.microsoft.com/productfeedback/

General Guidance

- Use most RAD APIs as a starting point as the situation requires.

- Use some RAD APIs with care. It is easy to abuse an API like String.Split where you don’t have a need for all the intermediate strings, for example.

- Just because you can make an API faster does not necessarily mean you should. It’s a cost/benefit situation.

- If you need to rewrite a standard API because it is a real performance bottleneck for you, please tell us so we can make the APIs better.

Furthermore, I should probably go into more detail about how we approach API performance.

During initial development, we try to make a reasonable trade-off between correctness, performance, security and maintainability. Security is particularly relevant here, as there is zero tolerance for errors that could expose security holes in the platform. As an illustration of the trade-offs, we almost never use unsafe code during initial development, because the increased risk of introducing a security error does is not usually justified by the small performance benefits you can get.

There can be interesting correctness, maintainability and code size trade-offs also. As an example, most number parsing and formatting code is shared for all numeric data types. We could make optimizations for simple cases of integer parsing or where invariant culture is used, but this would likely require a significant amount of additional code.

Thus, it is not the case that we make every API absolutely as fast as we can make it. We still try hard to make them very fast. However, we focus our time and resources to make the most important APIs as a fast as possible.

For APIs at the base of the platform, we rely on partner scenarios and customer scenarios to drive which APIs to optimize further. Thurs, there are some APIs such as String.CompareOrdinal, the Hashtable indexer and StringBuilder.Append that were very highly optimized in V1.0 because of how frequently they appeared in profiling logs for key scenarios. Unlike some of the APIs mentioned in Rico’s example, it would be extremely difficult to create a roll-your-own version of these that would be much faster than the ones in the BCL. For some primitive operations like for-each on Array and String.Length, it would be nearly impossible because they are special-cased by the JIT.

In some cases the performance is a consequence of the API design. In this example, String.Split and Int32.Parse force you into creating sub-strings. That does not mean that there is anything wrong with these, as it is the simplest abstraction to deal with. However, if there is significant customer or partner demand, we may need to create optional versions of these APIs that are a little more complicated to use or may have fewer features, but can go faster or eliminate the need for intermediate string objects. Having additional ways of doing the same thing makes the object model more complex and makes the libraries bigger, so you need to really sure such new APIs are justified.

This is why your feedback on which APIs to duplicate or tune is very important. Please let us know:

https://lab.msdn.microsoft.com/productfeedback/