WS-Test - a controversial benchmark

So, it looks like Sun finally published the source code behind the WS-Test benchmark. (Disclaimer: what follows is my own personal position which might not coincide with Microsoft official position)

For those not familiar with the controversy around WS-Test, here is a little history: in June 23 Sun published a benchmark (https://java.sun.com/performance/reference/whitepapers/WS_Test-1_0.pdf) comparing web service performance of Java vs. .NET. The benchmark compared two similar implementation of a Web service, one implemented in Tomcat and another in .NET 1.1/IIS. They did not release any source code, just a white paper showing some unverifiable performance numbers.

Since a benchmark is not very useful if is not verifiable, Microsoft responded shortly with a white paper (https://www.middlewareresearch.com/whitepapers/Benchmark_response.pdf) and associated downloadable source code (at https://www.theserverside.net/articles/content/SunBenchmarkResponse/WSTest.zip). The funny thing is that the Microsoft published benchmark results show a completely different picture: while the original (non-verifiable) results published in Sun’s whitepaper showed the Java solution being faster, the Microsoft results showed exactly the opposite: .NET is significantly faster, especially when the payload increases in size. You can download the code and verify the results yourself.

So where is the truth? Fortunately, it looks like we are getting close to it since Sun published recently a response (https://www.theserverside.com/news/thread.tss?thread_id=27801) and some associated source code (here: https://java.sun.com/developer/codesamples/webservices.html#Performance). Again they claim that Java is faster than .NET (more on that below) but there were a couple of subtle differences in their response this time:
1) Sun acknowledged that they re-ran the test with different settings than the original run (“We discovered that we did not increase the number of network connections on the .NET client side. We added the following to machine.config, and re-ran the tests using J2SE 5.0 Beta2 and found that the measured throughput and response times obtained are still superior to the .NET results”). But they did not mention any numbers this time... why?
2) They did not deny the published numbers from Microsoft. Is this an implicit acknowledgement of our results?

Anyway, a number of people quickly spotted some problems in Sun’s benchmark code. Here are some issues:
1) The benchmark code published by Sun is unfair since the test driver for .NET is different than the test driver for Java. For any relevant results, you should compare two configurations on the same hardware and exercised using the same test process. This is a basic requirement for any relevant benchmark that compares two solutions.
2) Another basic flaw: the .NET implementation provided by Sun doesn’t use the same feature set for the two application servers: In the.NET version, the web server is using Windows Integrated authentication while the Tomcat (Java version) is not doing any authentication! This is a pretty severe oversight. This clearly will give a performance hit to the .NET solution but now the comparison is useless because you are basically comparing apples with oranges. The correct configuration would have to rely on the same security settings for both application servers: either disable Windows Authentication in the .NET case, or offer a similar setting in the Windows Authentication Tomcat configuration.
3) Some unneeded .NET modules are still present (OutputCache, Session, etc.). I am not sure if this would make a difference but again, to make a fair comparison between the two configurations you have to bring them as closely as possible to the same set of features. Even if the defaults might be different…

So it looks like Sun has some home work to do. First it needs to fix their code to make it relevant. Then they have to publish again the code and the new results. You can’t just publish the results without the code and then a new code without the results.

Now, one more comment - just to clarify my own personal position. I don’t really think that benchmarks are about politics. Nor these benchmarks can show an absolute superiority of .NET vs. Java or vice-versa. In real world, the truth is not that black-and-white.

But, ultimately, such benchmarks or procedures are intended to help you to take a decision. So, if the particular .NET solution shows consistently faster than the Java correspondent for this specific benchmark, this doesn’t mean that your homework is done. Assuming that your (Java vs. .NET) decision is purely performance-driven, then in order to perform a true comparison, you have to start first to develop a pilot test benchmark for your specific project needs, and more importantly to figure out what matters and what does not matter in your specific case. For example you might care more only about scalability vs. the number of clients. Or you just want to know how well your hardware is used (CPU load, memory, etc). That said, such benchmarks can be a great starting point in developing your own pilot test benchmark and indirectly give you a guarantee that you made the right decision.