One of the most popular .NET Compact Framework demos at MEDC 2006 was the .NET Compact Framework Remote Performance Monitor. While working at our booth on the expo floor, I used the .NET Compact Framework Performance Monitor to examine the performance of a simple application that I wrote on the booth PC.
If you had the opportunity to attend Ryan Chapman's MEDC session on optimizing for performance, much of what I talk about today will be very familiar.
The application that I wrote was intentionally ill-performing -- it created a strongly typed collection of integers and copied the collection's data into a collection that was not strongly typed (ex: ArrayList) in a tight loop. Because integers are value types, I knew that boxing would be an issue with this application. What I didn't realize is just how much of an issue boxing could prove to be.
Please note: What follows is a very contrived example intended to show a "worst case" scenario related to the boxing of value types. When designing your application, it is very important to measure performance of algorithms early and often. If your algorithm meets your performance goals, there is no need to dwell on worst case micro-benchmarks.
Example of extreme boxing
The snippet below is based on the example I demonstrated in our booth at MEDC.
List<Int32> numberList = new List<Int32>();
for(Int32 i = 0; i < 13; i++)
ArrayList listCopy = new ArrayList();
for(Int32 i = 0; i < 100000; i++)
// make a copy of the collection
foreach(Int32 num in numberList)
// we no longer need the list contents
Diagnosing the performance of the application
Before the release of the .NET Compact Framework version 2 service pack 1 beta, the implications of excessive boxing can be seen by enabling performance counters and examining the data in the .stat file. Let's take a look at some of the interesting data.
|Total Program Run Time (ms)||19476||0||0||0||0||0|
|Boxed Value Types||1300000||-||-||-||-||-|
|Garbage Collections (GC)||14||-||-||-||-||-|
|Bytes Collected By GC||15508684||1112444||14||1107763||1046912||1112444|
|GC Latency Time (ms)||300||19||14||21||19||25|
This example generated boxed 1,300,000 times and the Garbage Collector ran 14 times in under 20 seconds! That is an incredible rate of collection. From the GC latency time, we can see that the Garbage Collector was running very efficiently since there were not many live objects at the time of the collections.
Please notice that even though we collected a large amount of data (15,508,684 bytes) during the runtime of the application, the device was not put under heavy memory pressure -- the GC Heap stayed below 1 MB for the duration of the application.
Removing the boxing
If we modify the example to replace the ArrayList with a second List<Int32> and re-run the test, we see the following data.
|Total Program Run Time (ms)||9408||0||0||0||0||0|
|Boxed Value Types||0||-||-||-||-||-|
|Garbage Collections (GC)||0||-||-||-||-||-|
|Bytes Collected By GC||0||0||0||0||0||0|
|GC Latency Time (ms)||0||0||0||0||0||0|
As we can see from this set of data, the second version took less than half of the time to run as did the first -- all boxing has been eliminated and no garbage collections occurred. We also used considerably less memory (only 64k) since no objects needed to be constructed.
If you use the .NET Compact Framework Remote Performance Monitor, the data above becomes very easy to see in real time.
During the MEDC booth demo, I ran these examples in tight loops within a worker thread. After 20-30 minutes, the counter for Bytes Collected By GC pegged (got to Int32.MaxValue). We were still collecting, but the counters could no longer track how much data we collected yet the GC Heap size never exceeded 1114112 bytes. The application was doing a huge number of collections, yet was still memory "friendly" on the device.
Please be sure to remember that this is an extreme micro-benchmark example. Value type boxing can be useful in moderation, and does not necessarily cause dramatic application performance issues by itself. It is important to measure application / algorithm performance early and often.
This posting is provided "AS IS" with no warranties, and confers no rights.
Some of the information contained within this post may be in relation to beta software. Any and all details are subject to change.