Optimizing 3D Collections in WPF

 

Background

 

Over the past few months we’ve been getting feedback that manipulating 3D collections can be slow.  It turns out there is more that one way to skin a … collection (horse references omitted).  In this article I profile many of the collection manipulation options for you, so you can make better choices when manipulating these things.  In addition, you should’ve noticed significant speedups after the May CTP: accessing the property system is 50-100% faster since the Feb. CTP.  That really matters when you’re, for example, trying to change Point3D values every 16.667 milliseconds (ms) (1000 ms / 60 frames per second).

 

Why 3D collections you ask?  Why not 2D?  After all, WPF is an integrated platform whose collections, 3D, 2D, or otherwise, are all pretty much the same.  The answer is two-fold.  First, 3D collections tend to be much larger.  We’re talking about tens of thousands of points, while 2D geometries often contain … tens.  So 3D collections stress the system a bit further.  Second, cool, spiffy 3D effects such as mesh deformations are often calculated at framerate in response to CompositionTarget.Render.  So not are we only changing tens of thousands of points, were trying to do it in a 16 ms budget.  That’s when the cost of manipulating collections becomes critical.

 

Finally, let me point out that WPF isn’t designed to be a game engine, or support that associated level of coolness (read: detail) we’ve all grown to expect in games.  What does that really mean to you? If this sounds uncomfortably close to your app, you need to profile & make sure that WPF can respond acceptably to the frequency of change requests you’re pushing into it.  If it can’t, it’s time to scale back either frame-rate or mesh complexity.  With that said, in addition to these optimizations and the work we’ve done making this scenario faster since the Feb. CTP, I can see more speedups happening in future versions.

 

Populating Collections

 

It’s Turtle vs. the Hare. Meet the Turtle.

 

   const int ARRAY_SIZE = 100000;

   MeshGeometry3D mesh1 = new MeshGeometry3D();

   for (int i = 0; i < ARRAY_SIZE; i++)

      mesh1.Positions.Add(new Point3D(1.0, 1.0, 1.0));

 

Profilers are our friends.  An instrumentation-based profile shows this code takes 196M cycles to run.  Compare that to adding the same number of points to a dynamically growing List<Point3D> (12.6M cycles), or a pre-sized List<Point3D> (1.4M cycles) and you’ll start to understand why people have been telling us about this.  At first glance, it appears to take 15.5X longer to populate a WPF collection than the equivalent dynamically grown List<Point3D>.

 

So, what the heck is WPF doing that’s taking all of this time?  Here’s the breakdown:

  • MeshGeometry3D::ctor: 80M cycles
  • Point3DCollection.Add: 68M cycles
  • get_Positions: 46.9M cycles

MeshGeometry3D::ctor

Out of the 80M cycles spent here, 77M cycles are spent creating static objects like Point3DCollection.Empty.  This cost is incurred only the first time the class is instantiated in this process, so it can be ignored (or easily worked around, by creating a 'warm-up' object at startup).  There is also a one-time cost associated with setting a certain property type for the first time, so that cost should be factored out as well when measuring results.

 

get_Positions

This is an easy cost to optimize.  Move get_Positions outside of the loop.

 

Point3DCollection.Add

You can't see it from the profile, but most of the time in this function is spent growing the collection.  By pre-sizing the collection we can avoid most of this cost.

 

Optimized Collection Population. Hare, enter stage right.

 

    // Instantiate a ‘warm-up’ MeshGeometry3D, set its

    // Position and TriangleIndicies properties. Then

    // run the following code.

   MeshGeometry3D mesh1 = new MeshGeometry3D();  

    // Populate a collection not yet connected to mesh1,

    // pre-sizing it ahead of time

   Point3DCollection collection = new Point3DCollection(ARRAY_SIZE);

   for (int i = 0; i < ARRAY_SIZE; i++)

      collection.Add(new Point3D(1.0, 1.0, 1.0));

    // To reduce working set, call collection.Freeze here before adding

    // the collection if your not planning on changing it

 

   mesh1.Positions = collection;

 

Total cost of this implementation is 3.9M cycles, down from 196M cycles.  This is only 2.7X slower than an pre-sized List<Point3D>, it's faster than an Point3D[] array, and has more functionality than either (e.g., changed notifications).  Here’s the breakdown:

  • Point3DCollection.Add: 3.6M cycles.  A 13.8X speedup due to not resizing the array.
  • Point3DCollection c’tor: 222.0K cycles.  Most of this time is spent pre-allocating the array
  • Point3DCollection.set_Positions: 103K cycles
  • MeshGeometry c’tor: 3.9K cycles

Finally, another option is to populate a List<> and the pass it to the IEnumerable collection constructor (i.e., public Point3DCollection(IEnumerable<Point3D> collection)).  Today this takes about 50% longer than calling .Add because of the IEnumerable cost. 

 

Modifying Collections

 

Slow Collection Modification

 

Just like populating a collection, there is a slow and fast way to modify collections.  The difference between modifying a WPF collection and modifying a CLR collection is realizing that changed notifications are involved.  You can get significant wins by managing these notifications optimally. 

 

Consider this sample of modifying this MeshGeometry3D, which exists within the simplest Visual tree: a single GeometryModel3D set on a ModelVisual3D within a Viewport3DVisual.

 

   Point3D point;

   Point3DCollection positions = mesh2.Positions;

   for (int i = 0; i < positions.Count; i++)

    {

      point = positions[i];

      point.X += 10;

      point.Y += 10;

      point.Z += 10;

      positions[i] = point;

    }

 

The author was being ‘smart’, in that the .Positions accessor was pulled outside of the loop.  Running this code on 100K points takes 6.6M cycles.  Here’s the breakdown:

  • Point3DCollection.set_Item: 5.5M
  • Point3DCollection.get_Item: 984K
  • Point3DCollection.get_Count: 154K
  • Point3DCollection.get_Positions: 1.1K

Why is set_Item so much more expensive than get_Item?  Because of changed notifiers of course – 4.43M cycles are spent fire changes in this simple Visual tree.  And this tree is extremely simple -- the more complex the tree, the more elements that will be involved in Changed notifications.

 

Optimized Collection Modification

 

In the following code sample, we ‘unhook’ the collection from the tree before modifying it:

 

   Point3D point;
Point3DCollection positions = mesh2.Positions;

   mesh2.Positions = null; // Unhook the collection

 

   int count = positions.Count;

   for (int i = 0; i < count; i++)
{
point = positions[i];
point.X += 10;
point.Y += 10;
point.Z += 10;
positions[i] = point;
}

   mesh2.Positions = positions; // Hookup the collection

 

This code runs in 836K cycles, a 7.9X improvement over the original code.  Here’s the breakdown:

  • Point3DCollection.set_Item: 391K
  • Point3DCollection.get_Item: 269K
  • Point3DCollection.set_Positions: 114K
  • Point3DCollection.get_Count: 57K
  • Point3DCollection.get_Positions: 3.3K

Summary

 

There is only a small handful of things you need to do to optimize WPF collection performance.  When populating WPF collections, pre-size the collection, move accessors outside of the loop whenever possible, and factor out startup costs when measuring.  When modifying collections, disconnect them from the live tree before changing them.  And as always, freeze collections you don't need to modify later on.

 

Here is the raw data for various collection operations:

 

Type

Method

Cycles (Millions)

Populating 100K Items

 

 

Point3DCollection

Unoptimized

196.00

Point3DCollection

Using IEnumerable ctor

7.00

Point3DCollection

Optimized

3.90

Point3D []

Initially sized

5.40

List<Point3D>

Not initially sized

12.60

List<Point3D>

Initially sized

1.40

 

 

 

Int32Collection

Unoptimized

144.49

Int32Collection

Using IEnumerable ctor

1.80

Int32Collection

Optimized

1.23

Int32[]

Initially sized

1.50

List<Int32>

Not initially sized

0.69

List<Int32>

Initially sized

0.08

 

 

 

Changing 100K Items

 

 

Point3DCollection

Unoptimized

6.60

Point3DCollection

Optimized

0.84

Point3D[]

N/A

4.20