Optimizing 3D Collections in WPF


 


Background


 


Over the past few months we’ve been getting feedback that manipulating 3D collections can be slow.  It turns out there is more that one way to skin a … collection (horse references omitted).  In this article I profile many of the collection manipulation options for you, so you can make better choices when manipulating these things.  In addition, you should’ve noticed significant speedups after the May CTP: accessing the property system is 50-100% faster since the Feb. CTP.  That really matters when you’re, for example, trying to change Point3D values every 16.667 milliseconds (ms) (1000 ms / 60 frames per second).


 


Why 3D collections you ask?  Why not 2D?  After all, WPF is an integrated platform whose collections, 3D, 2D, or otherwise, are all pretty much the same.  The answer is two-fold.  First, 3D collections tend to be much larger.  We’re talking about tens of thousands of points, while 2D geometries often contain … tens.  So 3D collections stress the system a bit further.  Second, cool, spiffy 3D effects such as mesh deformations are often calculated at framerate in response to CompositionTarget.Render.  So not are we only changing tens of thousands of points, were trying to do it in a 16 ms budget.  That’s when the cost of manipulating collections becomes critical.


 


Finally, let me point out that WPF isn’t designed to be a game engine, or support that associated level of coolness (read: detail) we’ve all grown to expect in games.  What does that really mean to you? If this sounds uncomfortably close to your app, you need to profile & make sure that WPF can respond acceptably to the frequency of change requests you’re pushing into it.  If it can’t, it’s time to scale back either frame-rate or mesh complexity.  With that said, in addition to these optimizations and the work we’ve done making this scenario faster since the Feb. CTP, I can see more speedups happening in future versions.


 


Populating Collections


 


It’s Turtle vs. the Hare.  Meet the Turtle.


 


   const int ARRAY_SIZE = 100000;


 


   MeshGeometry3D mesh1 = new MeshGeometry3D();            


 


   for (int i = 0; i < ARRAY_SIZE; i++)


      mesh1.Positions.Add(new Point3D(1.0, 1.0, 1.0));


 


Profilers are our friends.  An instrumentation-based profile shows this code takes 196M cycles to run.  Compare that to adding the same number of points to a dynamically growing List<Point3D> (12.6M cycles), or a pre-sized List<Point3D> (1.4M cycles) and you’ll start to understand why people have been telling us about this.  At first glance, it appears to take 15.5X longer to populate a WPF collection than the equivalent dynamically grown List<Point3D>.


 


So, what the heck is WPF doing that’s taking all of this time?  Here’s the breakdown:




  • MeshGeometry3D::ctor: 80M cycles


  • Point3DCollection.Add: 68M cycles


  • get_Positions: 46.9M cycles

MeshGeometry3D::ctor


Out of the 80M cycles spent here, 77M cycles are spent creating static objects like Point3DCollection.Empty.  This cost is incurred only the first time the class is instantiated in this process, so it can be ignored (or easily worked around, by creating a ‘warm-up’ object at startup).  There is also a one-time cost associated with setting a certain property type for the first time, so that cost should be factored out as well when measuring results.


 


get_Positions


This is an easy cost to optimize.  Move get_Positions outside of the loop.


 


Point3DCollection.Add


You can’t see it from the profile, but most of the time in this function is spent growing the collection.  By pre-sizing the collection we can avoid most of this cost.


 


Optimized Collection Population.  Hare, enter stage right.


 


   // Instantiate a ‘warm-up’ MeshGeometry3D, set its


   // Position and TriangleIndicies properties.  Then


   // run the following code.




   MeshGeometry3D mesh1 = new MeshGeometry3D(); 


   // Populate a collection not yet connected to mesh1,


   // pre-sizing it ahead of time


   


   Point3DCollection collection = new Point3DCollection(ARRAY_SIZE);


 


   for (int i = 0; i < ARRAY_SIZE; i++)


      collection.Add(new Point3D(1.0, 1.0, 1.0));


 


   // To reduce working set, call collection.Freeze here before adding


   // the collection if your not planning on changing it


 


   mesh1.Positions = collection;


 


Total cost of this implementation is 3.9M cycles, down from 196M cycles.  This is only 2.7X slower than an pre-sized List<Point3D>, it’s faster than an Point3D[] array, and has more functionality than either (e.g., changed notifications).  Here’s the breakdown:




  • Point3DCollection.Add: 3.6M cycles.  A 13.8X speedup due to not resizing the array.


  • Point3DCollection c’tor: 222.0K cycles.  Most of this time is spent pre-allocating the array


  • Point3DCollection.set_Positions: 103K cycles


  • MeshGeometry c’tor: 3.9K cycles

Finally, another option is to populate a List<> and the pass it to the IEnumerable collection constructor (i.e., public Point3DCollection(IEnumerable<Point3D> collection)).  Today this takes about 50% longer than calling .Add because of the IEnumerable cost. 


 


Modifying Collections


 


Slow Collection Modification


 


Just like populating a collection, there is a slow and fast way to modify collections.  The difference between modifying a WPF collection and modifying a CLR collection is realizing that changed notifications are involved.  You can get significant wins by managing these notifications optimally. 


 


Consider this sample of modifying this MeshGeometry3D, which exists within the simplest Visual tree: a single GeometryModel3D set on a ModelVisual3D within a Viewport3DVisual.


 


   Point3D point;


   Point3DCollection positions = mesh2.Positions;


 


   for (int i = 0; i < positions.Count; i++)


   {


      point = positions[i];


      point.X += 10;


      point.Y += 10;


      point.Z += 10;


      positions[i] = point;


   }


 


The author was being ‘smart’, in that the .Positions accessor was pulled outside of the loop.  Running this code on 100K points takes 6.6M cycles.  Here’s the breakdown:




  • Point3DCollection.set_Item: 5.5M


  • Point3DCollection.get_Item: 984K


  • Point3DCollection.get_Count: 154K


  • Point3DCollection.get_Positions: 1.1K

Why is set_Item so much more expensive than get_Item?  Because of changed notifiers of course – 4.43M cycles are spent fire changes in this simple Visual tree.  And this tree is extremely simple — the more complex the tree, the more elements that will be involved in Changed notifications.


 


Optimized Collection Modification


 


In the following code sample, we ‘unhook’ the collection from the tree before modifying it:


 


   Point3D point;
   Point3DCollection positions = mesh2.Positions;


   mesh2.Positions = null; // Unhook the collection


 


   int count = positions.Count;


  


   for (int i = 0; i < count; i++)
   {
      point = positions[i];
      point.X += 10;
      point.Y += 10;
      point.Z += 10;
      positions[i] = point;
   }


   mesh2.Positions = positions; // Hookup the collection


 


This code runs in 836K cycles, a 7.9X improvement over the original code.  Here’s the breakdown:




  • Point3DCollection.set_Item: 391K


  • Point3DCollection.get_Item: 269K


  • Point3DCollection.set_Positions: 114K


  • Point3DCollection.get_Count: 57K


  • Point3DCollection.get_Positions: 3.3K

Summary


 


There is only a small handful of things you need to do to optimize WPF collection performance.  When populating WPF collections, pre-size the collection, move accessors outside of the loop whenever possible, and factor out startup costs when measuring.  When modifying collections, disconnect them from the live tree before changing them.  And as always, freeze collections you don’t need to modify later on.


 


Here is the raw data for various collection operations:


 




















































































Type


Method


Cycles (Millions)


Populating 100K Items


 


 


Point3DCollection


Unoptimized


196.00


Point3DCollection


Using IEnumerable ctor


7.00


Point3DCollection


Optimized


3.90


Point3D []


Initially sized


5.40


List<Point3D>


Not initially sized


12.60


List<Point3D>


Initially sized


1.40


 


 


 


Int32Collection


Unoptimized


144.49


Int32Collection


Using IEnumerable ctor


1.80


Int32Collection


Optimized


1.23


Int32[]


Initially sized


1.50


List<Int32>


Not initially sized


0.69


List<Int32>


Initially sized


0.08


 


 


 


Changing 100K Items


 


 


Point3DCollection


Unoptimized


6.60


Point3DCollection


Optimized


0.84


Point3D[]


N/A


4.20


 


 

Comments (14)

  1. GreenReaper says:

    Thank you, Tim! With your advice about detaching objects from I was able to make significant gains with just twelve extra lines of code.

    We have an application in testing that has an arbitrary number of visuals on screen, parented to a container visual. Every one of those child visuals is moved every frame. The update pattern is such that it invalidates the whole of the screen.

    I added in the necessary code to detach the container visual from the visual tree while I was moving its children and then add it back later, and timed the results:

    At 150 visuals there is no obvious difference in performance – both are just under 50% CPU

    At 300 it’s possible to discern that there’s an increase in CPU to 70% for the non-detaching version, and it pauses for a frame or so every so often. The non-detaching version is still at 50% CPU and looking great.

    At 500 there is a visible difference – the detaching version is now at 65% CPU but still pretty smooth. The non-detaching version is visibly struggling, down to 15FPS, and is using up 70% CPU.

    At 1000, the detaching version is about where the other one was at 500, or maybe a little better – dragging windows is definitely choppy, though you could still use the system. The non-detaching version is at 85% and 8FPS.

    This may have made the difference between a feasible and infeasible design.

  2. One member of the WPF community sent me and the Performance team a very detailed question and analysis on 3D performance.  We greatly appreciate his effort.

    Tim has the full details here on how seemingly small things can make a lot of difference in

  3. joewood says:

    Thanks Tim, great advice.

    What about adding a method on these collections (or any part of the tree) to suspend notifications.  Then another method to resume.  Maybe use the IDisposable ‘using’ pattern…

  4. Henry Hahn says:

    Tim Cahill, a developer on the WPF Performance Team, posted a great article on optimizing 3D collections…

  5. I was recently involved in some analysis of a WPF application which used 3D heavily that wasn’t hitting

  6. If you need to manipulate the data in a MeshGeometry3D, you should read Tim&rsquo;s article on Optimizing

  7. Appunti di WPF: Ottimizzazione di collection 3D

  8. Charles Petzold has been experimenting with cell shading on his blog at the request of Chris Cavanagh

  9. Charles Petzold has been experimenting with cell shading on his blog at the request of Chris Cavanagh