Inline those vector constructors

Reading Chris To's article about Xbox CLR performance (see my
previous post) reminded me of a subtle optimization for constructing
new vector instances. Given some code like:

    DoCoolStuffWithVector(new Vector3(23, 42, -1));

You can rewrite this to avoid calling the Vector3 constructor:

    Vector3 temp = new Vector3();

temp.X = 23;
temp.Y = 42;
temp.Z = -1;


It may not be obvious why this second version will be faster, since
it also calls a Vector3 constructor, albeit the one without any
arguments. The reason lies in a subtlety of how the CLR handles value
types like Vector3. The default constructor for value types is special
in .NET: it always exists, is always public, always just fills the
structure with zeros, and you aren't allowed to override it. These
restrictions make it trivial for the runtime to optimize the default
constructor, so all the vector construction code in my second example
can be inlined. The first example, on the other hand, contains a call
to a custom Vector3 constructor which will not be inlined, and thus
will be significantly slower.

This is really just a special case
of a more general principle: manually inlining math computations can
make things faster by avoiding method calls. The Compact Framework
jitter that we use on Xbox only has limited inlining capabilities, so it can be useful to
inline critical methods by hand.


Please ignore everything I wrote above...

Remember the #1 rule of optimization: don't do it.

Rewriting all your math code to avoid calling structure constructors
will make it more complicated, error prone, and hard to maintain. This
is almost certainly a bad idea.

It is good to know about this optimization in case you ever find
yourself having to speed up some critical piece of code that happens
thousands of times a frame, but you generally shouldn't worry about it.

Comments (4)

  1. zproxy says:

    This is correct, but with orcas release you will be able to write this:

    DoCoolStuffWithVector(new Vector3 {

     X = 23,

     Y = 42,

     Z = -1 }


    And this will compile exactly like the second example.


  2. Yeouch!  That’s painful and I will heed your warning.

    Will the Orcas change optimize code built for the Compact Framework too?

    It just felt like someone walked over my grave, but that shudder I just experienced came from the thought of coercing XNA Studio to run on today’s Orcas bits…

  3. Lewis Cowles says:

    how much of a performance boost can we expect from this code in terms of changing something that executes currently 33,000,000 times per second, would it be a large performance boost, its a few functions that are called for every object I construct and also are used for collision detection

  4. ShawnHargreaves says:

    > how much of a performance boost can we expect from this code

    That entirely depends on the context. Every app is different, and the only way to find out for sure where your bottlenecks are is to profile your specific game and see where it is spending its time, then optimize those particular areas.

    It’s not possible to give generalized advice like "this particular optimization trick will save you X% of time", because this all depends on how long the thing you just optimized was taking in the first place!

Skip to main content