Rendertarget changes in XNA Game Studio 4.0


We made several changes to the rendertarget API in Game Studio 4.0, all with the goal of increasing usability and reducing error.

The most common cause of confusion is probably the RenderTargetUsage.DiscardContents behavior, but this is one thing we did not change. PreserveContents mode is just too slow on Xbox, and even slower on phone hardware, which typically uses some variant of tiled or binned rendering, and thus has the same preference for discard behavior as Xbox but with less memory bandwidth for the extra buffer copies if you do request preserve mode.

Making our API simple is well and good, but not if that is going to cost enormous amounts of performance! So discard mode rendertarget semantics are here to stay. Learn em, love em, live with em 🙂

Here are the things we did change:

 

Has-a versus Is-a

I often see people attempt something like:

    RenderTarget2D rt = new RenderTarget2D(...);
    List<Texture2D> textures = new List<Texture2D>();

    // Prerender animation frames
    for (int i = 0; i < 100; i++)
    {
        GraphicsDevice.SetRenderTarget(0, rt);
        DrawCharacterAnimationFrame(i);
        GraphicsDevice.SetRenderTarget(0, null);

        textures.Add(rt.GetTexture());
    }

This doesn’t work, because GetTexture returns an alias for the same surface memory as the rendertarget itself, rather than a separate copy of the data, so each drawing operation will replace the contents of all previously created textures. But these semantics are not all obvious from the API! GetTexture returns a reference to shared data, but the API makes it look like this could return a copy.

This is the classic has-a versus is-a distinction. Rendertargets are a special kind of texture, but our API made it look like they just had associated textures, or perhaps could be converted into textures.

We fixed this by removing the GetTexture method, and instead having RenderTarget2D inherit directly from Texture2D (and RenderTargetCube from TextureCube). It is harder to get these semantics wrong with the 4.0 API:

    List<Texture2D> textures = new List<Texture2D>();

    for (int i = 0; i < 100; i++)
    {
        RenderTarget2D rt = new RenderTarget2D(...);

        GraphicsDevice.SetRenderTarget(rt);
        DrawCharacterAnimationFrame(i);
        GraphicsDevice.SetRenderTarget(null);

        textures.Add(rt);
    }

 

Atomicity

How do you un-set a rendertarget? In previous versions of Game Studio we would often write:

    GraphicsDevice.SetRenderTarget(0, null);

That mostly works, but after using multiple rendertargets we must use this more complex version:

    for (int i = 0; i < HoweverManyRenderTargetsIJustUsed; i++)
    {
        GraphicsDevice.SetRenderTarget(i, null);
    }

Ugly, not to mention error prone if the un-set code does not loop enough times.

In Game Studio 4.0, we made SetRenderTarget an atomic method, so it always sets all the possible rendertargets at the same time. This call will always un-set all rendertargets, no matter how many were previously bound:

    GraphicsDevice.SetRenderTarget(null);

To set a single rendertarget, you no longer need to specify an index:

    GraphicsDevice.SetRenderTarget(renderTarget);

If multiple rendertargets were previously bound, this will change the first one to the specified value, then un-set the others.

To set multiple rendertargets (which is a HiDef feature, so not supported in the CTP), specify them all at the same time:

    GraphicsDevice.SetRenderTargets(diffuseRt, normalRt, depthRt);

That is a shortcut for this more flexible but verbose equivalent:

    RenderTargetBinding[] bindings =
    {
        new RenderTargetBinding(diffuseRt),
        new RenderTargetBinding(normalRt),
        new RenderTargetBinding(depthRt),
    };

    GraphicsDevice.SetRenderTargets(bindings);

Making the set call atomic has two main benefits:

  • It reduces the chance of accidentally forgetting to unset multiple rendertargets

  • It makes our validation code more efficient, as we now have a single place to validate MRT rules such as all surfaces being the same size and bit depth. Previously, SetRenderTarget had no way to know when it had the final state, or whether other calls were about to change rendertargets on different indices, so it had to just set a dirty flag which clued the next draw operation to validate and commit the new surfaces. This added a small but measurable overhead to all draw operations (even when MRT was not in use), which is no longer necessary now these operations are atomic.

 

Declarative depth

Our bloom sample contains a subtle bug in this line:

    renderTarget1 = new RenderTarget2D(GraphicsDevice, width, height, 1, format);

The problem is that when we later draw to this rendertarget, we do not explicitly un-set the depth buffer. Even though we are not using depth while rendering the bloom postprocess, the default depth buffer is still bound to the device, so must be compatible with the rendertarget we are using.

If you change the bloom sample by turning on multisampling, the default depth buffer will be multisampled, but the bloom rendertarget will not, so the two are no longer compatible and rendering will fail.

We could fix this by changing the bloom rendertarget to use the same multisample format as the backbuffer, or we could explicitly un-set the depth buffer before drawing bloom:

    DepthStencilBuffer previousDepth = GraphicsDevice.DepthStencilBuffer;
    GraphicsDevice.DepthStencilBuffer = null;

    DrawBloom();

    GraphicsDevice.DepthStencilBuffer = previousDepth;

This is ugly and far from obvious. We forgot to put this code in our sample, and I see other people making the same mistake all the time!

The more we thought about this, we realized some things:

  • Any time you change rendertarget without also changing depth buffer, that is almost certainly a bug waiting to bite.

  • Any time you un-set from a rendertarget to the backbuffer without also resetting the depth buffer, that’s another bug.

  • Many rendertargets are used in ways that do not actually require a depth buffer. The correct thing to do here is set the depth buffer to null. If you forget to do that, things will often still work, but can fail in subtle and confusing ways.

  • For rendertargets that do require a depth buffer, it can be a pain making sure your depth buffer has the same size and multisample format as the rendertarget. The XNA framework had much code dedicated to validating these rules, and I often see people getting this wrong.

  • Our DepthStencilBuffer class had no interesting methods or properties. In fact, the only thing you could do with it was to set it onto the graphics device, which always happened at the same time as setting a rendertarget.

We decided the DepthStencilBuffer class was so useless, we should get rid of it entirely! Instead, the depth format is now specified as part of each rendertarget. If I call:

    new RenderTarget2D(device, width, height);

I get a rendertarget with no associated depth buffer. If I want to use a depth buffer while drawing into my rendertarget, I use this constructor overload:

    new RenderTarget2D(device, width, height, false, SurfaceFormat.Color, DepthFormat.Depth24Stencil8);

Note: I could specify DepthFormat.None to use the full overload but get no depth buffer.

Note: when using MRT, the depth format is controlled by the first rendertarget.

With this design, many previously common errors become impossible:

  • The depth buffer cannot fail to match the rendertarget size and multisample format, so we don’t even need to bother validating this (which speeds up all rendering).

  • When you are doing 2D work like the bloom sample, and not thinking about depth buffers at all, the device is automatically set to use null depth.

  • Whenever you un-set a rendertarget, the default depth buffer is automatically restored. No need to explicitly save and restore means no chance of getting that wrong!

Several of you expressed concern that this design could lead to wasted memory, as you can no longer share a single depth buffer between many rendertargets.

Not at all! The key shift here is from an imperative API, where you explicitly create depth buffer objects, manage their lifespan, and tell us which one to use at what times, to a declarative API, where you tell us what depth format you want to use, and we figure out how best to make that happen.

The two important pieces of information you need to provide are:

  • Do I want a depth buffer when using this rendertarget? If so, what format?

  • Do I want to be able to go back to this buffer later and continue drawing over it (RenderTargetUsage.PreserveContents) or am I only interested in the final texture image? (RenderTargetUsage.DiscardContents)

Armed with this data, we can choose the appropriate implementation strategy for different situations:

  • On Xbox, there isn’t really such a thing as a depth buffer in the first place: it’s actually all just a small piece of shared EDRAM, plus backing store if you request PreserveContents mode. So this design takes less memory than what we had before, as we no longer need to jump through hoops to give the illusion that these are real objects with actual memory of their own.

  • On Windows, if you request PreserveContents mode, we allocate a separate depth buffer per rendertarget.

  • On Windows, if you use the default DiscardContents mode, we can be smarter, and can do things like automatically sharing a single native depth buffer between many rendertargets (as long as they all have the same size and multisample format).

Honesty compels me to admit that we haven’t actually implemented this sharing optimization yet. It’s currently on the schedule for 4.0 RTM, but things can always change, so please don’t beat me up too hard if we for some reason fail to get that part done in time 🙂


Comments (44)

  1. Alejandro says:

    I like a lot that’s not needed to manually set an index for each renderTarget, and that setting a render target unsets all others.

    Also the bindings will help to order the code a lot more, you may create bindings according to some options: going to do motionBlur? going to do lighting?

    Then create the bindings with linearDepth, normals, and a velocities renderTargets.

    SetRenderTargets may always be:

    SetRenderTargets(prePassBindings);

    I was concerned regarding the depthBuffers too… For what I understand, it would work like this?:

    diff = new RenderTarget2D(device, width, height, false, SurfaceFormat.Color, DepthFormat.Depth24Stencil8); –> A common depth.

    linearDepth = new RenderTarget2D(device, width, height, false, SurfaceFormat.Single, DepthFormat.None); —> None depth

    velocity = new RenderTarget2D(device, width, height, false, SurfaceFormat.HalfVector2, DepthFormat.None); —-> None again

    And then:

    SetRenderTargets(diff, linearDepth, velocity);

    It would use the diff depthBuffer. However if I swap the order:

    SetRenderTargets(linearDepth, diff, velocity);

    It will be rendering without a depth, am I right?

    Seems a little bit counter-intuitive depending on your code practices, I feel like setting redundant depthFormats, i.e. all render targets would have Depth24Stencil8 instead of SetRenderTargets(a, b, c, commonDepth) or even declared in the same bindings. However that is not an issue.

    The main question I have is, can I use the depthBuffer the API declared for some sized renderTarget in another lower sized renderTarget? In current XNA, the requirement is that the Depth has to be equal or bigger (besides the MSAA type), so a DepthBuffer for a fullscreen renderTarget will work for half and quarter render targets. Do we get the same behaviour?

    An example: using only a full screen depthBuffer not MSAA’ed. RT1 = fullScreen, RT2 = halfScreen.

    1. render full-screen color –> to RT1;

    2. downsample to half size both color and depth rewriting the depth with DEPTH0 semantic in pixel shader –> to RT2

    3. render particles on the same RT2.

    4. Combine with RT1.

    I doubt I always had, is it wasteful to write the Depth with the DEPTH0 semantic? How does that compare with the PreserveContents option?

    In this case it wouldn’t work unless PreserveContents automatically downscales the data, but supposing that there is no downscale, what do you do? direct-to-the-metal byte to byte copy? (that would obviously be way faster).

    Anyways I’m really happy with how everything is coming out and the over-the-top amazing things you guys do, not scared anymore :). Several platforms, new gadgets all within one framework… that’s just amazing, can’t wait to see what we will in the future.

    Thanks in advance!. Have a nice day.

  2. Very interesting. I was afraid before that 4.0 was going to be more limiting but some of these render target issues are one i had when i was just starting a year ago (not much of a problem now thou) but i can see that this will make it much easier for new people to learn and it should make my code easier to read too 🙂

    You speak of performance improvments because it no longer has to validate many rendering states etc… what kind of performance boost did you see on the 360 with this? significant?

    Anyway XNA 4.0 is looking good.

  3. Still Alarmed says:

    So how do you share depth buffer *data* between targets?

    As you know (and have written in many papers) using depth testing during deferred lighting improves performance.

    However in your new design this appears impossible.

    Forcing us to redraw the scene every target switch defeats the purpose of deferred rendering and is overly redundant.

    Why not provide the option to share (actually share) depth buffers on Windows?

    There are people legitimately using XNA on Windows who need the performance and have very little or no interest in Xbox or mobile.

    Should we expect the same focus on the "least common denominator" in future XNA versions?

  4. Kyle says:

    Re: depth buffers and render targets.

    Hmmm sounds suspiciously similar to what I’ve been doing for awhile…

    You didn’t happen to break into my computer and steal my codes did you?

    Seriously though, sounds like it will make the whole process much easier/confusing and less gotchas for new comers.

  5. p6 says:

    Hi Shawn,

    Thanks for another great post. I wanted to ask something. I’m currently doing as suggested (by you I think) while working on a landscape 2D game in the windows phone SDK (XNA). I’m rendering to a render target then rotating it to landscape on drawing. Everything works fine. However my co-ordinates for player position are based on X/Y as if the phone was in portrait – which is fine at the moment, when the sdk is updated to support explicit ‘landscape’ mode (I assume there will be a device call to set it so the resolutions is now 800×480 rather than 480×800), will this automatically take into account the X/Y switch and give you the full 800 pixels in the *X* direction. I assumed it would or it wouldn’t really be a supported landscape mode but wanted to check before I coded everything as X/Y only to find a real landscape mode would need to switch Y to X and X to Y.

    I could of course just code it now as X to Y and Y to X and draw it ‘on it’s side’ so to speak (i.e not using a render target) but I like to think of it properly as a horizontal 800×480 viewport to code in.

    Hope I’ve explained my question ok.

    thanks

  6. Still Alarmed says:

    Sorry, if my earlier post sounded a bit harsh but we put a lot of work in our game. :/  And i’m sleep deprived.

    Would still like to know about my questions though. 😉

  7. Michael Wilson says:

    That all sounds good.

    I have to ask though, will XNA4 fix the whole-application 5-second freeze issue when trying to do anything with MediaPlayer on Windows 7? Even though I’m coding for the X360, it’s really frustrating the alpha testers.

  8. PolyVector says:

    Thanks for the insight Shawn, this is exactly how I was hoping depth buffers would be treated in 4.0 🙂

  9. Terry says:

    Great post. Thanks for clearing a lot of this stuff up 🙂

  10. David Black says:

    For depth buffers, what about situation such as:

    1) Depth pre pass + normalss to colour channel

    2) other stuff… (eg AO)

    3) render solid objects using depth buffer from before and a new render target. (also reads things like AO)

    4) use depth buffer again for another pass(eg volume fog)

    With the new system, I dont see a good way to do this on windows without a copy of depth between render targets. Perhaps using MRT if all writes to colour are disabled on first pass?

  11. mattbettcher says:

    Shawn, this is probably the best change to the whole XNA framework! I can’t describe how many times I fought with getting render targets to work only to not get the right output or throw an exception because I still had some other render target set! I’ve already used the texture inheritance and it’s genius. You guys really did break it good!

  12. Pete says:

    > "Honesty compels me to admit that we haven’t actually implemented this sharing optimization yet. It’s currently on the schedule for 4.0 RTM, … "

    Then one suggestion: not allowing to pass a null rendertarget to the set operation. And instead, when you want to unset rendertargets you call something like:

    GraphicsDevice.ResetRenderTargets();

    Imo is less error-prone, for the cases when a rendertarget is disposed by a bug in our code and then we have to chase why this is happening.

    Now when you call SetRenderTarget or SetRenderTargets you don’t have to verify whether the render target is null and move that check to the internal operation you call last, and if a rendertarget is null, then you throw an exception.

    What would happen if I call "SetRenderTargets(rt1, null, rt3)"? Or "SetRenderTarget(rtNullByMistakeForWhateverReason)"?

    By not allowing to pass null rendertargets to those operations we could rapidly identify that a rendertarget we pass is getting to null somewhere in our codes and fix it.

    Thoughts?

  13. YellPika says:

    Thank you for the post!

    I’ve also tried to use one render target to draw to an array of textures before. I was very surprised to find that, strangely, the contents of all of the textures were the same…

    Thank you for clearing that up. I’m looking forward to the stable release.

    These changes are getting me excited 😀

  14. FlyingWaffle says:

    hmm, like Still Alarmed mentions, my current implementation of deferred rendering assumes depth buffer can be shared between render targets – first I render all the object and create a depth buffer, then in later passes (like light volume effects and analytical occlusion proxy boxes) against the previously generated d-buffer.

    It’s already a giant pain that a stencil buffer can’t be shared between render targets (prevents me from culling a significant amount of pixels when processing my light volumes and occlusion proxies).

    Shawn, I know you’ve stated many times that don’t believe in deferred rendering, but now it seems that you really don’t want it to work for anyone… (rather than fixing the stencil buffer problem, you’re breaking the depth buffer now?!).

    Maybe I’m missing something, but I’m really disappointed about this.

  15. default_ex says:

    Just like StillAlarmed and FlyingWaffle I still have my concerns about the depth-stencil buffer.

    When I first started to learn deferred rendering I would set my geometry buffer, render into it, then set null. A separate component would set the light buffer, render lights, and then set null again. However this proved counter-productive with lighting because then I could not use any depth or stencil test because it was invalidated after the double set render target operation (null and then light buffer).

    Lately I found when composting the image into the back buffer again if I set null to all render targets the depth-stencil buffer is invalidated, to get around this I simply employed yet another render target, set it in slot 1 (the second render target) and render the same image into the back buffer and that target. This actually became helpful as now I don’t need to resolve the back buffer to perform post processing effects.

    Mostly it worries me between the geomtry and lighting phases though as being able to put the depth and stencil buffer to work in lighting has moved my lighting phase in a normal scene from a couple milliseconds to 100s of a millisecond. That would be a significant performance hit to have to go back to the old naive approach.

  16. FlyingWaffle says:

    In an old article, Sean neatly described all the different depth/stencil trick necessary to get light volumes to work in the context of deferred rendering:

    http://www.talula.demon.co.uk/DeferredShading.pdf

    It’d be really awesome if someone could come up with a simple description on how to get this to work with the new rendertarget framework of XNA 4.0 (assuming it’s still feasible).

  17. Pete says:

    Shawn, if sharing a depthbuffer is not possible among render-target switches, maybe it may help overloading the operation I suggested to:

    GraphicsDevice.ResetRenderTargets(bool shareDepthBuffer);

    I’m also getting my hands into Deferred Lighting & Light Pre-pass methods, so I’d love not to lose this info.

  18. ShawnHargreaves says:

    > You speak of performance improvments because it no longer has to validate many rendering states etc… what kind of performance boost did you see on the 360 with this? significant?

    Not particularly significant (the validation logic was already pretty well optimized so not usually a big overhead) but it’s still slightly faster to do nothing at all, even compared to something that was already pretty quick.

  19. ShawnHargreaves says:

    > In current XNA, the requirement is that the Depth has to be equal or bigger (besides the MSAA type), so a DepthBuffer for a fullscreen renderTarget will work for half and quarter render targets. Do we get the same behaviour?

    This is actually a great example of where a declarative API can beat a more traditional imperative design.

    The detailed rules about what depth buffers are compatible with what surfaces are somewhat complex, and not at all consistent. Some platforms have this >= size behavior, while others require an exact size match. Some have bit depth or format restrictions where depth has to match color, while on other hardware these things are totally orthogonal.

    The nice thing about a declarative API is it gives the framework room to do the right thing for each platform, applying the appropriate rules to share as much as possible, without requiring you to understand or care exactly how these rules vary across platforms.

  20. ShawnHargreaves says:

    > For depth buffers, what about situation such as:

    > 1) Depth pre pass + normalss to colour channel

    > 2) other stuff… (eg AO)

    > 3) render solid objects using depth buffer from before and a new render target. (also reads things like AO)

    > 4) use depth buffer again for another pass(eg volume fog)

    The biggest restriction of the new API is that you cannot share a single depth buffer across multiple different rendertargets. However, it is usually possible to achieve this kind of rendering architecture (on Windows, anyway) just by arranging things so that any time you want to reuse the depth buffer, you are drawing to the same rendertarget.

    For instance, the classic deferred shading optimization of using depth/stencil to cull light volumes is totally doable: you just need to arrange your buffer operations so that these light accumulation passes are done into the same rendertarget that was bound on index #0 during the initial scene rendering.

  21. ShawnHargreaves says:

    > Should we expect the same focus on the "least common denominator" in future XNA versions?

    I don’t think "lowest common denominator" is entirely fair (for instance we introduced the concept of Reach vs. HiDef profiles specifically to give us a way of formalizing capability differences, in order to avoid forcing all platforms to match the least capable), but yes, we place a high value on cross platform consistency.

    Consistency is actually a multi-dimensional problem:

    – There is the obvious cross platform (Windows, Xbox, Phone)

    – There is cross devices within what a consumer would see as a single ‘platform’ (NVidia, AMD, Intel)

    – There is the time axis (DX9, DX10, DX11)

    Like most things in software engineering (not to mention life as a whole 🙂 this is something of a balancing act. We see a great deal of benefit in maximizing all three axes of consistency, but at the same time, there is also benefit in exposing the richness of specific platforms (for instance, we didn’t cut programmable shaders from Windows and Xbox just because we didn’t get them in this first Windows Phone release).

    Rendertargets are one of the areas that varies most across platforms, so this was definitely a tough area to rationalize. I think the 4.0 API does a good job of providing this consistency while still exposing enough richness (MRT, floating point formats, etc) to implement advanced rendering techniques (for instance it is totally possible to do deferred shading, including the various depth/stencil volume optimizations, with this API).

  22. ShawnHargreaves says:

    > And instead, when you want to unset rendertargets you call something like:

    > GraphicsDevice.ResetRenderTargets();

    > Imo is less error-prone, for the cases when a rendertarget is disposed by a bug in our code and then we have to chase why this is happening.

    We thought about that. The problem with separating "set" and "unset" into separate APIs, and disallowing null as a parameter to the "set" API, is this makes a couple of common usage patterns more awkward.

    For instance, it’s harder to implement a "save whatever was previously set, then restore it later" operation if the restore might have to call either of two different APIs.

    This can also be a pain for engine/middleware type APIs that want to implement operations which take in a rendertarget selection, use that if provided, or use the backbuffer if null. It’s handy if these things can just pass whatever parameter they are given straight through to SetRenderTarget, and have this do the appropriate thing regardless of whether or not that value is null.

  23. ShawnHargreaves says:

    > Shawn, I know you’ve stated many times that don’t believe in deferred rendering

    Huh? That’s not accurate at all.

    I think that deferred shading is, like most things, a tool that is appropriate for some situations but not others.

    Some people assume that because I wrote one of the earlier papers about this technique, I must be a died-in-the-wool fan, which is definitely not true. I wrote that paper based on some tech research, which never made it into an actual game because deferred shading turned out to be a sub-optimal approach for the design and hardware we were working with at that time.

    But this doesn’t mean I don’t think deferred shading can be a great fit for different situations!

    I do think that the "classic" style of deferred shading, like I described in that early article, is a somewhat awkward fit for Xbox 360, thanks to the EDRAM hardware architecture. I’ve seen many people get awesome results from deferred-ish architectures on Xbox, but the most successful tend to be hybrid designs, and often rely more on CPU visibility culling rather than stencil for optimizing the light volumes.

    You can certainly do the classic depth volume optimization (like I described in my paper) on Xbox, including with Game Studio 4.0, if you specify PreserveContents rendertarget usage, but there is a significant cost to treating EDRAM in that way. But of course, PreserveContents mode is cheap on most Windows hardware…

  24. David Black says:

    >>The biggest restriction of the new API is that you cannot share a single depth buffer across multiple different rendertargets. However, it is usually possible to achieve this kind of rendering architecture (on Windows, anyway) just by arranging things so that any time you want to reuse the depth buffer, you are drawing to the same rendertarget.

    For instance, the classic deferred shading optimization of using depth/stencil to cull light volumes is totally doable: you just need to arrange your buffer operations so that these light accumulation passes are done into the same rendertarget that was bound on index #0 during the initial scene rendering.

    <<

    I am not actually doing full deferred rendering… (reducing bandwidth is king:-)

    But the initial occlusion/z pass is a big gain, since I have some rather heavy shaders. Plus most importantly it can cull lights quite well and avoid the need for rendering shadow maps.

    What concerns me is the cost of ensuring the target with the right depth buffer is always #0. So I have to copy colour data(twice the size of depth stencil when using HDR). Bandwidth is one of the largest bottleneck, now and increasingly so in the future:-(

    Or I can always set target #0 as the same thing and disable colour writes. Apart from being a pain to adapt shaders for I am concerned that hardware isnt smart enough to avoid either consuming EDRAM or consuming other resources(eg ROP units).

    Also this assumes that MRT is present, which could be a big problem for Apps targeting reach…

    What would be really cool, assuming we are stuck with this deign would be a method to blit depth buffers. Perhaps not possible on windows dx9 though.

    David

  25. all this is nice , but

    as i understand if we can share a single dept buffer across all our rendertarget in our renderpipeline

    the depth rendertarget has to be stored in the Specieal ram on xbox360 , so you dont have 10mb of ram for all your other rendertarget

    if we can not share the depthbuffer , no , light,ao,and so on…

    we can do this by render a fresh copy of the deptbuffer every time..

    Hay we can only push 200 drawcalls on the xbox360

    let us divide it up

    color,normal,depth,,

    let say 40 3d models

    we are now a 40 drawcalls

    and now some shadows

    we are now at 80 drawcalls

    some light,ao and other postprocessing

    let say 7 draw calls

    now we need some explosions and some bullit

    we are now at 130 drawcalls

    we need a sky and pehaps some water and a nice terrain , we are now at 133 draw calls

    we allso need some spotlight and pointlight

    so we can have 70 light before we hit the limit

    this a bad , we can simple not do a real 3d game,, with these limitations,,

    bad idear if we have to render a fresh copy of the depth buffer if we gonna do ao as postprocess…

  26. Still Alarmed says:

    > "You can certainly do the classic depth volume optimization … including with Game Studio 4.0, if you specify PreserveContents rendertarget usage … PreserveContents mode is cheap on most Windows hardware…"

    But only by keeping the same target in slot 0 for the buffer and lighting passes?

    This consumes fill rate even if target is not referenced in shader, and even if the ColorWrite mode is disabled.

    Never mind losing the ability to share the depth with the back buffer (for pre-pass, or just to reduce memory).

    > "Some people assume that because I wrote one of the earlier papers about this technique, I must be a died-in-the-wool fan, which is definitely not true."

    All we’re asking is for the XNA team to consider those people who DO use these techniques.

    > "I don’t think "lowest common denominator" is entirely fair"

    Can you elaborate on the XNA team’s goals with the platform?

    Every release further limits how the system can be used, so obviously serious games are not being considered when decisions are made.

    XNA is catering more and more to the 2D bubble-popper / 3D BasicEffect shovelware crowd.

    The team has to be aware of this, so there must be some drive behind it – why continue to limit the platform?

  27. Matan says:

    Hi Shawn, first of all I have to say thanks for the information, been following your posts lately.

    I also wanted to add that GS 4.0 seem to bring back a bug from GS 2.0 which I believe was fixed in GS 3.0.

    https://connect.microsoft.com/feedback/ViewFeedback.aspx?FeedbackID=318195&SiteID=226

    You guys probably want to fix this as soon as possible, hopefully before the official release of it.

  28. ShawnHargreaves says:

    > What would be really cool, assuming we are stuck with this deign would be a method to blit depth buffers. Perhaps not possible on windows dx9 though.

    To be clear: for Game Studio 4.0, this design will ship exactly as described above.

    For future versions, we are definitely interested in hearing what additional scenarios you guys are interested in having supported, to make sure we are focusing our thinking and design efforts in the right areas!

  29. ShawnHargreaves says:

    > Can you elaborate on the XNA team’s goals with the platform?

    I thought Michael summed this up well in his MIX talk: this is a little corny but actually very accurate 🙂

    "Our goal is to create a game development platform which is powerful, productive, and portable".

    There is obviously some tension between those three goals, so balancing is required. If you only care about one or two out of the three, you will inevitably disagree with some of our decisions, in areas where we chose to prioritize a different goal to the one you care most about.

    It is absolutely not (and has never been) our goal to replicate the entire native DirectX API. We believe there is a vast and extremely interesting space at a somewhat higher level than typical native APIs, but still much lower level and closer to the metal than a game engine.

    > XNA is catering more and more to the 2D bubble-popper / 3D BasicEffect shovelware crowd.

    If that was true, we could have stopped at SpriteBatch, and skipped pretty much everything I’ve been working on the last couple of years!

    I don’t believe that simplified, more productive development and high quality games are in any way mutually exclusive, and I can assure you that high quality games are absolutely an important goal for our platform.

  30. ShawnHargreaves says:

    > I also wanted to add that GS 4.0 seem to bring back a bug from GS 2.0 which I believe was fixed in GS 3.0.

    > https://connect.microsoft.com/feedback/ViewFeedback.aspx?FeedbackID=318195&SiteID=226

    Hi Matan,

    This behavior should not have changed between GS3 and GS4. In both versions:

    – You can GetData from a texture at any time

    – You cannot SetData while the texture is set on the device

    If you are seeing something different to this with GS4, please file a bug on the Connect site (along with a repro app) so we can take a look at it.

    Is this behavior you are seeing with the Windows framework, or on Windows Phone, btw?

    Thanks!

  31. stringa says:

    I don't know where to ask this…

    Can we please have a HLSL QuickRef…similar to the GLSL quickref ?

    thanks

    stringa

  32. Doug McNamara says:

    Hi Shawn,

    Why does the code below in 4.0 produce the output below (I am using Reach because my windows game will not run in HiDef with my graphics card, ATI RADION X1900)?.

    This is a problem for me because I can not get shadow mapping to work in a 4.0 windows game, it works in 3.1

    //———- START CODE SEGMENT —————————————————————-

               ShadowTarget = new RenderTarget2D(GraphicsDevice, shadowMapSize, shadowMapSize, false, SurfaceFormat.Single, DepthFormat.Depth24Stencil8);

               Console.WriteLine(shadowMapSize + " : " + ShadowTarget.Format);

               SurfaceFormat qSF;

               DepthFormat qDF;

               int ms;

               GraphicsDevice.Adapter.QueryRenderTargetFormat(GraphicsProfile.HiDef, SurfaceFormat.Single, DepthFormat.Depth24Stencil8, 0, out qSF, out qDF, out ms);

               Console.WriteLine("{0} {1} {2}", qSF, qDF, ms);

    //———— END CODE SEGMENT —————————————————————

    //———— OUTPUT ————————————————————————

    1024 : Color

    Single Depth24Stencil8 0

    Thanks,

    Doug

  33. Doug McNamara says:

    Hi Shawn,

    My mistake I was querying with the hidef profile. Does the reach profile support Single surface formats? If not it seems like a bug that my graphics card does not support hidef or that reach does not support Single format, since in 3.1 I can create Single format render targets.

    Thanks,

    Doug

  34. ShawnHargreaves says:

    Hey Doug,

    Reach profile does not support floating point surface formats, but HiDef does.

    If you have follow on questions I would recommend the creators.xna.com formats: that's a better venue than blog comments for having a two way conversation!

  35. Piotr says:

    I have a question.

    Lets say that we want to use SurfaceFormat.Color against SurfaceFormat.Single for depth buffer to have blending enabled.

    I don't know if I'm thinking good. Is it possible to store in every channel RGBA(8 bit) values from depth to keep depth result precision like in format SurfaceFormat.Single? Store in R channel first 8 bits, in G 8-16 bits, in B 16-24 bits and in A 24-32 bits? And in final combine effect read depth value from all channels and have the same result like using SurfaceFormat.Single? For me it should be possible with some maths. Multiply or divide with 256, 256*256 and 256*256*256 or something like that:). Is it possible?

    Sorry my English:). I hope you know what I mean.:)

  36. Piotr says:

    For me it should be like moving bits in registers left or right like in assembler multiply / divide with pow(2, bits_numer).

  37. ShawnHargreaves says:

    Piotr: I recommend the create.msdn.com forums for this question – that's a better place that blog comments for tech support type discussions.

    In short: I don't entirely understand your question, but sure, you could write shader code to implement your own software-float format. That's going to be horribly complex and slow, though, and won't work if you apply any operations such as blending or filtering to these encoded values. Doesn't seem like a generally a useful technique to me.

  38. Piotr says:

    Thanks Shawn. From my point of view it is strange that I can't use SurfaceFormat.Single with alpha blend enable in HiDef profile especially that I'm not blending on this RenderTarget and in 3.1 it was possible. Other thing is that SurfaceFormat.Single has the same number of bits like SurfaceFormat.Color.

    I found a solution for my problem but I got another:) -> precision loss:).

    I described it here forums.create.msdn.com/…/70326.aspx

  39. KDavila says:

    I got a question about using SurfaceFormat.HalfVector2. Is there any especial consideration that one has to make about rendertargets using this format when the device is lost? more especificly when one makes a game that can switch between windowed and fullscreen modes. I got many strange errors when I toggle between modes but only if I create and use rendertargets with this format.

  40. KDavila says:

    Well, it seems like my problems were because of that bug you mentioned in other forums about XNA 4.0 and cloned effects, but actually i didn't clone my effect, just used the same reference as loaded by the content manager. I can tell that bug with sampler states does not only occur when changing states on a cloned effect but also tend to happen when using MRT. It was quite strange that the solution I found was to set manually the sampler states modified to null by my MRT effect not before using them but after drawing.

  41. JonnyOThan says:

    For anyone landing here and having trouble re-using a depth buffer, this might help:

    http://www.catalinzima.com/…/restoring-the-depth-buffer

  42. DJJ says:

    This tech talk is interesting, but in simple terms, my 11 year olsgrandson purchased Terraria some time ago, and put on my computer. It worked just fine till very recently. It appears there was an update to a newer version, which he nor I can get to work. It loads and just starts then crashes.

    The message that comes up says "System Object Disposed Exception: Cannot Access Disposed Item. Object Name: Render Target 2D" I cannot find a solution to overcome this error. He is very unhappy that he can no longer use the game, but was quite happy with it before it was "fixed". What can be done to correct this problem? I understand it is a common problem with the new version. For his age, he is well experienced with computer games, so wants to fix it. How to do it??

  43. ShawnHargreaves says:

    DJJ: you need to contact the Terraria developers for support with their program.  This blog is about XNA, not Terraria.

Skip to main content