Why can’t I copy the depth stencil information when using feature level of 9_x on Windows 7 / 8?

One of our CAD partners ran across this problem a few months back and it has taken us some time to get to the bottom of it. While this might not be a common problem, I thought blogging about it might give some context into the various Direct3D feature levels. In this case feature level 9_x.

D3D 10.0 shipped without support for copying the depth stencil. We quickly followed up with D3D 10.1 and added this feature. In Windows 8 we decided to move away from D3D 9 and introduce the feature level concept. Early on in the design process it was decided that feature levels would inherit the features and capabilities of their parents. Since D3D 10 did not support copying of the depth stencil buffer a feature level of 9_x would naturally lack this capability. This is strictly a design decision and does not represent a corresponding technical limitation.

Those of you familiar with D3D 9 will note that copying the depth stencil is available in this legacy technology. Because of this there is some disconnect between what D3D 9 was capable of and what D3D 11 with feature level 9_x is capable of. You need to be aware that D3D 11 feature level 9_x is not a complete back port of D3D 9. So not all D3D 9 features will be available in D3D 11 feature level 9_x. We are working on updating the documentation with some of the other feature parity problems as we encounter them. Keep your eyes on the docs for future updates.

So this is where is gets really complicated. There are a number of workarounds to enable you to copy the depth stencil at feature level 9_x. Unfortunately not all of them work on all platforms. On Windows 8 we document how you can use a pixel shader to copy the depth information (reference below). Unfortunately this process does not work on Windows 7. If you try calling “ID3D11DeviceContext::CopyResource”, it will always fail for feature level 9_x on both Windows 7 and Windows 8.

Luckily one of the brilliant minds on the DirectX team (not me) came up with this ingenious workaround that works on both Windows 7 and Windows 8. There are a few things to consider with this workaround. Since you are using multiple render targets you might see a performance hit when copying the depth information. The other issue with this workaround is that it requires feature level 9_3. Lower feature levels don’t support formats like R32_FLOAT. Because of this you will have to use feature level 9_3 with this workaround. Experiment and then use your best judgment before committing to this workaround.

The workaround for the D3D 11 feature level 9_x depth stencil copy problem is to use the MRT to generate a copy of the depth buffer’s contents in a separate render target (call it “temp”) during the initial rendering pass.  Then when a copy of the depth buffer is needed, bind ”temp” as a texture, draw a quad 1:1 sized with temp, sampling each texel from temp and outputting the depth value (float depth : SV_Depth) from the Pixel Shader.  

Note in the initial rendering pass, this requires binding an additional rendertarget, with a format like R32_FLOAT (supported in 9_3 but not 9_1). So whereas the original rendering pass would have had a depth buffer and some sort of color rendertarget, now there are 3 buffers: depth buffer, the color buffer, and an additional R32_FLOAT rendertarget bound. The pixel shader outputs not only the color that it normally did but also the depth value to the R32_FLOAT rendertarget.
 
The R32_FLOAT surface can be bound as an input texture in a “copy” rendering pass, rendering a single quad while the destination depth buffer is bound as depth buffer.  In this pass, the Pixel Shader does a .Sample (point sampling) on the current pixel in the (same sized) R32_FLOAT surface and just writes the value out to the depth buffer using SV_Depth.  When a pixel shader outputs SV_Depth, it is replacing the normal interpolated depth value with a shader generated arbitrary value.  Configure the depth state to just always pass the depth test.  The result of the “copy” rendering pass is the contents of the R32_FLOAT “temp” surface get dumped into the destination depth buffer. It might be necessary to use a dummy rendertarget in this pass. Keep in mind that you should interpolate z and w separately from the vertex shader and in the pixel shader divide z by w. Also this workaround won't work for Feature Level 9_1. If you want to target FL 9_1 you will have to use software rendering and target FL 9_3 as a minimum.

Follow us on twitter #wsdevsol