Staging Texture in C++ AMP

C++ AMP in Visual Studio 2012 had the ability to create staging array which helped in optimizing data transfer cost between the host and accelerator_view. In C++ AMP in Visual Studio 2013, we have enhanced the texture by adding the capability to create staging texture which brings in same benefits and has same semantics as that of staging array.

Creating Staging Texture

Similar to staging array, staging textures have the notion of associated accelerator_view. The associated accelerator_view is a hint to C++ AMP runtime as the intended destination for copying to/from the staging texture. For each constructor, that creates a texture on an accelerator in C++ AMP in Visual Studio 2012, we have a corresponding constructor for creating staging texture in C++ AMP in Visual Studio 2013. The only different being that constructors for staging textures (similar to staging array) take an additional parameter: associated accelerator_view.

    1: accelerator_view cpu_av = accelerator(accelerator::cpu_accelerator).default_view;
    2: accelerator_view assoc_av = device.create_view();
    3:  
    4: extent<1> e1(WIDTH);
    5: texture<int, 1> tex_1(e1, cpu_av, assoc_av);
    6:  
    7: unsigned int bpse = 16;
    8: extent<2> e2(HEIGHT, WIDTH);
    9: texture<int, 2> tex_2(e2, bpse, cpu_av, assoc_av);
   10:  
   11: extent<3> e3(DEPTH, HEIGHT, WIDTH);
   12: std::vector<int> vec(e3.size(), 10);
   13: texture<int, 3> tex_3(e3, vec.begin(), vec.end(), cpu_av, assoc_av);

Copy Constructing Staging Texture

A staging texture can be copy constructed either from another staging texture or from a texture on an accelerator (and vice-versa).

    1: // Copy construct a staging texture from another staging texture
    2: texture<int, 1> tex_copy(tex_1);
    3:  
    4: // Copy construct a texture on accelerator from a staging texture
    5: accelerator_view av = device.create_view();
    6: texture<int, 2> tex_copy2(tex_2, av);
    7:  
    8: // Copy construct a staging texture from another staging texture
    9: accelerator_view cpu_av1 = accelerator(accelerator::cpu_accelerator).default_view
   10: accelerator_view assoc_av1 = device.default_view;
   11: texture<int, 3> tex_copy_3(tex_3, cpu_av1, assoc_av1);
   12:  
   13: accelerator_view av = device.default_view;
   14: std::vector<int> vec(e3.size(), 10);
   15: texture<int, 3> tex(e3, vec.begin(), vec.end(), av);
   16:  
   17: // Copy construct a staging texture from a texture on accelerator
   18: texture<int, 3 > tex_copy(tex, cpu_av, assoc_av);

Accessing Staging Texture on CPU

Staging texture is primarily designed to only serve as a data storage medium that can provide optimum copy performance. It is not intended to be used for computation or data manipulation on CPU. Hence, the indexing and get/set operations on a texture are only supported on accelerator and are not available to staging texture. However, one can get pointer to underlying data of a staging texture by using the ‘data()’ member function and use it for modifying underlying data on host.

Due to memory alignment needs the underlying data of staging texture is padded and hence the size of each dimension cannot be used to navigate the raw data. Staging texture provides two properties ‘row_ pitch’ and ‘depth_pitch’ which can be used to navigate the raw data. The ‘row_ pitch’ of a 2D or 3D staging texture refers to the number of bytes needed to navigate from one row to the next row. The ‘depth_pitch’ of a 3D staging texture refers to the number of bytes needed to advance to navigate from once depth slice to next depth slice.

    1: int* ptr_data = (int*)tex_3.data();
    2: int* ptr_slice = ptr_data;
    3:  
    4: for(int i = 0; i < tex_3.extent[0]; i++)
    5: {
    6:     int* ptr_row = ptr_slice;
    7:  
    8:     for(int j = 0; j < tex_3.extent[1]; j++)
    9:     {
   10:         for(int k = 0; k < tex_3.extent[2]; k++)
   11:         {
   12:             std::cout << ptr_row[k] << std::endl;
   13:         }
   14:  
   15:         // Move to the next row
   16:         ptr_row = (int*)((char*)ptr_row + tex_3.row_pitch);
   17:     }
   18:  
   19:     // Move to next slice
   20:     ptr_slice = (int*)((char*)ptr_slice + tex_3.depth_pitch);
   21: }

Interop with DirectX

Interop with staging textures has the same experience as that of interop with texture created on accelerators. The only requirement for adopting a DirectX staging texture into C++ AMP is that it should be a read-write staging texture.

Creating Texture View on Staging Texture

You cannot create a texture view on top of a staging texture. This is because staging texture is not intended to perform texture based computation on CPU and hence there is no CPU scenario which necessitates creation of these types.

Similar to the staging array, staging texture cannot be captured in a parallel_for_each kernel, accessing the data underlying a staging texture while a copy operation is in progress can result in undefined behavior and the pointer to the underlying data must not be cached across copy operation involving the staging texture.

In Closing

I hope you now have a better understanding of staging texture that will enable you to use it efficiently in you code. As usual, I would love to read your comments below or in our MSDN forum.