norm and unorm in C++ AMP

Article
04/02/2012

The concurrency::graphics namespace defines two new types called norm and unorm. Allow me to quote from our introductory blog post (which is prerequisite reading):

norm and unorm are wrappers over “float” and provide clamping behavior. norm and unorm clamp a floating point value into the range [-1.0, 1.0] and [0.0, 1.0] respectively.

The norm and unorm types are C++ implemented types that work in cpu- and amp-restricted code contexts. They are mainly used with C++ AMP textures but there is no restriction that keeps them from being used wherever you wish to have automatic clamping of float values.

Construction and Conversion

In general, construction of norm/unorm values must be performed using explicit construction or explicit casting. Clamping is performed at the time of construction. Here are some other characteristics about norm/unorm types and conversion:

Unlike primitive types in C++, norm/unorm variables have a default constructor that explicitly sets the value to zero (0.0f).
Implicit conversions (via construction or casting) exist for the following:

unorm –> norm
norm –> float
unorm -> float

When constructing a norm/unorm value from a non-float value (e.g. int, unsigned int or double), the value is first converted to a float and then clamped.

Here are some examples:

 // default values 
norm n1;          // n1 == 0.0f
unorm un1;        // un1 == 0.0f

// clamping
norm n2(-1.5f);   // n2 == -1.0f
unorm un2(1.66f); // un2 == 1.0f
unorm un3(-5.3f); // un3 == 0.0f

// explicit conversion from int with clamping
norm n4(5);       // n4 == 1.0f

// implicit conversion to float
float fn2 = n2;   // fn2 == -1.0f
float fun2 = un2; // fun2 == 1.0f

Operators

The norm/unorm types provide the same operators as float. The way the operators work is that they internally perform the operation as a float and then clamp the result.

The following are operators defined for norm and unorm:

Assignment from the same type
Arithmetic and compound assignment: +,-,*,/,+=,-=,*=,/=
Unary: ++(postfix & prefix), --(postfix & prefix), - (negation only defined for norm)
Comparison operators: ==, !=, <, <=, >, >=

Note, while the subtraction operator is defined for unorm, the negation operator is not, as this would not make sense. If you did do this, the compiler will promote the unorm to a float and then do the negation, thus resulting in a float result.

Here’s an example of the negation operator:

 // Result of negation
auto neg_norm = -norm(0.8f);   // typeof(neg_norm) == norm
auto neg_unorm = -unorm(0.8f); // typeof(neg_unorm) == float

Usage in C++ AMP

The norm/unorm data types can be used just like any type in C++ AMP. You can create a concurrency::array or concurrency::array_view with these types. The storage for these types in this context is the same size and layout as a regular float. The same applies whether you have a local or tile-static variable or capture a value in the lambda passed into parallel_for_each of type norm/unorm.

The only difference comes when using norm/unorms with concurrency::texture. Textures use a special bit format for storing these types and that will be covered in a future blog post when we cover textures in more detail.

Helpful Tips

The header file amp_graphics.h also has definitions for some helpful macros to help you code against the boundary ranges of these types:

NORM_ZERO -> norm(0.0f)
NORM_MIN -> norm(-1.0f)
NORM_MAX -> norm(1.0f)
UNORM_ZERO -> unorm(0.0f)
UNORM_MIN -> unorm(0.0f)
UNORM_MAX -> unorm(1.0f)

Comparison with HLSL data types

There are some differences between the C++ AMP norm/unorm types and the ‘snorm float’ and ‘unorm float’ types in Direct3D HLSL. If you are not interested in those, you can safely skip this section. For simplicity I’ll refer to just the C++ AMP norm type and the HLSL snorm type (the statements here also apply to the unorm versions as well).

In HLSL:

The snorm keyword is a modifier for the float type. It only has an effect when it is stored into an RWTexture with element type ‘snorm float’ or ‘snorm floatN’. This means that you will only see clamping when they get stored into a texture. Any intermediate operation on variables of this type behaves the same as a float. This means there is no clamping between operations, on construction or conversion.

Take the following HLSL code for example:

 StructuredBuffer<snorm float> buff_a : register(t0);
RWStructuredBuffer<snorm float> results : register(u0);
RWTexture1D<snorm float> tex : register(u1);
...
void CSMain( uint3 DTid : SV_DispatchThreadID ) {
    snorm float v1 = buff_a[0]; // 0.9f
    snorm float v2 = buff_a[1]; // 0.3f
    snorm float v3 = buff_a[2]; // 0.4f
    
    snorm float result = (v1 + v2) – v3; // result == 0.8f
    float resultf =      (v1 + v2) – v3; // resultf == 0.8f

    // storage example
    result = 1.5f;            // result == 1.5f
    results[DTid.x] = result; // results[DTid.x] == 1.5f
    tex[DTid.x] = result;     // tex[DTid.x] == 1.0f
}

Here you see that only the assignment into an RWTexture causes a value to actually get clamped.

In C++ AMP:

In contrast, the C++ AMP norm and unorm types are a first-class data type with operators explicitly defined. Again, clamping is performed at construction, conversion, assignment and after every operation.

The following is C++ AMP code that performs the same three-variable expressions:

 array_view<norm, 1> datav(...);
norm v1 = 0.9f;
norm v2 = 0.3f;
norm v3 = 0.4f;
parallel_for_each(datav.extent, [=](index<1> idx) restrict(amp) {
    norm result = (v1 + v2) – v3;   // result == 0.6f
    float resultf = (v1 + v2) – v3; // resultf == 0.6f
    datav[idx] = (v1 + v2) – v3;    // datav[idx] == 0.6f
});

The + operation (0.9f + 0.3f) clamped the value from 1.2f to 1.0f before performing the subtraction. This is why the same expression in HLSL can produce a different result in C++ AMP.

When porting existing HLSL code to C++ AMP you should keep this in mind. To keep the same behavior as the HLSL code, convert your temporary values and variables as float and only cast/convert to norm/unorm just before writing them to a texture:

 array_view<norm, 1> datav(...);
texture<norm, 1> tex(...);
norm v1 = 0.9f;
norm v2 = 0.3f;
norm v3 = 0.4f;
parallel_for_each(tex.extent, [=](index<1> idx) restrict(amp) {
    float f1(v1), f2(v2), f3(v3);   // convert to float
    float resultf = (f1 + f2) – f3; // resultf == 0.8f

    datav[idx] = norm(resultf)); // datav[idx] == 0.8f
    tex[idx] = norm(resultf));   // tex[idx] == 0.8f
});

One impact of using norm/unorm is that you’ll generate more instructions (to perform the clamping), which may have a negative impact on performance.

Summary

In this post I have covered the norm and unorm data types defined in the concurrency::graphics namespace and shown some of the intricacies you may want to be aware of when using them. Please feel free to share your comments below or at our MSDN forum.