|
@@ -0,0 +1,976 @@
|
|
|
|
+DXC Cookbook: HLSL Coding Patterns for SPIR-V
|
|
|
|
+=============================================
|
|
|
|
+
|
|
|
|
+Author: Steven Perron
|
|
|
|
+
|
|
|
|
+Date: Oct 22, 2018
|
|
|
|
+
|
|
|
|
+Introduction
|
|
|
|
+============
|
|
|
|
+
|
|
|
|
+This document provides a set of examples that demonstrate what will and
|
|
|
|
+will not be accepted by the DXC compiler when generating SPIR-V. The
|
|
|
|
+difficulty in defining what is acceptable is that it cannot be specified
|
|
|
|
+by a grammar. The entire program must be taken into consideration.
|
|
|
|
+Hopefully this will be useful.
|
|
|
|
+
|
|
|
|
+We are interested in how global resources are used. For a SPIR-V shader
|
|
|
|
+to be valid, accesses to global resources like structured buffers and
|
|
|
|
+images must be done directly on the global resources. They cannot be
|
|
|
|
+copied or have their address returned from functions. However, in HLSL,
|
|
|
|
+it is possible to copy a global resource or to pass it by reference to a
|
|
|
|
+function. Since this can be arbitrarily complex, DXC can generate valid
|
|
|
|
+SPIR-V only if the compiler is able to remove all of these copies.
|
|
|
|
+
|
|
|
|
+The transformations that are used to remove the copies will be the same
|
|
|
|
+for both structured buffers and images, so we have chosen to focus on
|
|
|
|
+structured buffer. The process of transforming the code in this way is
|
|
|
|
+called *legalization.*
|
|
|
|
+
|
|
|
|
+Support evolves over time as the optimizations in SPIRV-Tools are
|
|
|
|
+improved. At GDC 2018, Greg Fischer from LunarG
|
|
|
|
+`presented <http://schedule.gdconf.com/session/hlsl-in-vulkan-there-and-back-again-presented-by-khronos-group/856616>`__
|
|
|
|
+earlier results in this space. The DXC, Glslang, and SPIRV-Tools
|
|
|
|
+maintainers work together to handle new HLSL code patterns. This
|
|
|
|
+document represents the state of the DXC compiler in October 2018.
|
|
|
|
+
|
|
|
|
+Glslang does legalization as well. However, what it is able to legalize
|
|
|
|
+is different from DXC because of features it chooses to support, and the
|
|
|
|
+optimizations from SPIRV-Tools it choose to run. For example, Glslang
|
|
|
|
+does not support structured buffer aliasing yet, so many of these
|
|
|
|
+examples will not work with Glslang.
|
|
|
|
+
|
|
|
|
+All of the examples are available in the DXC repository, at
|
|
|
|
+https://github.com/Microsoft/DirectXShaderCompiler/tree/master/tools/clang/test/CodeGenSPIRV/legal-examples
|
|
|
|
+. To open a link to Tim Jones' Shader Playground for an example, you can
|
|
|
|
+follow the url in the comments of each example.
|
|
|
|
+
|
|
|
|
+Examples for structured buffers
|
|
|
|
+===============================
|
|
|
|
+
|
|
|
|
+Desired code
|
|
|
|
+------------
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 0-copy-sbuf-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/e6af2bdce0c61ed07d3a826aa8a95d45
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ gRWSBuffer[i] = gSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+This example shows code that directly translates to valid SPIR-V. In
|
|
|
|
+this case, we have two structured buffers. When one of their elements is
|
|
|
|
+accessed, it is done by naming the resource from which to get the
|
|
|
|
+element.
|
|
|
|
+
|
|
|
|
+Note that it is fine to copy an element of the structured buffer.
|
|
|
|
+
|
|
|
|
+Single copy to a local
|
|
|
|
+----------------------
|
|
|
|
+
|
|
|
|
+Cases that can be easily legalized are those where there is exactly one
|
|
|
|
+assignment to the local copy of the structured buffer. In this context,
|
|
|
|
+a local is either a global static or a function scope symbol. Something
|
|
|
|
+that can be accessed by only a single instance of the shader. When you
|
|
|
|
+have a single copy to a local, it is obvious which global is actually be
|
|
|
|
+used. This allows the compiler to replace a reference to the local
|
|
|
|
+symbol with the global resource.
|
|
|
|
+
|
|
|
|
+Initialization of a static
|
|
|
|
+~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 1-copy-global-static-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/815543dc91a4e6855a8d0c6a345d4a5a
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ static StructuredBuffer<S> sSBuffer = gSBuffer;
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ gRWSBuffer[i] = sSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+This example shows an implicitly addressed structured buffer
|
|
|
|
+``gSBuffer`` assigned to a static ``sSBuffer``. This copy is treated
|
|
|
|
+like a shallow copy. This is implemented by making ``sSBuffer`` a
|
|
|
|
+pointer to ``gSBuffer``.
|
|
|
|
+
|
|
|
|
+This example can be legalized because the compiler is able to see that
|
|
|
|
+``sSbuffer`` is points to ``gSBuffer``, which does not move, so uses of
|
|
|
|
+``sSbuffer`` can be replaced by ``gSBuffer``.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 2-write-global-static-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/1c65c467e395383945d219a60edbe10c
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ static RWStructuredBuffer<S> sRWSBuffer = gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ sRWSBuffer[i].f = 0.0;
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+This example is similar to the previous example, except in this case the
|
|
|
|
+shallow copy becomes important. ``sRWSBuffer`` is treated like a pointer
|
|
|
|
+to ``gRWSBuffer``. As before, the references to ``sRWSBuffer`` can be
|
|
|
|
+replaced by ``gRWSBuffer``. This means that the write that occurs will
|
|
|
|
+be visible outside of the shader.
|
|
|
|
+
|
|
|
|
+Copy to function scope
|
|
|
|
+~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 3-copy-local-struct-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/77dd20774e4943044c2f1b630c539f07
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct CombinedBuffers {
|
|
|
|
+ StructuredBuffer<S> SBuffer;
|
|
|
|
+ RWStructuredBuffer<S> RWSBuffer;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ CombinedBuffers cb;
|
|
|
|
+ cb.SBuffer = gSBuffer;
|
|
|
|
+ cb.RWSBuffer = gRWSBuffer;
|
|
|
|
+ cb.RWSBuffer[i] = cb.SBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+It is also possible to copy a structured buffer to a function scope
|
|
|
|
+symbol. This is similar to a copy to a static scope symbol. The local
|
|
|
|
+copy is really a pointer to the original. This example demonstrates that
|
|
|
|
+DXC can legalize the copy even if it is a copy to part of a structure.
|
|
|
|
+There are no specific restrictions on the structure. The structured
|
|
|
|
+buffers can be anywhere in the structure, and there can be any number of
|
|
|
|
+members. Structured buffers can be in nested structures of any depth.
|
|
|
|
+The following is a move complicated example.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 4-copy-local-nested-struct-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/14f59ff2a28c0a0180daf6ce4393cf6b
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct CombinedBuffers {
|
|
|
|
+ StructuredBuffer<S> SBuffer;
|
|
|
|
+ RWStructuredBuffer<S> RWSBuffer;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct S2 {
|
|
|
|
+ CombinedBuffers cb;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct S1 {
|
|
|
|
+ S2 s2;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ S1 s1;
|
|
|
|
+ s1.s2.cb.SBuffer = gSBuffer;
|
|
|
|
+ s1.s2.cb.RWSBuffer = gRWSBuffer;
|
|
|
|
+ s1.s2.cb.RWSBuffer[i] = s1.s2.cb.SBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+Function parameters
|
|
|
|
+~~~~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 5-func-param-sbuf-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/aeb06f527c5390d82d63bdb4eafc9ae7
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct CombinedBuffers {
|
|
|
|
+ StructuredBuffer<S> SBuffer;
|
|
|
|
+ RWStructuredBuffer<S> RWSBuffer;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ void foo(StructuredBuffer<S> pSBuffer) {
|
|
|
|
+ gRWSBuffer[i] = pSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ foo(gSBuffer);
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+It is possible to pass a structured buffer as a parameter to a function.
|
|
|
|
+As with the copies in the previous section, it is a pointer to the
|
|
|
|
+structured buffer that is actually being passed to ``foo``. This is the
|
|
|
|
+same way that arrays work in C/C++.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 6-func-param-rwsbuf-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/f4e0194ce78118c0a709d85080ccea93
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ void foo(RWStructuredBuffer<S> pRWSBuffer) {
|
|
|
|
+ pRWSBuffer[i] = gSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ foo(gRWSBuffer);
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+The same is true for RW structured buffers. So in this case, the write
|
|
|
|
+to ``pRWSBuffer`` is changing ``gRWSBuffer``. This means that the write
|
|
|
|
+to ``pRWSBuffer`` will be visible outside of the function, and outside
|
|
|
|
+of the shader.
|
|
|
|
+
|
|
|
|
+Return values
|
|
|
|
+~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+The next two examples show that structured buffers can be a function's
|
|
|
|
+return value. As before, the return value of ``foo`` is really a pointer
|
|
|
|
+to the global resource.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 7-func-ret-tmp-var-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/d6b706423f02dad58fbb01841282c6a1
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ RWStructuredBuffer<S> foo() {
|
|
|
|
+ return gRWSBuffer;
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ RWStructuredBuffer<S> lRWSBuffer = foo();
|
|
|
|
+ lRWSBuffer[i] = gSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+| In this case, the compiler will replace ``lRWSBuffer`` by
|
|
|
|
+ ``gRWSBuffer``.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 8-func-ret-direct-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/6edbbc1aa6c6b6533c5a728135f87fb9
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> foo() {
|
|
|
|
+ return gSBuffer;
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ gRWSBuffer[i] = foo()[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+This example is similar to the previous, but shows that you do not have
|
|
|
|
+to use an explicit temporary value.
|
|
|
|
+
|
|
|
|
+Conditional control flow
|
|
|
|
+------------------------
|
|
|
|
+
|
|
|
|
+The examples so far have do not have any conditional control flow. This
|
|
|
|
+makes it obvious which resources are being used. The introduction of
|
|
|
|
+conditional control flow makes the job of the compiler much harder, and
|
|
|
|
+in some cases impossible. Remember that the compiler is trying to
|
|
|
|
+determine at compile time which resource will be used at run time. In
|
|
|
|
+this section, we will look at how control flow affects the compiler's
|
|
|
|
+ability to do this. The bottom line is that the compiler has to be able
|
|
|
|
+to turn all of the conditional control flow that affects which resources
|
|
|
|
+are used into straight line code.
|
|
|
|
+
|
|
|
|
+Inputs in if-statement
|
|
|
|
+~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+The first example is one where the compiler cannot determine which
|
|
|
|
+resource is actually being accessed.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 9-if-stmt-select-fail.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/2896e95627fd8a6689ca96c81a5c7c68
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct CombinedBuffers {
|
|
|
|
+ StructuredBuffer<S> SBuffer;
|
|
|
|
+ RWStructuredBuffer<S> RWSBuffer;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ #define constant 0
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> lSBuffer;
|
|
|
|
+ if (constant > i) { // Condition can't be computed at compile time.
|
|
|
|
+ lSBuffer = gSBuffer1; // Will produce invalid SPIR-V for Vulkan.
|
|
|
|
+ } else {
|
|
|
|
+ lSBuffer = gSBuffer2;
|
|
|
|
+ }
|
|
|
|
+ gRWSBuffer[i] = lSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+In this example, ``lsBuffer`` could be either ``gSBuffer1`` or
|
|
|
|
+``gSBuffer2``. It depends on the value of ``i`` which is a parameter to
|
|
|
|
+the shader and cannot be known at compile time. At this time, the
|
|
|
|
+compiler is not able to convert this code into something that drivers
|
|
|
|
+will accept.
|
|
|
|
+
|
|
|
|
+If this is the pattern that your code, I would suggest rewriting the
|
|
|
|
+code into the following:
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 10-if-stmt-select-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/5063d8a0a7ad1f9d0839cd34a6d94dd2
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct CombinedBuffers {
|
|
|
|
+ StructuredBuffer<S> SBuffer;
|
|
|
|
+ RWStructuredBuffer<S> RWSBuffer;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ #define constant 0
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> lSBuffer;
|
|
|
|
+ if (constant > i) {
|
|
|
|
+ lSBuffer = gSBuffer1;
|
|
|
|
+ gRWSBuffer[i] = lSBuffer[i];
|
|
|
|
+ } else {
|
|
|
|
+ lSBuffer = gSBuffer2;
|
|
|
|
+ gRWSBuffer[i] = lSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+Notice that this involves replicating code. If the code that follows the
|
|
|
|
+if-statement is long, you could consider moving it to a function, and
|
|
|
|
+having two calls to that function.
|
|
|
|
+
|
|
|
|
+If-statements with constants
|
|
|
|
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+Not all control flow is a problem. There are situations where the
|
|
|
|
+compiler is able to determine that a condition is always true or always
|
|
|
|
+false. For example, in the following code, the compiler looks at "0>2",
|
|
|
|
+and knows that is always false.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 11-if-stmt-const-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/7ef5b89b3ec3d56c22e1bca45b40516a
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ #define constant 0
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> lSBuffer;
|
|
|
|
+ if (constant > 2) {
|
|
|
|
+ lSBuffer = gSBuffer1;
|
|
|
|
+ } else {
|
|
|
|
+ lSBuffer = gSBuffer2;
|
|
|
|
+ }
|
|
|
|
+ gRWSBuffer[i] = lSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+The compiler will turn this code into
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ #define constant 0
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ gRWSBuffer[i] = gSBuffer2[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+The two previous examples show that handling control flow depends on
|
|
|
|
+what the compiler can do. This depends on the amount of optimization
|
|
|
|
+that is done, and which optimizations are done. In general, when you are
|
|
|
|
+writing code that will select a resource, keep the conditions as simple
|
|
|
|
+as possible to make it as easy as possible for the compiler to determine
|
|
|
|
+which path is taken.
|
|
|
|
+
|
|
|
|
+Switch statements
|
|
|
|
+~~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+Switch statements are similar to if-statements. If the selector is a
|
|
|
|
+constant, then the compiler will be able to propagate the copies.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 12-switch-stmt-select-fail.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/b079f878daeba5d77842725b90a476ca
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct CombinedBuffers {
|
|
|
|
+ StructuredBuffer<S> SBuffer;
|
|
|
|
+ RWStructuredBuffer<S> RWSBuffer;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ #define constant 0
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> lSBuffer;
|
|
|
|
+ switch(i) { // Compiler can't determine which case will run.
|
|
|
|
+ case 0:
|
|
|
|
+ lSBuffer = gSBuffer1; // Will produce invalid SPIR-V for Vulkan.
|
|
|
|
+ break;
|
|
|
|
+ default:
|
|
|
|
+ lSBuffer = gSBuffer2;
|
|
|
|
+ break;
|
|
|
|
+ }
|
|
|
|
+ gRWSBuffer[i] = lSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+The compiler is not able to remove the copies in this example because it
|
|
|
|
+does not know the value of ``i`` at compile time.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 13-switch-stmt-const-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/a46dd1f1a84eba38c047439741ec08ab
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct CombinedBuffers {
|
|
|
|
+ StructuredBuffer<S> SBuffer;
|
|
|
|
+ RWStructuredBuffer<S> RWSBuffer;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ const static int constant = 0;
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> lSBuffer;
|
|
|
|
+ switch(constant) {
|
|
|
|
+ case 0:
|
|
|
|
+ lSBuffer = gSBuffer1;
|
|
|
|
+ break;
|
|
|
|
+ default:
|
|
|
|
+ lSBuffer = gSBuffer2;
|
|
|
|
+ break;
|
|
|
|
+ }
|
|
|
|
+ gRWSBuffer[i] = lSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+However, if the selector is turned into a constant, the compiler can
|
|
|
|
+replace uses of ``lSBuffer`` by ``gSBuffer1``.
|
|
|
|
+
|
|
|
|
+Loop Induction Variables in conditions
|
|
|
|
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+Besides inputs, another type of variable that hinders the compiler are
|
|
|
|
+loop induction variables. These are variables that change value for each
|
|
|
|
+iteration of the loop. Consider this example.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 14-loop-var-fail.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/8df364770e3f425e6321e71f817bcd1a
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct CombinedBuffers {
|
|
|
|
+ StructuredBuffer<S> SBuffer;
|
|
|
|
+ RWStructuredBuffer<S> RWSBuffer;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ #define constant 0
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> lSBuffer;
|
|
|
|
+
|
|
|
|
+ for( int j = 0; j < 2; j++ ) {
|
|
|
|
+ if (constant > j) { // Condition is different for different iterations
|
|
|
|
+ lSBuffer = gSBuffer1; // Will produces invalid SPIR-V for Vulkan.
|
|
|
|
+ } else {
|
|
|
|
+ lSBuffer = gSBuffer2;
|
|
|
|
+ }
|
|
|
|
+ gRWSBuffer[j] = lSBuffer[j];
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+In this example, ``j`` is an induction variable. It takes on the values
|
|
|
|
+``0`` and ``1``. The information is there to be able to determine which
|
|
|
|
+path is taken in each iteration, but the compiler does not figure this
|
|
|
|
+out by default.
|
|
|
|
+
|
|
|
|
+If you want the compiler to be able to legalize this code, then you will
|
|
|
|
+have to direct the compiler to unroll this loop using the unroll
|
|
|
|
+attribute. The following example can be legalized by the compiler:
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 15-loop-var-unroll-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/3d0f6f830fc4a5102714e19c748e81c7
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct CombinedBuffers {
|
|
|
|
+ StructuredBuffer<S> SBuffer;
|
|
|
|
+ RWStructuredBuffer<S> RWSBuffer;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ #define constant 0
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> lSBuffer;
|
|
|
|
+
|
|
|
|
+ [unroll]
|
|
|
|
+ for( int j = 0; j < 2; j++ ) {
|
|
|
|
+ if (constant > j) {
|
|
|
|
+ lSBuffer = gSBuffer1;
|
|
|
|
+ } else {
|
|
|
|
+ lSBuffer = gSBuffer2;
|
|
|
|
+ }
|
|
|
|
+ gRWSBuffer[j] = lSBuffer[j];
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+Variable iteration counts
|
|
|
|
+~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+Adding the unroll attribute to loops does not guarantee that the
|
|
|
|
+compiler is able to legalize the code. The compiler has to be able to
|
|
|
|
+fully unroll the loop. That means the compiler will have to create a
|
|
|
|
+copy of the body of the loop for each iteration so that there is no loop
|
|
|
|
+anymore. That can only be done if the number of iterations can be known
|
|
|
|
+at compile time.
|
|
|
|
+
|
|
|
|
+This means that the compiler must be able to determine the initial
|
|
|
|
+value, the final value, and the step for the induction variable, ``j``
|
|
|
|
+in the example. None of ``foo1``, ``foo2``, or ``foo3`` can be legalized
|
|
|
|
+because the number of iterations cannot be known at compile time.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 16-loop-var-range-fail.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/376f5f985c3ceceea004ab58edb336f2
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct CombinedBuffers {
|
|
|
|
+ StructuredBuffer<S> SBuffer;
|
|
|
|
+ RWStructuredBuffer<S> RWSBuffer;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ #define constant 0
|
|
|
|
+
|
|
|
|
+ void foo1() {
|
|
|
|
+ StructuredBuffer<S> lSBuffer;
|
|
|
|
+
|
|
|
|
+ [unroll]
|
|
|
|
+ for( int j = i; j < 2; j++ ) { // Compiler can't determine the initial value
|
|
|
|
+ if (constant > j) {
|
|
|
|
+ lSBuffer = gSBuffer1;
|
|
|
|
+ } else {
|
|
|
|
+ lSBuffer = gSBuffer2;
|
|
|
|
+ }
|
|
|
|
+ gRWSBuffer[j] = lSBuffer[j];
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ void foo2() {
|
|
|
|
+ StructuredBuffer<S> lSBuffer;
|
|
|
|
+
|
|
|
|
+ [unroll]
|
|
|
|
+ for( int j = 0; j < i; j++ ) { // Compiler can't determine the end value
|
|
|
|
+ if (constant > j) {
|
|
|
|
+ lSBuffer = gSBuffer1;
|
|
|
|
+ } else {
|
|
|
|
+ lSBuffer = gSBuffer2;
|
|
|
|
+ }
|
|
|
|
+ gRWSBuffer[j] = lSBuffer[j];
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ void foo3() {
|
|
|
|
+ StructuredBuffer<S> lSBuffer;
|
|
|
|
+
|
|
|
|
+ [unroll]
|
|
|
|
+ for( int j = 0; j < 2; j += i ) { // Compiler can't determine the step count
|
|
|
|
+ if (constant > j) {
|
|
|
|
+ lSBuffer = gSBuffer1;
|
|
|
|
+ } else {
|
|
|
|
+ lSBuffer = gSBuffer2;
|
|
|
|
+ }
|
|
|
|
+ gRWSBuffer[j] = lSBuffer[j];
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ foo1(); foo2(); foo3();
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+As before the compiler will try to simplify expressions to determine
|
|
|
|
+their value at compile time, but it may not always be successful. We
|
|
|
|
+would recommend that you keep the expressions for the loop bounds as
|
|
|
|
+simple as possible to increase the chances the compiler can figure it
|
|
|
|
+out.
|
|
|
|
+
|
|
|
|
+Other restrictions on unrolling
|
|
|
|
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+Not being able to determine the iteration count at compile time is a
|
|
|
|
+fundamental problem. No matter how good the compiler is, it will never
|
|
|
|
+be able to fully unroll the loop. However, due to the internal details
|
|
|
|
+(algorithms in the SPIRV-Tools optimizer), other cases cannot be
|
|
|
|
+handled. The most notable one is that the induction variable must be an
|
|
|
|
+integral type.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 17-loop-var-float-fail.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/d5d2598699378688684a4a074553dddf
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ struct CombinedBuffers {
|
|
|
|
+ StructuredBuffer<S> SBuffer;
|
|
|
|
+ RWStructuredBuffer<S> RWSBuffer;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer;
|
|
|
|
+
|
|
|
|
+ #define constant 0
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> lSBuffer;
|
|
|
|
+
|
|
|
|
+ [unroll]
|
|
|
|
+ for( float j = 0; j < 2; j++ ) { // Can't infer floating point induction values
|
|
|
|
+ if (constant > j) {
|
|
|
|
+ lSBuffer = gSBuffer1;
|
|
|
|
+ } else {
|
|
|
|
+ lSBuffer = gSBuffer2;
|
|
|
|
+ }
|
|
|
|
+ gRWSBuffer[j] = lSBuffer[j];
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+This example cannot be legalized because ``j`` is a ``float``.
|
|
|
|
+
|
|
|
|
+Other interesting cases
|
|
|
|
+-----------------------
|
|
|
|
+
|
|
|
|
+Multiple calls to a function
|
|
|
|
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 18-multi-func-call-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/e7b3ac1262a291c92902fd3f1fd3343c
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer1;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer2;
|
|
|
|
+
|
|
|
|
+
|
|
|
|
+ void foo(RWStructuredBuffer<S> pRWSBuffer) {
|
|
|
|
+ pRWSBuffer[i] = gSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ foo(gRWSBuffer1);
|
|
|
|
+ foo(gRWSBuffer2);
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+In this example, we see the same function is called twice. Each call has
|
|
|
|
+a different parameter. This can look like a problem because
|
|
|
|
+``pRWSBuffer`` could be either ``gRWSBuffer1`` or ``gRWSBuffer2``.
|
|
|
|
+However, the compiler is able to work around this by creating a separate
|
|
|
|
+copy of ``foo`` for each call site. In fact, these copies will be placed
|
|
|
|
+inline.
|
|
|
|
+
|
|
|
|
+Multiple returns
|
|
|
|
+~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+As we have already seen, a return from a function is a copy. At this
|
|
|
|
+point, it would be fair to ask what happens if there are multiple
|
|
|
|
+returns.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 19-multi-func-ret-fail.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/922facb688a5ba09b153d64cf1fc4557
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer1;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer2;
|
|
|
|
+
|
|
|
|
+ RWStructuredBuffer<S> foo(int l) {
|
|
|
|
+ if (l == 0) { // Compiler does not know which branch will be taken:
|
|
|
|
+ // Branch taken depends on input i.
|
|
|
|
+ return gRWSBuffer1;
|
|
|
|
+ } else {
|
|
|
|
+ return gRWSBuffer2;
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ RWStructuredBuffer<S> lRWSBuffer = foo(i);
|
|
|
|
+ lRWSBuffer[i] = gSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+The compiler is not able to legalize this example because it does not
|
|
|
|
+know which value will be returned. However, if the compiler is able to
|
|
|
|
+determine which path will be taken, then it can be legalized.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 20-multi-func-ret-const-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/84b093c7cf9e3932c5f0d9691533bafe
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer1;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer2;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> foo(int l) {
|
|
|
|
+ if (l == 0) {
|
|
|
|
+ return gSBuffer1;
|
|
|
|
+ } else {
|
|
|
|
+ return gSBuffer2;
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ gRWSBuffer1[i] = foo(0)[i];
|
|
|
|
+ gRWSBuffer2[i] = foo(1)[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+For each call to ``foo``, the compiler is able to determine which value
|
|
|
|
+will be returned. In this case, the code can be legalized.
|
|
|
|
+
|
|
|
|
+Combining elements
|
|
|
|
+~~~~~~~~~~~~~~~~~~
|
|
|
|
+
|
|
|
|
+Individually, these examples are simple; however, these elements can be
|
|
|
|
+combined in arbitrary ways. As one last example, consider this HLSL
|
|
|
|
+source code.
|
|
|
|
+
|
|
|
|
+.. code-block:: hlsl
|
|
|
|
+
|
|
|
|
+ // 21-combined-ok.hlsl
|
|
|
|
+ // http://shader-playground.timjones.io/9f00d2d359da0731cdf8d0b68520e2c4
|
|
|
|
+
|
|
|
|
+ struct S {
|
|
|
|
+ float4 f;
|
|
|
|
+ };
|
|
|
|
+
|
|
|
|
+ int i;
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> gSBuffer1;
|
|
|
|
+ StructuredBuffer<S> gSBuffer2;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer1;
|
|
|
|
+ RWStructuredBuffer<S> gRWSBuffer2;
|
|
|
|
+
|
|
|
|
+ #define constant 0
|
|
|
|
+
|
|
|
|
+ StructuredBuffer<S> bar() {
|
|
|
|
+ if (constant > 2) {
|
|
|
|
+ return gSBuffer1;
|
|
|
|
+ } else {
|
|
|
|
+ return gSBuffer2;
|
|
|
|
+ }
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ void foo(RWStructuredBuffer<S> pRWSBuffer) {
|
|
|
|
+ StructuredBuffer<S> lSBuffer = bar();
|
|
|
|
+ pRWSBuffer[i] = lSBuffer[i];
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+ void main() {
|
|
|
|
+ foo(gRWSBuffer1);
|
|
|
|
+ foo(gRWSBuffer2);
|
|
|
|
+ }
|
|
|
|
+
|
|
|
|
+The compiler will do all of the transformations that mentioned earlier
|
|
|
|
+to identify a single resource for each load and store from a resource.
|
|
|
|
+
|
|
|
|
+Conclusion
|
|
|
|
+==========
|
|
|
|
+
|
|
|
|
+It is impossible to enumerate all of the possible code sequences that
|
|
|
|
+work or do not work, but hopefully this will give a guide as to what is
|
|
|
|
+possible or not. The general rule of thumb is that there must be a
|
|
|
|
+straightforward way to transform the code so that there are no copies of
|
|
|
|
+global resources.
|