123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976 |
- DXC Cookbook: HLSL Coding Patterns for SPIR-V
- =============================================
- Author: Steven Perron
- Date: Oct 22, 2018
- Introduction
- ============
- This document provides a set of examples that demonstrate what will and
- will not be accepted by the DXC compiler when generating SPIR-V. The
- difficulty in defining what is acceptable is that it cannot be specified
- by a grammar. The entire program must be taken into consideration.
- Hopefully this will be useful.
- We are interested in how global resources are used. For a SPIR-V shader
- to be valid, accesses to global resources like structured buffers and
- images must be done directly on the global resources. They cannot be
- copied or have their address returned from functions. However, in HLSL,
- it is possible to copy a global resource or to pass it by reference to a
- function. Since this can be arbitrarily complex, DXC can generate valid
- SPIR-V only if the compiler is able to remove all of these copies.
- The transformations that are used to remove the copies will be the same
- for both structured buffers and images, so we have chosen to focus on
- structured buffer. The process of transforming the code in this way is
- called *legalization.*
- Support evolves over time as the optimizations in SPIRV-Tools are
- improved. At GDC 2018, Greg Fischer from LunarG
- `presented <http://schedule.gdconf.com/session/hlsl-in-vulkan-there-and-back-again-presented-by-khronos-group/856616>`__
- earlier results in this space. The DXC, Glslang, and SPIRV-Tools
- maintainers work together to handle new HLSL code patterns. This
- document represents the state of the DXC compiler in October 2018.
- Glslang does legalization as well. However, what it is able to legalize
- is different from DXC because of features it chooses to support, and the
- optimizations from SPIRV-Tools it choose to run. For example, Glslang
- does not support structured buffer aliasing yet, so many of these
- examples will not work with Glslang.
- All of the examples are available in the DXC repository, at
- https://github.com/Microsoft/DirectXShaderCompiler/tree/master/tools/clang/test/CodeGenSPIRV/legal-examples
- . To open a link to Tim Jones' Shader Playground for an example, you can
- follow the url in the comments of each example.
- Examples for structured buffers
- ===============================
- Desired code
- ------------
- .. code-block:: hlsl
- // 0-copy-sbuf-ok.hlsl
- // http://shader-playground.timjones.io/e6af2bdce0c61ed07d3a826aa8a95d45
- struct S {
- float4 f;
- };
- int i;
- StructuredBuffer<S> gSBuffer;
- RWStructuredBuffer<S> gRWSBuffer;
- void main() {
- gRWSBuffer[i] = gSBuffer[i];
- }
- This example shows code that directly translates to valid SPIR-V. In
- this case, we have two structured buffers. When one of their elements is
- accessed, it is done by naming the resource from which to get the
- element.
- Note that it is fine to copy an element of the structured buffer.
- Single copy to a local
- ----------------------
- Cases that can be easily legalized are those where there is exactly one
- assignment to the local copy of the structured buffer. In this context,
- a local is either a global static or a function scope symbol. Something
- that can be accessed by only a single instance of the shader. When you
- have a single copy to a local, it is obvious which global is actually be
- used. This allows the compiler to replace a reference to the local
- symbol with the global resource.
- Initialization of a static
- ~~~~~~~~~~~~~~~~~~~~~~~~~~
- .. code-block:: hlsl
- // 1-copy-global-static-ok.hlsl
- // http://shader-playground.timjones.io/815543dc91a4e6855a8d0c6a345d4a5a
- struct S {
- float4 f;
- };
- int i;
- StructuredBuffer<S> gSBuffer;
- RWStructuredBuffer<S> gRWSBuffer;
- static StructuredBuffer<S> sSBuffer = gSBuffer;
- void main() {
- gRWSBuffer[i] = sSBuffer[i];
- }
- This example shows an implicitly addressed structured buffer
- ``gSBuffer`` assigned to a static ``sSBuffer``. This copy is treated
- like a shallow copy. This is implemented by making ``sSBuffer`` a
- pointer to ``gSBuffer``.
- This example can be legalized because the compiler is able to see that
- ``sSbuffer`` is points to ``gSBuffer``, which does not move, so uses of
- ``sSbuffer`` can be replaced by ``gSBuffer``.
- .. code-block:: hlsl
- // 2-write-global-static-ok.hlsl
- // http://shader-playground.timjones.io/1c65c467e395383945d219a60edbe10c
- struct S {
- float4 f;
- };
- int i;
- RWStructuredBuffer<S> gRWSBuffer;
- static RWStructuredBuffer<S> sRWSBuffer = gRWSBuffer;
- void main() {
- sRWSBuffer[i].f = 0.0;
- }
- This example is similar to the previous example, except in this case the
- shallow copy becomes important. ``sRWSBuffer`` is treated like a pointer
- to ``gRWSBuffer``. As before, the references to ``sRWSBuffer`` can be
- replaced by ``gRWSBuffer``. This means that the write that occurs will
- be visible outside of the shader.
- Copy to function scope
- ~~~~~~~~~~~~~~~~~~~~~~
- .. code-block:: hlsl
- // 3-copy-local-struct-ok.hlsl
- // http://shader-playground.timjones.io/77dd20774e4943044c2f1b630c539f07
- struct S {
- float4 f;
- };
- struct CombinedBuffers {
- StructuredBuffer<S> SBuffer;
- RWStructuredBuffer<S> RWSBuffer;
- };
- int i;
- StructuredBuffer<S> gSBuffer;
- RWStructuredBuffer<S> gRWSBuffer;
- void main() {
- CombinedBuffers cb;
- cb.SBuffer = gSBuffer;
- cb.RWSBuffer = gRWSBuffer;
- cb.RWSBuffer[i] = cb.SBuffer[i];
- }
- It is also possible to copy a structured buffer to a function scope
- symbol. This is similar to a copy to a static scope symbol. The local
- copy is really a pointer to the original. This example demonstrates that
- DXC can legalize the copy even if it is a copy to part of a structure.
- There are no specific restrictions on the structure. The structured
- buffers can be anywhere in the structure, and there can be any number of
- members. Structured buffers can be in nested structures of any depth.
- The following is a move complicated example.
- .. code-block:: hlsl
- // 4-copy-local-nested-struct-ok.hlsl
- // http://shader-playground.timjones.io/14f59ff2a28c0a0180daf6ce4393cf6b
- struct S {
- float4 f;
- };
- struct CombinedBuffers {
- StructuredBuffer<S> SBuffer;
- RWStructuredBuffer<S> RWSBuffer;
- };
- struct S2 {
- CombinedBuffers cb;
- };
- struct S1 {
- S2 s2;
- };
- int i;
- StructuredBuffer<S> gSBuffer;
- RWStructuredBuffer<S> gRWSBuffer;
- void main() {
- S1 s1;
- s1.s2.cb.SBuffer = gSBuffer;
- s1.s2.cb.RWSBuffer = gRWSBuffer;
- s1.s2.cb.RWSBuffer[i] = s1.s2.cb.SBuffer[i];
- }
- Function parameters
- ~~~~~~~~~~~~~~~~~~~
- .. code-block:: hlsl
- // 5-func-param-sbuf-ok.hlsl
- // http://shader-playground.timjones.io/aeb06f527c5390d82d63bdb4eafc9ae7
- struct S {
- float4 f;
- };
- struct CombinedBuffers {
- StructuredBuffer<S> SBuffer;
- RWStructuredBuffer<S> RWSBuffer;
- };
- int i;
- StructuredBuffer<S> gSBuffer;
- RWStructuredBuffer<S> gRWSBuffer;
- void foo(StructuredBuffer<S> pSBuffer) {
- gRWSBuffer[i] = pSBuffer[i];
- }
- void main() {
- foo(gSBuffer);
- }
- It is possible to pass a structured buffer as a parameter to a function.
- As with the copies in the previous section, it is a pointer to the
- structured buffer that is actually being passed to ``foo``. This is the
- same way that arrays work in C/C++.
- .. code-block:: hlsl
- // 6-func-param-rwsbuf-ok.hlsl
- // http://shader-playground.timjones.io/f4e0194ce78118c0a709d85080ccea93
- struct S {
- float4 f;
- };
- int i;
- StructuredBuffer<S> gSBuffer;
- RWStructuredBuffer<S> gRWSBuffer;
- void foo(RWStructuredBuffer<S> pRWSBuffer) {
- pRWSBuffer[i] = gSBuffer[i];
- }
- void main() {
- foo(gRWSBuffer);
- }
- The same is true for RW structured buffers. So in this case, the write
- to ``pRWSBuffer`` is changing ``gRWSBuffer``. This means that the write
- to ``pRWSBuffer`` will be visible outside of the function, and outside
- of the shader.
- Return values
- ~~~~~~~~~~~~~
- The next two examples show that structured buffers can be a function's
- return value. As before, the return value of ``foo`` is really a pointer
- to the global resource.
- .. code-block:: hlsl
- // 7-func-ret-tmp-var-ok.hlsl
- // http://shader-playground.timjones.io/d6b706423f02dad58fbb01841282c6a1
- struct S {
- float4 f;
- };
- int i;
- StructuredBuffer<S> gSBuffer;
- RWStructuredBuffer<S> gRWSBuffer;
- RWStructuredBuffer<S> foo() {
- return gRWSBuffer;
- }
- void main() {
- RWStructuredBuffer<S> lRWSBuffer = foo();
- lRWSBuffer[i] = gSBuffer[i];
- }
- | In this case, the compiler will replace ``lRWSBuffer`` by
- ``gRWSBuffer``.
- .. code-block:: hlsl
- // 8-func-ret-direct-ok.hlsl
- // http://shader-playground.timjones.io/6edbbc1aa6c6b6533c5a728135f87fb9
- struct S {
- float4 f;
- };
- int i;
- StructuredBuffer<S> gSBuffer;
- RWStructuredBuffer<S> gRWSBuffer;
- StructuredBuffer<S> foo() {
- return gSBuffer;
- }
- void main() {
- gRWSBuffer[i] = foo()[i];
- }
- This example is similar to the previous, but shows that you do not have
- to use an explicit temporary value.
- Conditional control flow
- ------------------------
- The examples so far have do not have any conditional control flow. This
- makes it obvious which resources are being used. The introduction of
- conditional control flow makes the job of the compiler much harder, and
- in some cases impossible. Remember that the compiler is trying to
- determine at compile time which resource will be used at run time. In
- this section, we will look at how control flow affects the compiler's
- ability to do this. The bottom line is that the compiler has to be able
- to turn all of the conditional control flow that affects which resources
- are used into straight line code.
- Inputs in if-statement
- ~~~~~~~~~~~~~~~~~~~~~~
- The first example is one where the compiler cannot determine which
- resource is actually being accessed.
- .. code-block:: hlsl
- // 9-if-stmt-select-fail.hlsl
- // http://shader-playground.timjones.io/2896e95627fd8a6689ca96c81a5c7c68
- struct S {
- float4 f;
- };
- struct CombinedBuffers {
- StructuredBuffer<S> SBuffer;
- RWStructuredBuffer<S> RWSBuffer;
- };
- int i;
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer;
- #define constant 0
- void main() {
- StructuredBuffer<S> lSBuffer;
- if (constant > i) { // Condition can't be computed at compile time.
- lSBuffer = gSBuffer1; // Will produce invalid SPIR-V for Vulkan.
- } else {
- lSBuffer = gSBuffer2;
- }
- gRWSBuffer[i] = lSBuffer[i];
- }
- In this example, ``lsBuffer`` could be either ``gSBuffer1`` or
- ``gSBuffer2``. It depends on the value of ``i`` which is a parameter to
- the shader and cannot be known at compile time. At this time, the
- compiler is not able to convert this code into something that drivers
- will accept.
- If this is the pattern that your code, I would suggest rewriting the
- code into the following:
- .. code-block:: hlsl
- // 10-if-stmt-select-ok.hlsl
- // http://shader-playground.timjones.io/5063d8a0a7ad1f9d0839cd34a6d94dd2
- struct S {
- float4 f;
- };
- struct CombinedBuffers {
- StructuredBuffer<S> SBuffer;
- RWStructuredBuffer<S> RWSBuffer;
- };
- int i;
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer;
- #define constant 0
- void main() {
- StructuredBuffer<S> lSBuffer;
- if (constant > i) {
- lSBuffer = gSBuffer1;
- gRWSBuffer[i] = lSBuffer[i];
- } else {
- lSBuffer = gSBuffer2;
- gRWSBuffer[i] = lSBuffer[i];
- }
- }
- Notice that this involves replicating code. If the code that follows the
- if-statement is long, you could consider moving it to a function, and
- having two calls to that function.
- If-statements with constants
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Not all control flow is a problem. There are situations where the
- compiler is able to determine that a condition is always true or always
- false. For example, in the following code, the compiler looks at "0>2",
- and knows that is always false.
- .. code-block:: hlsl
- // 11-if-stmt-const-ok.hlsl
- // http://shader-playground.timjones.io/7ef5b89b3ec3d56c22e1bca45b40516a
- struct S {
- float4 f;
- };
- int i;
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer;
- #define constant 0
- void main() {
- StructuredBuffer<S> lSBuffer;
- if (constant > 2) {
- lSBuffer = gSBuffer1;
- } else {
- lSBuffer = gSBuffer2;
- }
- gRWSBuffer[i] = lSBuffer[i];
- }
- The compiler will turn this code into
- .. code-block:: hlsl
- struct S {
- float4 f;
- };
- int i;
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer;
- #define constant 0
- void main() {
- gRWSBuffer[i] = gSBuffer2[i];
- }
- The two previous examples show that handling control flow depends on
- what the compiler can do. This depends on the amount of optimization
- that is done, and which optimizations are done. In general, when you are
- writing code that will select a resource, keep the conditions as simple
- as possible to make it as easy as possible for the compiler to determine
- which path is taken.
- Switch statements
- ~~~~~~~~~~~~~~~~~
- Switch statements are similar to if-statements. If the selector is a
- constant, then the compiler will be able to propagate the copies.
- .. code-block:: hlsl
- // 12-switch-stmt-select-fail.hlsl
- // http://shader-playground.timjones.io/b079f878daeba5d77842725b90a476ca
- struct S {
- float4 f;
- };
- struct CombinedBuffers {
- StructuredBuffer<S> SBuffer;
- RWStructuredBuffer<S> RWSBuffer;
- };
- int i;
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer;
- #define constant 0
- void main() {
- StructuredBuffer<S> lSBuffer;
- switch(i) { // Compiler can't determine which case will run.
- case 0:
- lSBuffer = gSBuffer1; // Will produce invalid SPIR-V for Vulkan.
- break;
- default:
- lSBuffer = gSBuffer2;
- break;
- }
- gRWSBuffer[i] = lSBuffer[i];
- }
- The compiler is not able to remove the copies in this example because it
- does not know the value of ``i`` at compile time.
- .. code-block:: hlsl
- // 13-switch-stmt-const-ok.hlsl
- // http://shader-playground.timjones.io/a46dd1f1a84eba38c047439741ec08ab
- struct S {
- float4 f;
- };
- struct CombinedBuffers {
- StructuredBuffer<S> SBuffer;
- RWStructuredBuffer<S> RWSBuffer;
- };
- int i;
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer;
- const static int constant = 0;
- void main() {
- StructuredBuffer<S> lSBuffer;
- switch(constant) {
- case 0:
- lSBuffer = gSBuffer1;
- break;
- default:
- lSBuffer = gSBuffer2;
- break;
- }
- gRWSBuffer[i] = lSBuffer[i];
- }
- However, if the selector is turned into a constant, the compiler can
- replace uses of ``lSBuffer`` by ``gSBuffer1``.
- Loop Induction Variables in conditions
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Besides inputs, another type of variable that hinders the compiler are
- loop induction variables. These are variables that change value for each
- iteration of the loop. Consider this example.
- .. code-block:: hlsl
- // 14-loop-var-fail.hlsl
- // http://shader-playground.timjones.io/8df364770e3f425e6321e71f817bcd1a
- struct S {
- float4 f;
- };
- struct CombinedBuffers {
- StructuredBuffer<S> SBuffer;
- RWStructuredBuffer<S> RWSBuffer;
- };
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer;
- #define constant 0
- void main() {
- StructuredBuffer<S> lSBuffer;
- for( int j = 0; j < 2; j++ ) {
- if (constant > j) { // Condition is different for different iterations
- lSBuffer = gSBuffer1; // Will produces invalid SPIR-V for Vulkan.
- } else {
- lSBuffer = gSBuffer2;
- }
- gRWSBuffer[j] = lSBuffer[j];
- }
- }
- In this example, ``j`` is an induction variable. It takes on the values
- ``0`` and ``1``. The information is there to be able to determine which
- path is taken in each iteration, but the compiler does not figure this
- out by default.
- If you want the compiler to be able to legalize this code, then you will
- have to direct the compiler to unroll this loop using the unroll
- attribute. The following example can be legalized by the compiler:
- .. code-block:: hlsl
- // 15-loop-var-unroll-ok.hlsl
- // http://shader-playground.timjones.io/3d0f6f830fc4a5102714e19c748e81c7
- struct S {
- float4 f;
- };
- struct CombinedBuffers {
- StructuredBuffer<S> SBuffer;
- RWStructuredBuffer<S> RWSBuffer;
- };
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer;
- #define constant 0
- void main() {
- StructuredBuffer<S> lSBuffer;
- [unroll]
- for( int j = 0; j < 2; j++ ) {
- if (constant > j) {
- lSBuffer = gSBuffer1;
- } else {
- lSBuffer = gSBuffer2;
- }
- gRWSBuffer[j] = lSBuffer[j];
- }
- }
- Variable iteration counts
- ~~~~~~~~~~~~~~~~~~~~~~~~~
- Adding the unroll attribute to loops does not guarantee that the
- compiler is able to legalize the code. The compiler has to be able to
- fully unroll the loop. That means the compiler will have to create a
- copy of the body of the loop for each iteration so that there is no loop
- anymore. That can only be done if the number of iterations can be known
- at compile time.
- This means that the compiler must be able to determine the initial
- value, the final value, and the step for the induction variable, ``j``
- in the example. None of ``foo1``, ``foo2``, or ``foo3`` can be legalized
- because the number of iterations cannot be known at compile time.
- .. code-block:: hlsl
- // 16-loop-var-range-fail.hlsl
- // http://shader-playground.timjones.io/376f5f985c3ceceea004ab58edb336f2
- struct S {
- float4 f;
- };
- struct CombinedBuffers {
- StructuredBuffer<S> SBuffer;
- RWStructuredBuffer<S> RWSBuffer;
- };
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer;
- int i;
- #define constant 0
- void foo1() {
- StructuredBuffer<S> lSBuffer;
- [unroll]
- for( int j = i; j < 2; j++ ) { // Compiler can't determine the initial value
- if (constant > j) {
- lSBuffer = gSBuffer1;
- } else {
- lSBuffer = gSBuffer2;
- }
- gRWSBuffer[j] = lSBuffer[j];
- }
- }
- void foo2() {
- StructuredBuffer<S> lSBuffer;
- [unroll]
- for( int j = 0; j < i; j++ ) { // Compiler can't determine the end value
- if (constant > j) {
- lSBuffer = gSBuffer1;
- } else {
- lSBuffer = gSBuffer2;
- }
- gRWSBuffer[j] = lSBuffer[j];
- }
- }
- void foo3() {
- StructuredBuffer<S> lSBuffer;
- [unroll]
- for( int j = 0; j < 2; j += i ) { // Compiler can't determine the step count
- if (constant > j) {
- lSBuffer = gSBuffer1;
- } else {
- lSBuffer = gSBuffer2;
- }
- gRWSBuffer[j] = lSBuffer[j];
- }
- }
- void main() {
- foo1(); foo2(); foo3();
- }
- As before the compiler will try to simplify expressions to determine
- their value at compile time, but it may not always be successful. We
- would recommend that you keep the expressions for the loop bounds as
- simple as possible to increase the chances the compiler can figure it
- out.
- Other restrictions on unrolling
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Not being able to determine the iteration count at compile time is a
- fundamental problem. No matter how good the compiler is, it will never
- be able to fully unroll the loop. However, due to the internal details
- (algorithms in the SPIRV-Tools optimizer), other cases cannot be
- handled. The most notable one is that the induction variable must be an
- integral type.
- .. code-block:: hlsl
- // 17-loop-var-float-fail.hlsl
- // http://shader-playground.timjones.io/d5d2598699378688684a4a074553dddf
- struct S {
- float4 f;
- };
- struct CombinedBuffers {
- StructuredBuffer<S> SBuffer;
- RWStructuredBuffer<S> RWSBuffer;
- };
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer;
- #define constant 0
- void main() {
- StructuredBuffer<S> lSBuffer;
- [unroll]
- for( float j = 0; j < 2; j++ ) { // Can't infer floating point induction values
- if (constant > j) {
- lSBuffer = gSBuffer1;
- } else {
- lSBuffer = gSBuffer2;
- }
- gRWSBuffer[j] = lSBuffer[j];
- }
- }
- This example cannot be legalized because ``j`` is a ``float``.
- Other interesting cases
- -----------------------
- Multiple calls to a function
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- .. code-block:: hlsl
- // 18-multi-func-call-ok.hlsl
- // http://shader-playground.timjones.io/e7b3ac1262a291c92902fd3f1fd3343c
- struct S {
- float4 f;
- };
- int i;
- StructuredBuffer<S> gSBuffer;
- RWStructuredBuffer<S> gRWSBuffer1;
- RWStructuredBuffer<S> gRWSBuffer2;
- void foo(RWStructuredBuffer<S> pRWSBuffer) {
- pRWSBuffer[i] = gSBuffer[i];
- }
- void main() {
- foo(gRWSBuffer1);
- foo(gRWSBuffer2);
- }
- In this example, we see the same function is called twice. Each call has
- a different parameter. This can look like a problem because
- ``pRWSBuffer`` could be either ``gRWSBuffer1`` or ``gRWSBuffer2``.
- However, the compiler is able to work around this by creating a separate
- copy of ``foo`` for each call site. In fact, these copies will be placed
- inline.
- Multiple returns
- ~~~~~~~~~~~~~~~~
- As we have already seen, a return from a function is a copy. At this
- point, it would be fair to ask what happens if there are multiple
- returns.
- .. code-block:: hlsl
- // 19-multi-func-ret-fail.hlsl
- // http://shader-playground.timjones.io/922facb688a5ba09b153d64cf1fc4557
- struct S {
- float4 f;
- };
- int i;
- StructuredBuffer<S> gSBuffer;
- RWStructuredBuffer<S> gRWSBuffer1;
- RWStructuredBuffer<S> gRWSBuffer2;
- RWStructuredBuffer<S> foo(int l) {
- if (l == 0) { // Compiler does not know which branch will be taken:
- // Branch taken depends on input i.
- return gRWSBuffer1;
- } else {
- return gRWSBuffer2;
- }
- }
- void main() {
- RWStructuredBuffer<S> lRWSBuffer = foo(i);
- lRWSBuffer[i] = gSBuffer[i];
- }
- The compiler is not able to legalize this example because it does not
- know which value will be returned. However, if the compiler is able to
- determine which path will be taken, then it can be legalized.
- .. code-block:: hlsl
- // 20-multi-func-ret-const-ok.hlsl
- // http://shader-playground.timjones.io/84b093c7cf9e3932c5f0d9691533bafe
- struct S {
- float4 f;
- };
- int i;
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer1;
- RWStructuredBuffer<S> gRWSBuffer2;
- StructuredBuffer<S> foo(int l) {
- if (l == 0) {
- return gSBuffer1;
- } else {
- return gSBuffer2;
- }
- }
- void main() {
- gRWSBuffer1[i] = foo(0)[i];
- gRWSBuffer2[i] = foo(1)[i];
- }
- For each call to ``foo``, the compiler is able to determine which value
- will be returned. In this case, the code can be legalized.
- Combining elements
- ~~~~~~~~~~~~~~~~~~
- Individually, these examples are simple; however, these elements can be
- combined in arbitrary ways. As one last example, consider this HLSL
- source code.
- .. code-block:: hlsl
- // 21-combined-ok.hlsl
- // http://shader-playground.timjones.io/9f00d2d359da0731cdf8d0b68520e2c4
- struct S {
- float4 f;
- };
- int i;
- StructuredBuffer<S> gSBuffer1;
- StructuredBuffer<S> gSBuffer2;
- RWStructuredBuffer<S> gRWSBuffer1;
- RWStructuredBuffer<S> gRWSBuffer2;
- #define constant 0
- StructuredBuffer<S> bar() {
- if (constant > 2) {
- return gSBuffer1;
- } else {
- return gSBuffer2;
- }
- }
- void foo(RWStructuredBuffer<S> pRWSBuffer) {
- StructuredBuffer<S> lSBuffer = bar();
- pRWSBuffer[i] = lSBuffer[i];
- }
- void main() {
- foo(gRWSBuffer1);
- foo(gRWSBuffer2);
- }
- The compiler will do all of the transformations that mentioned earlier
- to identify a single resource for each load and store from a resource.
- Conclusion
- ==========
- It is impossible to enumerate all of the possible code sequences that
- work or do not work, but hopefully this will give a guide as to what is
- possible or not. The general rule of thumb is that there must be a
- straightforward way to transform the code so that there are no copies of
- global resources.
|