SPIRV-Cookbook.rst 25 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976
  1. DXC Cookbook: HLSL Coding Patterns for SPIR-V
  2. =============================================
  3. Author: Steven Perron
  4. Date: Oct 22, 2018
  5. Introduction
  6. ============
  7. This document provides a set of examples that demonstrate what will and
  8. will not be accepted by the DXC compiler when generating SPIR-V. The
  9. difficulty in defining what is acceptable is that it cannot be specified
  10. by a grammar. The entire program must be taken into consideration.
  11. Hopefully this will be useful.
  12. We are interested in how global resources are used. For a SPIR-V shader
  13. to be valid, accesses to global resources like structured buffers and
  14. images must be done directly on the global resources. They cannot be
  15. copied or have their address returned from functions. However, in HLSL,
  16. it is possible to copy a global resource or to pass it by reference to a
  17. function. Since this can be arbitrarily complex, DXC can generate valid
  18. SPIR-V only if the compiler is able to remove all of these copies.
  19. The transformations that are used to remove the copies will be the same
  20. for both structured buffers and images, so we have chosen to focus on
  21. structured buffer. The process of transforming the code in this way is
  22. called *legalization.*
  23. Support evolves over time as the optimizations in SPIRV-Tools are
  24. improved. At GDC 2018, Greg Fischer from LunarG
  25. `presented <http://schedule.gdconf.com/session/hlsl-in-vulkan-there-and-back-again-presented-by-khronos-group/856616>`__
  26. earlier results in this space. The DXC, Glslang, and SPIRV-Tools
  27. maintainers work together to handle new HLSL code patterns. This
  28. document represents the state of the DXC compiler in October 2018.
  29. Glslang does legalization as well. However, what it is able to legalize
  30. is different from DXC because of features it chooses to support, and the
  31. optimizations from SPIRV-Tools it choose to run. For example, Glslang
  32. does not support structured buffer aliasing yet, so many of these
  33. examples will not work with Glslang.
  34. All of the examples are available in the DXC repository, at
  35. https://github.com/Microsoft/DirectXShaderCompiler/tree/master/tools/clang/test/CodeGenSPIRV/legal-examples
  36. . To open a link to Tim Jones' Shader Playground for an example, you can
  37. follow the url in the comments of each example.
  38. Examples for structured buffers
  39. ===============================
  40. Desired code
  41. ------------
  42. .. code-block:: hlsl
  43. // 0-copy-sbuf-ok.hlsl
  44. // http://shader-playground.timjones.io/e6af2bdce0c61ed07d3a826aa8a95d45
  45. struct S {
  46. float4 f;
  47. };
  48. int i;
  49. StructuredBuffer<S> gSBuffer;
  50. RWStructuredBuffer<S> gRWSBuffer;
  51. void main() {
  52. gRWSBuffer[i] = gSBuffer[i];
  53. }
  54. This example shows code that directly translates to valid SPIR-V. In
  55. this case, we have two structured buffers. When one of their elements is
  56. accessed, it is done by naming the resource from which to get the
  57. element.
  58. Note that it is fine to copy an element of the structured buffer.
  59. Single copy to a local
  60. ----------------------
  61. Cases that can be easily legalized are those where there is exactly one
  62. assignment to the local copy of the structured buffer. In this context,
  63. a local is either a global static or a function scope symbol. Something
  64. that can be accessed by only a single instance of the shader. When you
  65. have a single copy to a local, it is obvious which global is actually be
  66. used. This allows the compiler to replace a reference to the local
  67. symbol with the global resource.
  68. Initialization of a static
  69. ~~~~~~~~~~~~~~~~~~~~~~~~~~
  70. .. code-block:: hlsl
  71. // 1-copy-global-static-ok.hlsl
  72. // http://shader-playground.timjones.io/815543dc91a4e6855a8d0c6a345d4a5a
  73. struct S {
  74. float4 f;
  75. };
  76. int i;
  77. StructuredBuffer<S> gSBuffer;
  78. RWStructuredBuffer<S> gRWSBuffer;
  79. static StructuredBuffer<S> sSBuffer = gSBuffer;
  80. void main() {
  81. gRWSBuffer[i] = sSBuffer[i];
  82. }
  83. This example shows an implicitly addressed structured buffer
  84. ``gSBuffer`` assigned to a static ``sSBuffer``. This copy is treated
  85. like a shallow copy. This is implemented by making ``sSBuffer`` a
  86. pointer to ``gSBuffer``.
  87. This example can be legalized because the compiler is able to see that
  88. ``sSbuffer`` is points to ``gSBuffer``, which does not move, so uses of
  89. ``sSbuffer`` can be replaced by ``gSBuffer``.
  90. .. code-block:: hlsl
  91. // 2-write-global-static-ok.hlsl
  92. // http://shader-playground.timjones.io/1c65c467e395383945d219a60edbe10c
  93. struct S {
  94. float4 f;
  95. };
  96. int i;
  97. RWStructuredBuffer<S> gRWSBuffer;
  98. static RWStructuredBuffer<S> sRWSBuffer = gRWSBuffer;
  99. void main() {
  100. sRWSBuffer[i].f = 0.0;
  101. }
  102. This example is similar to the previous example, except in this case the
  103. shallow copy becomes important. ``sRWSBuffer`` is treated like a pointer
  104. to ``gRWSBuffer``. As before, the references to ``sRWSBuffer`` can be
  105. replaced by ``gRWSBuffer``. This means that the write that occurs will
  106. be visible outside of the shader.
  107. Copy to function scope
  108. ~~~~~~~~~~~~~~~~~~~~~~
  109. .. code-block:: hlsl
  110. // 3-copy-local-struct-ok.hlsl
  111. // http://shader-playground.timjones.io/77dd20774e4943044c2f1b630c539f07
  112. struct S {
  113. float4 f;
  114. };
  115. struct CombinedBuffers {
  116. StructuredBuffer<S> SBuffer;
  117. RWStructuredBuffer<S> RWSBuffer;
  118. };
  119. int i;
  120. StructuredBuffer<S> gSBuffer;
  121. RWStructuredBuffer<S> gRWSBuffer;
  122. void main() {
  123. CombinedBuffers cb;
  124. cb.SBuffer = gSBuffer;
  125. cb.RWSBuffer = gRWSBuffer;
  126. cb.RWSBuffer[i] = cb.SBuffer[i];
  127. }
  128. It is also possible to copy a structured buffer to a function scope
  129. symbol. This is similar to a copy to a static scope symbol. The local
  130. copy is really a pointer to the original. This example demonstrates that
  131. DXC can legalize the copy even if it is a copy to part of a structure.
  132. There are no specific restrictions on the structure. The structured
  133. buffers can be anywhere in the structure, and there can be any number of
  134. members. Structured buffers can be in nested structures of any depth.
  135. The following is a move complicated example.
  136. .. code-block:: hlsl
  137. // 4-copy-local-nested-struct-ok.hlsl
  138. // http://shader-playground.timjones.io/14f59ff2a28c0a0180daf6ce4393cf6b
  139. struct S {
  140. float4 f;
  141. };
  142. struct CombinedBuffers {
  143. StructuredBuffer<S> SBuffer;
  144. RWStructuredBuffer<S> RWSBuffer;
  145. };
  146. struct S2 {
  147. CombinedBuffers cb;
  148. };
  149. struct S1 {
  150. S2 s2;
  151. };
  152. int i;
  153. StructuredBuffer<S> gSBuffer;
  154. RWStructuredBuffer<S> gRWSBuffer;
  155. void main() {
  156. S1 s1;
  157. s1.s2.cb.SBuffer = gSBuffer;
  158. s1.s2.cb.RWSBuffer = gRWSBuffer;
  159. s1.s2.cb.RWSBuffer[i] = s1.s2.cb.SBuffer[i];
  160. }
  161. Function parameters
  162. ~~~~~~~~~~~~~~~~~~~
  163. .. code-block:: hlsl
  164. // 5-func-param-sbuf-ok.hlsl
  165. // http://shader-playground.timjones.io/aeb06f527c5390d82d63bdb4eafc9ae7
  166. struct S {
  167. float4 f;
  168. };
  169. struct CombinedBuffers {
  170. StructuredBuffer<S> SBuffer;
  171. RWStructuredBuffer<S> RWSBuffer;
  172. };
  173. int i;
  174. StructuredBuffer<S> gSBuffer;
  175. RWStructuredBuffer<S> gRWSBuffer;
  176. void foo(StructuredBuffer<S> pSBuffer) {
  177. gRWSBuffer[i] = pSBuffer[i];
  178. }
  179. void main() {
  180. foo(gSBuffer);
  181. }
  182. It is possible to pass a structured buffer as a parameter to a function.
  183. As with the copies in the previous section, it is a pointer to the
  184. structured buffer that is actually being passed to ``foo``. This is the
  185. same way that arrays work in C/C++.
  186. .. code-block:: hlsl
  187. // 6-func-param-rwsbuf-ok.hlsl
  188. // http://shader-playground.timjones.io/f4e0194ce78118c0a709d85080ccea93
  189. struct S {
  190. float4 f;
  191. };
  192. int i;
  193. StructuredBuffer<S> gSBuffer;
  194. RWStructuredBuffer<S> gRWSBuffer;
  195. void foo(RWStructuredBuffer<S> pRWSBuffer) {
  196. pRWSBuffer[i] = gSBuffer[i];
  197. }
  198. void main() {
  199. foo(gRWSBuffer);
  200. }
  201. The same is true for RW structured buffers. So in this case, the write
  202. to ``pRWSBuffer`` is changing ``gRWSBuffer``. This means that the write
  203. to ``pRWSBuffer`` will be visible outside of the function, and outside
  204. of the shader.
  205. Return values
  206. ~~~~~~~~~~~~~
  207. The next two examples show that structured buffers can be a function's
  208. return value. As before, the return value of ``foo`` is really a pointer
  209. to the global resource.
  210. .. code-block:: hlsl
  211. // 7-func-ret-tmp-var-ok.hlsl
  212. // http://shader-playground.timjones.io/d6b706423f02dad58fbb01841282c6a1
  213. struct S {
  214. float4 f;
  215. };
  216. int i;
  217. StructuredBuffer<S> gSBuffer;
  218. RWStructuredBuffer<S> gRWSBuffer;
  219. RWStructuredBuffer<S> foo() {
  220. return gRWSBuffer;
  221. }
  222. void main() {
  223. RWStructuredBuffer<S> lRWSBuffer = foo();
  224. lRWSBuffer[i] = gSBuffer[i];
  225. }
  226. | In this case, the compiler will replace ``lRWSBuffer`` by
  227. ``gRWSBuffer``.
  228. .. code-block:: hlsl
  229. // 8-func-ret-direct-ok.hlsl
  230. // http://shader-playground.timjones.io/6edbbc1aa6c6b6533c5a728135f87fb9
  231. struct S {
  232. float4 f;
  233. };
  234. int i;
  235. StructuredBuffer<S> gSBuffer;
  236. RWStructuredBuffer<S> gRWSBuffer;
  237. StructuredBuffer<S> foo() {
  238. return gSBuffer;
  239. }
  240. void main() {
  241. gRWSBuffer[i] = foo()[i];
  242. }
  243. This example is similar to the previous, but shows that you do not have
  244. to use an explicit temporary value.
  245. Conditional control flow
  246. ------------------------
  247. The examples so far have do not have any conditional control flow. This
  248. makes it obvious which resources are being used. The introduction of
  249. conditional control flow makes the job of the compiler much harder, and
  250. in some cases impossible. Remember that the compiler is trying to
  251. determine at compile time which resource will be used at run time. In
  252. this section, we will look at how control flow affects the compiler's
  253. ability to do this. The bottom line is that the compiler has to be able
  254. to turn all of the conditional control flow that affects which resources
  255. are used into straight line code.
  256. Inputs in if-statement
  257. ~~~~~~~~~~~~~~~~~~~~~~
  258. The first example is one where the compiler cannot determine which
  259. resource is actually being accessed.
  260. .. code-block:: hlsl
  261. // 9-if-stmt-select-fail.hlsl
  262. // http://shader-playground.timjones.io/2896e95627fd8a6689ca96c81a5c7c68
  263. struct S {
  264. float4 f;
  265. };
  266. struct CombinedBuffers {
  267. StructuredBuffer<S> SBuffer;
  268. RWStructuredBuffer<S> RWSBuffer;
  269. };
  270. int i;
  271. StructuredBuffer<S> gSBuffer1;
  272. StructuredBuffer<S> gSBuffer2;
  273. RWStructuredBuffer<S> gRWSBuffer;
  274. #define constant 0
  275. void main() {
  276. StructuredBuffer<S> lSBuffer;
  277. if (constant > i) { // Condition can't be computed at compile time.
  278. lSBuffer = gSBuffer1; // Will produce invalid SPIR-V for Vulkan.
  279. } else {
  280. lSBuffer = gSBuffer2;
  281. }
  282. gRWSBuffer[i] = lSBuffer[i];
  283. }
  284. In this example, ``lsBuffer`` could be either ``gSBuffer1`` or
  285. ``gSBuffer2``. It depends on the value of ``i`` which is a parameter to
  286. the shader and cannot be known at compile time. At this time, the
  287. compiler is not able to convert this code into something that drivers
  288. will accept.
  289. If this is the pattern that your code, I would suggest rewriting the
  290. code into the following:
  291. .. code-block:: hlsl
  292. // 10-if-stmt-select-ok.hlsl
  293. // http://shader-playground.timjones.io/5063d8a0a7ad1f9d0839cd34a6d94dd2
  294. struct S {
  295. float4 f;
  296. };
  297. struct CombinedBuffers {
  298. StructuredBuffer<S> SBuffer;
  299. RWStructuredBuffer<S> RWSBuffer;
  300. };
  301. int i;
  302. StructuredBuffer<S> gSBuffer1;
  303. StructuredBuffer<S> gSBuffer2;
  304. RWStructuredBuffer<S> gRWSBuffer;
  305. #define constant 0
  306. void main() {
  307. StructuredBuffer<S> lSBuffer;
  308. if (constant > i) {
  309. lSBuffer = gSBuffer1;
  310. gRWSBuffer[i] = lSBuffer[i];
  311. } else {
  312. lSBuffer = gSBuffer2;
  313. gRWSBuffer[i] = lSBuffer[i];
  314. }
  315. }
  316. Notice that this involves replicating code. If the code that follows the
  317. if-statement is long, you could consider moving it to a function, and
  318. having two calls to that function.
  319. If-statements with constants
  320. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  321. Not all control flow is a problem. There are situations where the
  322. compiler is able to determine that a condition is always true or always
  323. false. For example, in the following code, the compiler looks at "0>2",
  324. and knows that is always false.
  325. .. code-block:: hlsl
  326. // 11-if-stmt-const-ok.hlsl
  327. // http://shader-playground.timjones.io/7ef5b89b3ec3d56c22e1bca45b40516a
  328. struct S {
  329. float4 f;
  330. };
  331. int i;
  332. StructuredBuffer<S> gSBuffer1;
  333. StructuredBuffer<S> gSBuffer2;
  334. RWStructuredBuffer<S> gRWSBuffer;
  335. #define constant 0
  336. void main() {
  337. StructuredBuffer<S> lSBuffer;
  338. if (constant > 2) {
  339. lSBuffer = gSBuffer1;
  340. } else {
  341. lSBuffer = gSBuffer2;
  342. }
  343. gRWSBuffer[i] = lSBuffer[i];
  344. }
  345. The compiler will turn this code into
  346. .. code-block:: hlsl
  347. struct S {
  348. float4 f;
  349. };
  350. int i;
  351. StructuredBuffer<S> gSBuffer1;
  352. StructuredBuffer<S> gSBuffer2;
  353. RWStructuredBuffer<S> gRWSBuffer;
  354. #define constant 0
  355. void main() {
  356. gRWSBuffer[i] = gSBuffer2[i];
  357. }
  358. The two previous examples show that handling control flow depends on
  359. what the compiler can do. This depends on the amount of optimization
  360. that is done, and which optimizations are done. In general, when you are
  361. writing code that will select a resource, keep the conditions as simple
  362. as possible to make it as easy as possible for the compiler to determine
  363. which path is taken.
  364. Switch statements
  365. ~~~~~~~~~~~~~~~~~
  366. Switch statements are similar to if-statements. If the selector is a
  367. constant, then the compiler will be able to propagate the copies.
  368. .. code-block:: hlsl
  369. // 12-switch-stmt-select-fail.hlsl
  370. // http://shader-playground.timjones.io/b079f878daeba5d77842725b90a476ca
  371. struct S {
  372. float4 f;
  373. };
  374. struct CombinedBuffers {
  375. StructuredBuffer<S> SBuffer;
  376. RWStructuredBuffer<S> RWSBuffer;
  377. };
  378. int i;
  379. StructuredBuffer<S> gSBuffer1;
  380. StructuredBuffer<S> gSBuffer2;
  381. RWStructuredBuffer<S> gRWSBuffer;
  382. #define constant 0
  383. void main() {
  384. StructuredBuffer<S> lSBuffer;
  385. switch(i) { // Compiler can't determine which case will run.
  386. case 0:
  387. lSBuffer = gSBuffer1; // Will produce invalid SPIR-V for Vulkan.
  388. break;
  389. default:
  390. lSBuffer = gSBuffer2;
  391. break;
  392. }
  393. gRWSBuffer[i] = lSBuffer[i];
  394. }
  395. The compiler is not able to remove the copies in this example because it
  396. does not know the value of ``i`` at compile time.
  397. .. code-block:: hlsl
  398. // 13-switch-stmt-const-ok.hlsl
  399. // http://shader-playground.timjones.io/a46dd1f1a84eba38c047439741ec08ab
  400. struct S {
  401. float4 f;
  402. };
  403. struct CombinedBuffers {
  404. StructuredBuffer<S> SBuffer;
  405. RWStructuredBuffer<S> RWSBuffer;
  406. };
  407. int i;
  408. StructuredBuffer<S> gSBuffer1;
  409. StructuredBuffer<S> gSBuffer2;
  410. RWStructuredBuffer<S> gRWSBuffer;
  411. const static int constant = 0;
  412. void main() {
  413. StructuredBuffer<S> lSBuffer;
  414. switch(constant) {
  415. case 0:
  416. lSBuffer = gSBuffer1;
  417. break;
  418. default:
  419. lSBuffer = gSBuffer2;
  420. break;
  421. }
  422. gRWSBuffer[i] = lSBuffer[i];
  423. }
  424. However, if the selector is turned into a constant, the compiler can
  425. replace uses of ``lSBuffer`` by ``gSBuffer1``.
  426. Loop Induction Variables in conditions
  427. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  428. Besides inputs, another type of variable that hinders the compiler are
  429. loop induction variables. These are variables that change value for each
  430. iteration of the loop. Consider this example.
  431. .. code-block:: hlsl
  432. // 14-loop-var-fail.hlsl
  433. // http://shader-playground.timjones.io/8df364770e3f425e6321e71f817bcd1a
  434. struct S {
  435. float4 f;
  436. };
  437. struct CombinedBuffers {
  438. StructuredBuffer<S> SBuffer;
  439. RWStructuredBuffer<S> RWSBuffer;
  440. };
  441. StructuredBuffer<S> gSBuffer1;
  442. StructuredBuffer<S> gSBuffer2;
  443. RWStructuredBuffer<S> gRWSBuffer;
  444. #define constant 0
  445. void main() {
  446. StructuredBuffer<S> lSBuffer;
  447. for( int j = 0; j < 2; j++ ) {
  448. if (constant > j) { // Condition is different for different iterations
  449. lSBuffer = gSBuffer1; // Will produces invalid SPIR-V for Vulkan.
  450. } else {
  451. lSBuffer = gSBuffer2;
  452. }
  453. gRWSBuffer[j] = lSBuffer[j];
  454. }
  455. }
  456. In this example, ``j`` is an induction variable. It takes on the values
  457. ``0`` and ``1``. The information is there to be able to determine which
  458. path is taken in each iteration, but the compiler does not figure this
  459. out by default.
  460. If you want the compiler to be able to legalize this code, then you will
  461. have to direct the compiler to unroll this loop using the unroll
  462. attribute. The following example can be legalized by the compiler:
  463. .. code-block:: hlsl
  464. // 15-loop-var-unroll-ok.hlsl
  465. // http://shader-playground.timjones.io/3d0f6f830fc4a5102714e19c748e81c7
  466. struct S {
  467. float4 f;
  468. };
  469. struct CombinedBuffers {
  470. StructuredBuffer<S> SBuffer;
  471. RWStructuredBuffer<S> RWSBuffer;
  472. };
  473. StructuredBuffer<S> gSBuffer1;
  474. StructuredBuffer<S> gSBuffer2;
  475. RWStructuredBuffer<S> gRWSBuffer;
  476. #define constant 0
  477. void main() {
  478. StructuredBuffer<S> lSBuffer;
  479. [unroll]
  480. for( int j = 0; j < 2; j++ ) {
  481. if (constant > j) {
  482. lSBuffer = gSBuffer1;
  483. } else {
  484. lSBuffer = gSBuffer2;
  485. }
  486. gRWSBuffer[j] = lSBuffer[j];
  487. }
  488. }
  489. Variable iteration counts
  490. ~~~~~~~~~~~~~~~~~~~~~~~~~
  491. Adding the unroll attribute to loops does not guarantee that the
  492. compiler is able to legalize the code. The compiler has to be able to
  493. fully unroll the loop. That means the compiler will have to create a
  494. copy of the body of the loop for each iteration so that there is no loop
  495. anymore. That can only be done if the number of iterations can be known
  496. at compile time.
  497. This means that the compiler must be able to determine the initial
  498. value, the final value, and the step for the induction variable, ``j``
  499. in the example. None of ``foo1``, ``foo2``, or ``foo3`` can be legalized
  500. because the number of iterations cannot be known at compile time.
  501. .. code-block:: hlsl
  502. // 16-loop-var-range-fail.hlsl
  503. // http://shader-playground.timjones.io/376f5f985c3ceceea004ab58edb336f2
  504. struct S {
  505. float4 f;
  506. };
  507. struct CombinedBuffers {
  508. StructuredBuffer<S> SBuffer;
  509. RWStructuredBuffer<S> RWSBuffer;
  510. };
  511. StructuredBuffer<S> gSBuffer1;
  512. StructuredBuffer<S> gSBuffer2;
  513. RWStructuredBuffer<S> gRWSBuffer;
  514. int i;
  515. #define constant 0
  516. void foo1() {
  517. StructuredBuffer<S> lSBuffer;
  518. [unroll]
  519. for( int j = i; j < 2; j++ ) { // Compiler can't determine the initial value
  520. if (constant > j) {
  521. lSBuffer = gSBuffer1;
  522. } else {
  523. lSBuffer = gSBuffer2;
  524. }
  525. gRWSBuffer[j] = lSBuffer[j];
  526. }
  527. }
  528. void foo2() {
  529. StructuredBuffer<S> lSBuffer;
  530. [unroll]
  531. for( int j = 0; j < i; j++ ) { // Compiler can't determine the end value
  532. if (constant > j) {
  533. lSBuffer = gSBuffer1;
  534. } else {
  535. lSBuffer = gSBuffer2;
  536. }
  537. gRWSBuffer[j] = lSBuffer[j];
  538. }
  539. }
  540. void foo3() {
  541. StructuredBuffer<S> lSBuffer;
  542. [unroll]
  543. for( int j = 0; j < 2; j += i ) { // Compiler can't determine the step count
  544. if (constant > j) {
  545. lSBuffer = gSBuffer1;
  546. } else {
  547. lSBuffer = gSBuffer2;
  548. }
  549. gRWSBuffer[j] = lSBuffer[j];
  550. }
  551. }
  552. void main() {
  553. foo1(); foo2(); foo3();
  554. }
  555. As before the compiler will try to simplify expressions to determine
  556. their value at compile time, but it may not always be successful. We
  557. would recommend that you keep the expressions for the loop bounds as
  558. simple as possible to increase the chances the compiler can figure it
  559. out.
  560. Other restrictions on unrolling
  561. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  562. Not being able to determine the iteration count at compile time is a
  563. fundamental problem. No matter how good the compiler is, it will never
  564. be able to fully unroll the loop. However, due to the internal details
  565. (algorithms in the SPIRV-Tools optimizer), other cases cannot be
  566. handled. The most notable one is that the induction variable must be an
  567. integral type.
  568. .. code-block:: hlsl
  569. // 17-loop-var-float-fail.hlsl
  570. // http://shader-playground.timjones.io/d5d2598699378688684a4a074553dddf
  571. struct S {
  572. float4 f;
  573. };
  574. struct CombinedBuffers {
  575. StructuredBuffer<S> SBuffer;
  576. RWStructuredBuffer<S> RWSBuffer;
  577. };
  578. StructuredBuffer<S> gSBuffer1;
  579. StructuredBuffer<S> gSBuffer2;
  580. RWStructuredBuffer<S> gRWSBuffer;
  581. #define constant 0
  582. void main() {
  583. StructuredBuffer<S> lSBuffer;
  584. [unroll]
  585. for( float j = 0; j < 2; j++ ) { // Can't infer floating point induction values
  586. if (constant > j) {
  587. lSBuffer = gSBuffer1;
  588. } else {
  589. lSBuffer = gSBuffer2;
  590. }
  591. gRWSBuffer[j] = lSBuffer[j];
  592. }
  593. }
  594. This example cannot be legalized because ``j`` is a ``float``.
  595. Other interesting cases
  596. -----------------------
  597. Multiple calls to a function
  598. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  599. .. code-block:: hlsl
  600. // 18-multi-func-call-ok.hlsl
  601. // http://shader-playground.timjones.io/e7b3ac1262a291c92902fd3f1fd3343c
  602. struct S {
  603. float4 f;
  604. };
  605. int i;
  606. StructuredBuffer<S> gSBuffer;
  607. RWStructuredBuffer<S> gRWSBuffer1;
  608. RWStructuredBuffer<S> gRWSBuffer2;
  609. void foo(RWStructuredBuffer<S> pRWSBuffer) {
  610. pRWSBuffer[i] = gSBuffer[i];
  611. }
  612. void main() {
  613. foo(gRWSBuffer1);
  614. foo(gRWSBuffer2);
  615. }
  616. In this example, we see the same function is called twice. Each call has
  617. a different parameter. This can look like a problem because
  618. ``pRWSBuffer`` could be either ``gRWSBuffer1`` or ``gRWSBuffer2``.
  619. However, the compiler is able to work around this by creating a separate
  620. copy of ``foo`` for each call site. In fact, these copies will be placed
  621. inline.
  622. Multiple returns
  623. ~~~~~~~~~~~~~~~~
  624. As we have already seen, a return from a function is a copy. At this
  625. point, it would be fair to ask what happens if there are multiple
  626. returns.
  627. .. code-block:: hlsl
  628. // 19-multi-func-ret-fail.hlsl
  629. // http://shader-playground.timjones.io/922facb688a5ba09b153d64cf1fc4557
  630. struct S {
  631. float4 f;
  632. };
  633. int i;
  634. StructuredBuffer<S> gSBuffer;
  635. RWStructuredBuffer<S> gRWSBuffer1;
  636. RWStructuredBuffer<S> gRWSBuffer2;
  637. RWStructuredBuffer<S> foo(int l) {
  638. if (l == 0) { // Compiler does not know which branch will be taken:
  639. // Branch taken depends on input i.
  640. return gRWSBuffer1;
  641. } else {
  642. return gRWSBuffer2;
  643. }
  644. }
  645. void main() {
  646. RWStructuredBuffer<S> lRWSBuffer = foo(i);
  647. lRWSBuffer[i] = gSBuffer[i];
  648. }
  649. The compiler is not able to legalize this example because it does not
  650. know which value will be returned. However, if the compiler is able to
  651. determine which path will be taken, then it can be legalized.
  652. .. code-block:: hlsl
  653. // 20-multi-func-ret-const-ok.hlsl
  654. // http://shader-playground.timjones.io/84b093c7cf9e3932c5f0d9691533bafe
  655. struct S {
  656. float4 f;
  657. };
  658. int i;
  659. StructuredBuffer<S> gSBuffer1;
  660. StructuredBuffer<S> gSBuffer2;
  661. RWStructuredBuffer<S> gRWSBuffer1;
  662. RWStructuredBuffer<S> gRWSBuffer2;
  663. StructuredBuffer<S> foo(int l) {
  664. if (l == 0) {
  665. return gSBuffer1;
  666. } else {
  667. return gSBuffer2;
  668. }
  669. }
  670. void main() {
  671. gRWSBuffer1[i] = foo(0)[i];
  672. gRWSBuffer2[i] = foo(1)[i];
  673. }
  674. For each call to ``foo``, the compiler is able to determine which value
  675. will be returned. In this case, the code can be legalized.
  676. Combining elements
  677. ~~~~~~~~~~~~~~~~~~
  678. Individually, these examples are simple; however, these elements can be
  679. combined in arbitrary ways. As one last example, consider this HLSL
  680. source code.
  681. .. code-block:: hlsl
  682. // 21-combined-ok.hlsl
  683. // http://shader-playground.timjones.io/9f00d2d359da0731cdf8d0b68520e2c4
  684. struct S {
  685. float4 f;
  686. };
  687. int i;
  688. StructuredBuffer<S> gSBuffer1;
  689. StructuredBuffer<S> gSBuffer2;
  690. RWStructuredBuffer<S> gRWSBuffer1;
  691. RWStructuredBuffer<S> gRWSBuffer2;
  692. #define constant 0
  693. StructuredBuffer<S> bar() {
  694. if (constant > 2) {
  695. return gSBuffer1;
  696. } else {
  697. return gSBuffer2;
  698. }
  699. }
  700. void foo(RWStructuredBuffer<S> pRWSBuffer) {
  701. StructuredBuffer<S> lSBuffer = bar();
  702. pRWSBuffer[i] = lSBuffer[i];
  703. }
  704. void main() {
  705. foo(gRWSBuffer1);
  706. foo(gRWSBuffer2);
  707. }
  708. The compiler will do all of the transformations that mentioned earlier
  709. to identify a single resource for each load and store from a resource.
  710. Conclusion
  711. ==========
  712. It is impossible to enumerate all of the possible code sequences that
  713. work or do not work, but hopefully this will give a guide as to what is
  714. possible or not. The general rule of thumb is that there must be a
  715. straightforward way to transform the code so that there are no copies of
  716. global resources.