SPIR-V.rst 148 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581158215831584158515861587158815891590159115921593159415951596159715981599160016011602160316041605160616071608160916101611161216131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674167516761677167816791680168116821683168416851686168716881689169016911692169316941695169616971698169917001701170217031704170517061707170817091710171117121713171417151716171717181719172017211722172317241725172617271728172917301731173217331734173517361737173817391740174117421743174417451746174717481749175017511752175317541755175617571758175917601761176217631764176517661767176817691770177117721773177417751776177717781779178017811782178317841785178617871788178917901791179217931794179517961797179817991800180118021803180418051806180718081809181018111812181318141815181618171818181918201821182218231824182518261827182818291830183118321833183418351836183718381839184018411842184318441845184618471848184918501851185218531854185518561857185818591860186118621863186418651866186718681869187018711872187318741875187618771878187918801881188218831884188518861887188818891890189118921893189418951896189718981899190019011902190319041905190619071908190919101911191219131914191519161917191819191920192119221923192419251926192719281929193019311932193319341935193619371938193919401941194219431944194519461947194819491950195119521953195419551956195719581959196019611962196319641965196619671968196919701971197219731974197519761977197819791980198119821983198419851986198719881989199019911992199319941995199619971998199920002001200220032004200520062007200820092010201120122013201420152016201720182019202020212022202320242025202620272028202920302031203220332034203520362037203820392040204120422043204420452046204720482049205020512052205320542055205620572058205920602061206220632064206520662067206820692070207120722073207420752076207720782079208020812082208320842085208620872088208920902091209220932094209520962097209820992100210121022103210421052106210721082109211021112112211321142115211621172118211921202121212221232124212521262127212821292130213121322133213421352136213721382139214021412142214321442145214621472148214921502151215221532154215521562157215821592160216121622163216421652166216721682169217021712172217321742175217621772178217921802181218221832184218521862187218821892190219121922193219421952196219721982199220022012202220322042205220622072208220922102211221222132214221522162217221822192220222122222223222422252226222722282229223022312232223322342235223622372238223922402241224222432244224522462247224822492250225122522253225422552256225722582259226022612262226322642265226622672268226922702271227222732274227522762277227822792280228122822283228422852286228722882289229022912292229322942295229622972298229923002301230223032304230523062307230823092310231123122313231423152316231723182319232023212322232323242325232623272328232923302331233223332334233523362337233823392340234123422343234423452346234723482349235023512352235323542355235623572358235923602361236223632364236523662367236823692370237123722373237423752376237723782379238023812382238323842385238623872388238923902391239223932394239523962397239823992400240124022403240424052406240724082409241024112412241324142415241624172418241924202421242224232424242524262427242824292430243124322433243424352436243724382439244024412442244324442445244624472448244924502451245224532454245524562457245824592460246124622463246424652466246724682469247024712472247324742475247624772478247924802481248224832484248524862487248824892490249124922493249424952496249724982499250025012502250325042505250625072508250925102511251225132514251525162517251825192520252125222523252425252526252725282529253025312532253325342535253625372538253925402541254225432544254525462547254825492550255125522553255425552556255725582559256025612562256325642565256625672568256925702571257225732574257525762577257825792580258125822583258425852586258725882589259025912592259325942595259625972598259926002601260226032604260526062607260826092610261126122613261426152616261726182619262026212622262326242625262626272628262926302631263226332634263526362637263826392640264126422643264426452646264726482649265026512652265326542655265626572658265926602661266226632664266526662667266826692670267126722673267426752676267726782679268026812682268326842685268626872688268926902691269226932694269526962697269826992700270127022703270427052706270727082709271027112712271327142715271627172718271927202721272227232724272527262727272827292730273127322733273427352736273727382739274027412742274327442745274627472748274927502751275227532754275527562757275827592760276127622763276427652766276727682769277027712772277327742775277627772778277927802781278227832784278527862787278827892790279127922793279427952796279727982799280028012802280328042805280628072808280928102811281228132814281528162817281828192820282128222823282428252826282728282829283028312832283328342835283628372838283928402841284228432844284528462847284828492850285128522853285428552856285728582859286028612862286328642865286628672868286928702871287228732874287528762877287828792880288128822883288428852886288728882889289028912892289328942895289628972898289929002901290229032904290529062907290829092910291129122913291429152916291729182919292029212922292329242925292629272928292929302931293229332934293529362937293829392940294129422943294429452946294729482949295029512952295329542955295629572958295929602961296229632964296529662967296829692970297129722973297429752976297729782979298029812982298329842985298629872988298929902991299229932994299529962997299829993000300130023003300430053006300730083009301030113012301330143015301630173018301930203021302230233024302530263027
  1. =====================================
  2. HLSL to SPIR-V Feature Mapping Manual
  3. =====================================
  4. .. contents::
  5. :local:
  6. :depth: 3
  7. Introduction
  8. ============
  9. This document describes the mappings from HLSL features to SPIR-V for Vulkan
  10. adopted by the SPIR-V codegen. For how to build, use, or contribute to the
  11. SPIR-V codegen and its internals, please see the
  12. `wiki <https://github.com/Microsoft/DirectXShaderCompiler/wiki/SPIR%E2%80%90V-CodeGen>`_
  13. page.
  14. `SPIR-V <https://www.khronos.org/registry/spir-v/>`_ is a binary intermediate
  15. language for representing graphical-shader stages and compute kernels for
  16. multiple Khronos APIs, such as Vulkan, OpenGL, and OpenCL. At the moment we
  17. only intend to support the Vulkan flavor of SPIR-V.
  18. DirectXShaderCompiler is the reference compiler for HLSL. Adding SPIR-V codegen
  19. in DirectXShaderCompiler will enable the usage of HLSL as a frontend language
  20. for Vulkan shader programming. Sharing the same code base also means we can
  21. track the evolution of HLSL more closely and always deliver the best of HLSL to
  22. developers. Moreover, developers will also have a unified compiler toolchain for
  23. targeting both DirectX and Vulkan. We believe this effort will benefit the
  24. general graphics ecosystem.
  25. Note that this document is expected to be an ongoing effort and grow as we
  26. implement more and more HLSL features.
  27. Overview
  28. ========
  29. Although they share the same basic concepts, DirectX and Vulkan are still
  30. different graphics APIs with semantic gaps. HLSL is the native shading language
  31. for DirectX, so certain HLSL features do not have corresponding mappings in
  32. Vulkan, and certain Vulkan specific information does not have native ways to
  33. express in HLSL source code. This section describes the general translation
  34. paradigms and how we close some of the major semantic gaps.
  35. Note that the term "semantic" is overloaded. In HLSL, it can mean the string
  36. attached to shader input or output. For such cases, we refer it as "HLSL
  37. semantic" or "semantic string". For other cases, we just use the normal
  38. "semantic" term.
  39. Shader entry function
  40. ---------------------
  41. HLSL entry functions can read data from the previous shader stage and write
  42. data to the next shader stage via function parameters and return value. On the
  43. contrary, Vulkan requires all SPIR-V entry functions taking no parameters and
  44. returning void. All data passing between stages should use global variables
  45. in the ``Input`` and ``Output`` storage class.
  46. To handle this difference, we emit a wrapper function as the SPIR-V entry
  47. function around the HLSL source code entry function. The wrapper function is
  48. responsible to read data from SPIR-V ``Input`` global variables and prepare
  49. them to the types required in the source code entry function signature, call
  50. the source code entry function, and then decompose the contents in return value
  51. (and ``out``/``inout`` parameters) to the types required by the SPIR-V
  52. ``Output`` global variables, and then write out. For details about the wrapper
  53. function, please refer to the `entry function wrapper`_ section.
  54. Shader stage IO interface matching
  55. ----------------------------------
  56. HLSL leverages semantic strings to link variables and pass data between shader
  57. stages. Great flexibility is allowed as for how to use the semantic strings.
  58. They can appear on function parameters, function returns, and struct members.
  59. In Vulkan, linking variables and passing data between shader stages is done via
  60. numeric ``Location`` decorations on SPIR-V global variables in the ``Input`` and
  61. ``Output`` storage class.
  62. To help handling such differences, we provide `Vulkan specific attributes`_ to
  63. let the developer to express precisely their intents. The compiler will also try
  64. its best to deduce the mapping from semantic strings to SPIR-V ``Location``
  65. numbers when such explicit Vulkan specific attributes are absent. Please see the
  66. `HLSL semantic and Vulkan Location`_ section for more details about the mapping
  67. and ``Location`` assignment.
  68. What makes the story complicated is Vulkan's strict requirements on interface
  69. matching. Basically, a variable in the previous stage is considered a match to
  70. a variable in the next stage if and only if they are decorated with the same
  71. ``Location`` number and with the exact same type, except for the outermost
  72. arrayness in hull/domain/geometry shader, which can be ignored regarding
  73. interface matching. This is causing problems together with the flexibility of
  74. HLSL semantic strings.
  75. Some HLSL system-value (SV) semantic strings will be mapped into SPIR-V
  76. variables with builtin decorations, some are not. HLSL non-SV semantic strings
  77. should all be mapped to SPIR-V variables without builtin decorations (but with
  78. ``Location`` decorations).
  79. With these complications, if we are grouping multiple semantic strings in a
  80. struct in the HLSL source code, that struct should be flattened and each of
  81. its members should be mapped separately. For example, for the following:
  82. .. code:: hlsl
  83. struct T {
  84. float2 clip0 : SV_ClipDistance0;
  85. float3 cull0 : SV_CullDistance0;
  86. float4 foo : FOO;
  87. };
  88. struct S {
  89. float4 pos : SV_Position;
  90. float2 clip1 : SV_ClipDistance1;
  91. float3 cull1 : SV_CullDistance1;
  92. float4 bar : BAR;
  93. T t;
  94. };
  95. If we have an ``S`` input parameter in pixel shader, we should flatten it
  96. recursively to generate five SPIR-V ``Input`` variables. Three of them are
  97. decorated by the ``Position``, ``ClipDistance``, ``CullDistance`` builtin,
  98. and two of them are decorated by the ``Location`` decoration. (Note that
  99. ``clip0`` and ``clip1`` are concatenated, also ``cull0`` and ``cull1``.
  100. The ``ClipDistance`` and ``CullDistance`` builtins are special and explained
  101. in the `ClipDistance & CullDistance`_ section.)
  102. Flattening is infective because of Vulkan interface matching rules. If we
  103. flatten a struct in the output of a previous stage, which may create multiple
  104. variables decorated with different ``Location`` numbers, we also need to
  105. flatten it in the input of the next stage. otherwise we may have ``Location``
  106. mismatch even if we share the same definition of the struct. Because
  107. hull/domain/geometry shader is optional, we can have different chains of shader
  108. stages, which means we need to flatten all shader stage interfaces. For
  109. hull/domain/geometry shader, their inputs/outputs have an additional arrayness.
  110. So if we are seeing an array of structs in these shaders, we need to flatten
  111. them into arrays of its fields.
  112. Vulkan specific features
  113. ------------------------
  114. We try to implement Vulkan specific features using the most intuitive and
  115. non-intrusive ways in HLSL, which means we will prefer native language
  116. constructs when possible. If that is inadequate, we then consider attaching
  117. `Vulkan specific attributes`_ to them, or introducing new syntax.
  118. Descriptors
  119. ~~~~~~~~~~~
  120. The compiler provides multiple mechanisms to specify which Vulkan descriptor
  121. a particular resource binds to.
  122. In the source code, you can use the ``[[vk::binding(X[, Y])]]`` and
  123. ``[[vk::counter_binding(X)]]`` attribute. The native ``:register()`` attribute
  124. is also respected.
  125. On the command-line, you can use the ``-fvk-{b|s|t|u}-shift`` or
  126. ``-fvk-bind-register`` option.
  127. If you can modify the source code, the ``[[vk::binding(X[, Y])]]`` and
  128. ``[[vk::counter_binding(X)]]`` attribute gives you find-grained control over
  129. descriptor assignment.
  130. If you cannot modify the source code, you can use command-line options to change
  131. how ``:register()`` attribute is handled by the compiler. ``-fvk-bind-register``
  132. lets you to specify the descriptor for the source at a certain register.
  133. ``-fvk-{b|s|t|u}-shift`` lets you to apply shifts to all register numbers
  134. of a certain register type. They cannot be used together, though.
  135. Without attribute and command-line option, ``:register(xX, spaceY)`` will be
  136. mapped to binding ``X`` in descriptor set ``Y``. Note that register type ``x``
  137. is ignored, so this may cause overlap.
  138. The more specific a mechanism is, the higher precedence it has, and command-line
  139. option has higher precedence over source code attribute.
  140. For more details, see `HLSL register and Vulkan binding`_, `Vulkan specific
  141. attributes`_, and `Vulkan-specific options`_.
  142. Subpass inputs
  143. ~~~~~~~~~~~~~~
  144. Within a Vulkan `rendering pass <https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#renderpass>`_,
  145. a subpass can write results to an output target that can then be read by the
  146. next subpass as an input subpass. The "Subpass Input" feature regards the
  147. ability to read an output target.
  148. Subpasses are read through two new builtin resource types, available only in
  149. pixel shader:
  150. .. code:: hlsl
  151. class SubpassInput<T> {
  152. T SubpassLoad();
  153. };
  154. class SubpassInputMS<T> {
  155. T SubpassLoad(int sampleIndex);
  156. };
  157. In the above, ``T`` is a scalar or vector type. If omitted, it will defaults to
  158. ``float4``.
  159. Subpass inputs are implicitly addressed by the pixel's (x, y, layer) coordinate.
  160. These objects support reading the subpass input through the methods as shown
  161. in the above.
  162. A subpass input is selected by using a new attribute ``vk::input_attachment_index``.
  163. For example:
  164. .. code:: hlsl
  165. [[vk::input_attachment_index(i)]] SubpassInput input;
  166. An ``vk::input_attachment_index`` of ``i`` selects the ith entry in the input
  167. pass list. (See Vulkan API spec for more information.)
  168. Push constants
  169. ~~~~~~~~~~~~~~
  170. Vulkan push constant blocks are represented using normal global variables of
  171. struct types in HLSL. The variables (not the underlying struct types) should be
  172. annotated with the ``[[vk::push_constant]]`` attribute.
  173. Please note as per the requirements of Vulkan, "there must be no more than one
  174. push constant block statically used per shader entry point."
  175. Specialization constants
  176. ~~~~~~~~~~~~~~~~~~~~~~~~
  177. To use Vulkan specialization constants, annotate global constants with the
  178. ``[[vk::constant_id(X)]]`` attribute. For example,
  179. .. code:: hlsl
  180. [[vk::constant_id(1)]] const bool specConstBool = true;
  181. [[vk::constant_id(2)]] const int specConstInt = 42;
  182. [[vk::constant_id(3)]] const float specConstFloat = 1.5;
  183. Builtin variables
  184. ~~~~~~~~~~~~~~~~~
  185. Some of the Vulkan builtin variables have no equivalents in native HLSL
  186. language. To support them, ``[[vk::builtin("<builtin>")]]`` is introduced.
  187. Right now the following ``<builtin>`` are supported:
  188. * ``PointSize``: The GLSL equivalent is ``gl_PointSize``.
  189. * ``HelperInvocation``: The GLSL equivalent is ``gl_HelperInvocation``.
  190. * ``BaseVertex``: The GLSL equivalent is ``gl_BaseVertexARB``.
  191. Need ``SPV_KHR_shader_draw_parameters`` extension.
  192. * ``BaseInstance``: The GLSL equivalent is ``gl_BaseInstanceARB``.
  193. Need ``SPV_KHR_shader_draw_parameters`` extension.
  194. * ``DrawIndex``: The GLSL equivalent is ``gl_DrawIDARB``.
  195. Need ``SPV_KHR_shader_draw_parameters`` extension.
  196. * ``DeviceIndex``: The GLSL equivalent is ``gl_DeviceIndex``.
  197. Need ``SPV_KHR_device_group`` extension.
  198. Please see Vulkan spec. `14.6. Built-In Variables <https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#interfaces-builtin-variables>`_
  199. for detailed explanation of these builtins.
  200. Supported extensions
  201. ~~~~~~~~~~~~~~~~~~~~
  202. * SPV_KHR_16bit_storage
  203. * SPV_KHR_device_group
  204. * SPV_KHR_multivew
  205. * SPV_KHR_post_depth_coverage
  206. * SPV_KHR_shader_draw_parameters
  207. * SPV_EXT_descriptor_indexing
  208. * SPV_EXT_fragment_fully_covered
  209. * SPV_EXT_shader_stencil_support
  210. * SPV_AMD_shader_explicit_vertex_parameter
  211. * SPV_GOOGLE_hlsl_functionality1
  212. Vulkan specific attributes
  213. --------------------------
  214. `C++ attribute specifier sequence <http://en.cppreference.com/w/cpp/language/attributes>`_
  215. is a non-intrusive way of providing Vulkan specific information in HLSL.
  216. The namespace ``vk`` will be used for all Vulkan attributes:
  217. - ``location(X)``: For specifying the location (``X``) numbers for stage
  218. input/output variables. Allowed on function parameters, function returns,
  219. and struct fields.
  220. - ``binding(X[, Y])``: For specifying the descriptor set (``Y``) and binding
  221. (``X``) numbers for resource variables. The descriptor set (``Y``) is
  222. optional; if missing, it will be set to 0. Allowed on global variables.
  223. - ``counter_binding(X)``: For specifying the binding number (``X``) for the
  224. associated counter for RW/Append/Consume structured buffer. The descriptor
  225. set number for the associated counter is always the same as the main resource.
  226. - ``push_constant``: For marking a variable as the push constant block. Allowed
  227. on global variables of struct type. At most one variable can be marked as
  228. ``push_constant`` in a shader.
  229. - ``offset(X)``: For manually layout struct members. Annotating a struct member
  230. with this attribute will force the compiler to put the member at offset ``X``
  231. w.r.t. the beginning of the struct. Only allowed on struct members.
  232. - ``constant_id(X)``: For marking a global constant as a specialization constant.
  233. Allowed on global variables of boolean/integer/float types.
  234. - ``input_attachment_index(X)``: To associate the Xth entry in the input pass
  235. list to the annotated object. Only allowed on objects whose type are
  236. ``SubpassInput`` or ``SubpassInputMS``.
  237. - ``builtin("X")``: For specifying an entity should be translated into a certain
  238. Vulkan builtin variable. Allowed on function parameters, function returns,
  239. and struct fields.
  240. - ``index(X)``: For specifying the index at a specific pixel shader output
  241. location. Used for dual-source blending.
  242. - ``post_depth_coverage``: The input variable decorated with SampleMask will
  243. reflect the result of the EarlyFragmentTests. Only valid on pixel shader entry points.
  244. Only ``vk::`` attributes in the above list are supported. Other attributes will
  245. result in warnings and be ignored by the compiler. All C++11 attributes will
  246. only trigger warnings and be ignored if not compiling towards SPIR-V.
  247. For example, to specify the layout of resource variables and the location of
  248. interface variables:
  249. .. code:: hlsl
  250. struct S { ... };
  251. [[vk::binding(X, Y), vk::counter_binding(Z)]]
  252. RWStructuredBuffer<S> mySBuffer;
  253. [[vk::location(M)]] float4
  254. main([[vk::location(N)]] float4 input: A) : B
  255. { ... }
  256. SPIR-V version and extension
  257. ----------------------------
  258. SPIR-V CodeGen provides two command-line options for fine-grained SPIR-V target
  259. environment (hence SPIR-V version) and SPIR-V extension control:
  260. - ``-fspv-target-env=``: for specifying SPIR-V target environment
  261. - ``-fspv-extension=``: for specifying allowed SPIR-V extensions
  262. ``-fspv-target-env=`` only accepts ``vulkan1.0`` and ``vulkan1.1`` right now.
  263. If such an option is not given, the CodeGen defaults to ``vulkan1.0``. When
  264. targeting ``vulkan1.0``, trying to use features that are only available
  265. in Vulkan 1.1 (SPIR-V 1.3), like `Shader Model 6.0 wave intrinsics`_, will
  266. trigger a compiler error.
  267. If ``-fspv-extension=`` is not specified, the CodeGen will select suitable
  268. SPIR-V extensions to translate the source code. Otherwise, only extensions
  269. supplied via ``-fspv-extension=`` will be used. If that does not suffice, errors
  270. will be emitted explaining what additional extensions are required to translate
  271. what specific feature in the source code. If you want to allow all KHR
  272. extensions, you can use ``-fspv-extension=KHR``.
  273. Legalization, optimization, validation
  274. --------------------------------------
  275. After initial translation of the HLSL source code, SPIR-V CodeGen will further
  276. conduct legalization (if needed), optimization (if requested), and validation
  277. (if not turned off). All these three stages are outsourced to `SPIRV-Tools <https://github.com/KhronosGroup/SPIRV-Tools>`_.
  278. Here are the options controlling these stages:
  279. * ``-fcgl``: turn off legalization and optimization
  280. * ``-Od``: turn off optimization
  281. * ``-Vd``: turn off validation
  282. Legalization
  283. ~~~~~~~~~~~~
  284. HLSL is a fairly permissive language considering the flexibility it provides for
  285. manipulating resource objects. The developer can create local copies, pass
  286. them around as function parameters and return values, as long as after certain
  287. transformations (function inlining, constant evaluation and propagating, dead
  288. code elimination, etc.), the compiler can remove all temporary copies and
  289. pinpoint all uses to unique global resource objects.
  290. Resulting from the above property of HLSL, if we translate into SPIR-V for
  291. Vulkan literally from the input HLSL source code, we will sometimes generate
  292. illegal SPIR-V. Certain transformations are needed to legalize the literally
  293. translated SPIR-V. Performing such transformations at the frontend AST level
  294. is cumbersome or impossible (e.g., function inlining). They are better to be
  295. conducted at SPIR-V level. Therefore, legalization is delegated to SPIRV-Tools.
  296. Specifically, we need to legalize the following HLSL source code patterns:
  297. * Using resource types in struct types
  298. * Creating aliases of global resource objects
  299. * Control flows invovling the above cases
  300. Legalization transformations will not run unless the above patterns are
  301. encountered in the source code.
  302. Optimization
  303. ~~~~~~~~~~~~
  304. Optimization is also delegated to SPIRV-Tools. Right now there are no difference
  305. between optimization levels greater than zero; they will all invoke the same
  306. optimization recipe. That is, the recipe behind ``spirv-opt -O``. If you want to
  307. run a custom optimization recipe, you can do so using the command line option
  308. ``-Oconfig=`` and specifying a comma-separated list of your desired passes.
  309. The passes are invoked in the specified order.
  310. For example, you can specify ``-Oconfig=--loop-unroll,--scalar-replacement=300,--eliminate-dead-code-aggressive``
  311. to firstly invoke loop unrolling, then invoke scalar replacement of aggregates,
  312. lastly invoke aggressive dead code elimination. All valid options to
  313. ``spirv-opt`` are accepted as components to the comma-separated list.
  314. Here are the typical passes in alphabetical order:
  315. * ``--ccp``
  316. * ``--cfg-cleanup``
  317. * ``--convert-local-access-chains``
  318. * ``--copy-propagate-arrays``
  319. * ``--eliminate-dead-branches``
  320. * ``--eliminate-dead-code-aggressive``
  321. * ``--eliminate-dead-functions``
  322. * ``--eliminate-local-multi-store``
  323. * ``--eliminate-local-single-block``
  324. * ``--eliminate-local-single-store``
  325. * ``--flatten-decorations``
  326. * ``--if-conversion``
  327. * ``--inline-entry-points-exhaustive``
  328. * ``--local-redundancy-elimination``
  329. * ``--loop-fission``
  330. * ``--loop-fusion``
  331. * ``--loop-unroll``
  332. * ``--loop-unroll-partial=[<n>]``
  333. * ``--loop-peeling`` (requires ``--loop-peeling-threshold``)
  334. * ``--merge-blocks``
  335. * ``--merge-return``
  336. * ``--loop-unswitch``
  337. * ``--private-to-local``
  338. * ``--reduce-load-size``
  339. * ``--redundancy-elimination``
  340. * ``--remove-duplicates``
  341. * ``--replace-invalid-opcode``
  342. * ``--ssa-rewrite``
  343. * ``--scalar-replacement[=<n>]``
  344. * ``--simplify-instructions``
  345. * ``--vector-dce``
  346. Besides, there are two special batch options; each stands for a recommended
  347. recipe by itself:
  348. * ``-O``: A bunch of passes in an appropriate order that attempt to improve
  349. performance of generated code. Same as ``spirv-opt -O``. Also same as SPIR-V
  350. CodeGen's default recipe.
  351. * ``-Os``: A bunch of passes in an appropriate order that attempt to reduce the
  352. size of the generated code. Same as ``spirv-opt -Os``.
  353. So if you want to run loop unrolling additionally after the default optimization
  354. recipe, you can specify ``-Oconfig=-O,--loop-unroll``.
  355. For the whole list of accepted passes and details about each one, please see
  356. ``spirv-opt``'s help manual (``spirv-opt --help``), or the SPIRV-Tools `optimizer header file <https://github.com/KhronosGroup/SPIRV-Tools/blob/master/include/spirv-tools/optimizer.hpp>`_.
  357. Validation
  358. ~~~~~~~~~~
  359. Validation is turned on by default as the last stage of SPIR-V CodeGen. Failing
  360. validation, which indicates there is a CodeGen bug, will trigger a fatal error.
  361. Please file an issue if you see that.
  362. Debugging
  363. ---------
  364. By default, the compiler will only emit names for types and variables as debug
  365. information, to aid reading of the generated SPIR-V. The ``-Zi`` option will
  366. let the compiler emit the following additional debug information:
  367. * Full path of the main source file using ``OpSource``
  368. * Preprocessed source code using ``OpSource`` and ``OpSourceContinued``
  369. * Line information for certain instructions using ``OpLine`` (WIP)
  370. * DXC Git commit hash using ``OpModuleProcessed`` (requires Vulkan 1.1)
  371. * DXC command-line options used to compile the shader using ``OpModuleProcessed``
  372. (requires Vulkan 1.1)
  373. We chose to embed preprocessed source code instead of original source code to
  374. avoid pulling in lots of contents unrelated to the current entry point, and
  375. boilerplate contents generated by engines. We may add a mode for selecting
  376. between preprocessed single source code and original separated source code in
  377. the future.
  378. One thing to note is that to keep the line numbers in consistent with the
  379. embedded source, the compiler is invoked twice; the first time is for
  380. preprocessing the source code, and the second time is for feeding the
  381. preprocessed source code as input for a whole compilation. So using ``-Zi``
  382. means performance penality.
  383. If you want to have fine-grained control over the categories of emitted debug
  384. information, you can use ``-fspv-debug=``. It accepts:
  385. * ``file``: for emitting full path of the main source file
  386. * ``source``: for emitting preprocessed source code (turns on ``file`` implicitly)
  387. * ``line``: for emitting line information (turns on ``source`` implicitly)
  388. * ``tool``: for emitting DXC Git commit hash and command-line options
  389. ``-fspv-debug=`` overrules ``-Zi``. And you can provide multiple instances of
  390. ``-fspv-debug=``. For example, you can use ``-fspv-debug=file -fspv-debug=tool``
  391. to turn on emitting file path and DXC information; source code and line
  392. information will not be emitted.
  393. Reflection
  394. ----------
  395. Making reflection easier is one of the goals of SPIR-V CodeGen. This section
  396. provides guidelines about how to reflect on certain facts.
  397. Note that we generate ``OpName``/``OpMemberName`` instructions for various
  398. types/variables both explicitly defined in the source code and interally created
  399. by the compiler. These names are primarily for debugging purposes in the
  400. compiler. They have "no semantic impact and can safely be removed" according
  401. to the SPIR-V spec. And they are subject to changes without notice. So we do
  402. not suggest to use them for reflection.
  403. Source code shader profile
  404. ~~~~~~~~~~~~~~~~~~~~~~~~~~
  405. The source code shader profile version can be re-discovered by the "Version"
  406. operand in ``OpSource`` instruction. For ``*s_<major>_<minor>``, the "Verison"
  407. operand in ``OpSource`` will be set as ``<major>`` * 100 + ``<minor>`` * 10.
  408. For example, ``vs_5_1`` will have 510, ``ps_6_2`` will have 620.
  409. HLSL Semantic
  410. ~~~~~~~~~~~~~
  411. HLSL semantic strings are by default not emitted into the SPIR-V binary module.
  412. If you need them, by specifying ``-fspv-reflect``, the compiler will use
  413. the ``Op*DecorateStringGOOGLE`` instruction in `SPV_GOOGLE_hlsl_funtionality1 <https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/GOOGLE/SPV_GOOGLE_hlsl_functionality1.asciidoc>`_
  414. extension to emit them.
  415. Counter buffers for RW/Append/Consume StructuredBuffer
  416. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  417. The association between a counter buffer and its main RW/Append/Consume
  418. StructuredBuffer is conveyed by ``OpDecorateId <structured-buffer-id>
  419. HLSLCounterBufferGOOGLE <counter-buffer-id>`` instruction from the
  420. `SPV_GOOGLE_hlsl_funtionality1 <https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/GOOGLE/SPV_GOOGLE_hlsl_functionality1.asciidoc>`_
  421. extension. This information is by default missing; you need to specify
  422. ``-fspv-reflect`` to direct the compiler to emit them.
  423. Read-only vs. read-write resource types
  424. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  425. There are no clear and consistent decorations in the SPIR-V to show whether a
  426. resource type is translated from a read-only (RO) or read-write (RW) HLSL
  427. resource type. Instead, you need to use different checks for reflecting different
  428. resource types:
  429. * HLSL samplers: RO.
  430. * HLSL ``Buffer``/``RWBuffer``/``Texture*``/``RWTexture*``: Check the "Sampled"
  431. operand in the ``OpTypeImage`` instruction they translated into. "2" means RW,
  432. "1" means RO.
  433. * HLSL constant/texture/structured/byte buffers: Check both ``Block``/``BufferBlock``
  434. and ``NonWritable`` decoration. If decorated with ``Block`` (``cbuffer`` &
  435. ``ConstantBuffer``), then RO; if decorated with ``BufferBlock`` and ``NonWritable``
  436. (``tbuffer``, ``TextureBuffer``, ``StructuredBuffer``), then RO; Otherwise, RW.
  437. HLSL Types
  438. ==========
  439. This section lists how various HLSL types are mapped.
  440. Normal scalar types
  441. -------------------
  442. `Normal scalar types <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509646(v=vs.85).aspx>`_
  443. in HLSL are relatively easy to handle and can be mapped directly to SPIR-V
  444. type instructions:
  445. ============================== ======================= ================== =========== =================================
  446. HLSL Command Line Option SPIR-V Capability Extension
  447. ============================== ======================= ================== =========== =================================
  448. ``bool`` ``OpTypeBool``
  449. ``int``/``int32_t`` ``OpTypeInt 32 1``
  450. ``int16_t`` ``-enable-16bit-types`` ``OpTypeInt 16 1`` ``Int16``
  451. ``uint``/``dword``/``uin32_t`` ``OpTypeInt 32 0``
  452. ``uint16_t`` ``-enable-16bit-types`` ``OpTypeInt 16 0`` ``Int16``
  453. ``half`` ``OpTypeFloat 32``
  454. ``half``/``float16_t`` ``-enable-16bit-types`` ``OpTypeFloat 16`` ``SPV_AMD_gpu_shader_half_float``
  455. ``float``/``float32_t`` ``OpTypeFloat 32``
  456. ``snorm float`` ``OpTypeFloat 32``
  457. ``unorm float`` ``OpTypeFloat 32``
  458. ``double``/``float64_t`` ``OpTypeFloat 64`` ``Float64``
  459. ============================== ======================= ================== =========== =================================
  460. Please note that ``half`` is translated into 32-bit floating point numbers
  461. if without ``-enable-16bit-types`` because MSDN says that "this data type
  462. is provided only for language compatibility. Direct3D 10 shader targets map
  463. all ``half`` data types to ``float`` data types."
  464. Minimal precision scalar types
  465. ------------------------------
  466. HLSL also supports various
  467. `minimal precision scalar types <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509646(v=vs.85).aspx>`_,
  468. which graphics drivers can implement by using any precision greater than or
  469. equal to their specified bit precision.
  470. There are no direct mappings in SPIR-V for these types. We translate them into
  471. the corresponding 16-bit or 32-bit scalar types with the ``RelaxedPrecision`` decoration.
  472. We use the 16-bit variants if '-enable-16bit-types' command line option is present.
  473. For more information on these types, please refer to:
  474. https://github.com/Microsoft/DirectXShaderCompiler/wiki/16-Bit-Scalar-Types
  475. ============== ======================= ================== ==================== ============ =================================
  476. HLSL Command Line Option SPIR-V Decoration Capability Extension
  477. ============== ======================= ================== ==================== ============ =================================
  478. ``min16float`` ``OpTypeFloat 32`` ``RelaxedPrecision``
  479. ``min10float`` ``OpTypeFloat 32`` ``RelaxedPrecision``
  480. ``min16int`` ``OpTypeInt 32 1`` ``RelaxedPrecision``
  481. ``min12int`` ``OpTypeInt 32 1`` ``RelaxedPrecision``
  482. ``min16uint`` ``OpTypeInt 32 0`` ``RelaxedPrecision``
  483. ``min16float`` ``-enable-16bit-types`` ``OpTypeFloat 16`` ``SPV_AMD_gpu_shader_half_float``
  484. ``min10float`` ``-enable-16bit-types`` ``OpTypeFloat 16`` ``SPV_AMD_gpu_shader_half_float``
  485. ``min16int`` ``-enable-16bit-types`` ``OpTypeInt 16 1`` ``Int16``
  486. ``min12int`` ``-enable-16bit-types`` ``OpTypeInt 16 1`` ``Int16``
  487. ``min16uint`` ``-enable-16bit-types`` ``OpTypeInt 16 0`` ``Int16``
  488. ============== ======================= ================== ==================== ============ =================================
  489. Vectors and matrices
  490. --------------------
  491. `Vectors <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509707(v=vs.85).aspx>`_
  492. and `matrices <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509623(v=vs.85).aspx>`_
  493. are translated into:
  494. ==================================== ====================================================
  495. HLSL SPIR-V
  496. ==================================== ====================================================
  497. ``|type|N`` (``N`` > 1) ``OpTypeVector |type| N``
  498. ``|type|1`` The scalar type for ``|type|``
  499. ``|type|MxN`` (``M`` > 1, ``N`` > 1) ``%v = OpTypeVector |type| N`` ``OpTypeMatrix %v M``
  500. ``|type|Mx1`` (``M`` > 1) ``OpTypeVector |type| M``
  501. ``|type|1xN`` (``N`` > 1) ``OpTypeVector |type| N``
  502. ``|type|1x1`` The scalar type for ``|type|``
  503. ==================================== ====================================================
  504. The above table is for float matrices.
  505. A MxN HLSL float matrix is translated into a SPIR-V matrix with M vectors, each with
  506. N elements. Conceptually HLSL matrices are row-major while SPIR-V matrices are
  507. column-major, thus all HLSL matrices are represented by their transposes.
  508. Doing so may require special handling of certain matrix operations:
  509. - **Indexing**: no special handling required. ``matrix[m][n]`` will still access
  510. the correct element since ``m``/``n`` means the ``m``-th/``n``-th row/column
  511. in HLSL but ``m``-th/``n``-th vector/element in SPIR-V.
  512. - **Per-element operation**: no special handling required.
  513. - **Matrix multiplication**: need to swap the operands. ``mat1 x mat2`` should
  514. be translated as ``transpose(mat2) x transpose(mat1)``. Then the result is
  515. ``transpose(mat1 x mat2)``.
  516. - **Storage layout**: ``row_major``/``column_major`` will be translated into
  517. SPIR-V ``ColMajor``/``RowMajor`` decoration. This is because HLSL matrix
  518. row/column becomes SPIR-V matrix column/row. If elements in a row/column are
  519. packed together, they should be loaded into a column/row correspondingly.
  520. See `Appendix A. Matrix Representation`_ for further explanation regarding these design choices.
  521. Since the ``Shader`` capability in SPIR-V does not allow to parameterize matrix
  522. types with non-floating-point types, a non-floating-point MxN matrix is translated
  523. into an array with M elements, with each element being a vector with N elements.
  524. Structs
  525. -------
  526. `Structs <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509668(v=vs.85).aspx>`_
  527. in HLSL are defined in the a format similar to C structs. They are translated
  528. into SPIR-V ``OpTypeStruct``. Depending on the storage classes of the instances,
  529. a single struct definition may generate multiple ``OpTypeStruct`` instructions
  530. in SPIR-V. For example, for the following HLSL source code:
  531. .. code:: hlsl
  532. struct S { ... }
  533. ConstantBuffer<S> myCBuffer;
  534. StructuredBuffer<S> mySBuffer;
  535. float4 main() : A {
  536. S myLocalVar;
  537. ...
  538. }
  539. There will be three different ``OpTypeStruct`` generated, one for each variable
  540. defined in the above source code. This is because the ``OpTypeStruct`` for
  541. both ``myCBuffer`` and ``mySBuffer`` will have layout decorations (``Offset``,
  542. ``MatrixStride``, ``ArrayStride``, ``RowMajor``, ``ColMajor``). However, their
  543. layout rules are different (by default); ``myCBuffer`` will use vector-relaxed
  544. OpenGL ``std140`` while ``mySBuffer`` will use vector-relaxed OpenGL ``std430``.
  545. ``myLocalVar`` will have its ``OpTypeStruct`` without layout decorations.
  546. Read more about storage classes in the `Constant/Texture/Structured/Byte Buffers`_
  547. section.
  548. Structs used as stage inputs/outputs will have semantics attached to their
  549. members. These semantics are handled in the `entry function wrapper`_.
  550. Structs used as pixel shader inputs can have optional interpolation modifiers
  551. for their members, which will be translated according to the following table:
  552. =========================== ================= =====================
  553. HLSL Interpolation Modifier SPIR-V Decoration SPIR-V Capability
  554. =========================== ================= =====================
  555. ``linear`` <none>
  556. ``centroid`` ``Centroid``
  557. ``nointerpolation`` ``Flat``
  558. ``noperspective`` ``NoPerspective``
  559. ``sample`` ``Sample`` ``SampleRateShading``
  560. =========================== ================= =====================
  561. Arrays
  562. ------
  563. Sized (either explicitly or implicitly) arrays are translated into SPIR-V
  564. `OpTypeArray`. Unsized arrays are translated into `OpTypeRuntimeArray`.
  565. Arrays, if used for external resources (residing in SPIR-V `Uniform` or
  566. `UniformConstant` storage class), will need layout decorations like SPIR-V
  567. `ArrayStride` decoration. For arrays of opaque types, e.g., HLSL textures
  568. or samplers, we don't decorate with `ArrayStride` decorations since there is
  569. no meaningful strides. Similarly for arrays of structured/byte buffers.
  570. User-defined types
  571. ------------------
  572. `User-defined types <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509702(v=vs.85).aspx>`_
  573. are type aliases introduced by typedef. No new types are introduced and we can
  574. rely on Clang to resolve to the original types.
  575. Samplers
  576. --------
  577. All `sampler types <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509644(v=vs.85).aspx>`_
  578. will be translated into SPIR-V ``OpTypeSampler``.
  579. SPIR-V ``OpTypeSampler`` is an opaque type that cannot be parameterized;
  580. therefore state assignments on sampler types is not supported (yet).
  581. Textures
  582. --------
  583. `Texture types <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509700(v=vs.85).aspx>`_
  584. are translated into SPIR-V ``OpTypeImage``, with parameters:
  585. ======================= ==================== ===== =================== ========== ===== ======= == ======= ================ =================
  586. HLSL Vulkan SPIR-V
  587. ----------------------- -------------------------- ------------------------------------------------------------------------------------------
  588. Texture Type Descriptor Type RO/RW Storage Class Dim Depth Arrayed MS Sampled Image Format Capability
  589. ======================= ==================== ===== =================== ========== ===== ======= == ======= ================ =================
  590. ``Texture1D`` Sampled Image RO ``UniformConstant`` ``1D`` 2 0 0 1 ``Unknown``
  591. ``Texture2D`` Sampled Image RO ``UniformConstant`` ``2D`` 2 0 0 1 ``Unknown``
  592. ``Texture3D`` Sampled Image RO ``UniformConstant`` ``3D`` 2 0 0 1 ``Unknown``
  593. ``TextureCube`` Sampled Image RO ``UniformConstant`` ``Cube`` 2 0 0 1 ``Unknown``
  594. ``Texture1DArray`` Sampled Image RO ``UniformConstant`` ``1D`` 2 1 0 1 ``Unknown``
  595. ``Texture2DArray`` Sampled Image RO ``UniformConstant`` ``2D`` 2 1 0 1 ``Unknown``
  596. ``Texture2DMS`` Sampled Image RO ``UniformConstant`` ``2D`` 2 0 1 1 ``Unknown``
  597. ``Texture2DMSArray`` Sampled Image RO ``UniformConstant`` ``2D`` 2 1 1 1 ``Unknown`` ``ImageMSArray``
  598. ``TextureCubeArray`` Sampled Image RO ``UniformConstant`` ``3D`` 2 1 0 1 ``Unknown``
  599. ``Buffer<T>`` Uniform Texel Buffer RO ``UniformConstant`` ``Buffer`` 2 0 0 1 Depends on ``T`` ``SampledBuffer``
  600. ``RWBuffer<T>`` Storage Texel Buffer RW ``UniformConstant`` ``Buffer`` 2 0 0 2 Depends on ``T`` ``SampledBuffer``
  601. ``RWTexture1D<T>`` Storage Image RW ``UniformConstant`` ``1D`` 2 0 0 2 Depends on ``T``
  602. ``RWTexture2D<T>`` Storage Image RW ``UniformConstant`` ``2D`` 2 0 0 2 Depends on ``T``
  603. ``RWTexture3D<T>`` Storage Image RW ``UniformConstant`` ``3D`` 2 0 0 2 Depends on ``T``
  604. ``RWTexture1DArray<T>`` Storage Image RW ``UniformConstant`` ``1D`` 2 1 0 2 Depends on ``T``
  605. ``RWTexture2DArray<T>`` Storage Image RW ``UniformConstant`` ``2D`` 2 1 0 2 Depends on ``T``
  606. ======================= ==================== ===== =================== ========== ===== ======= == ======= ================ =================
  607. The meanings of the headers in the above table is explained in ``OpTypeImage``
  608. of the SPIR-V spec.
  609. Constant/Texture/Structured/Byte Buffers
  610. ----------------------------------------
  611. There are serveral buffer types in HLSL:
  612. - ``cbuffer`` and ``ConstantBuffer``
  613. - ``tbuffer`` and ``TextureBuffer``
  614. - ``StructuredBuffer`` and ``RWStructuredBuffer``
  615. - ``AppendStructuredBuffer`` and ``ConsumeStructuredBuffer``
  616. - ``ByteAddressBuffer`` and ``RWByteAddressBuffer``
  617. Note that ``Buffer`` and ``RWBuffer`` are considered as texture object in HLSL.
  618. They are listed in the above section.
  619. Please see the following sections for the details of each type. As a summary:
  620. =========================== ================== ================================ ==================== =================
  621. HLSL Type Vulkan Buffer Type Default Memory Layout Rule SPIR-V Storage Class SPIR-V Decoration
  622. =========================== ================== ================================ ==================== =================
  623. ``cbuffer`` Uniform Buffer Vector-relaxed OpenGL ``std140`` ``Uniform`` ``Block``
  624. ``ConstantBuffer`` Uniform Buffer Vector-relaxed OpenGL ``std140`` ``Uniform`` ``Block``
  625. ``tbuffer`` Storage Buffer Vector-relaxed OpenGL ``std430`` ``Uniform`` ``BufferBlock``
  626. ``TextureBuffer`` Storage Buffer Vector-relaxed OpenGL ``std430`` ``Uniform`` ``BufferBlock``
  627. ``StructuredBuffer`` Storage Buffer Vector-relaxed OpenGL ``std430`` ``Uniform`` ``BufferBlock``
  628. ``RWStructuredBuffer`` Storage Buffer Vector-relaxed OpenGL ``std430`` ``Uniform`` ``BufferBlock``
  629. ``AppendStructuredBuffer`` Storage Buffer Vector-relaxed OpenGL ``std430`` ``Uniform`` ``BufferBlock``
  630. ``ConsumeStructuredBuffer`` Storage Buffer Vector-relaxed OpenGL ``std430`` ``Uniform`` ``BufferBlock``
  631. ``ByteAddressBuffer`` Storage Buffer Vector-relaxed OpenGL ``std430`` ``Uniform`` ``BufferBlock``
  632. ``RWByteAddressBuffer`` Storage Buffer Vector-relaxed OpenGL ``std430`` ``Uniform`` ``BufferBlock``
  633. =========================== ================== ================================ ==================== =================
  634. To know more about the Vulkan buffer types, please refer to the Vulkan spec
  635. `13.1 Descriptor Types <https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#descriptorsets-types>`_.
  636. Memory layout rules
  637. ~~~~~~~~~~~~~~~~~~~
  638. SPIR-V CodeGen supports three sets of memory layout rules for buffer resources
  639. right now:
  640. 1. Vector-relaxed OpenGL ``std140`` for uniform buffers and vector-relaxed
  641. OpenGL ``std430`` for storage buffers: these rules satisfy Vulkan `"Standard
  642. Uniform Buffer Layout" and "Standard Storage Buffer Layout" <https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#interfaces-resources-layout>`_,
  643. respectively.
  644. They are the default.
  645. 2. DirectX memory layout rules for uniform buffers and storage buffers:
  646. they allow packing data on the application side that can be shared with
  647. DirectX. They can be enabled by ``-fvk-use-dx-layout``.
  648. 3. Strict OpenGL ``std140`` for uniform buffers and strict OpenGL ``std430``
  649. for storage buffers: they allow packing data on the application side that
  650. can be shared with OpenGL. They can be enabled by ``-fvk-use-gl-layout``.
  651. In the above, "vector-relaxed OpenGL ``std140``/``std430``" rules mean OpenGL
  652. ``std140``/``std430`` rules with the following modification for vector type
  653. alignment:
  654. 1. The alignment of a vector type is set to be the alignment of its element type
  655. 2. If the above causes an `improper straddle <https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#interfaces-resources-layout>`_,
  656. the alignment will be set to 16 bytes.
  657. As an exmaple, for the following HLSL definition:
  658. .. code:: hlsl
  659. struct S {
  660. float3 f;
  661. };
  662. struct T {
  663. float a_float;
  664. float3 b_float3;
  665. S c_S_float3;
  666. float2x3 d_float2x3;
  667. row_major float2x3 e_float2x3;
  668. int f_int_3[3];
  669. float2 g_float2_2[2];
  670. };
  671. We will have the following offsets for each member:
  672. ============== ====== ====== ====== ====== ====== ======
  673. HLSL Uniform Buffer Storage Buffer
  674. -------------- -------------------- --------------------
  675. Member 1 (VK) 2 (DX) 3 (GL) 1 (VK) 2 (DX) 3 (GL)
  676. ============== ====== ====== ====== ====== ====== ======
  677. ``a_float`` 0 0 0 0 0 0
  678. ``b_float3`` 4 4 16 4 4 16
  679. ``c_S_float3`` 16 16 32 16 16 32
  680. ``d_float2x3`` 32 32 48 32 28 48
  681. ``e_float2x3`` 80 80 96 64 52 80
  682. ``f_int_3`` 112 112 128 96 76 112
  683. ``g_float2_2`` 160 160 176 112 88 128
  684. ============== ====== ====== ====== ====== ====== ======
  685. If the above layout rules do not satisfy your needs and you want to manually
  686. control the layout of struct members, you can use either
  687. * The native HLSL ``:packoffset()`` attribute: only available for cbuffers; or
  688. * The Vulkan-specific ``[[vk::offset()]]`` attribute: applies to all resources.
  689. ``[[vk::offset]]`` overrules ``:packoffset``. Attaching ``[[vk::offset]]``
  690. to a struct memeber affects all variables of the struct type in question. So
  691. sharing the same struct definition having ``[[vk::offset]]`` annotations means
  692. also sharing the layout.
  693. These attributes give great flexibility but also responsibility to the
  694. developer; the compiler will just take in what is specified in the source code
  695. and emit it to SPIR-V with no error checking.
  696. ``cbuffer`` and ``ConstantBuffer``
  697. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  698. These two buffer types are treated as uniform buffers using Vulkan's
  699. terminology. They are translated into an ``OpTypeStruct`` with the
  700. necessary layout decorations (``Offset``, ``ArrayStride``, ``MatrixStride``,
  701. ``RowMajor``, ``ColMajor``) and the ``Block`` decoration. The layout rule
  702. used is vector-relaxed OpenGL ``std140`` (by default). A variable declared as
  703. one of these types will be placed in the ``Uniform`` storage class.
  704. For example, for the following HLSL source code:
  705. .. code:: hlsl
  706. struct T {
  707. float a;
  708. float3 b;
  709. };
  710. ConstantBuffer<T> myCBuffer;
  711. will be translated into
  712. .. code:: spirv
  713. ; Layout decoration
  714. OpMemberDecorate %type_ConstantBuffer_T 0 Offset 0
  715. OpMemberDecorate %type_ConstantBuffer_T 0 Offset 4
  716. ; Block decoration
  717. OpDecorate %type_ConstantBuffer_T Block
  718. ; Types
  719. %type_ConstantBuffer_T = OpTypeStruct %float %v3float
  720. %_ptr_Uniform_type_ConstantBuffer_T = OpTypePointer Uniform %type_ConstantBuffer_T
  721. ; Variable
  722. %myCbuffer = OpVariable %_ptr_Uniform_type_ConstantBuffer_T Uniform
  723. ``tbuffer`` and ``TextureBuffer``
  724. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  725. These two buffer types are treated as storage buffers using Vulkan's
  726. terminology. They are translated into an ``OpTypeStruct`` with the
  727. necessary layout decorations (``Offset``, ``ArrayStride``, ``MatrixStride``,
  728. ``RowMajor``, ``ColMajor``) and the ``BufferBlock`` decoration. All the struct
  729. members are also decorated with ``NonWritable`` decoration. The layout rule
  730. used is vector-relaxed OpenGL ``std430`` (by default). A variable declared as
  731. one of these types will be placed in the ``Uniform`` storage class.
  732. ``StructuredBuffer`` and ``RWStructuredBuffer``
  733. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  734. ``StructuredBuffer<T>``/``RWStructuredBuffer<T>`` is treated as storage buffer
  735. using Vulkan's terminology. It is translated into an ``OpTypeStruct`` containing
  736. an ``OpTypeRuntimeArray`` of type ``T``, with necessary layout decorations
  737. (``Offset``, ``ArrayStride``, ``MatrixStride``, ``RowMajor``, ``ColMajor``) and
  738. the ``BufferBlock`` decoration. The default layout rule used is vector-relaxed
  739. OpenGL ``std430``. A variable declared as one of these types will be placed in
  740. the ``Uniform`` storage class.
  741. For ``RWStructuredBuffer<T>``, each variable will have an associated counter
  742. variable generated. The counter variable will be of ``OpTypeStruct`` type, which
  743. only contains a 32-bit integer. The counter variable takes its own binding
  744. number. ``.IncrementCounter()``/``.DecrementCounter()`` will modify this counter
  745. variable.
  746. For example, for the following HLSL source code:
  747. .. code:: hlsl
  748. struct T {
  749. float a;
  750. float3 b;
  751. };
  752. StructuredBuffer<T> mySBuffer;
  753. will be translated into
  754. .. code:: spirv
  755. ; Layout decoration
  756. OpMemberDecorate %T 0 Offset 0
  757. OpMemberDecorate %T 1 Offset 4
  758. OpDecorate %_runtimearr_T ArrayStride 16
  759. OpMemberDecorate %type_StructuredBuffer_T 0 Offset 0
  760. OpMemberDecorate %type_StructuredBuffer_T 0 NoWritable
  761. ; BufferBlock decoration
  762. OpDecorate %type_StructuredBuffer_T BufferBlock
  763. ; Types
  764. %T = OpTypeStruct %float %v3float
  765. %_runtimearr_T = OpTypeRuntimeArray %T
  766. %type_StructuredBuffer_T = OpTypeStruct %_runtimearr_T
  767. %_ptr_Uniform_type_StructuredBuffer_T = OpTypePointer Uniform %type_StructuredBuffer_T
  768. ; Variable
  769. %myCbuffer = OpVariable %_ptr_Uniform_type_ConstantBuffer_T Uniform
  770. ``AppendStructuredBuffer`` and ``ConsumeStructuredBuffer``
  771. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  772. ``AppendStructuredBuffer<T>``/``ConsumeStructuredBuffer<T>`` is treated as
  773. storage buffer using Vulkan's terminology. It is translated into an
  774. ``OpTypeStruct`` containing an ``OpTypeRuntimeArray`` of type ``T``, with
  775. necessary layout decorations (``Offset``, ``ArrayStride``, ``MatrixStride``,
  776. ``RowMajor``, ``ColMajor``) and the ``BufferBlock`` decoration. The default
  777. layout rule used is vector-relaxed OpenGL ``std430``.
  778. A variable declared as one of these types will be placed in the ``Uniform``
  779. storage class. Besides, each variable will have an associated counter variable
  780. generated. The counter variable will be of ``OpTypeStruct`` type, which only
  781. contains a 32-bit integer. The integer is the total number of elements in the
  782. buffer. The counter variable takes its own binding number.
  783. ``.Append()``/``.Consume()`` will use the counter variable as the index and
  784. adjust it accordingly.
  785. For example, for the following HLSL source code:
  786. .. code:: hlsl
  787. struct T {
  788. float a;
  789. float3 b;
  790. };
  791. AppendStructuredBuffer<T> mySBuffer;
  792. will be translated into
  793. .. code:: spirv
  794. ; Layout decorations
  795. OpMemberDecorate %T 0 Offset 0
  796. OpMemberDecorate %T 1 Offset 4
  797. OpDecorate %_runtimearr_T ArrayStride 16
  798. OpMemberDecorate %type_AppendStructuredBuffer_T 0 Offset 0
  799. OpDecorate %type_AppendStructuredBuffer_T BufferBlock
  800. OpMemberDecorate %type_ACSBuffer_counter 0 Offset 0
  801. OpDecorate %type_ACSBuffer_counter BufferBlock
  802. ; Binding numbers
  803. OpDecorate %myASbuffer DescriptorSet 0
  804. OpDecorate %myASbuffer Binding 0
  805. OpDecorate %counter_var_myASbuffer DescriptorSet 0
  806. OpDecorate %counter_var_myASbuffer Binding 1
  807. ; Types
  808. %T = OpTypeStruct %float %v3float
  809. %_runtimearr_T = OpTypeRuntimeArray %T
  810. %type_AppendStructuredBuffer_T = OpTypeStruct %_runtimearr_T
  811. %_ptr_Uniform_type_AppendStructuredBuffer_T = OpTypePointer Uniform %type_AppendStructuredBuffer_T
  812. %type_ACSBuffer_counter = OpTypeStruct %int
  813. %_ptr_Uniform_type_ACSBuffer_counter = OpTypePointer Uniform %type_ACSBuffer_counter
  814. ; Variables
  815. %myASbuffer = OpVariable %_ptr_Uniform_type_AppendStructuredBuffer_T Uniform
  816. %counter_var_myASbuffer = OpVariable %_ptr_Uniform_type_ACSBuffer_counter Uniform
  817. ``ByteAddressBuffer`` and ``RWByteAddressBuffer``
  818. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  819. ``ByteAddressBuffer``/``RWByteAddressBuffer`` is treated as storage buffer using
  820. Vulkan's terminology. It is translated into an ``OpTypeStruct`` containing an
  821. ``OpTypeRuntimeArray`` of 32-bit unsigned integers, with ``BufferBlock``
  822. decoration.
  823. A variable declared as one of these types will be placed in the ``Uniform``
  824. storage class.
  825. For example, for the following HLSL source code:
  826. .. code:: hlsl
  827. ByteAddressBuffer myBuffer1;
  828. RWByteAddressBuffer myBuffer2;
  829. will be translated into
  830. .. code:: spirv
  831. ; Layout decorations
  832. OpDecorate %_runtimearr_uint ArrayStride 4
  833. OpDecorate %type_ByteAddressBuffer BufferBlock
  834. OpMemberDecorate %type_ByteAddressBuffer 0 Offset 0
  835. OpMemberDecorate %type_ByteAddressBuffer 0 NonWritable
  836. OpDecorate %type_RWByteAddressBuffer BufferBlock
  837. OpMemberDecorate %type_RWByteAddressBuffer 0 Offset 0
  838. ; Types
  839. %_runtimearr_uint = OpTypeRuntimeArray %uint
  840. %type_ByteAddressBuffer = OpTypeStruct %_runtimearr_uint
  841. %_ptr_Uniform_type_ByteAddressBuffer = OpTypePointer Uniform %type_ByteAddressBuffer
  842. %type_RWByteAddressBuffer = OpTypeStruct %_runtimearr_uint
  843. %_ptr_Uniform_type_RWByteAddressBuffer = OpTypePointer Uniform %type_RWByteAddressBuffer
  844. ; Variables
  845. %myBuffer1 = OpVariable %_ptr_Uniform_type_ByteAddressBuffer Uniform
  846. %myBuffer2 = OpVariable %_ptr_Uniform_type_RWByteAddressBuffer Uniform
  847. HLSL Variables and Resources
  848. ============================
  849. This section lists how various HLSL variables and resources are mapped.
  850. According to `Shader Constants <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509581(v=vs.85).aspx>`_,
  851. There are two default constant buffers available, $Global and $Param. Variables
  852. that are placed in the global scope are added implicitly to the $Global cbuffer,
  853. using the same packing method that is used for cbuffers. Uniform parameters in
  854. the parameter list of a function appear in the $Param constant buffer when a
  855. shader is compiled outside of the effects framework.
  856. So all global externally-visible non-resource-type stand-alone variables will
  857. be collected into a cbuffer named as ``$Globals``, no matter whether they are
  858. statically referenced by the entry point or not. The ``$Globals`` cbuffer
  859. follows the layout rules like normal cbuffer.
  860. Storage class
  861. -------------
  862. Normal local variables (without any modifier) will be placed in the ``Function``
  863. SPIR-V storage class. Normal global variables (without any modifer) will be
  864. placed in the ``Uniform`` or ``UniformConstant`` storage class.
  865. - ``static``
  866. - Global variables with ``static`` modifier will be placed in the ``Private``
  867. SPIR-V storage class. Initalizers of such global variables will be translated
  868. into SPIR-V ``OpVariable`` initializers if possible; otherwise, they will be
  869. initialized at the very beginning of the `entry function wrapper`_ using
  870. SPIR-V ``OpStore``.
  871. - Local variables with ``static`` modifier will also be placed in the
  872. ``Private`` SPIR-V storage class. initializers of such local variables will
  873. also be translated into SPIR-V ``OpVariable`` initializers if possible;
  874. otherwise, they will be initialized at the very beginning of the enclosing
  875. function. To make sure that such a local variable is only initialized once,
  876. a second boolean variable of the ``Private`` SPIR-V storage class will be
  877. generated to mark its initialization status.
  878. - ``groupshared``
  879. - Global variables with ``groupshared`` modifier will be placed in the
  880. ``Workgroup`` storage class.
  881. - Note that this modifier overrules ``static``; if both ``groupshared`` and
  882. ``static`` are applied to a variable, ``static`` will be ignored.
  883. - ``uinform``
  884. - This does not affect codegen. Variables will be treated like normal global
  885. variables.
  886. - ``extern``
  887. - This does not affect codegen. Variables will be treated like normal global
  888. variables.
  889. - ``shared``
  890. - This is a hint to the compiler. It will be ingored.
  891. - ``volatile``
  892. - This is a hint to the compiler. It will be ingored.
  893. HLSL semantic and Vulkan ``Location``
  894. -------------------------------------
  895. Direct3D uses HLSL "`semantics <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509647(v=vs.85).aspx>`_"
  896. to compose and match the interfaces between subsequent stages. These semantic
  897. strings can appear after struct members, function parameters and return
  898. values. E.g.,
  899. .. code:: hlsl
  900. struct VSInput {
  901. float4 pos : POSITION;
  902. float3 norm : NORMAL;
  903. };
  904. float4 VSMain(in VSInput input,
  905. in float4 tex : TEXCOORD,
  906. out float4 pos : SV_Position) : TEXCOORD {
  907. pos = input.pos;
  908. return tex;
  909. }
  910. In contrary, Vulkan stage input and output interface matching is via explicit
  911. ``Location`` numbers. Details can be found `here <https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#interfaces-iointerfaces>`_.
  912. To translate HLSL to SPIR-V for Vulkan, semantic strings need to be mapped to
  913. Vulkan ``Location`` numbers properly. This can be done either explicitly via
  914. information provided by the developer or implicitly by the compiler.
  915. Explicit ``Location`` number assignment
  916. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  917. ``[[vk::location(X)]]`` can be attached to the entities where semantic are
  918. allowed to attach (struct fields, function parameters, and function returns).
  919. For the above exmaple we can have:
  920. .. code:: hlsl
  921. struct VSInput {
  922. [[vk::location(0)]] float4 pos : POSITION;
  923. [[vk::location(1)]] float3 norm : NORMAL;
  924. };
  925. [[vk::location(1)]]
  926. float4 VSMain(in VSInput input,
  927. [[vk::location(2)]]
  928. in float4 tex : TEXCOORD,
  929. out float4 pos : SV_Position) : TEXCOORD {
  930. pos = input.pos;
  931. return tex;
  932. }
  933. In the above, input ``POSITION``, ``NORMAL``, and ``TEXCOORD`` will be mapped to
  934. ``Location`` 0, 1, and 2, respectively, and output ``TEXCOORD`` will be mapped
  935. to ``Location`` 1.
  936. [TODO] Another explicit way: using command-line options
  937. Please note that the compiler prohibits mixing the explicit and implicit
  938. approach for the same SigPoint to avoid complexity and fallibility. However,
  939. for a certain shader stage, one SigPoint using the explicit approach while the
  940. other adopting the implicit approach is permitted.
  941. Implicit ``Location`` number assignment
  942. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  943. Without hints from the developer, the compiler will try its best to map
  944. semantics to ``Location`` numbers. However, there is no single rule for this
  945. mapping; semantic strings should be handled case by case.
  946. Firstly, under certain `SigPoints <https://github.com/Microsoft/DirectXShaderCompiler/blob/master/docs/DXIL.rst#hlsl-signatures-and-semantics>`_,
  947. some system-value (SV) semantic strings will be translated into SPIR-V
  948. ``BuiltIn`` decorations:
  949. .. table:: Mapping from HLSL SV semantic to SPIR-V builtin and execution mode
  950. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  951. | HLSL Semantic | SigPoint | SPIR-V ``BuiltIn`` | SPIR-V Execution Mode | SPIR-V Capability |
  952. +===========================+=============+==========================+=======================+=============================+
  953. | | VSOut | ``Position`` | N/A | ``Shader`` |
  954. | +-------------+--------------------------+-----------------------+-----------------------------+
  955. | | HSCPIn | ``Position`` | N/A | ``Shader`` |
  956. | +-------------+--------------------------+-----------------------+-----------------------------+
  957. | | HSCPOut | ``Position`` | N/A | ``Shader`` |
  958. | +-------------+--------------------------+-----------------------+-----------------------------+
  959. | | DSCPIn | ``Position`` | N/A | ``Shader`` |
  960. | SV_Position +-------------+--------------------------+-----------------------+-----------------------------+
  961. | | DSOut | ``Position`` | N/A | ``Shader`` |
  962. | +-------------+--------------------------+-----------------------+-----------------------------+
  963. | | GSVIn | ``Position`` | N/A | ``Shader`` |
  964. | +-------------+--------------------------+-----------------------+-----------------------------+
  965. | | GSOut | ``Position`` | N/A | ``Shader`` |
  966. | +-------------+--------------------------+-----------------------+-----------------------------+
  967. | | PSIn | ``FragCoord`` | N/A | ``Shader`` |
  968. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  969. | | VSOut | ``ClipDistance`` | N/A | ``ClipDistance`` |
  970. | +-------------+--------------------------+-----------------------+-----------------------------+
  971. | | HSCPIn | ``ClipDistance`` | N/A | ``ClipDistance`` |
  972. | +-------------+--------------------------+-----------------------+-----------------------------+
  973. | | HSCPOut | ``ClipDistance`` | N/A | ``ClipDistance`` |
  974. | +-------------+--------------------------+-----------------------+-----------------------------+
  975. | | DSCPIn | ``ClipDistance`` | N/A | ``ClipDistance`` |
  976. | SV_ClipDistance +-------------+--------------------------+-----------------------+-----------------------------+
  977. | | DSOut | ``ClipDistance`` | N/A | ``ClipDistance`` |
  978. | +-------------+--------------------------+-----------------------+-----------------------------+
  979. | | GSVIn | ``ClipDistance`` | N/A | ``ClipDistance`` |
  980. | +-------------+--------------------------+-----------------------+-----------------------------+
  981. | | GSOut | ``ClipDistance`` | N/A | ``ClipDistance`` |
  982. | +-------------+--------------------------+-----------------------+-----------------------------+
  983. | | PSIn | ``ClipDistance`` | N/A | ``ClipDistance`` |
  984. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  985. | | VSOut | ``CullDistance`` | N/A | ``CullDistance`` |
  986. | +-------------+--------------------------+-----------------------+-----------------------------+
  987. | | HSCPIn | ``CullDistance`` | N/A | ``CullDistance`` |
  988. | +-------------+--------------------------+-----------------------+-----------------------------+
  989. | | HSCPOut | ``CullDistance`` | N/A | ``CullDistance`` |
  990. | +-------------+--------------------------+-----------------------+-----------------------------+
  991. | | DSCPIn | ``CullDistance`` | N/A | ``CullDistance`` |
  992. | SV_CullDistance +-------------+--------------------------+-----------------------+-----------------------------+
  993. | | DSOut | ``CullDistance`` | N/A | ``CullDistance`` |
  994. | +-------------+--------------------------+-----------------------+-----------------------------+
  995. | | GSVIn | ``CullDistance`` | N/A | ``CullDistance`` |
  996. | +-------------+--------------------------+-----------------------+-----------------------------+
  997. | | GSOut | ``CullDistance`` | N/A | ``CullDistance`` |
  998. | +-------------+--------------------------+-----------------------+-----------------------------+
  999. | | PSIn | ``CullDistance`` | N/A | ``CullDistance`` |
  1000. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1001. | SV_VertexID | VSIn | ``VertexIndex`` | N/A | ``Shader`` |
  1002. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1003. | SV_InstanceID | VSIn | ``InstanceIndex`` | N/A | ``Shader`` |
  1004. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1005. | SV_Depth | PSOut | ``FragDepth`` | N/A | ``Shader`` |
  1006. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1007. | SV_DepthGreaterEqual | PSOut | ``FragDepth`` | ``DepthGreater`` | ``Shader`` |
  1008. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1009. | SV_DepthLessEqual | PSOut | ``FragDepth`` | ``DepthLess`` | ``Shader`` |
  1010. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1011. | SV_IsFrontFace | PSIn | ``FrontFacing`` | N/A | ``Shader`` |
  1012. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1013. | SV_DispatchThreadID | CSIn | ``GlobalInvocationId`` | N/A | ``Shader`` |
  1014. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1015. | SV_GroupID | CSIn | ``WorkgroupId`` | N/A | ``Shader`` |
  1016. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1017. | SV_GroupThreadID | CSIn | ``LocalInvocationId`` | N/A | ``Shader`` |
  1018. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1019. | SV_GroupIndex | CSIn | ``LocalInvocationIndex`` | N/A | ``Shader`` |
  1020. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1021. | SV_OutputControlPointID | HSIn | ``InvocationId`` | N/A | ``Tessellation`` |
  1022. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1023. | SV_GSInstanceID | GSIn | ``InvocationId`` | N/A | ``Geometry`` |
  1024. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1025. | SV_DomainLocation | DSIn | ``TessCoord`` | N/A | ``Tessellation`` |
  1026. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1027. | | HSIn | ``PrimitiveId`` | N/A | ``Tessellation`` |
  1028. | +-------------+--------------------------+-----------------------+-----------------------------+
  1029. | | PCIn | ``PrimitiveId`` | N/A | ``Tessellation`` |
  1030. | +-------------+--------------------------+-----------------------+-----------------------------+
  1031. | | DsIn | ``PrimitiveId`` | N/A | ``Tessellation`` |
  1032. | SV_PrimitiveID +-------------+--------------------------+-----------------------+-----------------------------+
  1033. | | GSIn | ``PrimitiveId`` | N/A | ``Geometry`` |
  1034. | +-------------+--------------------------+-----------------------+-----------------------------+
  1035. | | GSOut | ``PrimitiveId`` | N/A | ``Geometry`` |
  1036. | +-------------+--------------------------+-----------------------+-----------------------------+
  1037. | | PSIn | ``PrimitiveId`` | N/A | ``Geometry`` |
  1038. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1039. | | PCOut | ``TessLevelOuter`` | N/A | ``Tessellation`` |
  1040. | SV_TessFactor +-------------+--------------------------+-----------------------+-----------------------------+
  1041. | | DSIn | ``TessLevelOuter`` | N/A | ``Tessellation`` |
  1042. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1043. | | PCOut | ``TessLevelInner`` | N/A | ``Tessellation`` |
  1044. | SV_InsideTessFactor +-------------+--------------------------+-----------------------+-----------------------------+
  1045. | | DSIn | ``TessLevelInner`` | N/A | ``Tessellation`` |
  1046. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1047. | SV_SampleIndex | PSIn | ``SampleId`` | N/A | ``SampleRateShading`` |
  1048. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1049. | SV_StencilRef | PSOut | ``FragStencilRefEXT`` | N/A | ``StencilExportEXT`` |
  1050. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1051. | SV_Barycentrics | PSIn | ``BaryCoord*AMD`` | N/A | ``Shader`` |
  1052. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1053. | | GSOut | ``Layer`` | N/A | ``Geometry`` |
  1054. | SV_RenderTargetArrayIndex +-------------+--------------------------+-----------------------+-----------------------------+
  1055. | | PSIn | ``Layer`` | N/A | ``Geometry`` |
  1056. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1057. | | GSOut | ``ViewportIndex`` | N/A | ``MultiViewport`` |
  1058. | SV_ViewportArrayIndex +-------------+--------------------------+-----------------------+-----------------------------+
  1059. | | PSIn | ``ViewportIndex`` | N/A | ``MultiViewport`` |
  1060. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1061. | | PSIn | ``SampleMask`` | N/A | ``Shader`` |
  1062. | SV_Coverage +-------------+--------------------------+-----------------------+-----------------------------+
  1063. | | PSOut | ``SampleMask`` | N/A | ``Shader`` |
  1064. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1065. | SV_InnerCoverage | PSIn | ``FullyCoveredEXT`` | N/A | ``FragmentFullyCoveredEXT`` |
  1066. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1067. | | VSIn | ``ViewIndex`` | N/A | ``MultiView`` |
  1068. | +-------------+--------------------------+-----------------------+-----------------------------+
  1069. | | HSIn | ``ViewIndex`` | N/A | ``MultiView`` |
  1070. | +-------------+--------------------------+-----------------------+-----------------------------+
  1071. | SV_ViewID | DSIn | ``ViewIndex`` | N/A | ``MultiView`` |
  1072. | +-------------+--------------------------+-----------------------+-----------------------------+
  1073. | | GSIn | ``ViewIndex`` | N/A | ``MultiView`` |
  1074. | +-------------+--------------------------+-----------------------+-----------------------------+
  1075. | | PSIn | ``ViewIndex`` | N/A | ``MultiView`` |
  1076. +---------------------------+-------------+--------------------------+-----------------------+-----------------------------+
  1077. For entities (function parameters, function return values, struct fields) with
  1078. the above SV semantic strings attached, SPIR-V variables of the
  1079. ``Input``/``Output`` storage class will be created. They will have the
  1080. corresponding SPIR-V ``Builtin`` decorations according to the above table.
  1081. SV semantic strings not translated into SPIR-V ``BuiltIn`` decorations will be
  1082. handled similarly as non-SV (arbitrary) semantic strings: a SPIR-V variable
  1083. of the ``Input``/``Output`` storage class will be created for each entity with
  1084. such semantic string. Then sort all semantic strings according to declaration
  1085. (the default, or if ``-fvk-stage-io-order=decl`` is given) or alphabetical
  1086. (if ``-fvk-stage-io-order=alpha`` is given) order, and assign ``Location``
  1087. numbers sequentially to the corresponding SPIR-V variables. Note that this means
  1088. flattening all structs if structs are used as function parameters or returns.
  1089. There is an exception to the above rule for SV_Target[N]. It will always be
  1090. mapped to ``Location`` number N.
  1091. ``ClipDistance & CullDistance``
  1092. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1093. Variables decorated with ``SV_ClipDistanceX`` can be float or vector of float
  1094. type. To map them into one float array in the struct, we firstly sort them
  1095. asecendingly according to ``X``, and then concatenate them tightly. For example,
  1096. .. code:: hlsl
  1097. struct T {
  1098. float clip0: SV_ClipDistance0,
  1099. };
  1100. struct S {
  1101. float3 clip5: SV_ClipDistance5;
  1102. ...
  1103. };
  1104. void main(T t, S s, float2 clip2 : SV_ClipDistance2) { ... }
  1105. Then we have an float array of size (1 + 2 + 3 =) 6 for ``ClipDistance``, with
  1106. ``clip0`` at offset 0, ``clip2`` at offset 1, ``clip5`` at offset 3.
  1107. Decorating a variable or struct member with the ``ClipDistance`` builtin but not
  1108. requiring the ``ClipDistance`` capability is legal as long as we don't read or
  1109. write the variable or struct member. But as per the way we handle `shader entry
  1110. function`_, this is not satisfied because we need to read their contents to
  1111. prepare for the source code entry function call or write back them after the
  1112. call. So annotating a variable or struct member with ``SV_ClipDistanceX`` means
  1113. requiring the ``ClipDistance`` capability in the generated SPIR-V.
  1114. Variables decorated with ``SV_CullDistanceX`` are mapped similarly as above.
  1115. HLSL register and Vulkan binding
  1116. --------------------------------
  1117. In shaders for DirectX, resources are accessed via registers; while in shaders
  1118. for Vulkan, it is done via descriptor set and binding numbers. The developer
  1119. can explicitly annotate variables in HLSL to specify descriptor set and binding
  1120. numbers, or leave it to the compiler to derive implicitly from registers.
  1121. Explicit binding number assignment
  1122. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1123. ``[[vk::binding(X[, Y])]]`` can be attached to global variables to specify the
  1124. descriptor set as ``Y`` and binding number as ``X``. The descriptor set number
  1125. is optional; if missing, it will be zero. RW/append/consume structured buffers
  1126. have associated counters, which will occupy their own Vulkan descriptors.
  1127. ``[vk::counter_binding(Z)]`` can be attached to a RW/append/consume structured
  1128. buffers to specify the binding number for the associated counter to ``Z``. Note
  1129. that the set number of the counter is always the same as the main buffer.
  1130. Implicit binding number assignment
  1131. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1132. Without explicit annotations, the compiler will try to deduce descriptor sets
  1133. and binding numbers in the following way:
  1134. If there is ``:register(xX, spaceY)`` specified for the given global variable,
  1135. the corresponding resource will be assigned to descriptor set ``Y`` and binding
  1136. number ``X``, regardless of the register type ``x``. Note that this will cause
  1137. binding number collision if, say, two resources are of different register
  1138. type but the same register number. To solve this problem, four command-line
  1139. options, ``-fvk-b-shift N M``, ``-fvk-s-shift N M``, ``-fvk-t-shift N M``, and
  1140. ``-fvk-u-shift N M``, are provided to shift by ``N`` all binding numbers
  1141. inferred for register type ``b``, ``s``, ``t``, and ``u`` in space ``M``,
  1142. respectively.
  1143. If there is no register specification, the corresponding resource will be
  1144. assigned to the next available binding number, starting from 0, in descriptor
  1145. set #0.
  1146. Summary
  1147. ~~~~~~~
  1148. In summary, the compiler essentially assigns binding numbers in three passes.
  1149. - Firstly it handles all declarations with explicit ``[[vk::binding(X[, Y])]]``
  1150. annotation.
  1151. - Then the compiler processes all remaining declarations with
  1152. ``:register(xX, spaceY)`` annotation, by applying the shift passed in using
  1153. command-line option ``-fvk-{b|s|t|u}-shift N M``, if provided.
  1154. - Finally, the compiler assigns next available binding numbers to the rest in
  1155. the declaration order.
  1156. As an example, for the following code:
  1157. .. code:: hlsl
  1158. struct S { ... };
  1159. ConstantBuffer<S> cbuffer1 : register(b0);
  1160. Texture2D<float4> texture1 : register(t0);
  1161. Texture2D<float4> texture2 : register(t1, space1);
  1162. SamplerState sampler1;
  1163. [[vk::binding(3)]]
  1164. RWBuffer<float4> rwbuffer1 : register(u5, space2);
  1165. If we compile with ``-fvk-t-shift 10 0 -fvk-t-shift 20 1``:
  1166. - ``rwbuffer1`` will take binding #3 in set #0, since explicit binding
  1167. assignment has precedence over the rest.
  1168. - ``cbuffer1`` will take binding #0 in set #0, since that's what deduced from
  1169. the register assignment, and there is no shift requested from command line.
  1170. - ``texture1`` will take binding #10 in set #0, and ``texture2`` will take
  1171. binding #21 in set #1, since we requested an 10 shift on t-type registers.
  1172. - ``sampler1`` will take binding 1 in set #0, since that's the next available
  1173. binding number in set #0.
  1174. .. code:: hlsl
  1175. HLSL Expressions
  1176. ================
  1177. Unless explicitly noted, matrix per-element operations will be conducted on
  1178. each component vector and then collected into the result matrix. The following
  1179. sections lists the SPIR-V opcodes for scalars and vectors.
  1180. Arithmetic operators
  1181. --------------------
  1182. `Arithmetic operators <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509631(v=vs.85).aspx#Additive_and_Multiplicative_Operators>`_
  1183. (``+``, ``-``, ``*``, ``/``, ``%``) are translated into their corresponding
  1184. SPIR-V opcodes according to the following table.
  1185. +-------+-----------------------------+-------------------------------+--------------------+
  1186. | | (Vector of) Signed Integers | (Vector of) Unsigned Integers | (Vector of) Floats |
  1187. +=======+=============================+===============================+====================+
  1188. | ``+`` | ``OpIAdd`` | ``OpFAdd`` |
  1189. +-------+-------------------------------------------------------------+--------------------+
  1190. | ``-`` | ``OpISub`` | ``OpFSub`` |
  1191. +-------+-------------------------------------------------------------+--------------------+
  1192. | ``*`` | ``OpIMul`` | ``OpFMul`` |
  1193. +-------+-----------------------------+-------------------------------+--------------------+
  1194. | ``/`` | ``OpSDiv`` | ``OpUDiv`` | ``OpFDiv`` |
  1195. +-------+-----------------------------+-------------------------------+--------------------+
  1196. | ``%`` | ``OpSRem`` | ``OpUMod`` | ``OpFRem`` |
  1197. +-------+-----------------------------+-------------------------------+--------------------+
  1198. Note that for modulo operation, SPIR-V has two sets of instructions: ``Op*Rem``
  1199. and ``Op*Mod``. For ``Op*Rem``, the sign of a non-0 result comes from the first
  1200. operand; while for ``Op*Mod``, the sign of a non-0 result comes from the second
  1201. operand. HLSL doc does not mandate which set of instructions modulo operations
  1202. should be translated into; it only says "the % operator is defined only in cases
  1203. where either both sides are positive or both sides are negative." So technically
  1204. it's undefined behavior to use the modulo operation with operands of different
  1205. signs. But considering HLSL's C heritage and the behavior of Clang frontend, we
  1206. translate modulo operators into ``Op*Rem`` (there is no ``OpURem``).
  1207. For multiplications of float vectors and float scalars, the dedicated SPIR-V
  1208. operation ``OpVectorTimesScalar`` will be used. Similarly, for multiplications
  1209. of float matrices and float scalars, ``OpMatrixTimesScalar`` will be generated.
  1210. Bitwise operators
  1211. -----------------
  1212. `Bitwise operators <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509631(v=vs.85).aspx#Bitwise_Operators>`_
  1213. (``~``, ``&``, ``|``, ``^``, ``<<``, ``>>``) are translated into their
  1214. corresponding SPIR-V opcodes according to the following table.
  1215. +--------+-----------------------------+-------------------------------+
  1216. | | (Vector of) Signed Integers | (Vector of) Unsigned Integers |
  1217. +========+=============================+===============================+
  1218. | ``~`` | ``OpNot`` |
  1219. +--------+-------------------------------------------------------------+
  1220. | ``&`` | ``OpBitwiseAnd`` |
  1221. +--------+-------------------------------------------------------------+
  1222. | ``|`` | ``OpBitwiseOr`` |
  1223. +--------+-----------------------------+-------------------------------+
  1224. | ``^`` | ``OpBitwiseXor`` |
  1225. +--------+-----------------------------+-------------------------------+
  1226. | ``<<`` | ``OpShiftLeftLogical`` |
  1227. +--------+-----------------------------+-------------------------------+
  1228. | ``>>`` | ``OpShiftRightArithmetic`` | ``OpShiftRightLogical`` |
  1229. +--------+-----------------------------+-------------------------------+
  1230. Note that for ``<<``/``>>``, the right hand side will be culled: only the ``n``
  1231. - 1 least significant bits are considered, where ``n`` is the bitwidth of the
  1232. left hand side.
  1233. Comparison operators
  1234. --------------------
  1235. `Comparison operators <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509631(v=vs.85).aspx#Comparison_Operators>`_
  1236. (``<``, ``<=``, ``>``, ``>=``, ``==``, ``!=``) are translated into their
  1237. corresponding SPIR-V opcodes according to the following table.
  1238. +--------+-----------------------------+-------------------------------+------------------------------+
  1239. | | (Vector of) Signed Integers | (Vector of) Unsigned Integers | (Vector of) Floats |
  1240. +========+=============================+===============================+==============================+
  1241. | ``<`` | ``OpSLessThan`` | ``OpULessThan`` | ``OpFOrdLessThan`` |
  1242. +--------+-----------------------------+-------------------------------+------------------------------+
  1243. | ``<=`` | ``OpSLessThanEqual`` | ``OpULessThanEqual`` | ``OpFOrdLessThanEqual`` |
  1244. +--------+-----------------------------+-------------------------------+------------------------------+
  1245. | ``>`` | ``OpSGreaterThan`` | ``OpUGreaterThan`` | ``OpFOrdGreaterThan`` |
  1246. +--------+-----------------------------+-------------------------------+------------------------------+
  1247. | ``>=`` | ``OpSGreaterThanEqual`` | ``OpUGreaterThanEqual`` | ``OpFOrdGreaterThanEqual`` |
  1248. +--------+-----------------------------+-------------------------------+------------------------------+
  1249. | ``==`` | ``OpIEqual`` | ``OpFOrdEqual`` |
  1250. +--------+-------------------------------------------------------------+------------------------------+
  1251. | ``!=`` | ``OpINotEqual`` | ``OpFOrdNotEqual`` |
  1252. +--------+-------------------------------------------------------------+------------------------------+
  1253. Note that for comparison of (vectors of) floats, SPIR-V has two sets of
  1254. instructions: ``OpFOrd*``, ``OpFUnord*``. We translate into ``OpFOrd*`` ones.
  1255. Boolean math operators
  1256. ----------------------
  1257. `Boolean match operators <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509631(v=vs.85).aspx#Boolean_Math_Operators>`_
  1258. (``&&``, ``||``, ``?:``) are translated into their corresponding SPIR-V opcodes
  1259. according to the following table.
  1260. +--------+----------------------+
  1261. | | (Vector of) Booleans |
  1262. +========+======================+
  1263. | ``&&`` | ``OpLogicalAnd`` |
  1264. +--------+----------------------+
  1265. | ``||`` | ``OpLogicalOr`` |
  1266. +--------+----------------------+
  1267. | ``?:`` | ``OpSelect`` |
  1268. +--------+----------------------+
  1269. Please note that "unlike short-circuit evaluation of ``&&``, ``||``, and ``?:``
  1270. in C, HLSL expressions never short-circuit an evaluation because they are vector
  1271. operations. All sides of the expression are always evaluated."
  1272. Unary operators
  1273. ---------------
  1274. For `unary operators <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509631(v=vs.85).aspx#Unary_Operators>`_:
  1275. - ``!`` is translated into ``OpLogicalNot``. Parsing will gurantee the operands
  1276. are of boolean types by inserting necessary casts.
  1277. - ``+`` requires no additional SPIR-V instructions.
  1278. - ``-`` is translated into ``OpSNegate`` and ``OpFNegate`` for (vectors of)
  1279. integers and floats, respectively.
  1280. Casts
  1281. -----
  1282. Casting between (vectors) of scalar types is translated according to the following table:
  1283. +------------+-------------------+-------------------+-------------------+-------------------+
  1284. | From \\ To | Bool | SInt | UInt | Float |
  1285. +============+===================+===================+===================+===================+
  1286. | Bool | no-op | select between one and zero |
  1287. +------------+-------------------+-------------------+-------------------+-------------------+
  1288. | SInt | | no-op | ``OpBitcast`` | ``OpConvertSToF`` |
  1289. +------------+ +-------------------+-------------------+-------------------+
  1290. | UInt | compare with zero | ``OpBitcast`` | no-op | ``OpConvertUToF`` |
  1291. +------------+ +-------------------+-------------------+-------------------+
  1292. | Float | | ``OpConvertFToS`` | ``OpConvertFToU`` | no-op |
  1293. +------------+-------------------+-------------------+-------------------+-------------------+
  1294. It is also feasible in HLSL to cast a float matrix to another float matrix with a smaller size.
  1295. This is known as matrix truncation cast. For instance, the following code casts a 3x4 matrix
  1296. into a 2x3 matrix.
  1297. .. code:: hlsl
  1298. float3x4 m = { 1, 2, 3, 4,
  1299. 5, 6, 7, 8,
  1300. 9, 10, 11, 12 };
  1301. float2x3 a = (float2x3)m;
  1302. Such casting takes the upper-left most corner of the original matrix to generate the result.
  1303. In the above example, matrix ``a`` will have 2 rows, with 3 columns each. First row will be
  1304. ``1, 2, 3`` and the second row will be ``5, 6, 7``.
  1305. Indexing operator
  1306. -----------------
  1307. The ``[]`` operator can also be used to access elements in a matrix or vector.
  1308. A matrix whose row and/or column count is 1 will be translated into a vector or
  1309. scalar. If a variable is used as the index for the dimension whose count is 1,
  1310. that variable will be ignored in the generated SPIR-V code. This is because
  1311. out-of-bound indexing triggers undefined behavior anyway. For example, for a
  1312. 1xN matrix ``mat``, ``mat[index][0]`` will be translated into
  1313. ``OpAccessChain ... %mat %uint_0``. Similarly, variable index into a size 1
  1314. vector will also be ignored and the only element will be always returned.
  1315. Assignment operators
  1316. --------------------
  1317. Assigning to struct object may involve decomposing the source struct object and
  1318. assign each element separately and recursively. This happens when the source
  1319. struct object is of different memory layout from the destination struct object.
  1320. For example, for the following source code:
  1321. .. code:: hlsl
  1322. struct S {
  1323. float a;
  1324. float2 b;
  1325. float2x3 c;
  1326. };
  1327. ConstantBuffer<S> cbuf;
  1328. RWStructuredBuffer<S> sbuf;
  1329. ...
  1330. sbuf[0] = cbuf[0];
  1331. ...
  1332. We need to assign each element because ``ConstantBuffer`` and
  1333. ``RWStructuredBuffer`` has different memory layout.
  1334. HLSL Control Flows
  1335. ==================
  1336. This section lists how various HLSL control flows are mapped.
  1337. Switch statement
  1338. ----------------
  1339. HLSL `switch statements <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509669(v=vs.85).aspx>`_
  1340. are translated into SPIR-V using:
  1341. - **OpSwitch**: if (all case values are integer literals or constant integer
  1342. variables) and (no attribute or the ``forcecase`` attribute is specified)
  1343. - **A series of if statements**: for all other scenarios (e.g., when
  1344. ``flatten``, ``branch``, or ``call`` attribute is specified)
  1345. Loops (for, while, do)
  1346. ----------------------
  1347. HLSL `for statements <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509602(v=vs.85).aspx>`_,
  1348. `while statements <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509708(v=vs.85).aspx>`_,
  1349. and `do statements <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509593(v=vs.85).aspx>`_
  1350. are translated into SPIR-V by constructing all necessary basic blocks and using
  1351. ``OpLoopMerge`` to organize as structured loops.
  1352. The HLSL attributes for these statements are translated into SPIR-V loop control
  1353. masks according to the following table:
  1354. +-------------------------+--------------------------------------------------+
  1355. | HLSL loop attribute | SPIR-V Loop Control Mask |
  1356. +=========================+==================================================+
  1357. | ``unroll(x)`` | ``Unroll`` |
  1358. +-------------------------+--------------------------------------------------+
  1359. | ``loop`` | ``DontUnroll`` |
  1360. +-------------------------+--------------------------------------------------+
  1361. | ``fastopt`` | ``DontUnroll`` |
  1362. +-------------------------+--------------------------------------------------+
  1363. | ``allow_uav_condition`` | Currently Unimplemented |
  1364. +-------------------------+--------------------------------------------------+
  1365. HLSL Functions
  1366. ==============
  1367. All functions reachable from the entry-point function will be translated into
  1368. SPIR-V code. Functions not reachable from the entry-point function will be
  1369. ignored.
  1370. Entry function wrapper
  1371. ----------------------
  1372. HLSL entry functions takes in parameters and returns values. These parameters
  1373. and return values can have semantics attached or if they are struct type,
  1374. the struct fields can have semantics attached. However, in Vulkan, the entry
  1375. function must be of the ``void(void)`` signature. To handle this difference,
  1376. for a given entry function ``main``, we will emit a wrapper function for it.
  1377. The wrapper function will take the name of the source code entry function,
  1378. while the source code entry function will have its name prefixed with "src.".
  1379. The wrapper function reads in stage input/builtin variables created according
  1380. to semantics and groups them into composites meeting the requirements of the
  1381. source code entry point. Then the wrapper calls the source code entry point.
  1382. The return value is extracted and components of it will be written to stage
  1383. output/builtin variables created according to semantics. For example:
  1384. .. code:: hlsl
  1385. // HLSL source code
  1386. struct S {
  1387. bool a : A;
  1388. uint2 b: B;
  1389. float2x3 c: C;
  1390. };
  1391. struct T {
  1392. S x;
  1393. int y: D;
  1394. };
  1395. T main(T input) {
  1396. return input;
  1397. }
  1398. .. code:: spirv
  1399. ; SPIR-V code
  1400. %in_var_A = OpVariable %_ptr_Input_bool Input
  1401. %in_var_B = OpVariable %_ptr_Input_v2uint Input
  1402. %in_var_C = OpVariable %_ptr_Input_mat2v3float Input
  1403. %in_var_D = OpVariable %_ptr_Input_int Input
  1404. %out_var_A = OpVariable %_ptr_Output_bool Output
  1405. %out_var_B = OpVariable %_ptr_Output_v2uint Output
  1406. %out_var_C = OpVariable %_ptr_Output_mat2v3float Output
  1407. %out_var_D = OpVariable %_ptr_Output_int Output
  1408. ; Wrapper function starts
  1409. %main = OpFunction %void None ...
  1410. ... = OpLabel
  1411. %param_var_input = OpVariable %_ptr_Function_T Function
  1412. ; Load stage input variables and group into the expected composite
  1413. %inA = OpLoad %bool %in_var_A
  1414. %inB = OpLoad %v2uint %in_var_B
  1415. %inC = OpLoad %mat2v3float %in_var_C
  1416. %inS = OpCompositeConstruct %S %inA %inB %inC
  1417. %inD = OpLoad %int %in_var_D
  1418. %inT = OpCompositeConstruct %T %inS %inD
  1419. OpStore %param_var_input %inT
  1420. %ret = OpFunctionCall %T %src_main %param_var_input
  1421. ; Extract component values from the composite and store into stage output variables
  1422. %outS = OpCompositeExtract %S %ret 0
  1423. %outA = OpCompositeExtract %bool %outS 0
  1424. OpStore %out_var_A %outA
  1425. %outB = OpCompositeExtract %v2uint %outS 1
  1426. OpStore %out_var_B %outB
  1427. %outC = OpCompositeExtract %mat2v3float %outS 2
  1428. OpStore %out_var_C %outC
  1429. %outD = OpCompositeExtract %int %ret 1
  1430. OpStore %out_var_D %outD
  1431. OpReturn
  1432. OpFunctionEnd
  1433. ; Source code entry point starts
  1434. %src_main = OpFunction %T None ...
  1435. In this way, we can concentrate all stage input/output/builtin variable
  1436. manipulation in the wrapper function and handle the source code entry function
  1437. just like other nomal functions.
  1438. Function parameter
  1439. ------------------
  1440. For a function ``f`` which has a parameter of type ``T``, the generated SPIR-V
  1441. signature will use type ``T*`` for the parameter. At every call site of ``f``,
  1442. additional local variables will be allocated to hold the actual arguments.
  1443. The local variables are passed in as direct function arguments. For example:
  1444. .. code:: hlsl
  1445. // HLSL source code
  1446. float4 f(float a, int b) { ... }
  1447. void caller(...) {
  1448. ...
  1449. float4 result = f(...);
  1450. ...
  1451. }
  1452. .. code:: spirv
  1453. ; SPIR-V code
  1454. ...
  1455. %i32PtrType = OpTypePointer Function %int
  1456. %f32PtrType = OpTypePointer Function %float
  1457. %fnType = OpTypeFunction %v4float %f32PtrType %i32PtrType
  1458. ...
  1459. %f = OpFunction %v4float None %fnType
  1460. %a = OpFunctionParameter %f32PtrType
  1461. %b = OpFunctionParameter %i32PtrType
  1462. ...
  1463. %caller = OpFunction ...
  1464. ...
  1465. %aAlloca = OpVariable %_ptr_Function_float Function
  1466. %bAlloca = OpVariable %_ptr_Function_int Function
  1467. ...
  1468. OpStore %aAlloca ...
  1469. OpStore %bAlloca ...
  1470. %result = OpFunctioncall %v4float %f %aAlloca %bAlloca
  1471. ...
  1472. This approach gives us unified handling of function parameters and local
  1473. variables: both of them are accessed via load/store instructions.
  1474. Intrinsic functions
  1475. -------------------
  1476. The following intrinsic HLSL functions have no direct SPIR-V opcode or GLSL
  1477. extended instruction mapping, so they are handled with additional steps:
  1478. - ``dot`` : performs dot product of two vectors, each containing floats or
  1479. integers. If the two parameters are vectors of floats, we use SPIR-V's
  1480. ``OpDot`` instruction to perform the translation. If the two parameters are
  1481. vectors of integers, we multiply corresponding vector elements using
  1482. ``OpIMul`` and accumulate the results using ``OpIAdd`` to compute the dot
  1483. product.
  1484. - ``mul``: performs multiplications. Each argument may be a scalar, vector,
  1485. or matrix. Depending on the argument type, this will be translated into
  1486. one of the multiplication instructions.
  1487. - ``all``: returns true if all components of the given scalar, vector, or
  1488. matrix are true. Performs conversions to boolean where necessary. Uses SPIR-V
  1489. ``OpAll`` for scalar arguments and vector arguments. For matrix arguments,
  1490. performs ``OpAll`` on each row, and then again on the vector containing the
  1491. results of all rows.
  1492. - ``any``: returns true if any component of the given scalar, vector, or matrix
  1493. is true. Performs conversions to boolean where necessary. Uses SPIR-V
  1494. ``OpAny`` for scalar arguments and vector arguments. For matrix arguments,
  1495. performs ``OpAny`` on each row, and then again on the vector containing the
  1496. results of all rows.
  1497. - ``asfloat``: converts the component type of a scalar/vector/matrix from float,
  1498. uint, or int into float. Uses ``OpBitcast``. This method currently does not
  1499. support taking non-float matrix arguments.
  1500. - ``asint``: converts the component type of a scalar/vector/matrix from float or
  1501. uint into int. Uses ``OpBitcast``. This method currently does not support
  1502. conversion into integer matrices.
  1503. - ``asuint``: converts the component type of a scalar/vector/matrix from float
  1504. or int into uint. Uses ``OpBitcast``. This method currently does not support
  1505. - ``asuint``: Converts a double into two 32-bit unsigned integers. Uses SPIR-V ``OpBitCast``.
  1506. - ``asdouble``: Converts two 32-bit unsigned integers into a double, or four 32-bit unsigned
  1507. integers into two doubles. Uses SPIR-V ``OpVectorShuffle`` and ``OpBitCast``.
  1508. conversion into unsigned integer matrices.
  1509. - ``isfinite`` : Determines if the specified value is finite. Since ``OpIsFinite``
  1510. requires the ``Kernel`` capability, translation is done using ``OpIsNan`` and
  1511. ``OpIsInf``. A given value is finite iff it is not NaN and not infinite.
  1512. - ``clip``: Discards the current pixel if the specified value is less than zero.
  1513. Uses conditional control flow as well as SPIR-V ``OpKill``.
  1514. - ``rcp``: Calculates a fast, approximate, per-component reciprocal.
  1515. Uses SIR-V ``OpFDiv``.
  1516. - ``lit``: Returns a lighting coefficient vector. This vector is a float4 with
  1517. components of (ambient, diffuse, specular, 1). How ``diffuse`` and ``specular``
  1518. are calculated are explained `here <https://msdn.microsoft.com/en-us/library/windows/desktop/bb509619(v=vs.85).aspx>`_.
  1519. - ``D3DCOLORtoUBYTE4``: Converts a floating-point, 4D vector set by a D3DCOLOR to a UBYTE4.
  1520. This is achieved by performing ``int4(input.zyxw * 255.002)`` using SPIR-V ``OpVectorShuffle``,
  1521. ``OpVectorTimesScalar``, and ``OpConvertFToS``, respectively.
  1522. - ``dst``: Calculates a distance vector. The resulting vector, ``dest``, has the following specifications:
  1523. ``dest.x = 1.0``, ``dest.y = src0.y * src1.y``, ``dest.z = src0.z``, and ``dest.w = src1.w``.
  1524. Uses SPIR-V ``OpCompositeExtract`` and ``OpFMul``.
  1525. Using SPIR-V opcode
  1526. ~~~~~~~~~~~~~~~~~~~
  1527. The following intrinsic HLSL functions have direct SPIR-V opcodes for them:
  1528. ==================================== =================================
  1529. HLSL Intrinsic Function SPIR-V Opcode
  1530. ==================================== =================================
  1531. ``AllMemoryBarrier`` ``OpMemoryBarrier``
  1532. ``AllMemoryBarrierWithGroupSync`` ``OpControlBarrier``
  1533. ``countbits`` ``OpBitCount``
  1534. ``DeviceMemoryBarrier`` ``OpMemoryBarrier``
  1535. ``DeviceMemoryBarrierWithGroupSync`` ``OpControlBarrier``
  1536. ``ddx`` ``OpDPdx``
  1537. ``ddy`` ``OpDPdy``
  1538. ``ddx_coarse`` ``OpDPdxCoarse``
  1539. ``ddy_coarse`` ``OpDPdyCoarse``
  1540. ``ddx_fine`` ``OpDPdxFine``
  1541. ``ddy_fine`` ``OpDPdyFine``
  1542. ``fmod`` ``OpFMod``
  1543. ``fwidth`` ``OpFwidth``
  1544. ``GroupMemoryBarrier`` ``OpMemoryBarrier``
  1545. ``GroupMemoryBarrierWithGroupSync`` ``OpControlBarrier``
  1546. ``InterlockedAdd`` ``OpAtomicIAdd``
  1547. ``InterlockedAnd`` ``OpAtomicAnd``
  1548. ``InterlockedOr`` ``OpAtomicOr``
  1549. ``InterlockedXor`` ``OpAtomicXor``
  1550. ``InterlockedMin`` ``OpAtomicUMin``/``OpAtomicSMin``
  1551. ``InterlockedMax`` ``OpAtomicUMax``/``OpAtomicSMax``
  1552. ``InterlockedExchange`` ``OpAtomicExchange``
  1553. ``InterlockedCompareExchange`` ``OpAtomicCompareExchange``
  1554. ``InterlockedCompareStore`` ``OpAtomicCompareExchange``
  1555. ``isnan`` ``OpIsNan``
  1556. ``isInf`` ``OpIsInf``
  1557. ``reversebits`` ``OpBitReverse``
  1558. ``transpose`` ``OpTranspose``
  1559. ``CheckAccessFullyMapped`` ``OpImageSparseTexelsResident``
  1560. ==================================== =================================
  1561. Using GLSL extended instructions
  1562. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1563. The following intrinsic HLSL functions are translated using their equivalent
  1564. instruction in the `GLSL extended instruction set <https://www.khronos.org/registry/spir-v/specs/1.0/GLSL.std.450.html>`_.
  1565. ======================= ===================================
  1566. HLSL Intrinsic Function GLSL Extended Instruction
  1567. ======================= ===================================
  1568. ``abs`` ``SAbs``/``FAbs``
  1569. ``acos`` ``Acos``
  1570. ``asin`` ``Asin``
  1571. ``atan`` ``Atan``
  1572. ``atan2`` ``Atan2``
  1573. ``ceil`` ``Ceil``
  1574. ``clamp`` ``SClamp``/``UClamp``/``FClamp``
  1575. ``cos`` ``Cos``
  1576. ``cosh`` ``Cosh``
  1577. ``cross`` ``Cross``
  1578. ``degrees`` ``Degrees``
  1579. ``distance`` ``Distance``
  1580. ``radians`` ``Radian``
  1581. ``determinant`` ``Determinant``
  1582. ``exp`` ``Exp``
  1583. ``exp2`` ``exp2``
  1584. ``f16tof32`` ``UnpackHalf2x16``
  1585. ``f32tof16`` ``PackHalf2x16``
  1586. ``faceforward`` ``FaceForward``
  1587. ``firstbithigh`` ``FindSMsb`` / ``FindUMsb``
  1588. ``firstbitlow`` ``FindILsb``
  1589. ``floor`` ``Floor``
  1590. ``fma`` ``Fma``
  1591. ``frac`` ``Fract``
  1592. ``frexp`` ``FrexpStruct``
  1593. ``ldexp`` ``Ldexp``
  1594. ``length`` ``Length``
  1595. ``lerp`` ``FMix``
  1596. ``log`` ``Log``
  1597. ``log10`` ``Log2`` (scaled by ``1/log2(10)``)
  1598. ``log2`` ``Log2``
  1599. ``mad`` ``Fma``
  1600. ``max`` ``SMax``/``UMax``/``FMax``
  1601. ``min`` ``SMin``/``UMin``/``FMin``
  1602. ``modf`` ``ModfStruct``
  1603. ``normalize`` ``Normalize``
  1604. ``pow`` ``Pow``
  1605. ``reflect`` ``Reflect``
  1606. ``refract`` ``Refract``
  1607. ``round`` ``Round``
  1608. ``rsqrt`` ``InverseSqrt``
  1609. ``saturate`` ``FClamp``
  1610. ``sign`` ``SSign``/``FSign``
  1611. ``sin`` ``Sin``
  1612. ``sincos`` ``Sin`` and ``Cos``
  1613. ``sinh`` ``Sinh``
  1614. ``smoothstep`` ``SmoothStep``
  1615. ``sqrt`` ``Sqrt``
  1616. ``step`` ``Step``
  1617. ``tan`` ``Tan``
  1618. ``tanh`` ``Tanh``
  1619. ``trunc`` ``Trunc``
  1620. ======================= ===================================
  1621. Synchronization intrinsics
  1622. ~~~~~~~~~~~~~~~~~~~~~~~~~~
  1623. Synchronization intrinsics are translated into ``OpMemoryBarrier`` (for those
  1624. non-``WithGroupSync`` variants) or ``OpControlBarrier`` (for those ``WithGroupSync``
  1625. variants) instructions with parameters:
  1626. ======================= ============ ===== ======= ========= ==============
  1627. HLSL SPIR-V SPIR-V Memory Semantics
  1628. ----------------------- ------------ --------------------------------------
  1629. Intrinsic Memory Scope Image Uniform Workgroup AcquireRelease
  1630. ======================= ============ ===== ======= ========= ==============
  1631. ``AllMemoryBarrier`` Device ✓ ✓ ✓ ✓
  1632. ``DeviceMemoryBarrier`` Device ✓ ✓ ✓
  1633. ``GroupMemoryBarrier`` Workgroup ✓ ✓
  1634. ======================= ============ ===== ======= ========= ==============
  1635. For the ``*WithGroupSync`` intrinsics, SPIR-V memory scope and semantics are the
  1636. same as their counterparts in the above. They have an additional execution
  1637. scope:
  1638. ==================================== ======================
  1639. HLSL Intrinsic SPIR-V Execution Scope
  1640. ==================================== ======================
  1641. ``AllMemoryBarrierWithGroupSync`` Workgroup
  1642. ``DeviceMemoryBarrierWithGroupSync`` Workgroup
  1643. ``GroupMemoryBarrierWithGroupSync`` Workgroup
  1644. ==================================== ======================
  1645. HLSL OO features
  1646. ================
  1647. A HLSL struct/class member method is translated into a normal SPIR-V function,
  1648. whose signature has an additional first parameter for the struct/class called
  1649. upon. Every calling site of the method is generated to pass in the object as
  1650. the first argument.
  1651. HLSL struct/class static member variables are translated into SPIR-V variables
  1652. in the ``Private`` storage class.
  1653. HLSL Methods
  1654. ============
  1655. This section lists how various HLSL methods are mapped.
  1656. Buffers
  1657. -------
  1658. ``Buffer``
  1659. ~~~~~~~~~~
  1660. ``.Load()``
  1661. +++++++++++
  1662. Since Buffers are represented as ``OpTypeImage`` with ``Sampled`` set to 1
  1663. (meaning to be used with a sampler), ``OpImageFetch`` is used to perform this
  1664. operation. The return value of ``OpImageFetch`` is always a four-component
  1665. vector; so proper additional instructions are generated to truncate the vector
  1666. and return the desired number of elements.
  1667. If an output unsigned integer ``status`` argument is present, ``OpImageSparseFetch``
  1668. is used instead. The resulting SPIR-V ``Residency Code`` will be written to ``status``.
  1669. ``operator[]``
  1670. ++++++++++++++
  1671. Handled similarly as ``.Load()``.
  1672. ``.GetDimensions()``
  1673. ++++++++++++++++++++
  1674. Since Buffers are represented as ``OpTypeImage`` with dimension of ``Buffer``,
  1675. ``OpImageQuerySize`` is used to perform this operation.
  1676. ``RWBuffer``
  1677. ~~~~~~~~~~~~
  1678. ``.Load()``
  1679. +++++++++++
  1680. Since RWBuffers are represented as ``OpTypeImage`` with ``Sampled`` set to 2
  1681. (meaning to be used without a sampler), ``OpImageRead`` is used to perform this
  1682. operation. If an output unsigned integer ``status`` argument is present, ``OpImageSparseRead``
  1683. is used instead. The resulting SPIR-V ``Residency Code`` will be written to ``status``.
  1684. ``operator[]``
  1685. ++++++++++++++
  1686. Using ``operator[]`` for reading is handled similarly as ``.Load()``, while for
  1687. writing, the ``OpImageWrite`` instruction is generated.
  1688. ``.GetDimensions()``
  1689. ++++++++++++++++++++
  1690. Since RWBuffers are represented as ``OpTypeImage`` with dimension of ``Buffer``,
  1691. ``OpImageQuerySize`` is used to perform this operation.
  1692. ``StructuredBuffer`` and ``RWStructuredBuffer``
  1693. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1694. ``.GetDimensions()``
  1695. ++++++++++++++++++++
  1696. Since StructuredBuffers/RWStructuredBuffers are represented as a struct with one
  1697. member that is a runtime array of structures, ``OpArrayLength`` is invoked on
  1698. the runtime array in order to find the dimension.
  1699. ``ByteAddressBuffer``
  1700. ~~~~~~~~~~~~~~~~~~~~~
  1701. ``.GetDimensions()``
  1702. ++++++++++++++++++++
  1703. Since ByteAddressBuffers are represented as a struct with one member that is a
  1704. runtime array of unsigned integers, ``OpArrayLength`` is invoked on the runtime array
  1705. in order to find the number of unsigned integers. This is then multiplied by 4 to find
  1706. the number of bytes.
  1707. ``.Load()``, ``.Load2()``, ``.Load3()``, ``.Load4()``
  1708. +++++++++++++++++++++++++++++++++++++++++++++++++++++
  1709. ByteAddressBuffers are represented as a struct with one member that is a runtime array of
  1710. unsigned integers. The ``address`` argument passed to the function is first divided by 4
  1711. in order to find the offset into the array (because each array element is 4 bytes). The
  1712. SPIR-V ``OpAccessChain`` instruction is then used to access that offset, and ``OpLoad`` is
  1713. used to load a 32-bit unsigned integer. For ``Load2``, ``Load3``, and ``Load4``, this is
  1714. done 2, 3, and 4 times, respectively. Each time the word offset is incremented by 1 before
  1715. performing ``OpAccessChain``. After all ``OpLoad`` operations are performed, a vector is
  1716. constructed with all the resulting values.
  1717. ``RWByteAddressBuffer``
  1718. ~~~~~~~~~~~~~~~~~~~~~~~
  1719. ``.GetDimensions()``
  1720. ++++++++++++++++++++
  1721. Since RWByteAddressBuffers are represented as a struct with one member that is a
  1722. runtime array of unsigned integers, ``OpArrayLength`` is invoked on the runtime array
  1723. in order to find the number of unsigned integers. This is then multiplied by 4 to find
  1724. the number of bytes.
  1725. ``.Load()``, ``.Load2()``, ``.Load3()``, ``.Load4()``
  1726. +++++++++++++++++++++++++++++++++++++++++++++++++++++
  1727. RWByteAddressBuffers are represented as a struct with one member that is a runtime array of
  1728. unsigned integers. The ``address`` argument passed to the function is first divided by 4
  1729. in order to find the offset into the array (because each array element is 4 bytes). The
  1730. SPIR-V ``OpAccessChain`` instruction is then used to access that offset, and ``OpLoad`` is
  1731. used to load a 32-bit unsigned integer. For ``Load2``, ``Load3``, and ``Load4``, this is
  1732. done 2, 3, and 4 times, respectively. Each time the word offset is incremented by 1 before
  1733. performing ``OpAccessChain``. After all ``OpLoad`` operations are performed, a vector is
  1734. constructed with all the resulting values.
  1735. ``.Store()``, ``.Store2()``, ``.Store3()``, ``.Store4()``
  1736. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1737. RWByteAddressBuffers are represented as a struct with one member that is a runtime array of
  1738. unsigned integers. The ``address`` argument passed to the function is first divided by 4
  1739. in order to find the offset into the array (because each array element is 4 bytes). The
  1740. SPIR-V ``OpAccessChain`` instruction is then used to access that offset, and ``OpStore`` is
  1741. used to store a 32-bit unsigned integer. For ``Store2``, ``Store3``, and ``Store4``, this is
  1742. done 2, 3, and 4 times, respectively. Each time the word offset is incremented by 1 before
  1743. performing ``OpAccessChain``.
  1744. ``.Interlocked*()``
  1745. +++++++++++++++++++
  1746. ================================= =================================
  1747. HLSL Intrinsic Method SPIR-V Opcode
  1748. ================================= =================================
  1749. ``.InterlockedAdd()`` ``OpAtomicIAdd``
  1750. ``.InterlockedAnd()`` ``OpAtomicAnd``
  1751. ``.InterlockedOr()`` ``OpAtomicOr``
  1752. ``.InterlockedXor()`` ``OpAtomicXor``
  1753. ``.InterlockedMin()`` ``OpAtomicUMin``/``OpAtomicSMin``
  1754. ``.InterlockedMax()`` ``OpAtomicUMax``/``OpAtomicSMax``
  1755. ``.InterlockedExchange()`` ``OpAtomicExchange``
  1756. ``.InterlockedCompareExchange()`` ``OpAtomicCompareExchange``
  1757. ``.InterlockedCompareStore()`` ``OpAtomicCompareExchange``
  1758. ================================= =================================
  1759. ``AppendStructuredBuffer``
  1760. ~~~~~~~~~~~~~~~~~~~~~~~~~~
  1761. ``.Append()``
  1762. +++++++++++++
  1763. The associated counter number will be increased by 1 using ``OpAtomicIAdd``.
  1764. The return value of ``OpAtomicIAdd``, which is the original count number, will
  1765. be used as the index for storing the new element. E.g., for ``buf.Append(vec)``:
  1766. .. code:: spirv
  1767. %counter = OpAccessChain %_ptr_Uniform_int %counter_var_buf %uint_0
  1768. %index = OpAtomicIAdd %uint %counter %uint_1 %uint_0 %uint_1
  1769. %ptr = OpAccessChain %_ptr_Uniform_v4float %buf %uint_0 %index
  1770. %val = OpLoad %v4float %vec
  1771. OpStore %ptr %val
  1772. ``.GetDimensions()``
  1773. ++++++++++++++++++++
  1774. Since AppendStructuredBuffers are represented as a struct with one member that
  1775. is a runtime array, ``OpArrayLength`` is invoked on the runtime array in order
  1776. to find the number of elements. The stride is also calculated based on GLSL
  1777. ``std430`` as explained above.
  1778. ``ConsumeStructuredBuffer``
  1779. ~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1780. ``.Consume()``
  1781. ++++++++++++++
  1782. The associated counter number will be decreased by 1 using ``OpAtomicISub``.
  1783. The return value of ``OpAtomicISub`` minus 1, which is the new count number,
  1784. will be used as the index for reading the new element. E.g., for
  1785. ``buf.Consume(vec)``:
  1786. .. code:: spirv
  1787. %counter = OpAccessChain %_ptr_Uniform_int %counter_var_buf %uint_0
  1788. %prev = OpAtomicISub %uint %counter %uint_1 %uint_0 %uint_1
  1789. %index = OpISub %uint %prev %uint_1
  1790. %ptr = OpAccessChain %_ptr_Uniform_v4float %buf %uint_0 %index
  1791. %val = OpLoad %v4float %vec
  1792. OpStore %ptr %val
  1793. ``.GetDimensions()``
  1794. ++++++++++++++++++++
  1795. Since ConsumeStructuredBuffers are represented as a struct with one member that
  1796. is a runtime array, ``OpArrayLength`` is invoked on the runtime array in order
  1797. to find the number of elements. The stride is also calculated based on GLSL
  1798. ``std430`` as explained above.
  1799. Read-only textures
  1800. ------------------
  1801. Methods common to all texture types are explained in the "common texture methods"
  1802. section. Methods unique to a specific texture type is explained in the section
  1803. for that texture type.
  1804. Common texture methods
  1805. ~~~~~~~~~~~~~~~~~~~~~~
  1806. ``.Sample(sampler, location[, offset][, clamp][, Status])``
  1807. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1808. Not available to ``Texture2DMS`` and ``Texture2DMSArray``.
  1809. The ``OpImageSampleImplicitLod`` instruction is used to translate ``.Sample()``
  1810. since texture types are represented as ``OpTypeImage``. An ``OpSampledImage`` is
  1811. created based on the ``sampler`` passed to the function. The resulting sampled
  1812. image and the ``location`` passed to the function are used as arguments to
  1813. ``OpImageSampleImplicitLod``, with the optional ``offset`` tranlated into
  1814. addtional SPIR-V image operands ``ConstOffset`` or ``Offset`` on it. The optional
  1815. ``clamp`` argument will be translated to the ``MinLod`` image operand.
  1816. If an output unsigned integer ``status`` argument is present,
  1817. ``OpImageSparseSampleImplicitLod`` is used instead. The resulting SPIR-V
  1818. ``Residency Code`` will be written to ``status``.
  1819. ``.SampleLevel(sampler, location, lod[, offset][, Status])``
  1820. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1821. Not available to ``Texture2DMS`` and ``Texture2DMSArray``.
  1822. The ``OpImageSampleExplicitLod`` instruction is used to translate this method.
  1823. An ``OpSampledImage`` is created based on the ``sampler`` passed to the function.
  1824. The resulting sampled image and the ``location`` passed to the function are used
  1825. as arguments to ``OpImageSampleExplicitLod``. The ``lod`` passed to the function
  1826. is attached to the instruction as an SPIR-V image operands ``Lod``. The optional
  1827. ``offset`` is also tranlated into addtional SPIR-V image operands ``ConstOffset``
  1828. or ``Offset`` on it.
  1829. If an output unsigned integer ``status`` argument is present,
  1830. ``OpImageSparseSampleExplicitLod`` is used instead. The resulting SPIR-V
  1831. ``Residency Code`` will be written to ``status``.
  1832. ``.SampleGrad(sampler, location, ddx, ddy[, offset][, clamp][, Status])``
  1833. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1834. Not available to ``Texture2DMS`` and ``Texture2DMSArray``.
  1835. Similarly to ``.SampleLevel``, the ``ddx`` and ``ddy`` parameter are attached to
  1836. the ``OpImageSampleExplicitLod`` instruction as an SPIR-V image operands
  1837. ``Grad``. The optional ``clamp`` argument will be translated into the ``MinLod``
  1838. image operand.
  1839. If an output unsigned integer ``status`` argument is present,
  1840. ``OpImageSparseSampleExplicitLod`` is used instead. The resulting SPIR-V
  1841. ``Residency Code`` will be written to ``status``.
  1842. ``.SampleBias(sampler, location, bias[, offset][, clamp][, Status])``
  1843. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1844. Not available to ``Texture2DMS`` and ``Texture2DMSArray``.
  1845. The translation is similar to ``.Sample()``, with the ``bias`` parameter
  1846. attached to the ``OpImageSampleImplicitLod`` instruction as an SPIR-V image
  1847. operands ``Bias``.
  1848. If an output unsigned integer ``status`` argument is present,
  1849. ``OpImageSparseSampleImplicitLod`` is used instead. The resulting SPIR-V
  1850. ``Residency Code`` will be written to ``status``.
  1851. ``.SampleCmp(sampler, location, comparator[, offset][, clamp][, Status])``
  1852. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1853. Not available to ``Texture3D``, ``Texture2DMS``, and ``Texture2DMSArray``.
  1854. The translation is similar to ``.Sample()``, but the
  1855. ``OpImageSampleDrefImplicitLod`` instruction are used.
  1856. If an output unsigned integer ``status`` argument is present,
  1857. ``OpImageSparseSampleDrefImplicitLod`` is used instead. The resulting SPIR-V
  1858. ``Residency Code`` will be written to ``status``.
  1859. ``.SampleCmpLevelZero(sampler, location, comparator[, offset][, Status])``
  1860. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1861. Not available to ``Texture3D``, ``Texture2DMS``, and ``Texture2DMSArray``.
  1862. The translation is similar to ``.Sample()``, but the
  1863. ``OpImageSampleDrefExplicitLod`` instruction are used, with the additional
  1864. ``Lod`` image operands set to 0.0.
  1865. If an output unsigned integer ``status`` argument is present,
  1866. ``OpImageSparseSampleDrefExplicitLod`` is used instead. The resulting SPIR-V
  1867. ``Residency Code`` will be written to ``status``.
  1868. ``.Gather()``
  1869. +++++++++++++
  1870. Available to ``Texture2D``, ``Texture2DArray``, ``TextureCube``, and
  1871. ``TextureCubeArray``.
  1872. The translation is similar to ``.Sample()``, but the ``OpImageGather``
  1873. instruction is used, with component setting to 0.
  1874. If an output unsigned integer ``status`` argument is present,
  1875. ``OpImageSparseGather`` is used instead. The resulting SPIR-V
  1876. ``Residency Code`` will be written to ``status``.
  1877. ``.GatherRed()``, ``.GatherGreen()``, ``.GatherBlue()``, ``.GatherAlpha()``
  1878. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1879. Available to ``Texture2D``, ``Texture2DArray``, ``TextureCube``, and
  1880. ``TextureCubeArray``.
  1881. The ``OpImageGather`` instruction is used to translate these functions, with
  1882. component setting to 0, 1, 2, and 3 respectively.
  1883. There are a few overloads for these functions:
  1884. - For those overloads taking 4 offset parameters, those offset parameters will
  1885. be conveyed as an additional ``ConstOffsets`` image operands to the
  1886. instruction if those offset parameters are all constants. Otherwise,
  1887. 4 separate ``OpImageGather`` instructions will be emitted to get each texel
  1888. from each offset, using the ``Offset`` image operands.
  1889. - For those overloads with the ``status`` parameter, ``OpImageSparseGather``
  1890. is used instead, and the resulting SPIR-V ``Residency Code`` will be
  1891. written to ``status``.
  1892. ``.GatherCmp()``
  1893. ++++++++++++++++
  1894. Available to ``Texture2D``, ``Texture2DArray``, ``TextureCube``, and
  1895. ``TextureCubeArray``.
  1896. The translation is similar to ``.Sample()``, but the ``OpImageDrefGather``
  1897. instruction is used.
  1898. For the overload with the output unsigned integer ``status`` argument,
  1899. ``OpImageSparseDrefGather`` is used instead. The resulting SPIR-V
  1900. ``Residency Code`` will be written to ``status``.
  1901. ``.GatherCmpRed()``
  1902. +++++++++++++++++++
  1903. Available to ``Texture2D``, ``Texture2DArray``, ``TextureCube``, and
  1904. ``TextureCubeArray``.
  1905. The translation is the same as ``.GatherCmp()``.
  1906. ``.Load(location[, sampleIndex][, offset])``
  1907. ++++++++++++++++++++++++++++++++++++++++++++
  1908. The ``OpImageFetch`` instruction is used for translation because texture types
  1909. are represented as ``OpTypeImage``. The last element in the ``location``
  1910. parameter will be used as arguments to the ``Lod`` SPIR-V image operand attached
  1911. to the ``OpImageFetch`` instruction, and the rest are used as the coordinate
  1912. argument to the instruction. ``offset`` is handled similarly to ``.Sample()``.
  1913. The return value of ``OpImageFetch`` is always a four-component vector; so
  1914. proper additional instructions are generated to truncate the vector and return
  1915. the desired number of elements.
  1916. For the overload with the output unsigned integer ``status`` argument,
  1917. ``OpImageSparseFetch`` is used instead. The resulting SPIR-V
  1918. ``Residency Code`` will be written to ``status``.
  1919. ``operator[]``
  1920. ++++++++++++++
  1921. Handled similarly as ``.Load()``.
  1922. ``.mips[lod][position]``
  1923. ++++++++++++++++++++++++
  1924. Not available to ``TextureCube``, ``TextureCubeArray``, ``Texture2DMS``, and
  1925. ``Texture2DMSArray``.
  1926. This method is translated into the ``OpImageFetch`` instruction. The ``lod``
  1927. parameter is attached to the instruction as the parameter to the ``Lod`` SPIR-V
  1928. image operands. The ``position`` parameter are used as the coordinate to the
  1929. instruction directly.
  1930. ``.CalculateLevelOfDetail()`` and ``.CalculateLevelOfDetailUnclamped()``
  1931. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1932. Not available to ``Texture2DMS`` and ``Texture2DMSArray``.
  1933. Since texture types are represented as ``OpTypeImage``, the ``OpImageQueryLod``
  1934. instruction is used for translation. An ``OpSampledImage`` is created based on
  1935. the ``SamplerState`` passed to the function. The resulting sampled image and
  1936. the coordinate passed to the function are used to invoke ``OpImageQueryLod``.
  1937. The result of ``OpImageQueryLod`` is a ``float2``. The first element contains
  1938. the mipmap array layer. The second element contains the unclamped level of detail.
  1939. ``Texture1D``
  1940. ~~~~~~~~~~~~~
  1941. ``.GetDimensions(width)`` or ``.GetDimensions(MipLevel, width, NumLevels)``
  1942. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1943. Since Texture1D is represented as ``OpTypeImage``, the ``OpImageQuerySizeLod`` instruction
  1944. is used for translation. If a ``MipLevel`` argument is passed to ``GetDimensions``, it will
  1945. be used as the ``Lod`` parameter of the query instruction. Otherwise, ``Lod`` of ``0`` be used.
  1946. ``Texture1DArray``
  1947. ~~~~~~~~~~~~~~~~~~
  1948. ``.GetDimensions(width, elements)`` or ``.GetDimensions(MipLevel, width, elements, NumLevels)``
  1949. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1950. Since Texture1DArray is represented as ``OpTypeImage``, the ``OpImageQuerySizeLod`` instruction
  1951. is used for translation. If a ``MipLevel`` argument is present, it will be used as the
  1952. ``Lod`` parameter of the query instruction. Otherwise, ``Lod`` of ``0`` be used.
  1953. ``Texture2D``
  1954. ~~~~~~~~~~~~~
  1955. ``.GetDimensions(width, height)`` or ``.GetDimensions(MipLevel, width, height, NumLevels)``
  1956. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1957. Since Texture2D is represented as ``OpTypeImage``, the ``OpImageQuerySizeLod`` instruction
  1958. is used for translation. If a ``MipLevel`` argument is present, it will be used as the
  1959. ``Lod`` parameter of the query instruction. Otherwise, ``Lod`` of ``0`` be used.
  1960. ``Texture2DArray``
  1961. ~~~~~~~~~~~~~~~~~~
  1962. ``.GetDimensions(width, height, elements)`` or ``.GetDimensions(MipLevel, width, height, elements, NumLevels)``
  1963. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1964. Since Texture2DArray is represented as ``OpTypeImage``, the ``OpImageQuerySizeLod`` instruction
  1965. is used for translation. If a ``MipLevel`` argument is present, it will be used as the
  1966. ``Lod`` parameter of the query instruction. Otherwise, ``Lod`` of ``0`` be used.
  1967. ``Texture3D``
  1968. ~~~~~~~~~~~~~
  1969. ``.GetDimensions(width, height, depth)`` or ``.GetDimensions(MipLevel, width, height, depth, NumLevels)``
  1970. +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
  1971. Since Texture3D is represented as ``OpTypeImage``, the ``OpImageQuerySizeLod`` instruction
  1972. is used for translation. If a ``MipLevel`` argument is present, it will be used as the
  1973. ``Lod`` parameter of the query instruction. Otherwise, ``Lod`` of ``0`` be used.
  1974. ``Texture2DMS``
  1975. ~~~~~~~~~~~~~~~
  1976. ``.sample[sample][position]``
  1977. +++++++++++++++++++++++++++++
  1978. This method is translated into the ``OpImageFetch`` instruction. The ``sample``
  1979. parameter is attached to the instruction as the parameter to the ``Sample``
  1980. SPIR-V image operands. The ``position`` parameter are used as the coordinate to
  1981. the instruction directly.
  1982. ``.GetDimensions(width, height, numSamples)``
  1983. +++++++++++++++++++++++++++++++++++++++++++++
  1984. Since Texture2DMS is represented as ``OpTypeImage`` with ``MS`` of ``1``, the ``OpImageQuerySize`` instruction
  1985. is used to get the width and the height. Furthermore, ``OpImageQuerySamples`` is used to get the numSamples.
  1986. ``.GetSamplePosition(index)``
  1987. +++++++++++++++++++++++++++++
  1988. There are no direct mapping SPIR-V instructions for this method. Right now, it
  1989. is translated into the SPIR-V code for the following HLSL source code:
  1990. .. code:: hlsl
  1991. // count is the number of samples in the Texture2DMS(Array)
  1992. // index is the index of the sample we are trying to get the position
  1993. static const float2 pos2[] = {
  1994. { 4.0/16.0, 4.0/16.0 }, {-4.0/16.0, -4.0/16.0 },
  1995. };
  1996. static const float2 pos4[] = {
  1997. {-2.0/16.0, -6.0/16.0 }, { 6.0/16.0, -2.0/16.0 }, {-6.0/16.0, 2.0/16.0 }, { 2.0/16.0, 6.0/16.0 },
  1998. };
  1999. static const float2 pos8[] = {
  2000. { 1.0/16.0, -3.0/16.0 }, {-1.0/16.0, 3.0/16.0 }, { 5.0/16.0, 1.0/16.0 }, {-3.0/16.0, -5.0/16.0 },
  2001. {-5.0/16.0, 5.0/16.0 }, {-7.0/16.0, -1.0/16.0 }, { 3.0/16.0, 7.0/16.0 }, { 7.0/16.0, -7.0/16.0 },
  2002. };
  2003. static const float2 pos16[] = {
  2004. { 1.0/16.0, 1.0/16.0 }, {-1.0/16.0, -3.0/16.0 }, {-3.0/16.0, 2.0/16.0 }, { 4.0/16.0, -1.0/16.0 },
  2005. {-5.0/16.0, -2.0/16.0 }, { 2.0/16.0, 5.0/16.0 }, { 5.0/16.0, 3.0/16.0 }, { 3.0/16.0, -5.0/16.0 },
  2006. {-2.0/16.0, 6.0/16.0 }, { 0.0/16.0, -7.0/16.0 }, {-4.0/16.0, -6.0/16.0 }, {-6.0/16.0, 4.0/16.0 },
  2007. {-8.0/16.0, 0.0/16.0 }, { 7.0/16.0, -4.0/16.0 }, { 6.0/16.0, 7.0/16.0 }, {-7.0/16.0, -8.0/16.0 },
  2008. };
  2009. float2 position = float2(0.0f, 0.0f);
  2010. if (count == 2) {
  2011. position = pos2[index];
  2012. } else if (count == 4) {
  2013. position = pos4[index];
  2014. } else if (count == 8) {
  2015. position = pos8[index];
  2016. } else if (count == 16) {
  2017. position = pos16[index];
  2018. }
  2019. From the above, it's clear that the current implementation only supports standard
  2020. sample settings, i.e., with 1, 2, 4, 8, or 16 samples. For other cases, the
  2021. implementation will just return `(float2)0`.
  2022. ``Texture2DMSArray``
  2023. ~~~~~~~~~~~~~~~~~~~~
  2024. ``.sample[sample][position]``
  2025. +++++++++++++++++++++++++++++
  2026. This method is translated into the ``OpImageFetch`` instruction. The ``sample``
  2027. parameter is attached to the instruction as the parameter to the ``Sample``
  2028. SPIR-V image operands. The ``position`` parameter are used as the coordinate to
  2029. the instruction directly.
  2030. ``.GetDimensions(width, height, elements, numSamples)``
  2031. +++++++++++++++++++++++++++++++++++++++++++++++++++++++
  2032. Since Texture2DMS is represented as ``OpTypeImage`` with ``MS`` of ``1``, the ``OpImageQuerySize`` instruction
  2033. is used to get the width, the height, and the elements. Furthermore, ``OpImageQuerySamples`` is used to get the numSamples.
  2034. ``.GetSamplePosition(index)``
  2035. +++++++++++++++++++++++++++++
  2036. Similar to Texture2D.
  2037. ``TextureCube``
  2038. ~~~~~~~~~~~~~~~
  2039. ``TextureCubeArray``
  2040. ~~~~~~~~~~~~~~~~~~~~
  2041. Read-write textures
  2042. -------------------
  2043. Methods common to all texture types are explained in the "common texture methods"
  2044. section. Methods unique to a specific texture type is explained in the section
  2045. for that texture type.
  2046. Common texture methods
  2047. ~~~~~~~~~~~~~~~~~~~~~~
  2048. ``.Load()``
  2049. +++++++++++
  2050. Since read-write texture types are represented as ``OpTypeImage`` with
  2051. ``Sampled`` set to 2 (meaning to be used without a sampler), ``OpImageRead`` is
  2052. used to perform this operation.
  2053. For the overload with the output unsigned integer ``status`` argument,
  2054. ``OpImageSparseRead`` is used instead. The resulting SPIR-V
  2055. ``Residency Code`` will be written to ``status``.
  2056. ``operator[]``
  2057. ++++++++++++++
  2058. Using ``operator[]`` for reading is handled similarly as ``.Load()``, while for
  2059. writing, the ``OpImageWrite`` instruction is generated.
  2060. ``RWTexture1D``
  2061. ~~~~~~~~~~~~~~~
  2062. ``.GetDimensions(width)``
  2063. +++++++++++++++++++++++++
  2064. The ``OpImageQuerySize`` instruction is used to find the width.
  2065. ``RWTexture1DArray``
  2066. ~~~~~~~~~~~~~~~~~~~~
  2067. ``.GetDimensions(width, elements)``
  2068. +++++++++++++++++++++++++++++++++++
  2069. The ``OpImageQuerySize`` instruction is used to get a uint2. The first element
  2070. is the width, and the second is the elements.
  2071. ``RWTexture2D``
  2072. ~~~~~~~~~~~~~~~
  2073. ``.GetDimensions(width, height)``
  2074. +++++++++++++++++++++++++++++++++
  2075. The ``OpImageQuerySize`` instruction is used to get a uint2. The first element is the width, and the second
  2076. element is the height.
  2077. ``RWTexture2DArray``
  2078. ~~~~~~~~~~~~~~~~~~~~
  2079. ``.GetDimensions(width, height, elements)``
  2080. +++++++++++++++++++++++++++++++++++++++++++
  2081. The ``OpImageQuerySize`` instruction is used to get a uint3. The first element is the width, the second
  2082. element is the height, and the third is the elements.
  2083. ``RWTexture3D``
  2084. ~~~~~~~~~~~~~~~
  2085. ``.GetDimensions(width, height, depth)``
  2086. ++++++++++++++++++++++++++++++++++++++++
  2087. The ``OpImageQuerySize`` instruction is used to get a uint3. The first element is the width, the second
  2088. element is the height, and the third element is the depth.
  2089. HLSL Shader Stages
  2090. ==================
  2091. Hull Shaders
  2092. ------------
  2093. Hull shaders corresponds to Tessellation Control Shaders (TCS) in Vulkan.
  2094. This section describes how Hull shaders are translated to SPIR-V for Vulkan.
  2095. Hull Entry Point Attributes
  2096. ~~~~~~~~~~~~~~~~~~~~~~~~~~~
  2097. The following HLSL attributes are attached to the main entry point of hull shaders
  2098. and are translated to SPIR-V execution modes according to the table below:
  2099. .. table:: Mapping from HLSL attribute to SPIR-V execution mode
  2100. +-------------------------+---------------------+--------------------------+
  2101. | HLSL Attribute | value | SPIR-V Execution Mode |
  2102. +=========================+=====================+==========================+
  2103. | | ``quad`` | ``Quads`` |
  2104. | +---------------------+--------------------------+
  2105. | ``domain`` | ``tri`` | ``Triangles`` |
  2106. | +---------------------+--------------------------+
  2107. | | ``isoline`` | ``Isoline`` |
  2108. +-------------------------+---------------------+--------------------------+
  2109. | | ``integer`` | ``SpacingEqual`` |
  2110. | +---------------------+--------------------------+
  2111. | | ``fractional_even`` | ``SpacingFractionalEven``|
  2112. | ``partitioning`` +---------------------+--------------------------+
  2113. | | ``fractional_odd`` | ``SpacingFractionalOdd`` |
  2114. | +---------------------+--------------------------+
  2115. | | ``pow2`` | N/A |
  2116. +-------------------------+---------------------+--------------------------+
  2117. | | ``point`` | ``PointMode`` |
  2118. | +---------------------+--------------------------+
  2119. | | ``line`` | N/A |
  2120. | ``outputtopology`` +---------------------+--------------------------+
  2121. | | ``triangle_cw`` | ``VertexOrderCw`` |
  2122. | +---------------------+--------------------------+
  2123. | | ``triangle_ccw`` | ``VertexOrderCcw`` |
  2124. +-------------------------+---------------------+--------------------------+
  2125. |``outputcontrolpoints`` | ``n`` | ``OutputVertices n`` |
  2126. +-------------------------+---------------------+--------------------------+
  2127. The ``patchconstfunc`` attribute does not have a direct equivalent in SPIR-V.
  2128. It specifies the name of the Patch Constant Function. This function is run only
  2129. once per patch. This is further described below.
  2130. InputPatch and OutputPatch
  2131. ~~~~~~~~~~~~~~~~~~~~~~~~~~
  2132. Both of ``InputPatch<T, N>`` and ``OutputPatch<T, N>`` are translated to an array
  2133. of constant size ``N`` where each element is of type ``T``.
  2134. InputPatch can be passed to the Hull shader main entry function as well as the
  2135. patch constant function. This would include information about each of the ``N``
  2136. vertices that are input to the tessellation control shader.
  2137. OutputPatch is an array containing ``N`` elements (where ``N`` is the number of
  2138. output vertices). Each element of the array contains information about an
  2139. output vertex. OutputPatch may also be passed to the patch constant function.
  2140. The SPIR-V ``InvocationID`` (``SV_OutputControlPointID`` in HLSL) is used to index
  2141. into the InputPatch and OutputPatch arrays to read/write information for the given
  2142. vertex.
  2143. The hull main entry function in HLSL returns only one value (say, of type ``T``), but
  2144. that function is in fact executed once for each control point. The Vulkan spec requires that
  2145. "Tessellation control shader per-vertex output variables and blocks, and tessellation control,
  2146. tessellation evaluation, and geometry shader per-vertex input variables and blocks are required
  2147. to be declared as arrays, with each element representing input or output values for a single vertex
  2148. of a multi-vertex primitive". Therefore, we need to create a stage output variable that is an array
  2149. with elements of type ``T``. The number of elements of the array is equal to the number of
  2150. output control points. Each final output control point is written into the corresponding element in
  2151. the array using SV_OutputControlPointID as the index.
  2152. Patch Constant Function
  2153. ~~~~~~~~~~~~~~~~~~~~~~~
  2154. As mentioned above, the patch constant function is to be invoked only once per patch.
  2155. As a result, in the SPIR-V module, the `entry function wrapper`_ will first invoke the
  2156. main entry function, and then use an ``OpControlBarrier`` to wait for all vertex
  2157. processing to finish. After the barrier, *only* the first thread (with InvocationID of 0)
  2158. will invoke the patch constant function.
  2159. The information resulting from the patch constant function will also be returned
  2160. as stage output variables. The output struct of the patch constant function must include
  2161. ``SV_TessFactor`` and ``SV_InsideTessFactor`` fields which will translate to
  2162. ``TessLevelOuter`` and ``TessLevelInner`` builtin variables, respectively. And the rest
  2163. will be flattened and translated into normal stage output variables, one for each field.
  2164. Geometry Shaders
  2165. ----------------
  2166. This section describes how geometry shaders are translated to SPIR-V for Vulkan.
  2167. Geometry Shader Entry Point Attributes
  2168. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  2169. The following HLSL attribute is attached to the main entry point of geometry shaders
  2170. and is translated to SPIR-V execution mode as follows:
  2171. .. table:: Mapping from geometry shader HLSL attribute to SPIR-V execution mode
  2172. +-------------------------+---------------------+--------------------------+
  2173. | HLSL Attribute | value | SPIR-V Execution Mode |
  2174. +=========================+=====================+==========================+
  2175. |``maxvertexcount`` | ``n`` | ``OutputVertices n`` |
  2176. +-------------------------+---------------------+--------------------------+
  2177. |``instance`` | ``n`` | ``Invocations n`` |
  2178. +-------------------------+---------------------+--------------------------+
  2179. Translation for Primitive Types
  2180. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  2181. Geometry shader vertex inputs may be qualified with primitive types. Only one primitive type
  2182. is allowed to be used in a given geometry shader. The following table shows the SPIR-V execution
  2183. mode that is used in order to represent the given primitive type.
  2184. .. table:: Mapping from geometry shader primitive type to SPIR-V execution mode
  2185. +---------------------+-----------------------------+
  2186. | HLSL Primitive Type | SPIR-V Execution Mode |
  2187. +=====================+=============================+
  2188. |``point`` | ``InputPoints`` |
  2189. +---------------------+-----------------------------+
  2190. |``line`` | ``InputLines`` |
  2191. +---------------------+-----------------------------+
  2192. |``triangle`` | ``Triangles`` |
  2193. +---------------------+-----------------------------+
  2194. |``lineadj`` | ``InputLinesAdjacency`` |
  2195. +---------------------+-----------------------------+
  2196. |``triangleadj`` | ``InputTrianglesAdjacency`` |
  2197. +---------------------+-----------------------------+
  2198. Translation of Output Stream Types
  2199. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  2200. Supported output stream types in geometry shaders are: ``PointStream<T>``,
  2201. ``LineStream<T>``, and ``TriangleStream<T>``. These types are translated as the underlying
  2202. type ``T``, which is recursively flattened into stand-alone variables for each field.
  2203. Furthermore, output stream objects passed to geometry shader entry points are
  2204. required to be annotated with ``inout``, but the generated SPIR-V only contains
  2205. stage output variables for them.
  2206. The following table shows the SPIR-V execution mode that is used in order to represent the
  2207. given output stream.
  2208. .. table:: Mapping from geometry shader output stream type to SPIR-V execution mode
  2209. +---------------------+-----------------------------+
  2210. | HLSL Output Stream | SPIR-V Execution Mode |
  2211. +=====================+=============================+
  2212. |``PointStream`` | ``OutputPoints`` |
  2213. +---------------------+-----------------------------+
  2214. |``LineStream`` | ``OutputLineStrip`` |
  2215. +---------------------+-----------------------------+
  2216. |``TriangleStream`` | ``OutputTriangleStrip`` |
  2217. +---------------------+-----------------------------+
  2218. In other shader stages, stage output variables are only written in the `entry
  2219. function wrapper`_ after calling the source code entry function. However,
  2220. geometry shaders can output as many vertices as they wish, by calling the
  2221. ``.Append()`` method on the output stream object. Therefore, it is incorrect to
  2222. have only one flush in the entry function wrapper like other stages. Instead,
  2223. each time a ``*Stream<T>::Append()`` is encountered, all stage output variables
  2224. behind ``T`` will be flushed before SPIR-V ``OpEmitVertex`` instruction is
  2225. generated. ``.RestartStrip()`` method calls will be translated into the SPIR-V
  2226. ``OpEndPrimitive`` instruction.
  2227. Shader Model 6.0 Wave Intrinsics
  2228. ================================
  2229. Note that Wave intrinsics requires SPIR-V 1.3, which is supported by Vulkan 1.1.
  2230. If you use wave intrinsics in your source code, you will need to specify
  2231. -fspv-target-env=vulkan1.1 via the command line to target Vulkan 1.1.
  2232. Shader model 6.0 introduces a set of wave operations. Apart from
  2233. ``WaveGetLaneCount()`` and ``WaveGetLaneIndex()``, which are translated into
  2234. loading from SPIR-V builtin variable ``SubgroupSize`` and
  2235. ``SubgroupLocalInvocationId`` respectively, the rest are translated into SPIR-V
  2236. group operations with ``Subgroup`` scope according to the following chart:
  2237. ============= ============================ =================================== ======================
  2238. Wave Category Wave Intrinsics SPIR-V Opcode SPIR-V Group Operation
  2239. ============= ============================ =================================== ======================
  2240. Query ``WaveIsFirstLane()`` ``OpGroupNonUniformElect``
  2241. Vote ``WaveActiveAnyTrue()`` ``OpGroupNonUniformAny``
  2242. Vote ``WaveActiveAllTrue()`` ``OpGroupNonUniformAll``
  2243. Vote ``WaveActiveBallot()`` ``OpGroupNonUniformBallot``
  2244. Reduction ``WaveActiveAllEqual()`` ``OpGroupNonUniformAllEqual`` ``Reduction``
  2245. Reduction ``WaveActiveCountBits()`` ``OpGroupNonUniformBallotBitCount`` ``Reduction``
  2246. Reduction ``WaveActiveSum()`` ``OpGroupNonUniform*Add`` ``Reduction``
  2247. Reduction ``WaveActiveProduct()`` ``OpGroupNonUniform*Mul`` ``Reduction``
  2248. Reduction ``WaveActiveBitAdd()`` ``OpGroupNonUniformBitwiseAnd`` ``Reduction``
  2249. Reduction ``WaveActiveBitOr()`` ``OpGroupNonUniformBitwiseOr`` ``Reduction``
  2250. Reduction ``WaveActiveBitXor()`` ``OpGroupNonUniformBitwiseXor`` ``Reduction``
  2251. Reduction ``WaveActiveMin()`` ``OpGroupNonUniform*Min`` ``Reduction``
  2252. Reduction ``WaveActiveMax()`` ``OpGroupNonUniform*Max`` ``Reduction``
  2253. Scan/Prefix ``WavePrefixSum()`` ``OpGroupNonUniform*Add`` ``ExclusiveScan``
  2254. Scan/Prefix ``WavePrefixProduct()`` ``OpGroupNonUniform*Mul`` ``ExclusiveScan``
  2255. Scan/Prefix ``WavePrefixCountBits()`` ``OpGroupNonUniformBallotBitCount`` ``ExclusiveScan``
  2256. Broadcast ``WaveReadLaneAt()`` ``OpGroupNonUniformBroadcast``
  2257. Broadcast ``WaveReadLaneFirst()`` ``OpGroupNonUniformBroadcastFirst``
  2258. Quad ``QuadReadAcrossX()`` ``OpGroupNonUniformQuadSwap``
  2259. Quad ``QuadReadAcrossY()`` ``OpGroupNonUniformQuadSwap``
  2260. Quad ``QuadReadAcrossDiagonal()`` ``OpGroupNonUniformQuadSwap``
  2261. Quad ``QuadReadLaneAt()`` ``OpGroupNonUniformQuadBroadcast``
  2262. ============= ============================ =================================== ======================
  2263. Supported Command-line Options
  2264. ==============================
  2265. Command-line options supported by SPIR-V CodeGen are listed below. They are
  2266. also recognized by the library API calls.
  2267. General options
  2268. ---------------
  2269. - ``-T``: specifies shader profile
  2270. - ``-E``: specifies entry point
  2271. - ``-D``: Defines macro
  2272. - ``-I``: Adds directory to include search path
  2273. - ``-O{|0|1|2|3}``: Specifies optimization level
  2274. - ``-enable-16bit-types``: enables 16-bit types and disables min precision types
  2275. - ``-Zpc``: Packs matrices in column-major order by deafult
  2276. - ``-Zpr``: Packs matrices in row-major order by deafult
  2277. - ``-Fc``: outputs SPIR-V disassembly to the given file
  2278. - ``-Fe``: outputs warnings and errors to the given file
  2279. - ``-Fo``: outputs SPIR-V code to the given file
  2280. - ``-Fh``: outputs SPIR-V code as a header file
  2281. - ``-Vn``: specifies the variable name for SPIR-V code in generated header file
  2282. - ``-Zi``: Emits more debug information (see `Debugging`_)
  2283. - ``-Cc``: colorizes SPIR-V disassembly
  2284. - ``-No``: adds instruction byte offsets to SPIR-V disassembly
  2285. - ``-H``: Shows header includes and nesting depth
  2286. - ``-Vi``: Shows details about the include process
  2287. - ``-Vd``: Disables SPIR-V verification
  2288. - ``-WX``: Treats warnings as errors
  2289. - ``-no-warnings``: Suppresses all warnings
  2290. - ``-flegacy-macro-expansion``: expands the operands before performing
  2291. token-pasting operation (fxc behavior)
  2292. Vulkan-specific options
  2293. -----------------------
  2294. The following command line options are added into ``dxc`` to support SPIR-V
  2295. codegen for Vulkan:
  2296. - ``-spirv``: Generates SPIR-V code.
  2297. - ``-fvk-b-shift N M``: Shifts by ``N`` the inferred binding numbers for all
  2298. resources in b-type registers of space ``M``. Specifically, for a resouce
  2299. attached with ``:register(bX, spaceM)`` but not ``[vk::binding(...)]``,
  2300. sets its Vulkan descriptor set to ``M`` and binding number to ``X + N``. If
  2301. you need to shift the inferred binding numbers for more than one space,
  2302. provide more than one such option. If more than one such option is provided
  2303. for the same space, the last one takes effect. If you need to shift the
  2304. inferred binding numbers for all sets, use ``all`` as ``M``.
  2305. See `HLSL register and Vulkan binding`_ for explanation and examples.
  2306. - ``-fvk-t-shift N M``, similar to ``-fvk-b-shift``, but for t-type registers.
  2307. - ``-fvk-s-shift N M``, similar to ``-fvk-b-shift``, but for s-type registers.
  2308. - ``-fvk-u-shift N M``, similar to ``-fvk-b-shift``, but for u-type registers.
  2309. - ``-fvk-bind-register xX Y N M`` (short alias: ``-vkbr``): Binds the resouce
  2310. at ``register(xX, spaceY)`` to descriptor set ``M`` and binding ``N``. This
  2311. option cannot be used together with other binding assignment options.
  2312. It requires all source code resources have ``:register()`` attribute and
  2313. all registers have corresponding Vulkan descriptors specified using this
  2314. option.
  2315. - ``-fvk-use-gl-layout``: Uses strict OpenGL ``std140``/``std430``
  2316. layout rules for resources.
  2317. - ``-fvk-use-dx-layout``: Uses DirectX layout rules for resources.
  2318. - ``-fvk-invert-y``: Negates (additively inverts) SV_Position.y before writing
  2319. to stage output. Used to accommodate the difference between Vulkan's
  2320. coordinate system and DirectX's. Only allowed in VS/DS/GS.
  2321. - ``-fvk-use-dx-position-w``: Reciprocates (multiplicatively inverts)
  2322. SV_Position.w after reading from stage input. Used to accommodate the
  2323. difference between Vulkan DirectX: the w component of SV_Position in PS is
  2324. stored as 1/w in Vulkan. Only recognized in PS; applying to other stages
  2325. is no-op.
  2326. - ``-fvk-stage-io-order={alpha|decl}``: Assigns the stage input/output variable
  2327. location number according to alphabetical order or declaration order. See
  2328. `HLSL semantic and Vulkan Location`_ for more details.
  2329. - ``-fspv-reflect``: Emits additional SPIR-V instructions to aid reflection.
  2330. - ``-fspv-debug=<category>``: Controls what category of debug information
  2331. should be emitted. Accepted values are ``file``, ``source``, ``line``, and
  2332. ``tool``. See `Debugging`_ for more details.
  2333. - ``-fspv-extension=<extension>``: Only allows using ``<extension>`` in CodeGen.
  2334. If you want to allow multiple extensions, provide more than one such option. If you
  2335. want to allow *all* KHR extensions, use ``-fspv-extension=KHR``.
  2336. - ``-fspv-target-env=<env>``: Specifies the target environment for this compilation.
  2337. The current valid options are ``vulkan1.0`` and ``vulkan1.1``. If no target
  2338. environment is provided, ``vulkan1.0`` is used as default.
  2339. - ``-Wno-vk-ignored-features``: Does not emit warnings on ignored features
  2340. resulting from no Vulkan support, e.g., cbuffer member initializer.
  2341. Unsupported HLSL Features
  2342. =========================
  2343. The following HLSL language features are not supported in SPIR-V codegen,
  2344. either because of no Vulkan equivalents at the moment, or because of deprecation.
  2345. * Literal/immediate sampler state: deprecated feature. The compiler will
  2346. emit a warning and ignore it.
  2347. * ``abort()`` intrinsic function: no Vulkan equivalent. The compiler will emit
  2348. an error.
  2349. * ``GetRenderTargetSampleCount()`` intrinsic function: no Vulkan equivalent.
  2350. (Its GLSL counterpart is ``gl_NumSamples``, which is not available in GLSL for
  2351. Vulkan.) The compiler will emit an error.
  2352. * ``GetRenderTargetSamplePosition()`` intrinsic function: no Vulkan equivalent.
  2353. (``gl_SamplePosition`` provides similar functionality but it's only for the
  2354. sample currently being processed.) The compiler will emit an error.
  2355. * ``tex*()`` intrinsic functions: deprecated features. The compiler will
  2356. emit errors.
  2357. * ``.GatherCmpGreen()``, ``.GatherCmpBlue()``, ``.GatherCmpAlpha()`` intrinsic
  2358. method: no Vulkan equivalent. (SPIR-V ``OpImageDrefGather`` instruction does
  2359. not take component as input.) The compiler will emit an error.
  2360. * Since ``StructuredBuffer``, ``RWStructuredBuffer``, ``ByteAddressBuffer``, and
  2361. ``RWByteAddressBuffer`` are not represented as image types in SPIR-V, using the
  2362. output unsigned integer ``status`` argument in their ``Load*`` methods is not
  2363. supported. Using these methods with the ``status`` argument will cause a compiler error.
  2364. * Applying ``row_major`` or ``column_major`` attributes to a stand-alone matrix will be
  2365. ignored by the compiler because ``RowMajor`` and ``ColMajor`` decorations in SPIR-V are
  2366. only allowed to be applied to members of structures. A warning will be issued by the compiler.
  2367. * The Hull shader ``partitioning`` attribute may not have the ``pow2`` value. The compiler
  2368. will emit an error. Other attribute values are supported and described in the
  2369. `Hull Entry Point Attributes`_ section.
  2370. * ``cbuffer``/``tbuffer`` member initializer: no Vulkan equivalent. The compiler
  2371. will emit an warning and ignore it.
  2372. Appendix
  2373. ==========
  2374. Appendix A. Matrix Representation
  2375. ---------------------------------
  2376. Consider a matrix in HLSL defined as ``float2x3 m;``. Conceptually, this is a matrix with 2 rows and 3 columns.
  2377. This means that you can access its elements via expressions such as ``m[i][j]``, where ``i`` can be ``{0, 1}`` and ``j`` can be ``{0, 1, 2}``.
  2378. Now let's look how matrices are defined in SPIR-V:
  2379. .. code:: spirv
  2380. %columnType = OpTypeVector %float <number of rows>
  2381. %matType = OpTypeMatrix %columnType <number of columns>
  2382. As you can see, SPIR-V conceptually represents matrices as a collection of vectors where each vector is a *column*.
  2383. Now, let's represent our float2x3 matrix in SPIR-V. If we choose a naive translation (3 columns, each of which is a vector of size 2), we get:
  2384. .. code:: spirv
  2385. %v2float = OpTypeVector %float 2
  2386. %mat3v2float = OpTypeMatrix %v2float 3
  2387. Now, let's use this naive translation to access into the matrix (e.g. ``m[0][2]``). This is evaluated by first finding ``n = m[0]``, and then finding ``n[2]``.
  2388. Notice that in HLSL, ``m[0]`` represents a row, which is a vector of size 3. But accessing the first dimension of the SPIR-V matrix give us
  2389. the first column which is a vector of size 2.
  2390. .. code:: spirv
  2391. ; n is a vector of size 2
  2392. %n = OpAccessChain %v2float %m %int_0
  2393. Notice that in HLSL access ``m[i][j]``, ``i`` can be ``{0, 1}`` and ``j`` can be ``{0, 1, 2}``.
  2394. But in SPIR-V OpAccessChain access, the first index (``i``) can be ``{0, 1, 2}`` and the second index (``j``) can be ``{1, 0}``.
  2395. Therefore, the naive translation does not work well with indexing.
  2396. As a result, we must translate a given HLSL float2x3 matrix (with 2 rows and 3 columns) as a SPIR-V matrix with 3 rows and 2 columns:
  2397. .. code:: spirv
  2398. %v3float = OpTypeVector %float 3
  2399. %mat2v3float = OpTypeMatrix %v3float 2
  2400. This way, all accesses into the matrix can be naturally handled correctly.
  2401. Packing
  2402. ~~~~~~~
  2403. The HLSL ``row_major`` and ``column_major`` type modifiers change the way packing is done.
  2404. The following table provides an example which should make our translation more clear:
  2405. +------------------+---------------------------+---------------------------+-----------------------------+-------------------+
  2406. | Host CPU Data | HLSL Variable | GPU (HLSL Representation) | GPU (SPIR-V Representation) | SPIR-V Decoration |
  2407. +==================+===========================+===========================+=============================+===================+
  2408. |``{1,2,3,4,5,6}`` | ``float2x3`` | ``[1 3 5]`` | ``[1 2]`` | |
  2409. | | | | | |
  2410. | | | ``[2 4 6]`` | ``[3 4]`` | ``RowMajor`` |
  2411. | | | | | |
  2412. | | | | ``[5 6]`` | |
  2413. +------------------+---------------------------+---------------------------+-----------------------------+-------------------+
  2414. |``{1,2,3,4,5,6}`` | ``column_major float2x3`` | ``[1 3 5]`` | ``[1 2]`` | |
  2415. | | | | | |
  2416. | | | ``[2 4 6]`` | ``[3 4]`` | ``RowMajor`` |
  2417. | | | | | |
  2418. | | | | ``[5 6]`` | |
  2419. +------------------+---------------------------+---------------------------+-----------------------------+-------------------+
  2420. |``{1,2,3,4,5,6}`` | ``row_major float2x3`` | ``[1 2 3]`` | ``[1 4]`` | |
  2421. | | | | | |
  2422. | | | ``[4 5 6]`` | ``[2 5]`` | ``ColMajor`` |
  2423. | | | | | |
  2424. | | | | ``[3 6]`` | |
  2425. +------------------+---------------------------+---------------------------+-----------------------------+-------------------+