Sfoglia il codice sorgente

Merge branch 'master' into user/texr/rt-merge-rebase

Tex Riddell 7 anni fa
parent
commit
cb5f27c080

+ 61 - 16
docs/SPIR-V.rst

@@ -338,6 +338,37 @@ Validation is turned on by default as the last stage of SPIR-V CodeGen. Failing
 validation, which indicates there is a CodeGen bug, will trigger a fatal error.
 Please file an issue if you see that.
 
+Reflection
+----------
+
+Making reflection easier is one of the goals of SPIR-V CodeGen. This section
+provides guidelines about how to reflect on certain facts.
+
+Note that we generate ``OpName``/``OpMemberName`` instructions for various
+types/variables both explicitly defined in the source code and interally created
+by the compiler. These names are primarily for debugging purposes in the
+compiler. They have "no semantic impact and can safely be removed" according
+to the SPIR-V spec. And they are subject to changes without notice. So we do
+not suggest to use them for reflection.
+
+Read-only vs. read-write resource types
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+There are no clear and consistent decorations in the SPIR-V to show whether a
+resource type is translated from a read-only (RO) or read-write (RW) HLSL
+resource type. Instead, you need to use different checks for reflecting different
+resource types:
+
+* HLSL samplers: RO.
+* HLSL ``Buffer``/``RWBuffer``/``Texture*``/``RWTexture*``: Check the "Sampled"
+  operand in the ``OpTypeImage`` instruction they translated into. "2" means RW,
+  "1" means RO.
+* HLSL constant/texture/structured/byte buffers: Check both ``Block``/``BufferBlock``
+  and ``NonWritable`` decoration. If decorated with ``Block`` (``cbuffer`` &
+  ``ConstantBuffer``), then RO; if decorated with ``BufferBlock`` and ``NonWritable``
+  (``tbuffer``, ``TextureBuffer``, ``StructuredBuffer``), then RO; Otherwise, RW.
+
+
 HLSL Types
 ==========
 
@@ -555,18 +586,30 @@ Please see the following sections for the details of each type. As a summary:
 =========================== ================== ========================== ==================== =================
          HLSL Type          Vulkan Buffer Type Default Memory Layout Rule SPIR-V Storage Class SPIR-V Decoration
 =========================== ================== ========================== ==================== =================
-``cbuffer``                   Uniform Buffer      GLSL ``std140``            ``Uniform``        ``Block``
-``ConstantBuffer``            Uniform Buffer      GLSL ``std140``            ``Uniform``        ``Block``
-``tbuffer``                   Storage Buffer      GLSL ``std430``            ``Uniform``        ``BufferBlock``
-``TextureBuffer``             Storage Buffer      GLSL ``std430``            ``Uniform``        ``BufferBlock``
-``StructuredBuffer``          Storage Buffer      GLSL ``std430``            ``Uniform``        ``BufferBlock``
-``RWStructuredBuffer``        Storage Buffer      GLSL ``std430``            ``Uniform``        ``BufferBlock``
-``AppendStructuredBuffer``    Storage Buffer      GLSL ``std430``            ``Uniform``        ``BufferBlock``
-``ConsumeStructuredBuffer``   Storage Buffer      GLSL ``std430``            ``Uniform``        ``BufferBlock``
-``ByteAddressBuffer``         Storage Buffer      GLSL ``std430``            ``Uniform``        ``BufferBlock``
-``RWByteAddressBuffer``       Storage Buffer      GLSL ``std430``            ``Uniform``        ``BufferBlock``
+``cbuffer``                   Uniform Buffer    Relaxed GLSL ``std140``      ``Uniform``        ``Block``
+``ConstantBuffer``            Uniform Buffer    Relaxed GLSL ``std140``      ``Uniform``        ``Block``
+``tbuffer``                   Storage Buffer    Relaxed GLSL ``std430``      ``Uniform``        ``BufferBlock``
+``TextureBuffer``             Storage Buffer    Relaxed GLSL ``std430``      ``Uniform``        ``BufferBlock``
+``StructuredBuffer``          Storage Buffer    Relaxed GLSL ``std430``      ``Uniform``        ``BufferBlock``
+``RWStructuredBuffer``        Storage Buffer    Relaxed GLSL ``std430``      ``Uniform``        ``BufferBlock``
+``AppendStructuredBuffer``    Storage Buffer    Relaxed GLSL ``std430``      ``Uniform``        ``BufferBlock``
+``ConsumeStructuredBuffer``   Storage Buffer    Relaxed GLSL ``std430``      ``Uniform``        ``BufferBlock``
+``ByteAddressBuffer``         Storage Buffer    Relaxed GLSL ``std430``      ``Uniform``        ``BufferBlock``
+``RWByteAddressBuffer``       Storage Buffer    Relaxed GLSL ``std430``      ``Uniform``        ``BufferBlock``
 =========================== ================== ========================== ==================== =================
 
+In the above, "relaxed" GLSL ``std140``/``std430`` rules mean GLSL
+``std140``/``std430`` rules with the following modification for vector type
+alignment:
+
+1. The alignment of a vector type is set to be the alignment of its element type
+2. If the above causes an improper straddle (see Vulkan spec
+   `14.5.4. Offset and Stride Assignment <https://www.khronos.org/registry/vulkan/specs/1.0-extensions/html/vkspec.html#interfaces-resources-layout>`_),
+   the alignment will be set to 16 bytes.
+
+To use the conventional GLSL ``std140``/``std430`` rules for resources,
+you can use the ``-fvk-use-glsl-layout`` option.
+
 To know more about the Vulkan buffer types, please refer to the Vulkan spec
 `13.1 Descriptor Types <https://www.khronos.org/registry/vulkan/specs/1.0-wsi_extensions/html/vkspec.html#descriptorsets-types>`_.
 
@@ -577,8 +620,8 @@ These two buffer types are treated as uniform buffers using Vulkan's
 terminology. They are translated into an ``OpTypeStruct`` with the
 necessary layout decorations (``Offset``, ``ArrayStride``, ``MatrixStride``,
 ``RowMajor``, ``ColMajor``) and the ``Block`` decoration. The layout rule
-used is GLSL ``std140`` (by default). A variable declared as one of these
-types will be placed in the ``Uniform`` storage class.
+used is relaxed GLSL ``std140`` (by default). A variable declared as one of
+these types will be placed in the ``Uniform`` storage class.
 
 For example, for the following HLSL source code:
 
@@ -616,8 +659,8 @@ terminology. They are translated into an ``OpTypeStruct`` with the
 necessary layout decorations (``Offset``, ``ArrayStride``, ``MatrixStride``,
 ``RowMajor``, ``ColMajor``) and the ``BufferBlock`` decoration. All the struct
 members are also decorated with ``NonWritable`` decoration. The layout rule
-used is GLSL ``std430`` (by default). A variable declared as one of these
-types will be placed in the ``Uniform`` storage class.
+used is relaxed GLSL ``std430`` (by default). A variable declared as one of
+these types will be placed in the ``Uniform`` storage class.
 
 
 ``StructuredBuffer`` and ``RWStructuredBuffer``
@@ -627,7 +670,7 @@ types will be placed in the ``Uniform`` storage class.
 using Vulkan's terminology. It is translated into an ``OpTypeStruct`` containing
 an ``OpTypeRuntimeArray`` of type ``T``, with necessary layout decorations
 (``Offset``, ``ArrayStride``, ``MatrixStride``, ``RowMajor``, ``ColMajor``) and
-the ``BufferBlock`` decoration.  The default layout rule used is GLSL
+the ``BufferBlock`` decoration.  The default layout rule used is relaxed GLSL
 ``std430``. A variable declared as one of these types will be placed in the
 ``Uniform`` storage class.
 
@@ -678,7 +721,7 @@ storage buffer using Vulkan's terminology. It is translated into an
 ``OpTypeStruct`` containing an ``OpTypeRuntimeArray`` of type ``T``, with
 necessary layout decorations (``Offset``, ``ArrayStride``, ``MatrixStride``,
 ``RowMajor``, ``ColMajor``) and the ``BufferBlock`` decoration. The default
-layout rule used is GLSL ``std430``.
+layout rule used is relaxed GLSL ``std430``.
 
 A variable declared as one of these types will be placed in the ``Uniform``
 storage class. Besides, each variable will have an associated counter variable
@@ -2519,6 +2562,8 @@ codegen for Vulkan:
 - ``-fvk-ignore-unused-resources``: Avoids emitting SPIR-V code for resources
   defined but not statically referenced by the call tree of the entry point
   in question.
+- ``-fvk-use-glsl-layout``: Uses conventional GLSL ``std140``/``std430`` layout
+  rules for resources.
 - ``-fvk-invert-y``: Inverts SV_Position.y before writing to stage output.
   Used to accommodate the difference between Vulkan's coordinate system and
   DirectX's. Only allowed in VS/DS/GS.

+ 1 - 1
include/dxc/Support/Global.h

@@ -187,7 +187,7 @@ inline void OutputDebugFormatA(_In_ _Printf_format_string_ _Null_terminated_ con
 //
 #define DXASSERT(exp, fmt, ...)\
   do { _Analysis_assume_(exp); if(!(exp)) {                              \
-    OutputDebugFormatA("Error: \t%s\t\nFile:\n%s(%d)\t\nFunc:\t%s.\n\t" fmt "\n", "!(" #exp ")", __FILE__, __LINE__, __FUNCTION__, __VA_ARGS__); \
+    OutputDebugFormatA("Error: \t%s\nFile:\n%s(%d)\nFunc:\t%s.\n\t" fmt "\n", "!(" #exp ")", __FILE__, __LINE__, __FUNCTION__, __VA_ARGS__); \
     __debugbreak();\
   } } while(0)
 

+ 42 - 41
include/dxc/Support/HLSLOptions.h

@@ -110,49 +110,49 @@ public:
   llvm::StringRef RootSignatureDefine; // OPT_rootsig_define
   llvm::StringRef FloatDenormalMode; // OPT_denorm
 
-  bool AllResourcesBound; // OPT_all_resources_bound
-  bool AstDump; // OPT_ast_dump
-  bool ColorCodeAssembly; // OPT_Cc
-  bool CodeGenHighLevel; // OPT_fcgl
-  bool DebugInfo; // OPT__SLASH_Zi
-  bool DebugNameForBinary; // OPT_Zsb
-  bool DebugNameForSource; // OPT_Zss
-  bool DumpBin;        // OPT_dumpbin
-  bool WarningAsError; // OPT__SLASH_WX
-  bool IEEEStrict;     // OPT_Gis
-  bool IgnoreLineDirectives; // OPT_ignore_line_directives
-  bool DefaultColMajor;  // OPT_Zpc
-  bool DefaultRowMajor;  // OPT_Zpr
-  bool DisableValidation; // OPT_VD
-  unsigned OptLevel;      // OPT_O0/O1/O2/O3
-  bool DisableOptimizations; // OPT_Od
-  bool AvoidFlowControl;     // OPT_Gfa
-  bool PreferFlowControl;    // OPT_Gfp
-  bool EnableStrictMode;     // OPT_Ges
-  unsigned long HLSLVersion; // OPT_hlsl_version (2015-2018)
-  bool Enable16BitTypes; // OPT_enable_16bit_types
-  bool OptDump; // OPT_ODump - dump optimizer commands
+  bool AllResourcesBound = false; // OPT_all_resources_bound
+  bool AstDump = false; // OPT_ast_dump
+  bool ColorCodeAssembly = false; // OPT_Cc
+  bool CodeGenHighLevel = false; // OPT_fcgl
+  bool DebugInfo = false; // OPT__SLASH_Zi
+  bool DebugNameForBinary = false; // OPT_Zsb
+  bool DebugNameForSource = false; // OPT_Zss
+  bool DumpBin = false;        // OPT_dumpbin
+  bool WarningAsError = false; // OPT__SLASH_WX
+  bool IEEEStrict = false;     // OPT_Gis
+  bool IgnoreLineDirectives = false; // OPT_ignore_line_directives
+  bool DefaultColMajor = false;  // OPT_Zpc
+  bool DefaultRowMajor = false;  // OPT_Zpr
+  bool DisableValidation = false; // OPT_VD
+  unsigned OptLevel = 0;      // OPT_O0/O1/O2/O3
+  bool DisableOptimizations = false; // OPT_Od
+  bool AvoidFlowControl = false;     // OPT_Gfa
+  bool PreferFlowControl = false;    // OPT_Gfp
+  bool EnableStrictMode = false;     // OPT_Ges
+  unsigned long HLSLVersion = 0; // OPT_hlsl_version (2015-2018)
+  bool Enable16BitTypes = false; // OPT_enable_16bit_types
+  bool OptDump = false; // OPT_ODump - dump optimizer commands
   bool OutputWarnings = true; // OPT_no_warnings
   bool ShowHelp = false;  // OPT_help
-  bool UseColor; // OPT_Cc
-  bool UseHexLiterals; // OPT_Lx
-  bool UseInstructionByteOffsets; // OPT_No
-  bool UseInstructionNumbers; // OPT_Ni
-  bool NotUseLegacyCBufLoad;  // OPT_not_use_legacy_cbuf_load
-  bool PackPrefixStable;  // OPT_pack_prefix_stable
-  bool PackOptimized;  // OPT_pack_optimized
-  bool DisplayIncludeProcess; // OPT__vi
-  bool RecompileFromBinary; // OPT _Recompile (Recompiling the DXBC binary file not .hlsl file)
-  bool StripDebug; // OPT Qstrip_debug
-  bool StripRootSignature; // OPT_Qstrip_rootsignature
-  bool StripPrivate; // OPT_Qstrip_priv
-  bool StripReflection; // OPT_Qstrip_reflect
-  bool ExtractRootSignature; // OPT_extractrootsignature
-  bool DisassembleColorCoded; // OPT_Cc
-  bool DisassembleInstNumbers; //OPT_Ni
-  bool DisassembleByteOffset; //OPT_No
-  bool DisaseembleHex; //OPT_Lx
-  bool LegacyMacroExpansion; // OPT_flegacy_macro_expansion
+  bool UseColor = false; // OPT_Cc
+  bool UseHexLiterals = false; // OPT_Lx
+  bool UseInstructionByteOffsets = false; // OPT_No
+  bool UseInstructionNumbers = false; // OPT_Ni
+  bool NotUseLegacyCBufLoad = false;  // OPT_not_use_legacy_cbuf_load
+  bool PackPrefixStable = false;  // OPT_pack_prefix_stable
+  bool PackOptimized = false;  // OPT_pack_optimized
+  bool DisplayIncludeProcess = false; // OPT__vi
+  bool RecompileFromBinary = false; // OPT _Recompile (Recompiling the DXBC binary file not .hlsl file)
+  bool StripDebug = false; // OPT Qstrip_debug
+  bool StripRootSignature = false; // OPT_Qstrip_rootsignature
+  bool StripPrivate = false; // OPT_Qstrip_priv
+  bool StripReflection = false; // OPT_Qstrip_reflect
+  bool ExtractRootSignature = false; // OPT_extractrootsignature
+  bool DisassembleColorCoded = false; // OPT_Cc
+  bool DisassembleInstNumbers = false; //OPT_Ni
+  bool DisassembleByteOffset = false; //OPT_No
+  bool DisaseembleHex = false; //OPT_Lx
+  bool LegacyMacroExpansion = false; // OPT_flegacy_macro_expansion
 
   bool IsRootSignatureProfile();
   bool IsLibraryProfile();
@@ -162,6 +162,7 @@ public:
   bool GenSPIRV; // OPT_spirv
   bool VkIgnoreUnusedResources; // OPT_fvk_ignore_used_resources
   bool VkInvertY; // OPT_fvk_invert_y
+  bool VkUseGlslLayout; // OPT_fvk_use_glsl_layout
   llvm::StringRef VkStageIoOrder; // OPT_fvk_stage_io_order
   llvm::SmallVector<uint32_t, 4> VkBShift; // OPT_fvk_b_shift
   llvm::SmallVector<uint32_t, 4> VkTShift; // OPT_fvk_t_shift

+ 2 - 0
include/dxc/Support/HLSLOptions.td

@@ -250,6 +250,8 @@ def fvk_u_shift : MultiArg<["-"], "fvk-u-shift", 2>, MetaVarName<"<shift> <space
   HelpText<"Specify Vulkan binding number shift for u-type register">;
 def fvk_invert_y: Flag<["-"], "fvk-invert-y">, Group<spirv_Group>, Flags<[CoreOption, DriverOption]>,
   HelpText<"Invert SV_Position.y in VS/DS/GS to accommodate Vulkan's coordinate system">;
+def fvk_use_glsl_layout: Flag<["-"], "fvk-use-glsl-layout">, Group<spirv_Group>, Flags<[CoreOption, DriverOption]>,
+  HelpText<"Use conventional GLSL std140/std430 layout for resources">;
 // SPIRV Change Ends
 
 //////////////////////////////////////////////////////////////////////////////

+ 3 - 2
include/llvm/llvm_assert/assert.h

@@ -33,12 +33,13 @@ extern "C" {
 #endif
 void llvm_assert(const char *_Message,
                  const char *_File,
-                 unsigned _Line);
+                 unsigned _Line,
+                 const char *_Function);
 #ifdef __cplusplus
 }
 #endif
 
-#define assert(_Expression) ((void)( (!!(_Expression)) || (llvm_assert(#_Expression, __FILE__, __LINE__), 0) ))
+#define assert(_Expression) ((void)( (!!(_Expression)) || (llvm_assert(#_Expression, __FILE__, __LINE__, __FUNCTION__), 0) ))
 
 #endif  /* NDEBUG */
 

+ 2 - 0
lib/DxcSupport/HLSLOptions.cpp

@@ -483,6 +483,7 @@ int ReadDxcOpts(const OptTable *optionTable, unsigned flagsToInclude,
 #ifdef ENABLE_SPIRV_CODEGEN
   const bool genSpirv = opts.GenSPIRV = Args.hasFlag(OPT_spirv, OPT_INVALID, false);
   opts.VkInvertY = Args.hasFlag(OPT_fvk_invert_y, OPT_INVALID, false);
+  opts.VkUseGlslLayout = Args.hasFlag(OPT_fvk_use_glsl_layout, OPT_INVALID, false);
   opts.VkIgnoreUnusedResources = Args.hasFlag(OPT_fvk_ignore_unused_resources, OPT_INVALID, false);
 
   // Collects the arguments for -fvk-{b|s|t|u}-shift.
@@ -522,6 +523,7 @@ int ReadDxcOpts(const OptTable *optionTable, unsigned flagsToInclude,
 #else
   if (Args.hasFlag(OPT_spirv, OPT_INVALID, false) ||
       Args.hasFlag(OPT_fvk_invert_y, OPT_INVALID, false) ||
+      Args.hasFlag(OPT_fvk_use_glsl_layout, OPT_INVALID, false) ||
       Args.hasFlag(OPT_fvk_ignore_unused_resources, OPT_INVALID, false) ||
       !Args.getLastArgValue(OPT_fvk_stage_io_order_EQ).empty() ||
       !Args.getLastArgValue(OPT_fvk_b_shift).empty() ||

+ 4 - 1
lib/Support/assert.cpp

@@ -9,9 +9,12 @@
 
 #include "assert.h"
 #include "windows.h"
+#include "dxc/Support/Global.h"
 
 void llvm_assert(_In_z_ const char *_Message,
                  _In_z_ const char *_File,
-                 _In_ unsigned _Line) {
+                 _In_ unsigned _Line,
+                 const char *_Function) {
+  OutputDebugFormatA("Error: assert(%s)\nFile:\n%s(%d)\nFunc:\t%s\n", _Message, _File, _Line, _Function);
   RaiseException(STATUS_LLVM_ASSERT, 0, 0, 0);
 }

+ 10 - 0
lib/Transforms/Scalar/Scalarizer.cpp

@@ -576,6 +576,16 @@ bool Scalarizer::visitShuffleVectorInst(ShuffleVectorInst &SVI) {
       Res[I] = Op0[Selector];
     else
       Res[I] = Op1[Selector - Op0.size()];
+    // HLSL Change Begins: (fix bug in upstream llvm)
+    if (ExtractElementInst *EA = dyn_cast<ExtractElementInst>(Res[I])) {
+      // Clone extractelement here, since it is associated with another inst.
+      // Otherwise it will be added to our Gather, and after the incoming
+      // instruction is processed, it will be replaced without updating our
+      // Gather entry.  This dead instruction will be accessed by finish(),
+      // causing assert or crash.
+      Res[I] = IRBuilder<>(SVI.getNextNode()).Insert(EA->clone());
+    }
+    // HLSL Change Ends
   }
   gather(&SVI, Res);
   return true;

+ 1 - 0
tools/clang/include/clang/SPIRV/EmitSPIRVOptions.h

@@ -20,6 +20,7 @@ struct EmitSPIRVOptions {
   bool defaultRowMajor;
   bool disableValidation;
   bool invertY;
+  bool useGlslLayout;
   bool ignoreUnusedResources;
   bool enable16BitTypes;
   llvm::StringRef stageIoOrder;

+ 10 - 7
tools/clang/lib/CodeGen/CGHLSLMS.cpp

@@ -27,6 +27,7 @@
 #include "clang/Lex/HLSLMacroExpander.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/StringSwitch.h"
+#include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/IRBuilder.h"
 #include "llvm/IR/GetElementPtrTypeIterator.h"
@@ -3701,8 +3702,8 @@ static bool SimplifyBitCastGEP(GEPOperator *GEP, llvm::Type *FromTy, llvm::Type
   }
   return false;
 }
-
-static void SimplifyBitCast(BitCastOperator *BC, std::vector<Instruction *> &deadInsts) {
+typedef SmallPtrSet<Instruction *, 4> SmallInstSet;
+static void SimplifyBitCast(BitCastOperator *BC, SmallInstSet &deadInsts) {
   Value *Ptr = BC->getOperand(0);
   llvm::Type *FromTy = Ptr->getType();
   llvm::Type *ToTy = BC->getType();
@@ -3732,18 +3733,18 @@ static void SimplifyBitCast(BitCastOperator *BC, std::vector<Instruction *> &dea
     if (LoadInst *LI = dyn_cast<LoadInst>(U)) {
       if (SimplifyBitCastLoad(LI, FromTy, ToTy, Ptr)) {
         LI->dropAllReferences();
-        deadInsts.emplace_back(LI);
+        deadInsts.insert(LI);
       }
     } else if (StoreInst *SI = dyn_cast<StoreInst>(U)) {
       if (SimplifyBitCastStore(SI, FromTy, ToTy, Ptr)) {
         SI->dropAllReferences();
-        deadInsts.emplace_back(SI);
+        deadInsts.insert(SI);
       }
     } else if (GEPOperator *GEP = dyn_cast<GEPOperator>(U)) {
       if (SimplifyBitCastGEP(GEP, FromTy, ToTy, Ptr))
         if (Instruction *I = dyn_cast<Instruction>(GEP)) {
           I->dropAllReferences();
-          deadInsts.emplace_back(I);
+          deadInsts.insert(I);
         }
     } else if (CallInst *CI = dyn_cast<CallInst>(U)) {
       // Skip function call.
@@ -3991,7 +3992,7 @@ static Value * TryEvalIntrinsic(CallInst *CI, IntrinsicOp intriOp) {
 }
 
 static void SimpleTransformForHLDXIR(Instruction *I,
-                                     std::vector<Instruction *> &deadInsts) {
+                                     SmallInstSet &deadInsts) {
 
   unsigned opcode = I->getOpcode();
   switch (opcode) {
@@ -4048,11 +4049,13 @@ static void SimpleTransformForHLDXIR(Instruction *I,
 
 // Do simple transform to make later lower pass easier.
 static void SimpleTransformForHLDXIR(llvm::Module *pM) {
-  std::vector<Instruction *> deadInsts;
+  SmallInstSet deadInsts;
   for (Function &F : pM->functions()) {
     for (BasicBlock &BB : F.getBasicBlockList()) {
       for (BasicBlock::iterator Iter = BB.begin(); Iter != BB.end(); ) {
         Instruction *I = (Iter++);
+        if (deadInsts.count(I))
+          continue; // Skip dead instructions
         SimpleTransformForHLDXIR(I, deadInsts);
       }
     }

+ 50 - 6
tools/clang/lib/SPIRV/TypeTranslator.cpp

@@ -32,6 +32,15 @@ inline void roundToPow2(uint32_t *val, uint32_t pow2) {
   assert(pow2 != 0);
   *val = (*val + pow2 - 1) & ~(pow2 - 1);
 }
+
+/// Returns true if the given vector type (of the given size) crosses the
+/// 4-component vector boundary if placed at the given offset.
+bool improperStraddle(QualType type, int size, int offset) {
+  assert(TypeTranslator::isVectorType(type));
+  return size <= 16 ? offset / 16 != (offset + size - 1) / 16
+                    : offset % 16 != 0;
+}
+
 } // anonymous namespace
 
 bool TypeTranslator::isRelaxedPrecisionType(QualType type,
@@ -1079,11 +1088,13 @@ TypeTranslator::getLayoutDecorations(const DeclContext *decl, LayoutRule rule) {
     std::tie(memberAlignment, memberSize) =
         getAlignmentAndSize(fieldType, rule, isRowMajor, &stride);
 
+    alignUsingHLSLRelaxedLayout(fieldType, memberSize, &memberAlignment,
+                                &offset);
+
     // Each structure-type member must have an Offset Decoration.
     if (const auto *offsetAttr = field->getAttr<VKOffsetAttr>())
       offset = offsetAttr->getOffset();
-    else
-      roundToPow2(&offset, memberAlignment);
+
     decorations.push_back(Decoration::getOffset(*spirvContext, offset, index));
     offset += memberSize;
 
@@ -1330,6 +1341,37 @@ TypeTranslator::translateSampledTypeToImageFormat(QualType sampledType) {
   return spv::ImageFormat::Unknown;
 }
 
+void TypeTranslator::alignUsingHLSLRelaxedLayout(QualType fieldType,
+                                                 uint32_t fieldSize,
+                                                 uint32_t *fieldAlignment,
+                                                 uint32_t *currentOffset) {
+  bool fieldIsVecType = false;
+
+  if (!spirvOptions.useGlslLayout) {
+    // Adjust according to HLSL relaxed layout rules.
+    // Aligning vectors as their element types so that we can pack a float
+    // and a float3 tightly together.
+    QualType vecElemType = {};
+    if (fieldIsVecType = isVectorType(fieldType, &vecElemType)) {
+      uint32_t scalarAlignment = 0;
+      std::tie(scalarAlignment, std::ignore) =
+          getAlignmentAndSize(vecElemType, LayoutRule::Void, false, nullptr);
+      if (scalarAlignment <= 4)
+        *fieldAlignment = scalarAlignment;
+    }
+  }
+
+  roundToPow2(currentOffset, *fieldAlignment);
+
+  // Adjust according to HLSL relaxed layout rules.
+  // Bump to 4-component vector alignment if there is a bad straddle
+  if (!spirvOptions.useGlslLayout && fieldIsVecType &&
+      improperStraddle(fieldType, fieldSize, *currentOffset)) {
+    *fieldAlignment = kStd140Vec4Alignment;
+    roundToPow2(currentOffset, *fieldAlignment);
+  }
+}
+
 std::pair<uint32_t, uint32_t>
 TypeTranslator::getAlignmentAndSize(QualType type, LayoutRule rule,
                                     const bool isRowMajor, uint32_t *stride) {
@@ -1405,8 +1447,8 @@ TypeTranslator::getAlignmentAndSize(QualType type, LayoutRule rule,
         case BuiltinType::ULongLong:
           return {8, 8};
         default:
-          emitError("primitive type %0 unimplemented")
-              << builtinType->getTypeClassName();
+          emitError("alignment and size calculation for type %0 unimplemented")
+              << type;
           return {0, 0};
         }
   }
@@ -1463,10 +1505,12 @@ TypeTranslator::getAlignmentAndSize(QualType type, LayoutRule rule,
       std::tie(memberAlignment, memberSize) =
           getAlignmentAndSize(field->getType(), rule, isRowMajor, stride);
 
+      alignUsingHLSLRelaxedLayout(field->getType(), memberSize,
+                                  &memberAlignment, &structSize);
+
       // The base alignment of the structure is N, where N is the largest
       // base alignment value of any of its members...
       maxAlignment = std::max(maxAlignment, memberAlignment);
-      roundToPow2(&structSize, memberAlignment);
       structSize += memberSize;
     }
 
@@ -1504,7 +1548,7 @@ TypeTranslator::getAlignmentAndSize(QualType type, LayoutRule rule,
     return {alignment, size};
   }
 
-  emitError("type %0 unimplemented") << type->getTypeClassName();
+  emitError("alignment and size calculation for type %0 unimplemented") << type;
   return {0, 0};
 }
 

+ 10 - 1
tools/clang/lib/SPIRV/TypeTranslator.h

@@ -261,10 +261,19 @@ private:
   /// instructions and returns the <result-id>. Returns 0 on failure.
   uint32_t translateResourceType(QualType type, LayoutRule rule);
 
-  /// \bried For the given sampled type, returns the corresponding image format
+  /// \brief For the given sampled type, returns the corresponding image format
   /// that can be used to create an image object.
   spv::ImageFormat translateSampledTypeToImageFormat(QualType type);
 
+  /// \brief Aligns currentOffset properly to allow packing vectors in the HLSL
+  /// way: using the element type's alignment as the vector alignment, as long
+  /// as there is no improper straddle.
+  /// fieldSize and fieldAlignment are the original size and alignment
+  /// calculated without considering the HLSL vector relaxed rule.
+  void alignUsingHLSLRelaxedLayout(QualType fieldType, uint32_t fieldSize,
+                                   uint32_t *fieldAlignment,
+                                   uint32_t *currentOffset);
+
 public:
   /// \brief Returns the alignment and size in bytes for the given type
   /// according to the given LayoutRule.

+ 1 - 1
tools/clang/test/CodeGenSPIRV/method.append-structured-buffer.get-dimensions.hlsl

@@ -1,4 +1,4 @@
-// Run: %dxc -T vs_6_0 -E main
+// Run: %dxc -T vs_6_0 -E main -fvk-use-glsl-layout
 
 struct S {
     float a;

+ 1 - 1
tools/clang/test/CodeGenSPIRV/method.consume-structured-buffer.get-dimensions.hlsl

@@ -1,4 +1,4 @@
-// Run: %dxc -T vs_6_0 -E main
+// Run: %dxc -T vs_6_0 -E main -fvk-use-glsl-layout
 
 struct S {
     float a;

+ 1 - 1
tools/clang/test/CodeGenSPIRV/method.structured-buffer.get-dimensions.hlsl

@@ -1,4 +1,4 @@
-// Run: %dxc -T ps_6_0 -E main
+// Run: %dxc -T ps_6_0 -E main -fvk-use-glsl-layout
 
 struct SBuffer {
   float4   f1;

+ 1 - 1
tools/clang/test/CodeGenSPIRV/vk.layout.cbuffer.std140.hlsl

@@ -1,4 +1,4 @@
-// Run: %dxc -T vs_6_0 -E main
+// Run: %dxc -T vs_6_0 -E main -fvk-use-glsl-layout
 
 struct R {     // Alignment                           Offset     Size       Next
     float2 rf; // 8(vec2)                          -> 0        + 8(vec2)  = 8

+ 1 - 1
tools/clang/test/CodeGenSPIRV/vk.layout.push-constant.std430.hlsl

@@ -1,4 +1,4 @@
-// Run: %dxc -T vs_6_0 -E main
+// Run: %dxc -T vs_6_0 -E main -fvk-use-glsl-layout
 
 // CHECK: OpDecorate %_arr_v2float_uint_3 ArrayStride 8
 // CHECK: OpDecorate %_arr_mat3v2float_uint_2 ArrayStride 32

+ 1 - 1
tools/clang/test/CodeGenSPIRV/vk.layout.sbuffer.std430.hlsl

@@ -1,4 +1,4 @@
-// Run: %dxc -T ps_6_0 -E main
+// Run: %dxc -T ps_6_0 -E main -fvk-use-glsl-layout
 
 struct R {     // Alignment       Offset     Size       Next
     float2 rf; // 8(vec2)      -> 0        + 8(vec2)  = 8

+ 124 - 0
tools/clang/test/CodeGenSPIRV/vk.layout.vector.relaxed.hlsl

@@ -0,0 +1,124 @@
+// Run: %dxc -T ps_6_0 -E main
+
+    // For ConstantBuffer & cbuffer
+// CHECK: OpMemberDecorate %S 0 Offset 0
+// CHECK: OpMemberDecorate %S 1 Offset 4
+// CHECK: OpMemberDecorate %S 2 Offset 16
+// CHECK: OpMemberDecorate %S 3 Offset 28
+// CHECK: OpMemberDecorate %S 4 Offset 32
+// CHECK: OpMemberDecorate %S 5 Offset 36
+// CHECK: OpMemberDecorate %S 6 Offset 44
+// CHECK: OpMemberDecorate %S 7 Offset 48
+// CHECK: OpMemberDecorate %S 8 Offset 56
+// CHECK: OpMemberDecorate %S 9 Offset 64
+// CHECK: OpMemberDecorate %S 10 Offset 80
+// CHECK: OpMemberDecorate %S 11 Offset 92
+// CHECK: OpMemberDecorate %S 12 Offset 96
+// CHECK: OpMemberDecorate %S 13 Offset 112
+// CHECK: OpMemberDecorate %S 14 Offset 128
+// CHECK: OpMemberDecorate %S 15 Offset 140
+// CHECK: OpMemberDecorate %S 16 Offset 144
+// CHECK: OpMemberDecorate %S 17 Offset 160
+// CHECK: OpMemberDecorate %S 18 Offset 176
+// CHECK: OpMemberDecorate %S 19 Offset 192
+// CHECK: OpMemberDecorate %S 20 Offset 208
+// CHECK: OpMemberDecorate %S 21 Offset 240
+// CHECK: OpMemberDecorate %S 22 Offset 272
+// CHECK: OpMemberDecorate %S 23 Offset 304
+
+    // For StructuredBuffer & tbuffer
+// CHECK: OpMemberDecorate %S_0 0 Offset 0
+// CHECK: OpMemberDecorate %S_0 1 Offset 4
+// CHECK: OpMemberDecorate %S_0 2 Offset 16
+// CHECK: OpMemberDecorate %S_0 3 Offset 28
+// CHECK: OpMemberDecorate %S_0 4 Offset 32
+// CHECK: OpMemberDecorate %S_0 5 Offset 36
+// CHECK: OpMemberDecorate %S_0 6 Offset 44
+// CHECK: OpMemberDecorate %S_0 7 Offset 48
+// CHECK: OpMemberDecorate %S_0 8 Offset 56
+// CHECK: OpMemberDecorate %S_0 9 Offset 64
+// CHECK: OpMemberDecorate %S_0 10 Offset 80
+// CHECK: OpMemberDecorate %S_0 11 Offset 92
+// CHECK: OpMemberDecorate %S_0 12 Offset 96
+// CHECK: OpMemberDecorate %S_0 13 Offset 112
+// CHECK: OpMemberDecorate %S_0 14 Offset 128
+// CHECK: OpMemberDecorate %S_0 15 Offset 140
+// CHECK: OpMemberDecorate %S_0 16 Offset 144
+// CHECK: OpMemberDecorate %S_0 17 Offset 160
+// CHECK: OpMemberDecorate %S_0 18 Offset 176
+// CHECK: OpMemberDecorate %S_0 19 Offset 192
+// CHECK: OpMemberDecorate %S_0 20 Offset 196
+// CHECK: OpMemberDecorate %S_0 21 Offset 208
+// CHECK: OpMemberDecorate %S_0 22 Offset 240
+// CHECK: OpMemberDecorate %S_0 23 Offset 272
+
+// CHECK: OpDecorate %_runtimearr_T ArrayStride 288
+
+// CHECK:     %type_ConstantBuffer_T = OpTypeStruct %S
+// CHECK:                         %T = OpTypeStruct %S_0
+// CHECK:   %type_StructuredBuffer_T = OpTypeStruct %_runtimearr_T
+// CHECK: %type_RWStructuredBuffer_T = OpTypeStruct %_runtimearr_T
+// CHECK:              %type_TBuffer = OpTypeStruct %S_0
+
+// CHECK:   %MyCBuffer = OpVariable %_ptr_Uniform_type_ConstantBuffer_T Uniform
+// CHECK:   %MySBuffer = OpVariable %_ptr_Uniform_type_StructuredBuffer_T Uniform
+// CHECK: %MyRWSBuffer = OpVariable %_ptr_Uniform_type_RWStructuredBuffer_T Uniform
+// CHECK:     %CBuffer = OpVariable %_ptr_Uniform_type_ConstantBuffer_T Uniform
+// CHECK:     %TBuffer = OpVariable %_ptr_Uniform_type_TBuffer Uniform
+
+struct S {
+    float  f0;
+    float3 f1;
+
+    float3 f2;
+    float1 f3;
+
+    float  f4;
+    float2 f5;
+    float1 f6;
+
+    float2 f7;
+    float2 f8;
+
+    float2 f9;
+    float3 f10;
+    float  f11;
+
+    float1 f12;
+    float4 f13;
+    float3 f14;
+    float  f15;
+
+    float1 f16[1];
+    float3 f17[1];
+
+    float3 f18[1];
+    float  f19[1];
+
+    float1 f20[2];
+    float3 f21[2];
+
+    float3 f22[2];
+    float  f23[2];
+};
+
+struct T {
+    S s;
+};
+
+
+    ConstantBuffer<T> MyCBuffer;
+  StructuredBuffer<T> MySBuffer;
+RWStructuredBuffer<T> MyRWSBuffer;
+
+cbuffer CBuffer {
+    S CB_s;
+};
+
+tbuffer TBuffer {
+    S TB_s;
+};
+
+float4 main() : SV_Target {
+    return MyCBuffer.s.f0 + MySBuffer[0].s.f4 + CB_s.f11 + TB_s.f15;
+}

+ 1 - 0
tools/clang/tools/dxcompiler/dxcompilerobj.cpp

@@ -447,6 +447,7 @@ public:
           spirvOpts.codeGenHighLevel = opts.CodeGenHighLevel;
           spirvOpts.disableValidation = opts.DisableValidation;
           spirvOpts.invertY = opts.VkInvertY;
+          spirvOpts.useGlslLayout = opts.VkUseGlslLayout;
           spirvOpts.ignoreUnusedResources = opts.VkIgnoreUnusedResources;
           spirvOpts.defaultRowMajor = opts.DefaultRowMajor;
           spirvOpts.stageIoOrder = opts.VkStageIoOrder;

+ 9 - 2
tools/clang/unittests/HLSL/FileCheckerTest.cpp

@@ -188,8 +188,12 @@ static string trim(string value) {
       argStrings = hlsl::options::MainArgs(splitArgs);
       std::string errorString;
       llvm::raw_string_ostream errorStream(errorString);
-      IFT(ReadDxcOpts(hlsl::options::getHlslOptTable(), /*flagsToInclude*/ 0,
-                      argStrings, Opts, errorStream));
+      RunResult = ReadDxcOpts(hlsl::options::getHlslOptTable(), /*flagsToInclude*/ 0,
+                              argStrings, Opts, errorStream);
+      errorStream.flush();
+      if (RunResult) {
+        StdErr = errorString;
+      }
     }
 
     void FileRunCommandPart::RunDxc(const FileRunCommandPart *Prior) {
@@ -223,6 +227,9 @@ static string trim(string value) {
 
       HRESULT resultStatus;
 
+      if (RunResult)  // opt parsing already failed
+        return;
+
       IFT(DllSupport->CreateInstance(CLSID_DxcLibrary, &pLibrary));
       IFT(pLibrary->CreateBlobFromFile(CommandFileName, nullptr, &pSource));
       IFT(pLibrary->CreateIncludeHandler(&pIncludeHandler));

+ 5 - 0
tools/clang/unittests/SPIRV/CodeGenSPIRVTest.cpp

@@ -1217,6 +1217,11 @@ TEST_F(FileTest, VulkanLayout64BitTypesStd430) {
 TEST_F(FileTest, VulkanLayout64BitTypesStd140) {
   runFileTest("vk.layout.64bit-types.std140.hlsl");
 }
+TEST_F(FileTest, VulkanLayoutVectorRelaxedLayout) {
+  // Allows vectors to be aligned according to their element types, if not
+  // causing improper straddle
+  runFileTest("vk.layout.vector.relaxed.hlsl");
+}
 
 TEST_F(FileTest, VulkanLayoutPushConstantStd430) {
   runFileTest("vk.layout.push-constant.std430.hlsl");

+ 42 - 24
utils/hct/hctstart.cmd

@@ -72,6 +72,7 @@ doskey hctvs=%HLSL_SRC_DIR%\utils\hct\hctvs.cmd $*
 call :checksdk
 if errorlevel 1 (
   echo Windows SDK not properly installed. Build enviornment could not be setup correctly.
+  echo Please see the README.md instructions in the project root.
   exit /b 1
 )
 
@@ -177,40 +178,57 @@ goto :eof
 
 :checksdk 
 setlocal
-reg query "HKLM\SOFTWARE\Microsoft\Windows Kits\Installed Roots" /v KitsRoot10 1>nul
-if errorlevel 1 (
-  echo Unable to find Windows 10 SDK.
-  echo Please see the README.md instructions in the project root.
+set min_sdk_ver=14393
+
+set REG_QUERY=REG QUERY "HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Microsoft SDKs\Windows\v10.0"
+set kit_root=
+for /F "tokens=1,2*" %%A in ('%REG_QUERY% /v InstallationFolder') do (
+  if "%%A"=="InstallationFolder" (
+    rem echo Found Windows 10 SDK
+    rem echo   InstallationFolder: "%%C"
+    set kit_root=%%C
+  )
+)
+if ""=="%kit_root%" (
+  echo Did not find a Windows 10 SDK installation.
   exit /b 1
 )
-for /f "tokens=2* delims= " %%A in ('reg query "HKLM\SOFTWARE\Microsoft\Windows Kits\Installed Roots" /v KitsRoot10') do set kit_root=%%B
 if not exist "%kit_root%" (
   echo Windows 10 SDK was installed but is not accessible.
   exit /b 1
 )
-rem 10.0.16299.0, 10.0.15063.0 and 10.0.14393.0 will work properly. Reject 10586 and 10240 explicitly.
-if exist "%kit_root%\include\10.0.16299.0\um\d3d12.h" (
-  echo Found Windows SDK 10.0.16299.0
-  goto :eof
-)
-if exist "%kit_root%\include\10.0.15063.0\um\d3d12.h" (
-  echo Found Windows SDK 10.0.15063.0
-  goto :eof
-)
-if exist "%kit_root%\include\10.0.14393.0\um\d3d12.h" (
-  echo Found Windows SDK 10.0.14393.0
-  goto :eof
+
+set sdk_ver=
+set d3d12_sdk_ver=
+for /F "tokens=1-3" %%A in ('%REG_QUERY% /v ProductVersion') do (
+  if "%%A"=="ProductVersion" (
+    rem echo       ProductVersion: %%C
+    for /F "tokens=1-3 delims=." %%X in ("%%C") do (
+      set sdk_ver=%%Z
+      if exist "%kit_root%\include\10.0.%%Z.0\um\d3d12.h" (
+        set d3d12_sdk_ver=%%Z
+      )
+    )
+  )
+)
+if ""=="%sdk_ver%" (
+  echo Could not detect Windows 10 SDK version.
+  exit /b 1
 )
-if exist "%kit_root%\include\10.0.10586.0\um\d3d12.h" (
-  echo Found Windows SDK 10.0.10586.0 - no longer supported.
+if NOT %min_sdk_ver% LEQ %sdk_ver% (
+  echo Found unsupported Windows 10 SDK version 10.0.%sdk_ver%.0 installed.
+  echo Windows 10 SDK version 10.0.%min_sdk_ver%.0 or newer is required.
+  exit /b 1
 )
-if exist  "%kit_root%\include\10.0.10240.0\um\d3d12.h" (
-  echo Found Windows SDK 10.0.10240.0 - no longer supported.
+
+if ""=="%d3d12_sdk_ver%" (
+  echo Windows 10 SDK version 10.0.%sdk_ver%.0 installed, but did not find d3d12.h.
+  exit /b 1
+) else (
+  echo Found Windows 10 SDK 10.0.%d3d12_sdk_ver%.0
 )
-echo Unable to find a suitable SDK version under %kit_root%\include
-echo Please see the README.md instructions in the project root.
-exit /b 1
 endlocal
+goto :eof
 
 :checkcmake 
 cmake --version | findstr 3.4.3 1>nul 2>nul