Pārlūkot izejas kodu

Debug name part implementation (#264)

* Add documentation with source-level debugging with HLSL and DXIL.
* Fix trailing underscore in generate .rst documentation.
* Add container support for debug name part.
* Bump validator version to 1.1.
* Implement debug stripping in dxc, including /Fd dir-named behavior.
* Implement IDxcCompiler2 and CompileWithDebug.
Marcelo Lopez Ruiz 8 gadi atpakaļ
vecāks
revīzija
ad76d814a4

+ 145 - 143
docs/DXIL.rst

@@ -1901,149 +1901,149 @@ Opcodes are defined on a dense range and will be provided as enum in a header fi
 .. <py::lines('OPCODES-RST')>hctdb_instrhelp.get_opcodes_rst()</py>
 .. OPCODES-RST:BEGIN
 
-=== ============================== =================================================================================================================
-ID  Name                           Description
-=== ============================== =================================================================================================================
-0   TempRegLoad_                   Helper load operation
-1   TempRegStore_                  Helper store operation
-2   MinPrecXRegLoad_               Helper load operation for minprecision
-3   MinPrecXRegStore_              Helper store operation for minprecision
-4   LoadInput_                     Loads the value from shader input
-5   StoreOutput_                   Stores the value to shader output
-6   FAbs_                          returns the absolute value of the input value.
-7   Saturate_                      clamps the result of a single or double precision floating point value to [0.0f...1.0f]
-8   IsNaN_                         Returns true if x is NAN or QNAN, false otherwise.
-9   IsInf_                         Returns true if x is +INF or -INF, false otherwise.
-10  IsFinite_                      Returns true if x is finite, false otherwise.
-11  IsNormal_                      returns IsNormal
-12  Cos_                           returns cosine(theta) for theta in radians.
-13  Sin_                           returns sine(theta) for theta in radians.
-14  Tan_                           returns tan(theta) for theta in radians.
-15  Acos_                          Returns the arccosine of the specified value. Input should be a floating-point value within the range of -1 to 1.
-16  Asin_                          Returns the arccosine of the specified value. Input should be a floating-point value within the range of -1 to 1
-17  Atan_                          Returns the arctangent of the specified value. The return value is within the range of -PI/2 to PI/2.
-18  Hcos_                          returns the hyperbolic cosine of the specified value.
-19  Hsin_                          returns the hyperbolic sine of the specified value.
-20  Htan_                          returns the hyperbolic tangent of the specified value.
-21  Exp_                           returns 2^exponent
-22  Frc_                           extract fracitonal component.
-23  Log_                           returns log base 2.
-24  Sqrt_                          returns square root
-25  Rsqrt_                         returns reciprocal square root (1 / sqrt(src)
-26  Round_ne_                      floating-point round to integral float.
-27  Round_ni_                      floating-point round to integral float.
-28  Round_pi_                      floating-point round to integral float.
-29  Round_z_                       floating-point round to integral float.
-30  Bfrev_                         Reverses the order of the bits.
-31  Countbits_                     Counts the number of bits in the input integer.
-32  FirstbitLo_                    Returns the location of the first set bit starting from the lowest order bit and working upward.
-33  FirstbitHi_                    Returns the location of the first set bit starting from the highest order bit and working downward.
-34  FirstbitSHi_                   Returns the location of the first set bit from the highest order bit based on the sign.
-35  FMax_                          returns a if a >= b, else b
-36  FMin_                          returns a if a < b, else b
-37  IMax_                          IMax(a,b) returns a if a > b, else b
-38  IMin_                          IMin(a,b) returns a if a < b, else b
-39  UMax_                          unsigned integer maximum. UMax(a,b) = a > b ? a : b
-40  UMin_                          unsigned integer minimum. UMin(a,b) = a < b ? a : b
-41  IMul_                          multiply of 32-bit operands to produce the correct full 64-bit result.
-42  UMul_                          multiply of 32-bit operands to produce the correct full 64-bit result.
-43  UDiv_                          unsigned divide of the 32-bit operand src0 by the 32-bit operand src1.
-44  UAddc_                         unsigned add of 32-bit operand with the carry
-45  USubb_                         unsigned subtract of 32-bit operands with the borrow
-46  FMad_                          floating point multiply & add
-47  Fma_                           fused multiply-add
-48  IMad_                          Signed integer multiply & add
-49  UMad_                          Unsigned integer multiply & add
-50  Msad_                          masked Sum of Absolute Differences.
-51  Ibfe_                          Integer bitfield extract
-52  Ubfe_                          Unsigned integer bitfield extract
-53  Bfi_                           Given a bit range from the LSB of a number, places that number of bits in another number at any offset
-54  Dot2_                          Two-dimensional vector dot-product
-55  Dot3_                          Three-dimensional vector dot-product
-56  Dot4_                          Four-dimensional vector dot-product
-57  CreateHandle_                  creates the handle to a resource
-58  CBufferLoad_                   loads a value from a constant buffer resource
-59  CBufferLoadLegacy_             loads a value from a constant buffer resource
-60  Sample_                        samples a texture
-61  SampleBias_                    samples a texture after applying the input bias to the mipmap level
-62  SampleLevel_                   samples a texture using a mipmap-level offset
-63  SampleGrad_                    samples a texture using a gradient to influence the way the sample location is calculated
-64  SampleCmp_                     samples a texture and compares a single component against the specified comparison value
-65  SampleCmpLevelZero_            samples a texture and compares a single component against the specified comparison value
-66  TextureLoad_                   reads texel data without any filtering or sampling
-67  TextureStore_                  reads texel data without any filtering or sampling
-68  BufferLoad_                    reads from a TypedBuffer
-69  BufferStore_                   writes to a RWTypedBuffer
-70  BufferUpdateCounter_           atomically increments/decrements the hidden 32-bit counter stored with a Count or Append UAV
-71  CheckAccessFullyMapped_        determines whether all values from a Sample, Gather, or Load operation accessed mapped tiles in a tiled resource
-72  GetDimensions_                 gets texture size information
-73  TextureGather_                 gathers the four texels that would be used in a bi-linear filtering operation
-74  TextureGatherCmp_              same as TextureGather, except this instrution performs comparison on texels, similar to SampleCmp
-75  Texture2DMSGetSamplePosition_  gets the position of the specified sample
-76  RenderTargetGetSamplePosition_ gets the position of the specified sample
-77  RenderTargetGetSampleCount_    gets the number of samples for a render target
-78  AtomicBinOp_                   performs an atomic operation on two operands
-79  AtomicCompareExchange_         atomic compare and exchange to memory
-80  Barrier_                       inserts a memory barrier in the shader
-81  CalculateLOD_                  calculates the level of detail
-82  Discard_                       discard the current pixel
-83  DerivCoarseX_                  computes the rate of change per stamp in x direction.
-84  DerivCoarseY_                  computes the rate of change per stamp in y direction.
-85  DerivFineX_                    computes the rate of change per pixel in x direction.
-86  DerivFineY_                    computes the rate of change per pixel in y direction.
-87  EvalSnapped_                   evaluates an input attribute at pixel center with an offset
-88  EvalSampleIndex_               evaluates an input attribute at a sample location
-89  EvalCentroid_                  evaluates an input attribute at pixel center
-90  SampleIndex_                   returns the sample index in a sample-frequency pixel shader
-91  Coverage_                      returns the coverage mask input in a pixel shader
-92  InnerCoverage_                 returns underestimated coverage input from conservative rasterization in a pixel shader
-93  ThreadId_                      reads the thread ID
-94  GroupId_                       reads the group ID (SV_GroupID)
-95  ThreadIdInGroup_               reads the thread ID within the group (SV_GroupThreadID)
-96  FlattenedThreadIdInGroup_      provides a flattened index for a given thread within a given group (SV_GroupIndex)
-97  EmitStream_                    emits a vertex to a given stream
-98  CutStream_                     completes the current primitive topology at the specified stream
-99  EmitThenCutStream_             equivalent to an EmitStream followed by a CutStream
-100 GSInstanceID_                  GSInstanceID
-101 MakeDouble_                    creates a double value
-102 SplitDouble_                   splits a double into low and high parts
-103 LoadOutputControlPoint_        LoadOutputControlPoint
-104 LoadPatchConstant_             LoadPatchConstant
-105 DomainLocation_                DomainLocation
-106 StorePatchConstant_            StorePatchConstant
-107 OutputControlPointID_          OutputControlPointID
-108 PrimitiveID_                   PrimitiveID
-109 CycleCounterLegacy_            CycleCounterLegacy
-110 WaveIsFirstLane_               returns 1 for the first lane in the wave
-111 WaveGetLaneIndex_              returns the index of the current lane in the wave
-112 WaveGetLaneCount_              returns the number of lanes in the wave
-113 WaveAnyTrue_                   returns 1 if any of the lane evaluates the value to true
-114 WaveAllTrue_                   returns 1 if all the lanes evaluate the value to true
-115 WaveActiveAllEqual_            returns 1 if all the lanes have the same value
-116 WaveActiveBallot_              returns a struct with a bit set for each lane where the condition is true
-117 WaveReadLaneAt_                returns the value from the specified lane
-118 WaveReadLaneFirst_             returns the value from the first lane
-119 WaveActiveOp_                  returns the result the operation across waves
-120 WaveActiveBit_                 returns the result of the operation across all lanes
-121 WavePrefixOp_                  returns the result of the operation on prior lanes
-122 QuadReadLaneAt_                reads from a lane in the quad
-123 QuadOp_                        returns the result of a quad-level operation
-124 BitcastI16toF16_               bitcast between different sizes
-125 BitcastF16toI16_               bitcast between different sizes
-126 BitcastI32toF32_               bitcast between different sizes
-127 BitcastF32toI32_               bitcast between different sizes
-128 BitcastI64toF64_               bitcast between different sizes
-129 BitcastF64toI64_               bitcast between different sizes
-130 LegacyF32ToF16_                legacy fuction to convert float (f32) to half (f16) (this is not related to min-precision)
-131 LegacyF16ToF32_                legacy fuction to convert half (f16) to float (f32) (this is not related to min-precision)
-132 LegacyDoubleToFloat_           legacy fuction to convert double to float
-133 LegacyDoubleToSInt32_          legacy fuction to convert double to int32
-134 LegacyDoubleToUInt32_          legacy fuction to convert double to uint32
-135 WaveAllBitCount_               returns the count of bits set to 1 across the wave
-136 WavePrefixBitCount_            returns the count of bits set to 1 on prior lanes
-137 AttributeAtVertex_             returns the values of the attributes at the vertex.
-138 ViewID_                        returns the view index
-=== ============================== =================================================================================================================
+=== ============================= =================================================================================================================
+ID  Name                          Description
+=== ============================= =================================================================================================================
+0   TempRegLoad_                  Helper load operation
+1   TempRegStore_                 Helper store operation
+2   MinPrecXRegLoad_              Helper load operation for minprecision
+3   MinPrecXRegStore_             Helper store operation for minprecision
+4   LoadInput_                    Loads the value from shader input
+5   StoreOutput_                  Stores the value to shader output
+6   FAbs_                         returns the absolute value of the input value.
+7   Saturate_                     clamps the result of a single or double precision floating point value to [0.0f...1.0f]
+8   IsNaN_                        Returns true if x is NAN or QNAN, false otherwise.
+9   IsInf_                        Returns true if x is +INF or -INF, false otherwise.
+10  IsFinite_                     Returns true if x is finite, false otherwise.
+11  IsNormal_                     returns IsNormal
+12  Cos_                          returns cosine(theta) for theta in radians.
+13  Sin_                          returns sine(theta) for theta in radians.
+14  Tan_                          returns tan(theta) for theta in radians.
+15  Acos_                         Returns the arccosine of the specified value. Input should be a floating-point value within the range of -1 to 1.
+16  Asin_                         Returns the arccosine of the specified value. Input should be a floating-point value within the range of -1 to 1
+17  Atan_                         Returns the arctangent of the specified value. The return value is within the range of -PI/2 to PI/2.
+18  Hcos_                         returns the hyperbolic cosine of the specified value.
+19  Hsin_                         returns the hyperbolic sine of the specified value.
+20  Htan_                         returns the hyperbolic tangent of the specified value.
+21  Exp_                          returns 2^exponent
+22  Frc_                          extract fracitonal component.
+23  Log_                          returns log base 2.
+24  Sqrt_                         returns square root
+25  Rsqrt_                        returns reciprocal square root (1 / sqrt(src)
+26  Round_ne_                     floating-point round to integral float.
+27  Round_ni_                     floating-point round to integral float.
+28  Round_pi_                     floating-point round to integral float.
+29  Round_z_                      floating-point round to integral float.
+30  Bfrev_                        Reverses the order of the bits.
+31  Countbits_                    Counts the number of bits in the input integer.
+32  FirstbitLo_                   Returns the location of the first set bit starting from the lowest order bit and working upward.
+33  FirstbitHi_                   Returns the location of the first set bit starting from the highest order bit and working downward.
+34  FirstbitSHi_                  Returns the location of the first set bit from the highest order bit based on the sign.
+35  FMax_                         returns a if a >= b, else b
+36  FMin_                         returns a if a < b, else b
+37  IMax_                         IMax(a,b) returns a if a > b, else b
+38  IMin_                         IMin(a,b) returns a if a < b, else b
+39  UMax_                         unsigned integer maximum. UMax(a,b) = a > b ? a : b
+40  UMin_                         unsigned integer minimum. UMin(a,b) = a < b ? a : b
+41  IMul_                         multiply of 32-bit operands to produce the correct full 64-bit result.
+42  UMul_                         multiply of 32-bit operands to produce the correct full 64-bit result.
+43  UDiv_                         unsigned divide of the 32-bit operand src0 by the 32-bit operand src1.
+44  UAddc_                        unsigned add of 32-bit operand with the carry
+45  USubb_                        unsigned subtract of 32-bit operands with the borrow
+46  FMad_                         floating point multiply & add
+47  Fma_                          fused multiply-add
+48  IMad_                         Signed integer multiply & add
+49  UMad_                         Unsigned integer multiply & add
+50  Msad_                         masked Sum of Absolute Differences.
+51  Ibfe_                         Integer bitfield extract
+52  Ubfe_                         Unsigned integer bitfield extract
+53  Bfi_                          Given a bit range from the LSB of a number, places that number of bits in another number at any offset
+54  Dot2_                         Two-dimensional vector dot-product
+55  Dot3_                         Three-dimensional vector dot-product
+56  Dot4_                         Four-dimensional vector dot-product
+57  CreateHandle                  creates the handle to a resource
+58  CBufferLoad                   loads a value from a constant buffer resource
+59  CBufferLoadLegacy             loads a value from a constant buffer resource
+60  Sample                        samples a texture
+61  SampleBias                    samples a texture after applying the input bias to the mipmap level
+62  SampleLevel                   samples a texture using a mipmap-level offset
+63  SampleGrad                    samples a texture using a gradient to influence the way the sample location is calculated
+64  SampleCmp                     samples a texture and compares a single component against the specified comparison value
+65  SampleCmpLevelZero            samples a texture and compares a single component against the specified comparison value
+66  TextureLoad                   reads texel data without any filtering or sampling
+67  TextureStore                  reads texel data without any filtering or sampling
+68  BufferLoad                    reads from a TypedBuffer
+69  BufferStore                   writes to a RWTypedBuffer
+70  BufferUpdateCounter           atomically increments/decrements the hidden 32-bit counter stored with a Count or Append UAV
+71  CheckAccessFullyMapped        determines whether all values from a Sample, Gather, or Load operation accessed mapped tiles in a tiled resource
+72  GetDimensions                 gets texture size information
+73  TextureGather                 gathers the four texels that would be used in a bi-linear filtering operation
+74  TextureGatherCmp              same as TextureGather, except this instrution performs comparison on texels, similar to SampleCmp
+75  Texture2DMSGetSamplePosition  gets the position of the specified sample
+76  RenderTargetGetSamplePosition gets the position of the specified sample
+77  RenderTargetGetSampleCount    gets the number of samples for a render target
+78  AtomicBinOp                   performs an atomic operation on two operands
+79  AtomicCompareExchange         atomic compare and exchange to memory
+80  Barrier                       inserts a memory barrier in the shader
+81  CalculateLOD                  calculates the level of detail
+82  Discard                       discard the current pixel
+83  DerivCoarseX_                 computes the rate of change per stamp in x direction.
+84  DerivCoarseY_                 computes the rate of change per stamp in y direction.
+85  DerivFineX_                   computes the rate of change per pixel in x direction.
+86  DerivFineY_                   computes the rate of change per pixel in y direction.
+87  EvalSnapped                   evaluates an input attribute at pixel center with an offset
+88  EvalSampleIndex               evaluates an input attribute at a sample location
+89  EvalCentroid                  evaluates an input attribute at pixel center
+90  SampleIndex                   returns the sample index in a sample-frequency pixel shader
+91  Coverage                      returns the coverage mask input in a pixel shader
+92  InnerCoverage                 returns underestimated coverage input from conservative rasterization in a pixel shader
+93  ThreadId                      reads the thread ID
+94  GroupId                       reads the group ID (SV_GroupID)
+95  ThreadIdInGroup               reads the thread ID within the group (SV_GroupThreadID)
+96  FlattenedThreadIdInGroup      provides a flattened index for a given thread within a given group (SV_GroupIndex)
+97  EmitStream                    emits a vertex to a given stream
+98  CutStream                     completes the current primitive topology at the specified stream
+99  EmitThenCutStream             equivalent to an EmitStream followed by a CutStream
+100 GSInstanceID                  GSInstanceID
+101 MakeDouble                    creates a double value
+102 SplitDouble                   splits a double into low and high parts
+103 LoadOutputControlPoint        LoadOutputControlPoint
+104 LoadPatchConstant             LoadPatchConstant
+105 DomainLocation                DomainLocation
+106 StorePatchConstant            StorePatchConstant
+107 OutputControlPointID          OutputControlPointID
+108 PrimitiveID                   PrimitiveID
+109 CycleCounterLegacy            CycleCounterLegacy
+110 WaveIsFirstLane               returns 1 for the first lane in the wave
+111 WaveGetLaneIndex              returns the index of the current lane in the wave
+112 WaveGetLaneCount              returns the number of lanes in the wave
+113 WaveAnyTrue                   returns 1 if any of the lane evaluates the value to true
+114 WaveAllTrue                   returns 1 if all the lanes evaluate the value to true
+115 WaveActiveAllEqual            returns 1 if all the lanes have the same value
+116 WaveActiveBallot              returns a struct with a bit set for each lane where the condition is true
+117 WaveReadLaneAt                returns the value from the specified lane
+118 WaveReadLaneFirst             returns the value from the first lane
+119 WaveActiveOp                  returns the result the operation across waves
+120 WaveActiveBit                 returns the result of the operation across all lanes
+121 WavePrefixOp                  returns the result of the operation on prior lanes
+122 QuadReadLaneAt                reads from a lane in the quad
+123 QuadOp                        returns the result of a quad-level operation
+124 BitcastI16toF16               bitcast between different sizes
+125 BitcastF16toI16               bitcast between different sizes
+126 BitcastI32toF32               bitcast between different sizes
+127 BitcastF32toI32               bitcast between different sizes
+128 BitcastI64toF64               bitcast between different sizes
+129 BitcastF64toI64               bitcast between different sizes
+130 LegacyF32ToF16                legacy fuction to convert float (f32) to half (f16) (this is not related to min-precision)
+131 LegacyF16ToF32                legacy fuction to convert half (f16) to float (f32) (this is not related to min-precision)
+132 LegacyDoubleToFloat           legacy fuction to convert double to float
+133 LegacyDoubleToSInt32          legacy fuction to convert double to int32
+134 LegacyDoubleToUInt32          legacy fuction to convert double to uint32
+135 WaveAllBitCount               returns the count of bits set to 1 across the wave
+136 WavePrefixBitCount            returns the count of bits set to 1 on prior lanes
+137 AttributeAtVertex_            returns the values of the attributes at the vertex.
+138 ViewID                        returns the view index
+=== ============================= =================================================================================================================
 
 
 Acos
@@ -2902,6 +2902,8 @@ Support is provided in the Microsoft Windows family of operating systems, when r
 
 The HLSL language is versioned independently of DXIL, and currently follows an 'HLSL <year>' naming scheme. HLSL 2015 is the dialect supported by the d3dcompiler_47 library; a limited form of support is provided in the open source HLSL on LLVM project. HLSL 2016 is the version supported by the current HLSL on LLVM project, which removes some features (primarily effect framework syntax, backquote operator) and adds new ones (wave intrinsics and basic i64 support).
 
+.. _dxil_container_format:
+
 DXIL Container Format
 ---------------------
 

+ 3 - 0
docs/SourceLevelDebugging.rst

@@ -14,6 +14,9 @@ information takes <format>`, which is useful for those interested in creating
 front-ends or dealing directly with the information.  Further, this document
 provides specific examples of what debug information for C/C++ looks like.
 
+HLSL and DXIL-specific information is available in the :doc:`Source Level
+Debugging with HLSL <SourceLevelDebuggingHLSL>` document.
+
 Philosophy behind LLVM debugging information
 --------------------------------------------
 

+ 130 - 0
docs/SourceLevelDebuggingHLSL.rst

@@ -0,0 +1,130 @@
+================================
+Source Level Debugging with HLSL
+================================
+
+.. contents::
+   :local:
+
+Introduction
+============
+
+This document describes the specifics of source level debuging with HLSL. The
+basic infrastructure is based on :doc:`Source Level Debugging with LLVM
+<SourceLevelDebugging>`, so the focus here is on the specifics of DXIL
+programs compiled from HLSL.
+
+DXIL Debug Information Format
+=============================
+
+The debug information for an HLSL program in DXIL form is stored as an LLVM
+module with debug information represented according to the :doc:`Source Level
+Debugging with LLVM <SourceLevelDebugging>` document.
+
+The :ref:`dxil_container_format` describes how a single data structure
+holds both a DXIL program, debug information, and other optional parts.
+
+There are three parts that are associated with debug information.
+
+* DFCC_DXIL ('DXIL'). A valid DXIL program has no debug information. This is
+  the program described by debug information.
+
+* DFCC_ShaderDebugInfoDXIL ('ILDB'). This is an LLVM module with debug
+  information. It's an augmented version of the original DXIL module. For
+  historical reasons, this is sometimes referred to as 'the PDB of the
+  program'.
+
+* DFCC_ShaderDebugName ('ILDN'). This is a name for an external entity holding
+  the debug information.
+
+Using Debug Information
+=======================
+
+The debug information can be used directly by looking up the debug information
+part and loading into an LLVM module. There is full fidelity with debug
+information via this mechanism, although it requires linking in the LLVM
+supporting libraries.
+
+For compatibility, the dxcompiler.dll binary also exposes a limited
+implementation of the DIA APIs. To do this, a CLSID_DxcDiaDataSource class
+should be created via a call to DxcCreateInstance, and a loadDataFromIStream
+call with the debug part will initialize it.
+
+The DxcContext::Recompile implementation provides an example of how to
+initialize the diagnostic objects from debug information, extract high-level
+information and recreate the compilation options and inputs.
+
+Using Debug Names
+=================
+
+The only current use case for the debug name is as a relative path to a file
+that provides shader debug information. A debugging tool would typically have
+a list of paths to act as search roots.
+
+Command-Line Options
+====================
+
+The following command-line options are used with the DirectX Shader Compiler
+tools to work with debug information.
+
+* /Zi. Enables debug information during compilation.
+
+* /Zss. Builds debug names that consider source information.
+
+* /Zsb. Builds debug names that consider only the output binary.
+
+* /Fd. Extracts debug information to a different file.
+
+* /Qstrip_debug. Removes debug information from a container.
+
+The most common use cases are as follows.
+
+* Build debug information and leave it in the container. In this case, simply
+  compiling with /Zi will do the trick.
+
+* Build debug information and extract it to an auto-generated external
+  file. In this case, /Zi and /Fd should both be used, and the /Fd value
+  should end in a trailing backslash when using dxc, naming the target
+  directory in which to place the file. /Zss is the default, but /Zsb can be
+  used to deduplicate files. When using /Fd with a directory name,
+  /Qstrip_debug is implied.
+
+A less common use case is to specify an explicit name for the external
+file. In this case, the command-line should include /Zi, /Fd with a specific
+name, and /Qstrip_debug.
+
+Implementation Notes
+====================
+
+The current implementation provides a few interesting behaviors worth noting.
+
+* The shader debug name is derived from either the DXIL or the ILDB parts by
+  hashing the byte contents, but it can be replaced programmatically.
+
+* Source content is included in the debug information blob by default. This
+  helps with scenarios where the code never exists on-disk, but is instead
+  generated on-the-fly.
+  
+* Typically the derivation is done from the ILDB part, which includes
+  source-specific information, and so two shaders with different sources will
+  have different debug information. However the option is provided via (-Zsb)
+  to include debug information that only takes into consideration the DXIL
+  binary. In this case, two shaders that compile to the same binary will have
+  the same debug information, which can be used to deduplicate content when
+  any equivalent source program is acceptable for debugging.
+
+Future Directions
+=================
+
+This section is purely speculative, but captures some of the thoughts about
+future debugging capabilities.
+
+* If driver-level constructs should be debugged, they need to be mapped to
+  DXIL first, and from there on to HLSL.
+
+* Including content in debug is convenient, especially when sources are
+  transient, but they are inefficient (again, especially for a large number of
+  transient sources). Deduplicating sources would be beneficial.
+
+* Integration with symbol servers and source servers can simplify some of the
+  developer workflows.
+

+ 5 - 0
docs/index.rst

@@ -170,6 +170,7 @@ For API clients and LLVM developers.
    MarkedUpDisassembly
    SystemLibrary
    SourceLevelDebugging
+   SourceLevelDebuggingHLSL
    Vectorizers
    WritingAnLLVMBackend
    WritingAnLLVMPass
@@ -204,6 +205,10 @@ For API clients and LLVM developers.
    This document describes the design and philosophy behind the LLVM
    source-level debugger.
 
+:doc:`Source Level Debugging with HLSL <SourceLevelDebuggingHLSL>`
+    This document describes specifics of using source-level debuggers for DXIL
+    and HLSL.
+
 :doc:`Vectorizers`
    This document describes the current status of vectorization in LLVM.
 

+ 55 - 4
include/dxc/HLSL/DxilContainer.h

@@ -76,6 +76,7 @@ enum DxilFourCC {
   DFCC_PatchConstantSignature   = DXIL_FOURCC('P', 'S', 'G', '1'),
   DFCC_ShaderStatistics         = DXIL_FOURCC('S', 'T', 'A', 'T'),
   DFCC_ShaderDebugInfoDXIL      = DXIL_FOURCC('I', 'L', 'D', 'B'),
+  DFCC_ShaderDebugName          = DXIL_FOURCC('I', 'L', 'D', 'N'),
   DFCC_FeatureInfo              = DXIL_FOURCC('S', 'F', 'I', '0'),
   DFCC_PrivateData              = DXIL_FOURCC('P', 'R', 'I', 'V'),
   DFCC_RootSignature            = DXIL_FOURCC('R', 'T', 'S', '0'),
@@ -105,9 +106,8 @@ static const uint64_t ShaderFeatureInfo_Int64Ops = 0x8000;
 
 static const unsigned ShaderFeatureInfoCount = 16;
 
-struct DxilShaderFeatureInfo
-{
-    uint64_t FeatureFlags;
+struct DxilShaderFeatureInfo {
+  uint64_t FeatureFlags;
 };
 
 // DXIL program information.
@@ -210,6 +210,15 @@ struct DxilProgramSignatureElement {
 // Easy to get this wrong. Earlier assertions can help determine
 static_assert(sizeof(DxilProgramSignatureElement) == 0x20, "else DxilProgramSignatureElement is misaligned");
 
+struct DxilShaderDebugName {
+  uint16_t Flags;       // Reserved, must be set to zero.
+  uint16_t NameLength;  // Length of the debug name, without null terminator.
+  // Followed by NameLength bytes of the UTF-8-encoded name.
+  // Followed by a null terminator.
+  // Followed by [0-3] zero bytes to align to a 4-byte boundary.
+};
+static const size_t MinDxilShaderDebugNameSize = sizeof(DxilShaderDebugName) + 4;
+
 #pragma pack(pop)
 
 /// Gets a part header by index.
@@ -376,6 +385,33 @@ inline uint32_t EncodeVersion(DXIL::ShaderKind shaderType, uint32_t major,
   return ((unsigned)shaderType << 16) | (major << 4) | minor;
 }
 
+inline bool IsDxilShaderDebugNameValid(const DxilPartHeader *pPart) {
+  if (pPart->PartFourCC != DFCC_ShaderDebugName) return false;
+  if (pPart->PartSize < MinDxilShaderDebugNameSize) return false;
+  const DxilShaderDebugName *pDebugNameContent = reinterpret_cast<const DxilShaderDebugName *>(GetDxilPartData(pPart));
+  uint16_t ExpectedSize = sizeof(DxilShaderDebugName) + pDebugNameContent->NameLength + 1;
+  if (ExpectedSize & 0x3) {
+    ExpectedSize += 0x4;
+    ExpectedSize &= ~(0x3);
+  }
+  if (pPart->PartSize != ExpectedSize) return false;
+  return true;
+}
+
+inline bool GetDxilShaderDebugName(const DxilPartHeader *pDebugNamePart,
+  const char **ppUtf8Name, _Out_opt_ uint16_t *pUtf8NameLen) {
+  *ppUtf8Name = nullptr;
+  if (!IsDxilShaderDebugNameValid(pDebugNamePart)) {
+    return false;
+  }
+  const DxilShaderDebugName *pDebugNameContent = reinterpret_cast<const DxilShaderDebugName *>(GetDxilPartData(pDebugNamePart));
+  if (pUtf8NameLen) {
+    *pUtf8NameLen = pDebugNameContent->NameLength;
+  }
+  *ppUtf8Name = (const char *)(pDebugNameContent + 1);
+  return true;
+}
+
 class DxilPartWriter {
 public:
   virtual ~DxilPartWriter() {}
@@ -397,9 +433,24 @@ public:
 
 DxilContainerWriter *NewDxilContainerWriter();
 
+enum class SerializeDxilFlags {
+  None = 0,                     // No flags defined.
+  IncludeDebugInfoPart = 1,     // Include the debug info part in the container.
+  IncludeDebugNamePart = 2,     // Include the debug name part in the container.
+  DebugNameDependOnSource = 4   // Make the debug name depend on source (and not just final module).
+};
+inline SerializeDxilFlags& operator |=(SerializeDxilFlags& l, const SerializeDxilFlags& r) {
+  l = static_cast<SerializeDxilFlags>(static_cast<int>(l) | static_cast<int>(r));
+  return l;
+}
+inline int operator&(SerializeDxilFlags l, SerializeDxilFlags r) {
+  return static_cast<int>(l) & static_cast<int>(r);
+}
+
 void SerializeDxilContainerForModule(hlsl::DxilModule *pModule,
                                      AbstractMemoryStream *pModuleBitcode,
-                                     AbstractMemoryStream *pStream);
+                                     AbstractMemoryStream *pStream,
+                                     SerializeDxilFlags Flags);
 void SerializeDxilContainerForRootSignature(hlsl::RootSignatureHandle *pRootSigHandle,
                                      AbstractMemoryStream *pStream);
 

+ 2 - 0
include/dxc/Support/HLSLOptions.h

@@ -113,6 +113,8 @@ public:
   bool ColorCodeAssembly; // OPT_Cc
   bool CodeGenHighLevel; // OPT_fcgl
   bool DebugInfo; // OPT__SLASH_Zi
+  bool DebugNameForBinary; // OPT_Zsb
+  bool DebugNameForSource; // OPT_Zss
   bool DumpBin;        // OPT_dumpbin
   bool WarningAsError; // OPT__SLASH_WX
   bool IEEEStrict;     // OPT_Gis

+ 7 - 3
include/dxc/Support/HLSLOptions.td

@@ -254,6 +254,10 @@ def Zpr : Flag<["-", "/"], "Zpr">, Flags<[CoreOption]>, Group<hlslcomp_Group>,
   HelpText<"Pack matrices in row-major order">;
 def Zpc : Flag<["-", "/"], "Zpc">, Flags<[CoreOption]>, Group<hlslcomp_Group>,
   HelpText<"Pack matrices in column-major order">;
+def Zss : Flag<["-", "/"], "Zss">, Flags<[CoreOption]>, Group<hlslcomp_Group>,
+  HelpText<"Build debug name considering source information">;
+def Zsb : Flag<["-", "/"], "Zsb">, Flags<[CoreOption]>, Group<hlslcomp_Group>,
+  HelpText<"Build debug name considering only output binary">;
 
 // deprecated /Gpp def Gpp : Flag<["-", "/"], "Gpp">, HelpText<"Force partial precision">;
 def Gfa : Flag<["-", "/"], "Gfa">, HelpText<"Avoid flow control constructs">, Flags<[CoreOption]>, Group<hlslcomp_Group>;
@@ -267,8 +271,8 @@ def Fo : JoinedOrSeparate<["-", "/"], "Fo">, MetaVarName<"<file>">, HelpText<"Ou
 def Fc : JoinedOrSeparate<["-", "/"], "Fc">, MetaVarName<"<file>">, HelpText<"Output assembly code listing file">, Flags<[DriverOption]>, Group<hlslcomp_Group>;
 //def Fx : JoinedOrSeparate<["-", "/"], "Fx">, MetaVarName<"<file>">, HelpText<"Output assembly code and hex listing file">;
 def Fh : JoinedOrSeparate<["-", "/"], "Fh">, MetaVarName<"<file>">, HelpText<"Output header file containing object code">, Flags<[DriverOption]>, Group<hlslcomp_Group>;
-def Fe : JoinedOrSeparate<["-", "/"], "Fe">, MetaVarName<"<file>">, HelpText<"Output warnings and errors to a specific file">, Flags<[DriverOption]>, Group<hlslcomp_Group>;
-def Fd : JoinedOrSeparate<["-", "/"], "Fd">, MetaVarName<"<file>">, HelpText<"Extract LLVM Debug IR and write to given file">, Flags<[DriverOption]>, Group<hlslcomp_Group>;
+def Fe : JoinedOrSeparate<["-", "/"], "Fe">, MetaVarName<"<file>">, HelpText<"Output warnings and errors to the given file">, Flags<[DriverOption]>, Group<hlslcomp_Group>;
+def Fd : JoinedOrSeparate<["-", "/"], "Fd">, MetaVarName<"<file>">, HelpText<"Write debug information to the given file or directory; trail \\ to auto-generate and imply Qstrip_priv">, Flags<[DriverOption]>, Group<hlslcomp_Group>;
 def Vn : JoinedOrSeparate<["-", "/"], "Vn">, MetaVarName<"<name>">, HelpText<"Use <name> as variable name in header file">, Flags<[DriverOption]>, Group<hlslcomp_Group>;
 def Cc : Flag<["-", "/"], "Cc">, HelpText<"Output color coded assembly listings">, Group<hlslcomp_Group>, Flags<[DriverOption]>;
 def Ni : Flag<["-", "/"], "Ni">, HelpText<"Output instruction numbers in assembly listings">, Group<hlslcomp_Group>, Flags<[DriverOption]>;
@@ -285,7 +289,7 @@ def dumpbin : Flag<["-", "/"], "dumpbin">, Flags<[DriverOption]>, Group<hlslutil
   HelpText<"Load a binary file rather than compiling">;
 def Qstrip_reflect : Flag<["-", "/"], "Qstrip_reflect">, Flags<[DriverOption]>, Group<hlslutil_Group>,
   HelpText<"Strip reflection data from shader bytecode  (must be used with /Fo <file>)">;
-def Qstrip_debug : Flag<["-", "/"], "Qstrip_debug">, Flags<[DriverOption]>, Group<hlslutil_Group>,
+def Qstrip_debug : Flag<["-", "/"], "Qstrip_debug">, Flags<[CoreOption]>, Group<hlslutil_Group>,
   HelpText<"Strip debug information from 4_0+ shader bytecode  (must be used with /Fo <file>)">;
 def Qstrip_priv : Flag<["-", "/"], "Qstrip_priv">, Flags<[DriverOption]>, Group<hlslutil_Group>,
   HelpText<"Strip private data from shader bytecode  (must be used with /Fo <file>)">;

+ 4 - 0
include/dxc/Support/Unicode.h

@@ -50,6 +50,10 @@ bool IsStarMatchUTF8(_In_reads_opt_(maskLen) const char *pMask, size_t maskLen,
 bool IsStarMatchUTF16(_In_reads_opt_(maskLen) const wchar_t *pMask, size_t maskLen,
                       _In_reads_opt_(nameLen) const wchar_t *pName, size_t nameLen);
 
+_Success_(return != false)
+bool UTF8BufferToUTF16ComHeap(_In_z_ const char *pUTF8,
+                              _Outptr_result_z_ wchar_t **ppUTF16) throw();
+
 _Success_(return != false)
 bool UTF8BufferToUTF16Buffer(
   _In_NLS_string_(cbUTF8) const char *pUTF8,

+ 22 - 0
include/dxc/Support/microcom.h

@@ -198,6 +198,28 @@ HRESULT DoBasicQueryInterface4(TObject* self, REFIID iid, void** ppvObject)
   return DoBasicQueryInterface3<TInterface, TInterface2, TInterface3, TObject>(self, iid, ppvObject);
 }
 
+/// <summary>
+/// Provides a QueryInterface implementation for a class that supports
+/// five interfaces in addition to IUnknown.
+/// </summary>
+/// <remarks>
+/// This implementation will also report the instance as not supporting
+/// marshaling. This will help catch marshaling problems early or avoid
+/// them altogether.
+/// </remarks>
+template <typename TInterface, typename TInterface2, typename TInterface3, typename TInterface4, typename TInterface5, typename TObject>
+HRESULT DoBasicQueryInterface5(TObject* self, REFIID iid, void** ppvObject)
+{
+  if (ppvObject == nullptr) return E_POINTER;
+  if (IsEqualIID(iid, __uuidof(TInterface5))) {
+    *(TInterface5**)ppvObject = self;
+    self->AddRef();
+    return S_OK;
+  }
+
+  return DoBasicQueryInterface4<TInterface, TInterface2, TInterface3, TInterface4, TObject>(self, iid, ppvObject);
+}
+
 template <typename T>
 HRESULT AssignToOut(T value, _Out_ T* pResult) {
   if (pResult == nullptr)

+ 20 - 0
include/dxc/dxcapi.h

@@ -156,12 +156,32 @@ IDxcCompiler : public IUnknown {
     _COM_Outptr_ IDxcOperationResult **ppResult   // Preprocessor output status, buffer, and errors
   ) = 0;
 
+  // Disassemble a program.
   virtual HRESULT STDMETHODCALLTYPE Disassemble(
     _In_ IDxcBlob *pSource,                         // Program to disassemble.
     _COM_Outptr_ IDxcBlobEncoding **ppDisassembly   // Disassembly text.
     ) = 0;
 };
 
+struct __declspec(uuid("A005A9D9-B8BB-4594-B5C9-0E633BEC4D37"))
+IDxcCompiler2 : public IDxcCompiler {
+  // Compile a single entry point to the target shader model with debug information.
+  virtual HRESULT STDMETHODCALLTYPE CompileWithDebug(
+    _In_ IDxcBlob *pSource,                       // Source text to compile
+    _In_opt_ LPCWSTR pSourceName,                 // Optional file name for pSource. Used in errors and include handlers.
+    _In_ LPCWSTR pEntryPoint,                     // Entry point name
+    _In_ LPCWSTR pTargetProfile,                  // Shader profile to compile
+    _In_count_(argCount) LPCWSTR *pArguments,     // Array of pointers to arguments
+    _In_ UINT32 argCount,                         // Number of arguments
+    _In_count_(defineCount) const DxcDefine *pDefines,  // Array of defines
+    _In_ UINT32 defineCount,                      // Number of defines
+    _In_opt_ IDxcIncludeHandler *pIncludeHandler, // user-provided interface to handle #include directives (optional)
+    _COM_Outptr_ IDxcOperationResult **ppResult,  // Compiler output status, buffer, and errors
+    _Outptr_opt_result_z_ LPWSTR *ppDebugBlobName,// Suggested file name for debug blob.
+    _COM_Outptr_opt_ IDxcBlob **ppDebugBlob       // Debug blob
+  ) = 0;
+};
+
 static const UINT32 DxcValidatorFlags_Default = 0;
 static const UINT32 DxcValidatorFlags_InPlaceEdit = 1;  // Validator is allowed to update shader blob in-place.
 static const UINT32 DxcValidatorFlags_RootSignatureOnly = 2;

+ 10 - 0
lib/DxcSupport/HLSLOptions.cpp

@@ -252,6 +252,8 @@ int ReadDxcOpts(const OptTable *optionTable, unsigned flagsToInclude,
   opts.GenSPIRV = Args.hasFlag(OPT_spirv, OPT_INVALID, false); // SPIRV change
   opts.CodeGenHighLevel = Args.hasFlag(OPT_fcgl, OPT_INVALID, false);
   opts.DebugInfo = Args.hasFlag(OPT__SLASH_Zi, OPT_INVALID, false);
+  opts.DebugNameForBinary = Args.hasFlag(OPT_Zsb, OPT_INVALID, false);
+  opts.DebugNameForSource = Args.hasFlag(OPT_Zsb, OPT_INVALID, false);
   opts.VariableName = Args.getLastArgValue(OPT_Vn);
   opts.InputFile = Args.getLastArgValue(OPT_INPUT);
   opts.ForceRootSigVer = Args.getLastArgValue(OPT_force_rootsig_ver);
@@ -375,6 +377,14 @@ int ReadDxcOpts(const OptTable *optionTable, unsigned flagsToInclude,
     return 1;
   }
 
+  if (!opts.DebugNameForBinary && !opts.DebugNameForSource) {
+    opts.DebugNameForSource = true;
+  }
+  else if (opts.DebugNameForBinary && opts.DebugNameForSource) {
+    errors << "Cannot specify both /Zss and /Zsb";
+    return 1;
+  }
+
   opts.Args = std::move(Args);
   return 0;
 }

+ 16 - 0
lib/DxcSupport/Unicode.cpp

@@ -122,6 +122,22 @@ std::string UTF16ToUTF8StringOrThrow(_In_z_ const wchar_t *pUTF16) {
   return result;
 }
 
+_Use_decl_annotations_
+bool UTF8BufferToUTF16ComHeap(const char *pUTF8, wchar_t **ppUTF16) throw() {
+  *ppUTF16 = nullptr;
+  int c = ::MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, pUTF8, -1,
+                                nullptr, 0);
+  if (c == 0)
+    return false;
+  CComHeapPtr<wchar_t> p;
+  if (!p.Allocate(c))
+    return false;
+  DXVERIFY_NOMSG(0 < ::MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, pUTF8,
+                                           -1, p.m_pData, c));
+  *ppUTF16 = p.Detach();
+  return true;
+}
+
 _Use_decl_annotations_
 bool UTF8BufferToUTF16Buffer(const char *pUTF8, int cbUTF8, wchar_t **ppUTF16, size_t *pcUTF16) throw() {
   *ppUTF16 = nullptr;

+ 41 - 4
lib/HLSL/DxilContainerAssembler.cpp

@@ -13,6 +13,7 @@
 #include "llvm/IR/Module.h"
 #include "llvm/IR/DebugInfo.h"
 #include "llvm/Bitcode/ReaderWriter.h"
+#include "llvm/Support/MD5.h"
 #include "dxc/HLSL/DxilContainer.h"
 #include "dxc/HLSL/DxilModule.h"
 #include "dxc/HLSL/DxilShaderModel.h"
@@ -612,7 +613,8 @@ static void WriteProgramPart(const ShaderModel *pModel,
 
 void hlsl::SerializeDxilContainerForModule(DxilModule *pModule,
                                            AbstractMemoryStream *pModuleBitcode,
-                                           AbstractMemoryStream *pFinalStream) {
+                                           AbstractMemoryStream *pFinalStream,
+                                           SerializeDxilFlags Flags) {
   // TODO: add a flag to update the module and remove information that is not part
   // of DXIL proper and is used only to assemble the container.
 
@@ -680,9 +682,11 @@ void hlsl::SerializeDxilContainerForModule(DxilModule *pModule,
   if (HasDebugInfo(*pModule->GetModule())) {
     uint32_t debugInUInt32, debugPaddingBytes;
     GetPaddedProgramPartSize(pInputProgramStream, debugInUInt32, debugPaddingBytes);
-    writer.AddPart(DFCC_ShaderDebugInfoDXIL, debugInUInt32 * sizeof(uint32_t) + sizeof(DxilProgramHeader), [&](AbstractMemoryStream *pStream) {
-      WriteProgramPart(pModule->GetShaderModel(), pInputProgramStream, pStream);
-    });
+    if (Flags & SerializeDxilFlags::IncludeDebugInfoPart) {
+      writer.AddPart(DFCC_ShaderDebugInfoDXIL, debugInUInt32 * sizeof(uint32_t) + sizeof(DxilProgramHeader), [&](AbstractMemoryStream *pStream) {
+        WriteProgramPart(pModule->GetShaderModel(), pInputProgramStream, pStream);
+      });
+    }
 
     pProgramStream.Release();
 
@@ -694,6 +698,39 @@ void hlsl::SerializeDxilContainerForModule(DxilModule *pModule,
     IFT(CreateMemoryStream(pMalloc, &pProgramStream));
     raw_stream_ostream outStream(pProgramStream.p);
     WriteBitcodeToFile(pModule->GetModule(), outStream, true);
+
+    if (Flags & SerializeDxilFlags::IncludeDebugNamePart) {
+      CComPtr<AbstractMemoryStream> pHashStream;
+      // If the debug name should be specific to the sources, base the name on the debug
+      // bitcode, which will include the source references, line numbers, etc. Otherwise,
+      // do it exclusively on the target shader bitcode.
+      pHashStream = (int)(Flags & SerializeDxilFlags::DebugNameDependOnSource) ? pModuleBitcode : pProgramStream;
+      const uint32_t DebugInfoNameHashLen = 32;   // 32 chars of MD5
+      const uint32_t DebugInfoNameSuffix = 4;     // '.lld'
+      const uint32_t DebugInfoNameNullAndPad = 4; // '\0\0\0\0'
+      const uint32_t DebugInfoContentLen =
+          sizeof(DxilShaderDebugName) + DebugInfoNameHashLen +
+          DebugInfoNameSuffix + DebugInfoNameNullAndPad;
+      writer.AddPart(DFCC_ShaderDebugName, DebugInfoContentLen, [&](AbstractMemoryStream *pStream) {
+        DxilShaderDebugName NameContent;
+        NameContent.Flags = 0;
+        NameContent.NameLength = DebugInfoNameHashLen + DebugInfoNameSuffix;
+        IFT(WriteStreamValue(pStream, NameContent));
+
+        ArrayRef<uint8_t> Data((uint8_t *)pHashStream->GetPtr(), pHashStream->GetPtrSize());
+        llvm::MD5 md5;
+        llvm::MD5::MD5Result md5Result;
+        SmallString<32> Hash;
+        md5.update(Data);
+        md5.final(md5Result);
+        md5.stringifyResult(md5Result, Hash);
+
+        ULONG cbWritten;
+        IFT(pStream->Write(Hash.data(), Hash.size(), &cbWritten));
+        const char SuffixAndPad[] = ".lld\0\0\0";
+        IFT(pStream->Write(SuffixAndPad, _countof(SuffixAndPad), &cbWritten));
+      });
+    }
   }
 
   // Compute padded bitcode size.

+ 4 - 1
lib/HLSL/DxilValidation.cpp

@@ -4070,7 +4070,9 @@ static void ValidateUninitializedOutput(ValidationContext &ValCtx) {
 }
 
 void GetValidationVersion(_Out_ unsigned *pMajor, _Out_ unsigned *pMinor) {
-  // Bump these versions after 1.0 to account for additional validation rules.
+  // 1.0 is the first validator.
+  // 1.1 adds:
+  // - ILDN container part support
   *pMajor = 1;
   *pMinor = 1;
 }
@@ -4334,6 +4336,7 @@ HRESULT ValidateDxilContainerParts(llvm::Module *pModule,
     case DFCC_PrivateData:
     case DFCC_DXIL:
     case DFCC_ShaderDebugInfoDXIL:
+    case DFCC_ShaderDebugName:
       continue;
 
     case DFCC_Container:

+ 38 - 7
tools/clang/tools/dxc/dxc.cpp

@@ -78,6 +78,7 @@ private:
   DxcDllSupport &m_dxcSupport;
 
   int ActOnBlob(IDxcBlob *pBlob);
+  int ActOnBlob(IDxcBlob *pBlob, IDxcBlob *pDebugBlob, LPCWSTR pDebugBlobName);
   void UpdatePart(IDxcBlob *pBlob, IDxcBlob **ppResult);
   bool UpdatePartRequired();
   void WriteHeader(IDxcBlobEncoding *pDisassembly, IDxcBlob *pCode,
@@ -135,6 +136,10 @@ static void WritePartToFile(IDxcBlob *pBlob, hlsl::DxilFourCC CC,
 // This function is called either after the compilation is done or /dumpbin option is provided
 // Performing options that are used to process dxil container.
 int DxcContext::ActOnBlob(IDxcBlob *pBlob) {
+  return ActOnBlob(pBlob, nullptr, nullptr);
+}
+
+int DxcContext::ActOnBlob(IDxcBlob *pBlob, IDxcBlob *pDebugBlob, LPCWSTR pDebugBlobName) {
   int retVal = 0;
   // Text output.
   if (m_Opts.AstDump || m_Opts.OptDump) {
@@ -164,7 +169,14 @@ int DxcContext::ActOnBlob(IDxcBlob *pBlob) {
       "/Zi switch to generate debug "
       "information compiling this shader.");
 
-    WritePartToFile(pBlob, hlsl::DFCC_ShaderDebugInfoDXIL, m_Opts.DebugFile);
+    if (pDebugBlob != nullptr) {
+      IFTBOOLMSG(pDebugBlobName && *pDebugBlobName, E_INVALIDARG,
+        "/Fd was specified but no debug name was produced");
+      WriteBlobToFile(pDebugBlob, pDebugBlobName);
+    }
+    else {
+      WritePartToFile(pBlob, hlsl::DFCC_ShaderDebugInfoDXIL, m_Opts.DebugFile);
+    }
   }
 
   // Extract and write root signature information.
@@ -598,6 +610,8 @@ void DxcContext::Recompile(IDxcBlob *pSource, IDxcLibrary *pLibrary, IDxcCompile
 int DxcContext::Compile() {
   CComPtr<IDxcCompiler> pCompiler;
   CComPtr<IDxcOperationResult> pCompileResult;
+  CComPtr<IDxcBlob> pDebugBlob;
+  std::wstring debugName;
   {
     CComPtr<IDxcBlobEncoding> pSource;
 
@@ -632,11 +646,28 @@ int DxcContext::Compile() {
         TargetProfile = hlsl::ShaderModel::Get(SM->GetKind(), 6, 0)->GetName();
       }
 
-      IFT(pCompiler->Compile(pSource, StringRefUtf16(m_Opts.InputFile),
-        StringRefUtf16(m_Opts.EntryPoint),
-        StringRefUtf16(TargetProfile), args.data(),
-        args.size(), m_Opts.Defines.data(),
-        m_Opts.Defines.size(), pIncludeHandler, &pCompileResult));
+      if (!m_Opts.DebugFile.empty() && m_Opts.DebugFile.endswith(llvm::StringRef("\\"))) {
+        args.push_back(L"/Qstrip_debug"); // implied
+        CComPtr<IDxcCompiler2> pCompiler2;
+        CComHeapPtr<WCHAR> pDebugName;
+        IFT(pCompiler.QueryInterface(&pCompiler2));
+        IFT(pCompiler2->CompileWithDebug(
+            pSource, StringRefUtf16(m_Opts.InputFile),
+            StringRefUtf16(m_Opts.EntryPoint), StringRefUtf16(TargetProfile),
+            args.data(), args.size(), m_Opts.Defines.data(),
+            m_Opts.Defines.size(), pIncludeHandler, &pCompileResult,
+            &pDebugName, &pDebugBlob));
+        if (pDebugName.m_pData) {
+          Unicode::UTF8ToUTF16String(m_Opts.DebugFile.str().c_str(), &debugName);
+          debugName += pDebugName.m_pData;
+        }
+      } else {
+        IFT(pCompiler->Compile(pSource, StringRefUtf16(m_Opts.InputFile),
+          StringRefUtf16(m_Opts.EntryPoint),
+          StringRefUtf16(TargetProfile), args.data(),
+          args.size(), m_Opts.Defines.data(),
+          m_Opts.Defines.size(), pIncludeHandler, &pCompileResult));
+      }
     }
   }
 
@@ -657,7 +688,7 @@ int DxcContext::Compile() {
     pCompiler.Release();
     pCompileResult.Release();
     if (pProgram.p != nullptr) {
-      ActOnBlob(pProgram.p);
+      ActOnBlob(pProgram.p, pDebugBlob, debugName.c_str());
     }
   }
   return status;

+ 3 - 1
tools/clang/tools/dxcompiler/dxcassembler.cpp

@@ -121,7 +121,9 @@ HRESULT STDMETHODCALLTYPE DxcAssembler::AssembleToContainer(
     CComPtr<AbstractMemoryStream> pFinalStream;
     IFT(CreateMemoryStream(pMalloc, &pFinalStream));
 
-    SerializeDxilContainerForModule(&M->GetOrCreateDxilModule(), pOutputStream, pFinalStream);
+    SerializeDxilContainerForModule(&M->GetOrCreateDxilModule(), pOutputStream,
+                                    pFinalStream,
+                                    SerializeDxilFlags::IncludeDebugNamePart);
 
     CComPtr<IDxcBlob> pResultBlob;
     IFT(pFinalStream->QueryInterface(&pResultBlob));

+ 109 - 11
tools/clang/tools/dxcompiler/dxcompilerobj.cpp

@@ -120,6 +120,21 @@ HRESULT RunInternalValidator(_In_ IDxcValidator *pValidator,
                              _In_ IDxcBlob *pShader, UINT32 Flags,
                              _In_ IDxcOperationResult **ppResult);
 
+static HRESULT GetValidatorVersion(IDxcValidator *pValidator, UINT32 *pMajor,
+                                   UINT32 *pMinor) {
+  CComPtr<IDxcVersionInfo> pVersionInfo;
+  IFR(pValidator->QueryInterface(&pVersionInfo));
+  IFR(pVersionInfo->GetVersion(pMajor, pMinor));
+  return S_OK;
+}
+
+static bool DoesValidatorSupportDebugNamePart(IDxcValidator *pValidator) {
+  UINT32 Major, Minor;
+  if (FAILED((GetValidatorVersion(pValidator, &Major, &Minor))))
+    return false;
+  return Major > 1 || (Major == 1 && Minor >= 1);
+}
+
 enum class HandleKind {
   Special = 0,
   File = 1,
@@ -1949,13 +1964,17 @@ public:
   { }
 
   void CloneForDebugInfo() {
-      m_llvmModuleWithDebugInfo.reset(llvm::CloneModule(m_llvmModule.get()));
+    m_llvmModuleWithDebugInfo.reset(llvm::CloneModule(m_llvmModule.get()));
   }
 
- void WrapModuleInDxilContainer(IMalloc *pMalloc,  AbstractMemoryStream *pModuleBitcode, CComPtr<IDxcBlob> &pDxilContainerBlob) {
+  void WrapModuleInDxilContainer(IMalloc *pMalloc,
+                                 AbstractMemoryStream *pModuleBitcode,
+                                 CComPtr<IDxcBlob> &pDxilContainerBlob,
+                                 SerializeDxilFlags Flags) {
     CComPtr<AbstractMemoryStream> pContainerStream;
     IFT(CreateMemoryStream(pMalloc, &pContainerStream));
-    SerializeDxilContainerForModule(&m_llvmModule->GetOrCreateDxilModule(), pModuleBitcode, pContainerStream);
+    SerializeDxilContainerForModule(&m_llvmModule->GetOrCreateDxilModule(),
+                                    pModuleBitcode, pContainerStream, Flags);
 
     pDxilContainerBlob.Release();
     IFT(pContainerStream.QueryInterface(&pDxilContainerBlob));
@@ -1969,7 +1988,7 @@ private:
   std::unique_ptr<llvm::Module> m_llvmModuleWithDebugInfo;
 };
 
-class DxcCompiler : public IDxcCompiler, public IDxcLangExtensions, public IDxcContainerEvent, public IDxcVersionInfo {
+class DxcCompiler : public IDxcCompiler2, public IDxcLangExtensions, public IDxcContainerEvent, public IDxcVersionInfo {
 private:
   DXC_MICROCOM_REF_FIELD(m_dwRef)
   DxcLangExtensionsHelper m_langExtensionsHelper;
@@ -2028,7 +2047,7 @@ public:
   }
 
   HRESULT STDMETHODCALLTYPE QueryInterface(REFIID iid, void **ppvObject) {
-    return DoBasicQueryInterface4<IDxcCompiler, IDxcLangExtensions, IDxcContainerEvent, IDxcVersionInfo>(this, iid, ppvObject);
+    return DoBasicQueryInterface5<IDxcCompiler, IDxcCompiler2, IDxcLangExtensions, IDxcContainerEvent, IDxcVersionInfo>(this, iid, ppvObject);
   }
 
   // Compile a single entry point to the target shader model
@@ -2043,22 +2062,45 @@ public:
     _In_ UINT32 defineCount,                      // Number of defines
     _In_opt_ IDxcIncludeHandler *pIncludeHandler, // user-provided interface to handle #include directives (optional)
     _COM_Outptr_ IDxcOperationResult **ppResult   // Compiler output status, buffer, and errors
-    ) {
+  ) {
+    return CompileWithDebug(pSource, pSourceName, pEntryPoint, pTargetProfile,
+                            pArguments, argCount, pDefines, defineCount,
+                            pIncludeHandler, ppResult, nullptr, nullptr);
+  }
+
+  // Compile a single entry point to the target shader model with debug information.
+  __override HRESULT STDMETHODCALLTYPE CompileWithDebug(
+    _In_ IDxcBlob *pSource,                       // Source text to compile
+    _In_opt_ LPCWSTR pSourceName,                 // Optional file name for pSource. Used in errors and include handlers.
+    _In_ LPCWSTR pEntryPoint,                     // Entry point name
+    _In_ LPCWSTR pTargetProfile,                  // Shader profile to compile
+    _In_count_(argCount) LPCWSTR *pArguments,     // Array of pointers to arguments
+    _In_ UINT32 argCount,                         // Number of arguments
+    _In_count_(defineCount) const DxcDefine *pDefines,  // Array of defines
+    _In_ UINT32 defineCount,                      // Number of defines
+    _In_opt_ IDxcIncludeHandler *pIncludeHandler, // user-provided interface to handle #include directives (optional)
+    _COM_Outptr_ IDxcOperationResult **ppResult,  // Compiler output status, buffer, and errors
+    _Outptr_opt_result_z_ LPWSTR *ppDebugBlobName,// Suggested file name for debug blob.
+    _COM_Outptr_opt_ IDxcBlob **ppDebugBlob       // Debug blob
+  ) {
     if (pSource == nullptr || ppResult == nullptr ||
         (defineCount > 0 && pDefines == nullptr) ||
         (argCount > 0 && pArguments == nullptr) || pEntryPoint == nullptr ||
         pTargetProfile == nullptr)
       return E_INVALIDARG;
     *ppResult = nullptr;
+    AssignToOutOpt(nullptr, ppDebugBlobName);
+    AssignToOutOpt(nullptr, ppDebugBlob);
 
     HRESULT hr = S_OK;
     CComPtr<IDxcBlobEncoding> utf8Source;
+    CComPtr<AbstractMemoryStream> pOutputStream;
+    CHeapPtr<wchar_t> DebugBlobName;
     DxcEtw_DXCompilerCompile_Start();
     IFC(hlsl::DxcGetBlobAsUtf8(pSource, &utf8Source));
 
     try {
       CComPtr<IMalloc> pMalloc;
-      CComPtr<AbstractMemoryStream> pOutputStream;
       CComPtr<IDxcBlob> pOutputBlob;
       DxcArgsFileSystem *msfPtr;
       IFT(CreateDxcArgsFileSystem(utf8Source, pSourceName, pIncludeHandler, &msfPtr));
@@ -2149,7 +2191,8 @@ public:
 
       // NOTE: this calls the validation component from dxil.dll; the built-in
       // validator can be used as a fallback.
-      bool needsValidation = !opts.CodeGenHighLevel && !opts.DisableValidation;
+      bool produceFullContainer = !opts.CodeGenHighLevel && !opts.AstDump && !opts.OptDump && rootSigMajor == 0;
+      bool needsValidation = produceFullContainer && !opts.DisableValidation;
       bool internalValidator = false;
       CComPtr<IDxcValidator> pValidator;
       CComPtr<IDxcOperationResult> pValResult;
@@ -2241,6 +2284,19 @@ public:
         }
         outStream.flush();
 
+        SerializeDxilFlags SerializeFlags = SerializeDxilFlags::None;
+        if (opts.DebugInfo) {
+          if (DoesValidatorSupportDebugNamePart(pValidator))
+            SerializeFlags = SerializeDxilFlags::IncludeDebugNamePart;
+          // Unless we want to strip it right away, include it in the container.
+          if (!opts.StripDebug || ppDebugBlob == nullptr) {
+            SerializeFlags |= SerializeDxilFlags::IncludeDebugInfoPart;
+          }
+        }
+        if (opts.DebugNameForSource) {
+          SerializeFlags |= SerializeDxilFlags::DebugNameDependOnSource;
+        }
+
         // Don't do work to put in a container if an error has occurred
         if (compileOK) {
           HRESULT valHR = S_OK;
@@ -2257,7 +2313,7 @@ public:
 
           // Do not create a container when there is only a a high-level representation in the module.
           if (!opts.CodeGenHighLevel)
-            llvmModule.WrapModuleInDxilContainer(pMalloc, pOutputStream, pOutputBlob);
+            llvmModule.WrapModuleInDxilContainer(pMalloc, pOutputStream, pOutputBlob, SerializeFlags);
 
           if (needsValidation) {
             // Important: in-place edit is required so the blob is reused and thus
@@ -2291,6 +2347,7 @@ public:
             }
             pValidator.Release();
           }
+
           // Callback after valid DXIL is produced
           if (SUCCEEDED(valHR)) {
             CComPtr<IDxcBlob> pTargetBlob;
@@ -2300,6 +2357,19 @@ public:
                 std::swap(pOutputBlob, pTargetBlob);
               }
             }
+
+            if (ppDebugBlobName && produceFullContainer) {
+              const DxilContainerHeader *pContainer = reinterpret_cast<DxilContainerHeader *>(pOutputBlob->GetBufferPointer());
+              DXASSERT(IsValidDxilContainer(pContainer, pOutputBlob->GetBufferSize()), "else invalid container generated");
+              auto it = std::find_if(begin(pContainer), end(pContainer),
+                DxilPartIsType(DFCC_ShaderDebugName));
+              if (it != end(pContainer)) {
+                const char *pDebugName;
+                if (GetDxilShaderDebugName(*it, &pDebugName, nullptr) && pDebugName && *pDebugName) {
+                  IFTBOOL(Unicode::UTF8BufferToUTF16ComHeap(pDebugName, &DebugBlobName), DXC_E_CONTAINER_INVALID);
+                }
+              }
+            }
           }
         }
       }
@@ -2309,6 +2379,19 @@ public:
 
       CreateOperationResultFromOutputs(pOutputBlob, msfPtr, warnings,
                                        compiler.getDiagnostics(), ppResult);
+
+      // On success, return values. After assigning ppResult, nothing should fail.
+      HRESULT status;
+      DXVERIFY_NOMSG(SUCCEEDED((*ppResult)->GetStatus(&status)));
+      if (SUCCEEDED(status)) {
+        if (opts.DebugInfo && ppDebugBlob) {
+          DXVERIFY_NOMSG(SUCCEEDED(pOutputStream.QueryInterface(ppDebugBlob)));
+        }
+        if (ppDebugBlobName) {
+          *ppDebugBlobName = DebugBlobName.Detach();
+        }
+      }
+
       hr = S_OK;
     }
     CATCH_CPP_ASSIGN_HRESULT();
@@ -2501,6 +2584,18 @@ public:
                          Stream, /*comment*/";");
         }
 
+        it = std::find_if(begin(pContainer), end(pContainer),
+                          DxilPartIsType(DFCC_ShaderDebugName));
+        if (it != end(pContainer)) {
+          const char *pDebugName;
+          if (!GetDxilShaderDebugName(*it, &pDebugName, nullptr)) {
+            Stream << "; shader debug name present; corruption detected\n";
+          }
+          else if (pDebugName && *pDebugName) {
+            Stream << "; shader debug name: " << pDebugName << "\n";
+          }
+        }
+
         it = std::find_if(begin(pContainer), end(pContainer),
                                            DxilPartIsType(DFCC_DXIL));
         if (it == end(pContainer)) {
@@ -2673,9 +2768,12 @@ public:
     else
       compiler.getCodeGenOpts().HLSLSignaturePackingStrategy = (unsigned)DXIL::PackingStrategy::Default;
 
-    // Constructing vector of wide strings to pass in to codegen. Just passing in pArguments will expose ownership of memory to both CodeGenOptions and this caller, which can lead to unexpected behavior.
+    // Constructing vector of wide strings to pass in to codegen. Just passing
+    // in pArguments will expose ownership of memory to both CodeGenOptions and
+    // this caller, which can lead to unexpected behavior.
     for (UINT32 i = 0; i != argCount; ++i) {
-      compiler.getCodeGenOpts().HLSLArguments.emplace_back(Unicode::UTF16ToUTF8StringOrThrow(pArguments[i]));
+      compiler.getCodeGenOpts().HLSLArguments.emplace_back(
+          Unicode::UTF16ToUTF8StringOrThrow(pArguments[i]));
     }
     // Overrding default set of loop unroll.
     if (Opts.PreferFlowControl)

+ 82 - 5
tools/clang/unittests/HLSL/DxilContainerTest.cpp

@@ -65,6 +65,7 @@ public:
     TEST_METHOD_PROPERTY(L"Priority", L"0")
   END_TEST_CLASS()
 
+  TEST_METHOD(CompileWhenDebugSourceThenSourceMatters)
   TEST_METHOD(CompileWhenOKThenIncludesFeatureInfo)
   TEST_METHOD(CompileWhenOKThenIncludesSignatures)
   TEST_METHOD(CompileWhenSigSquareThenIncludeSplit)
@@ -407,20 +408,59 @@ public:
                    __uuidof(ID3D12ShaderReflection), (void **)ppReflection));
   }
 
-  std::string DisassembleProgram(LPCSTR program, LPCWSTR entryPoint,
-                                 LPCWSTR target) {
+  void CompileToProgram(LPCSTR program, LPCWSTR entryPoint, LPCWSTR target,
+                        LPCWSTR *pArguments, UINT32 argCount,
+                        IDxcBlob **ppProgram) {
     CComPtr<IDxcCompiler> pCompiler;
     CComPtr<IDxcBlobEncoding> pSource;
     CComPtr<IDxcBlob> pProgram;
-    CComPtr<IDxcBlobEncoding> pDisassembly;
     CComPtr<IDxcOperationResult> pResult;
 
     VERIFY_SUCCEEDED(CreateCompiler(&pCompiler));
     CreateBlobFromText(program, &pSource);
     VERIFY_SUCCEEDED(pCompiler->Compile(pSource, L"hlsl.hlsl", entryPoint,
-                                        target, nullptr, 0, nullptr, 0, nullptr,
-                                        &pResult));
+                                        target, pArguments, argCount, nullptr,
+                                        0, nullptr, &pResult));
     VERIFY_SUCCEEDED(pResult->GetResult(&pProgram));
+    *ppProgram = pProgram.Detach();
+  }
+
+  bool DoesValidatorSupportDebugName() {
+    CComPtr<IDxcVersionInfo> pVersionInfo;
+    UINT Major, Minor;
+    HRESULT hrVer = m_dllSupport.CreateInstance(CLSID_DxcValidator, &pVersionInfo);
+    if (hrVer == E_NOINTERFACE) return false;
+    VERIFY_SUCCEEDED(hrVer);
+    VERIFY_SUCCEEDED(pVersionInfo->GetVersion(&Major, &Minor));
+    return Major == 1 && (Minor >= 1);
+  }
+
+  std::string CompileToDebugName(LPCSTR program, LPCWSTR entryPoint,
+                                 LPCWSTR target, LPCWSTR *pArguments, UINT32 argCount) {
+    CComPtr<IDxcBlob> pProgram;
+    CComPtr<IDxcBlob> pNameBlob;
+    CComPtr<IDxcContainerReflection> pContainer;
+    UINT32 index;
+
+    CompileToProgram(program, entryPoint, target, pArguments, argCount, &pProgram);
+    VERIFY_SUCCEEDED(m_dllSupport.CreateInstance(CLSID_DxcContainerReflection, &pContainer));
+    VERIFY_SUCCEEDED(pContainer->Load(pProgram));
+    if (FAILED(pContainer->FindFirstPartKind(hlsl::DFCC_ShaderDebugName, &index))) {
+      return std::string();
+    }
+    VERIFY_SUCCEEDED(pContainer->GetPartContent(index, &pNameBlob));
+    const hlsl::DxilShaderDebugName *pDebugName = (hlsl::DxilShaderDebugName *)pNameBlob->GetBufferPointer();
+    return std::string((const char *)(pDebugName + 1));
+  }
+
+  std::string DisassembleProgram(LPCSTR program, LPCWSTR entryPoint,
+                                 LPCWSTR target) {
+    CComPtr<IDxcCompiler> pCompiler;
+    CComPtr<IDxcBlob> pProgram;
+    CComPtr<IDxcBlobEncoding> pDisassembly;
+
+    CompileToProgram(program, entryPoint, target, nullptr, 0, &pProgram);
+    VERIFY_SUCCEEDED(CreateCompiler(&pCompiler));
     VERIFY_SUCCEEDED(pCompiler->Disassemble(pProgram, &pDisassembly));
     return BlobToUtf8(pDisassembly);
   }
@@ -481,6 +521,43 @@ public:
   }
 };
 
+TEST_F(DxilContainerTest, CompileWhenDebugSourceThenSourceMatters) {
+  char program1[] = "float4 main() : SV_Target { return 0; }";
+  char program2[] = "  float4 main() : SV_Target { return 0; }  ";
+  LPCWSTR Zi[] = { L"/Zi" };
+  LPCWSTR ZiZss[] = { L"/Zi", L"/Zss" };
+  LPCWSTR ZiZsb[] = { L"/Zi", L"/Zsb" };
+  
+  // No debug info, no debug name...
+  std::string noName = CompileToDebugName(program1, L"main", L"ps_6_0", nullptr, 0);
+  VERIFY_IS_TRUE(noName.empty());
+
+  if (!DoesValidatorSupportDebugName())
+    return;
+
+  // Debug info, default to source name.
+  std::string sourceName1 = CompileToDebugName(program1, L"main", L"ps_6_0", Zi, _countof(Zi));
+  VERIFY_IS_FALSE(sourceName1.empty());
+
+  // Deterministic naming.
+  std::string sourceName1Again = CompileToDebugName(program1, L"main", L"ps_6_0", Zi, _countof(Zi));
+  VERIFY_ARE_EQUAL_STR(sourceName1.c_str(), sourceName1Again.c_str());
+
+  // Changes in source become changes in name.
+  std::string sourceName2 = CompileToDebugName(program2, L"main", L"ps_6_0", Zi, _countof(Zi));
+  VERIFY_IS_FALSE(0 == strcmp(sourceName2.c_str(), sourceName1.c_str()));
+
+  // Source again, different because different switches are specified.
+  std::string sourceName1Zss = CompileToDebugName(program1, L"main", L"ps_6_0", ZiZss, _countof(ZiZss));
+  VERIFY_IS_FALSE(0 == strcmp(sourceName1Zss.c_str(), sourceName1.c_str()));
+
+  // Binary program 1 and 2 should be different from source and equal to each other.
+  std::string binName1 = CompileToDebugName(program1, L"main", L"ps_6_0", ZiZsb, _countof(ZiZsb));
+  std::string binName2 = CompileToDebugName(program2, L"main", L"ps_6_0", ZiZsb, _countof(ZiZsb));
+  VERIFY_ARE_EQUAL_STR(binName1.c_str(), binName2.c_str());
+  VERIFY_IS_FALSE(0 == strcmp(sourceName1Zss.c_str(), binName1.c_str()));
+}
+
 TEST_F(DxilContainerTest, CompileWhenOKThenIncludesSignatures) {
   char program[] =
     "struct PSInput {\r\n"

+ 4 - 1
utils/hct/hctdb_instrhelp.py

@@ -819,7 +819,10 @@ def get_opcodes_rst():
     rows = []
     rows.append(["ID", "Name", "Description"])
     for i in instrs:
-        rows.append([i.dxil_opid, i.dxil_op + "_", i.doc]) #append _ to enable internal hyperlink on rst files
+        op_name = i.dxil_op
+        if i.remarks:
+            op_name = op_name + "_" # append _ to enable internal hyperlink on rst files
+        rows.append([i.dxil_opid, op_name, i.doc])
     result = "\n\n" + format_rst_table(rows) + "\n\n"
     # Add detailed instruction information where available.
     instrs = sorted(instrs, key=lambda v : v.name)

+ 39 - 0
utils/hct/hcttestcmds.cmd

@@ -38,6 +38,40 @@ if %errorlevel% neq 0 (
   exit /b 1
 )
 
+rem When dxil.dll is present, /Fd with trailing will not produce a name.
+if exist dxil.dll (
+  echo Skipping /Fd with trailing backslash when dxil.dll is present.
+  echo A future dxil.dll will provide this information.
+  goto :skipfdtrail
+)
+dxc.exe /T ps_6_0 smoke.hlsl /Zi /Fd %CD%\ /Fo smoke.hlsl.strip 1>nul
+if %errorlevel% neq 0 (
+  echo Failed - %CD%\dxc.exe /T ps_6_0 smoke.hlsl /Zi /Fd %CD%\
+  call :cleanup 2>nul
+  exit /b 1
+)
+rem .lld file should be produced
+dir %CD%\*.lld 1>nul
+if %errorlevel% neq 0 (
+  echo Failed to find some .lld file at %CD%
+  call :cleanup 2>nul
+  exit /b 1
+)
+rem /Fd with trailing backslash implies /Qstrip_debug
+dxc.exe -dumpbin smoke.hlsl.strip | findstr "shader debug name" 1>nul
+if %errorlevel% neq 0 (
+  echo Failed to find shader debug name.
+  call :cleanup 2>nul
+  exit /b 1
+)
+dxc.exe -dumpbin smoke.hlsl.strip | findstr "DICompileUnit" 1>nul
+if %errorlevel% equ 0 (
+  echo Found DICompileUnit after implied strip.
+  call :cleanup 2>nul
+  exit /b 1
+)
+
+:skipfdtrail
 dxc.exe /T ps_6_0 smoke.hlsl /Fe smoke.hlsl.e 1>nul
 if %errorlevel% neq 0 (
   echo Failed - %CD%\dxc.exe /T ps_6_0 smoke.hlsl /Fe %CD%\smoke.hlsl.e
@@ -435,6 +469,9 @@ if %errorlevel% neq 0 (
   exit /b 1
 )
 
+echo Smoke test for debug info extraction.
+dxc.exe smoke.hlsl
+
 call :cleanup
 exit /b 0
 
@@ -444,6 +481,7 @@ del %CD%\smoke.hlsl.c
 del %CD%\smoke.hlsl.d
 del %CD%\smoke.hlsl.e
 del %CD%\smoke.hlsl.h
+del %CD%\smoke.hlsl.strip
 del %CD%\smoke.cso
 del %CD%\NonUniform.cso
 del %CD%\private.cso
@@ -459,6 +497,7 @@ del %CD%\NonUniformNoRootSig.cso
 del %CD%\TextVS.cso
 del %CD%\smoke.ll
 del %CD%\smoke.cso.ll
+del %CD%\*.lld
 del %CD%\smoke.cso.plain.bc
 del %CD%\smoke.rebuilt-container.cso
 del %CD%\smoke.rebuilt-container2.cso

+ 4 - 5
utils/hct/hctvs.cmd

@@ -16,13 +16,12 @@ if not exist "%HLSL_BLD_DIR%\LLVM.sln" (
 )
 
 if not exist "%ProgramFiles(x86)%\Microsoft Visual Studio 14.0\Common7\IDE\devenv.exe" (
-  echo Missing Visual Studio at "%ProgramFiles(x86)%\Microsoft Visual Studio 14.0\Common7\IDE\devenv.exe"
-  exit /b 1
+  start %HLSL_BLD_DIR%\LLVM.sln
+) else (
+  start "%ProgramFiles(x86)%\Microsoft Visual Studio 14.0\Common7\IDE\devenv.exe" %HLSL_BLD_DIR%\LLVM.sln
 )
 
-start "%ProgramFiles(x86)%\Microsoft Visual Studio 14.0\Common7\IDE\devenv.exe" %HLSL_BLD_DIR%\LLVM.sln
-
-exit /b 0
+goto :eof
 
 :showhelp
 echo Launches Visual Studio and opens the solution file.