index.rst 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308
  1. ========
  2. TableGen
  3. ========
  4. .. contents::
  5. :local:
  6. .. toctree::
  7. :hidden:
  8. BackEnds
  9. LangRef
  10. LangIntro
  11. Deficiencies
  12. Introduction
  13. ============
  14. TableGen's purpose is to help a human develop and maintain records of
  15. domain-specific information. Because there may be a large number of these
  16. records, it is specifically designed to allow writing flexible descriptions and
  17. for common features of these records to be factored out. This reduces the
  18. amount of duplication in the description, reduces the chance of error, and makes
  19. it easier to structure domain specific information.
  20. The core part of TableGen parses a file, instantiates the declarations, and
  21. hands the result off to a domain-specific `backend`_ for processing.
  22. The current major users of TableGen are :doc:`../CodeGenerator`
  23. and the
  24. `Clang diagnostics and attributes <http://clang.llvm.org/docs/UsersManual.html#controlling-errors-and-warnings>`_.
  25. Note that if you work on TableGen much, and use emacs or vim, that you can find
  26. an emacs "TableGen mode" and a vim language file in the ``llvm/utils/emacs`` and
  27. ``llvm/utils/vim`` directories of your LLVM distribution, respectively.
  28. .. _intro:
  29. The TableGen program
  30. ====================
  31. TableGen files are interpreted by the TableGen program: `llvm-tblgen` available
  32. on your build directory under `bin`. It is not installed in the system (or where
  33. your sysroot is set to), since it has no use beyond LLVM's build process.
  34. Running TableGen
  35. ----------------
  36. TableGen runs just like any other LLVM tool. The first (optional) argument
  37. specifies the file to read. If a filename is not specified, ``llvm-tblgen``
  38. reads from standard input.
  39. To be useful, one of the `backends`_ must be used. These backends are
  40. selectable on the command line (type '``llvm-tblgen -help``' for a list). For
  41. example, to get a list of all of the definitions that subclass a particular type
  42. (which can be useful for building up an enum list of these records), use the
  43. ``-print-enums`` option:
  44. .. code-block:: bash
  45. $ llvm-tblgen X86.td -print-enums -class=Register
  46. AH, AL, AX, BH, BL, BP, BPL, BX, CH, CL, CX, DH, DI, DIL, DL, DX, EAX, EBP, EBX,
  47. ECX, EDI, EDX, EFLAGS, EIP, ESI, ESP, FP0, FP1, FP2, FP3, FP4, FP5, FP6, IP,
  48. MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10W, R11, R11B, R11D,
  49. R11W, R12, R12B, R12D, R12W, R13, R13B, R13D, R13W, R14, R14B, R14D, R14W, R15,
  50. R15B, R15D, R15W, R8, R8B, R8D, R8W, R9, R9B, R9D, R9W, RAX, RBP, RBX, RCX, RDI,
  51. RDX, RIP, RSI, RSP, SI, SIL, SP, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7,
  52. XMM0, XMM1, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM2, XMM3, XMM4, XMM5,
  53. XMM6, XMM7, XMM8, XMM9,
  54. $ llvm-tblgen X86.td -print-enums -class=Instruction
  55. ABS_F, ABS_Fp32, ABS_Fp64, ABS_Fp80, ADC32mi, ADC32mi8, ADC32mr, ADC32ri,
  56. ADC32ri8, ADC32rm, ADC32rr, ADC64mi32, ADC64mi8, ADC64mr, ADC64ri32, ADC64ri8,
  57. ADC64rm, ADC64rr, ADD16mi, ADD16mi8, ADD16mr, ADD16ri, ADD16ri8, ADD16rm,
  58. ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr,
  59. ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ...
  60. The default backend prints out all of the records.
  61. If you plan to use TableGen, you will most likely have to write a `backend`_
  62. that extracts the information specific to what you need and formats it in the
  63. appropriate way.
  64. Example
  65. -------
  66. With no other arguments, `llvm-tblgen` parses the specified file and prints out all
  67. of the classes, then all of the definitions. This is a good way to see what the
  68. various definitions expand to fully. Running this on the ``X86.td`` file prints
  69. this (at the time of this writing):
  70. .. code-block:: llvm
  71. ...
  72. def ADD32rr { // Instruction X86Inst I
  73. string Namespace = "X86";
  74. dag OutOperandList = (outs GR32:$dst);
  75. dag InOperandList = (ins GR32:$src1, GR32:$src2);
  76. string AsmString = "add{l}\t{$src2, $dst|$dst, $src2}";
  77. list<dag> Pattern = [(set GR32:$dst, (add GR32:$src1, GR32:$src2))];
  78. list<Register> Uses = [];
  79. list<Register> Defs = [EFLAGS];
  80. list<Predicate> Predicates = [];
  81. int CodeSize = 3;
  82. int AddedComplexity = 0;
  83. bit isReturn = 0;
  84. bit isBranch = 0;
  85. bit isIndirectBranch = 0;
  86. bit isBarrier = 0;
  87. bit isCall = 0;
  88. bit canFoldAsLoad = 0;
  89. bit mayLoad = 0;
  90. bit mayStore = 0;
  91. bit isImplicitDef = 0;
  92. bit isConvertibleToThreeAddress = 1;
  93. bit isCommutable = 1;
  94. bit isTerminator = 0;
  95. bit isReMaterializable = 0;
  96. bit isPredicable = 0;
  97. bit hasDelaySlot = 0;
  98. bit usesCustomInserter = 0;
  99. bit hasCtrlDep = 0;
  100. bit isNotDuplicable = 0;
  101. bit hasSideEffects = 0;
  102. InstrItinClass Itinerary = NoItinerary;
  103. string Constraints = "";
  104. string DisableEncoding = "";
  105. bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 };
  106. Format Form = MRMDestReg;
  107. bits<6> FormBits = { 0, 0, 0, 0, 1, 1 };
  108. ImmType ImmT = NoImm;
  109. bits<3> ImmTypeBits = { 0, 0, 0 };
  110. bit hasOpSizePrefix = 0;
  111. bit hasAdSizePrefix = 0;
  112. bits<4> Prefix = { 0, 0, 0, 0 };
  113. bit hasREX_WPrefix = 0;
  114. FPFormat FPForm = ?;
  115. bits<3> FPFormBits = { 0, 0, 0 };
  116. }
  117. ...
  118. This definition corresponds to the 32-bit register-register ``add`` instruction
  119. of the x86 architecture. ``def ADD32rr`` defines a record named
  120. ``ADD32rr``, and the comment at the end of the line indicates the superclasses
  121. of the definition. The body of the record contains all of the data that
  122. TableGen assembled for the record, indicating that the instruction is part of
  123. the "X86" namespace, the pattern indicating how the instruction is selected by
  124. the code generator, that it is a two-address instruction, has a particular
  125. encoding, etc. The contents and semantics of the information in the record are
  126. specific to the needs of the X86 backend, and are only shown as an example.
  127. As you can see, a lot of information is needed for every instruction supported
  128. by the code generator, and specifying it all manually would be unmaintainable,
  129. prone to bugs, and tiring to do in the first place. Because we are using
  130. TableGen, all of the information was derived from the following definition:
  131. .. code-block:: llvm
  132. let Defs = [EFLAGS],
  133. isCommutable = 1, // X = ADD Y,Z --> X = ADD Z,Y
  134. isConvertibleToThreeAddress = 1 in // Can transform into LEA.
  135. def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst),
  136. (ins GR32:$src1, GR32:$src2),
  137. "add{l}\t{$src2, $dst|$dst, $src2}",
  138. [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>;
  139. This definition makes use of the custom class ``I`` (extended from the custom
  140. class ``X86Inst``), which is defined in the X86-specific TableGen file, to
  141. factor out the common features that instructions of its class share. A key
  142. feature of TableGen is that it allows the end-user to define the abstractions
  143. they prefer to use when describing their information.
  144. Each ``def`` record has a special entry called "NAME". This is the name of the
  145. record ("``ADD32rr``" above). In the general case ``def`` names can be formed
  146. from various kinds of string processing expressions and ``NAME`` resolves to the
  147. final value obtained after resolving all of those expressions. The user may
  148. refer to ``NAME`` anywhere she desires to use the ultimate name of the ``def``.
  149. ``NAME`` should not be defined anywhere else in user code to avoid conflicts.
  150. Syntax
  151. ======
  152. TableGen has a syntax that is loosely based on C++ templates, with built-in
  153. types and specification. In addition, TableGen's syntax introduces some
  154. automation concepts like multiclass, foreach, let, etc.
  155. Basic concepts
  156. --------------
  157. TableGen files consist of two key parts: 'classes' and 'definitions', both of
  158. which are considered 'records'.
  159. **TableGen records** have a unique name, a list of values, and a list of
  160. superclasses. The list of values is the main data that TableGen builds for each
  161. record; it is this that holds the domain specific information for the
  162. application. The interpretation of this data is left to a specific `backend`_,
  163. but the structure and format rules are taken care of and are fixed by
  164. TableGen.
  165. **TableGen definitions** are the concrete form of 'records'. These generally do
  166. not have any undefined values, and are marked with the '``def``' keyword.
  167. .. code-block:: llvm
  168. def FeatureFPARMv8 : SubtargetFeature<"fp-armv8", "HasFPARMv8", "true",
  169. "Enable ARMv8 FP">;
  170. In this example, FeatureFPARMv8 is ``SubtargetFeature`` record initialised
  171. with some values. The names of the classes are defined via the
  172. keyword `class` either on the same file or some other included. Most target
  173. TableGen files include the generic ones in ``include/llvm/Target``.
  174. **TableGen classes** are abstract records that are used to build and describe
  175. other records. These classes allow the end-user to build abstractions for
  176. either the domain they are targeting (such as "Register", "RegisterClass", and
  177. "Instruction" in the LLVM code generator) or for the implementor to help factor
  178. out common properties of records (such as "FPInst", which is used to represent
  179. floating point instructions in the X86 backend). TableGen keeps track of all of
  180. the classes that are used to build up a definition, so the backend can find all
  181. definitions of a particular class, such as "Instruction".
  182. .. code-block:: llvm
  183. class ProcNoItin<string Name, list<SubtargetFeature> Features>
  184. : Processor<Name, NoItineraries, Features>;
  185. Here, the class ProcNoItin, receiving parameters `Name` of type `string` and
  186. a list of target features is specializing the class Processor by passing the
  187. arguments down as well as hard-coding NoItineraries.
  188. **TableGen multiclasses** are groups of abstract records that are instantiated
  189. all at once. Each instantiation can result in multiple TableGen definitions.
  190. If a multiclass inherits from another multiclass, the definitions in the
  191. sub-multiclass become part of the current multiclass, as if they were declared
  192. in the current multiclass.
  193. .. code-block:: llvm
  194. multiclass ro_signed_pats<string T, string Rm, dag Base, dag Offset, dag Extend,
  195. dag address, ValueType sty> {
  196. def : Pat<(i32 (!cast<SDNode>("sextload" # sty) address)),
  197. (!cast<Instruction>("LDRS" # T # "w_" # Rm # "_RegOffset")
  198. Base, Offset, Extend)>;
  199. def : Pat<(i64 (!cast<SDNode>("sextload" # sty) address)),
  200. (!cast<Instruction>("LDRS" # T # "x_" # Rm # "_RegOffset")
  201. Base, Offset, Extend)>;
  202. }
  203. defm : ro_signed_pats<"B", Rm, Base, Offset, Extend,
  204. !foreach(decls.pattern, address,
  205. !subst(SHIFT, imm_eq0, decls.pattern)),
  206. i8>;
  207. See the :doc:`TableGen Language Introduction <LangIntro>` for more generic
  208. information on the usage of the language, and the
  209. :doc:`TableGen Language Reference <LangRef>` for more in-depth description
  210. of the formal language specification.
  211. .. _backend:
  212. .. _backends:
  213. TableGen backends
  214. =================
  215. TableGen files have no real meaning without a back-end. The default operation
  216. of running ``llvm-tblgen`` is to print the information in a textual format, but
  217. that's only useful for debugging of the TableGen files themselves. The power
  218. in TableGen is, however, to interpret the source files into an internal
  219. representation that can be generated into anything you want.
  220. Current usage of TableGen is to create huge include files with tables that you
  221. can either include directly (if the output is in the language you're coding),
  222. or be used in pre-processing via macros surrounding the include of the file.
  223. Direct output can be used if the back-end already prints a table in C format
  224. or if the output is just a list of strings (for error and warning messages).
  225. Pre-processed output should be used if the same information needs to be used
  226. in different contexts (like Instruction names), so your back-end should print
  227. a meta-information list that can be shaped into different compile-time formats.
  228. See the `TableGen BackEnds <BackEnds.html>`_ for more information.
  229. TableGen Deficiencies
  230. =====================
  231. Despite being very generic, TableGen has some deficiencies that have been
  232. pointed out numerous times. The common theme is that, while TableGen allows
  233. you to build Domain-Specific-Languages, the final languages that you create
  234. lack the power of other DSLs, which in turn increase considerably the size
  235. and complexity of TableGen files.
  236. At the same time, TableGen allows you to create virtually any meaning of
  237. the basic concepts via custom-made back-ends, which can pervert the original
  238. design and make it very hard for newcomers to understand the evil TableGen
  239. file.
  240. There are some in favour of extending the semantics even more, but making sure
  241. back-ends adhere to strict rules. Others are suggesting we should move to less,
  242. more powerful DSLs designed with specific purposes, or even re-using existing
  243. DSLs.
  244. Either way, this is a discussion that will likely span across several years,
  245. if not decades. You can read more in the `TableGen Deficiencies <Deficiencies.html>`_
  246. document.