123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388 |
- ===========================
- TableGen Language Reference
- ===========================
- .. contents::
- :local:
- .. warning::
- This document is extremely rough. If you find something lacking, please
- fix it, file a documentation bug, or ask about it on llvm-dev.
- Introduction
- ============
- This document is meant to be a normative spec about the TableGen language
- in and of itself (i.e. how to understand a given construct in terms of how
- it affects the final set of records represented by the TableGen file). If
- you are unsure if this document is really what you are looking for, please
- read the :doc:`introduction to TableGen <index>` first.
- Notation
- ========
- The lexical and syntax notation used here is intended to imitate
- `Python's`_. In particular, for lexical definitions, the productions
- operate at the character level and there is no implied whitespace between
- elements. The syntax definitions operate at the token level, so there is
- implied whitespace between tokens.
- .. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
- Lexical Analysis
- ================
- TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
- comments.
- The following is a listing of the basic punctuation tokens::
- - + [ ] { } ( ) < > : ; . = ? #
- Numeric literals take one of the following forms:
- .. TableGen actually will lex some pretty strange sequences an interpret
- them as numbers. What is shown here is an attempt to approximate what it
- "should" accept.
- .. productionlist::
- TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
- DecimalInteger: ["+" | "-"] ("0"..."9")+
- HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
- BinInteger: "0b" ("0" | "1")+
- One aspect to note is that the :token:`DecimalInteger` token *includes* the
- ``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
- most languages do.
- Also note that :token:`BinInteger` creates a value of type ``bits<n>``
- (where ``n`` is the number of bits). This will implicitly convert to
- integers when needed.
- TableGen has identifier-like tokens:
- .. productionlist::
- ualpha: "a"..."z" | "A"..."Z" | "_"
- TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
- TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")*
- Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
- begin with a number. In case of ambiguity, a token will be interpreted as a
- numeric literal rather than an identifier.
- TableGen also has two string-like literals:
- .. productionlist::
- TokString: '"' <non-'"' characters and C-like escapes> '"'
- TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
- :token:`TokCodeFragment` is essentially a multiline string literal
- delimited by ``[{`` and ``}]``.
- .. note::
- The current implementation accepts the following C-like escapes::
- \\ \' \" \t \n
- TableGen also has the following keywords::
- bit bits class code dag
- def foreach defm field in
- int let list multiclass string
- TableGen also has "bang operators" which have a
- wide variety of meanings:
- .. productionlist::
- BangOperator: one of
- :!eq !if !head !tail !con
- :!add !shl !sra !srl !and
- :!cast !empty !subst !foreach !listconcat !strconcat
- Syntax
- ======
- TableGen has an ``include`` mechanism. It does not play a role in the
- syntax per se, since it is lexically replaced with the contents of the
- included file.
- .. productionlist::
- IncludeDirective: "include" `TokString`
- TableGen's top-level production consists of "objects".
- .. productionlist::
- TableGenFile: `Object`*
- Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
- ``class``\es
- ------------
- .. productionlist::
- Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
- A ``class`` declaration creates a record which other records can inherit
- from. A class can be parametrized by a list of "template arguments", whose
- values can be used in the class body.
- A given class can only be defined once. A ``class`` declaration is
- considered to define the class if any of the following is true:
- .. break ObjectBody into its consituents so that they are present here?
- #. The :token:`TemplateArgList` is present.
- #. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
- #. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
- You can declare an empty class by giving and empty :token:`TemplateArgList`
- and an empty :token:`ObjectBody`. This can serve as a restricted form of
- forward declaration: note that records deriving from the forward-declared
- class will inherit no fields from it since the record expansion is done
- when the record is parsed.
- .. productionlist::
- TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
- Declarations
- ------------
- .. Omitting mention of arcane "field" prefix to discourage its use.
- The declaration syntax is pretty much what you would expect as a C++
- programmer.
- .. productionlist::
- Declaration: `Type` `TokIdentifier` ["=" `Value`]
- It assigns the value to the identifer.
- Types
- -----
- .. productionlist::
- Type: "string" | "code" | "bit" | "int" | "dag"
- :| "bits" "<" `TokInteger` ">"
- :| "list" "<" `Type` ">"
- :| `ClassID`
- ClassID: `TokIdentifier`
- Both ``string`` and ``code`` correspond to the string type; the difference
- is purely to indicate programmer intention.
- The :token:`ClassID` must identify a class that has been previously
- declared or defined.
- Values
- ------
- .. productionlist::
- Value: `SimpleValue` `ValueSuffix`*
- ValueSuffix: "{" `RangeList` "}"
- :| "[" `RangeList` "]"
- :| "." `TokIdentifier`
- RangeList: `RangePiece` ("," `RangePiece`)*
- RangePiece: `TokInteger`
- :| `TokInteger` "-" `TokInteger`
- :| `TokInteger` `TokInteger`
- The peculiar last form of :token:`RangePiece` is due to the fact that the
- "``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
- two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
- instead of "1", "-", and "5".
- The :token:`RangeList` can be thought of as specifying "list slice" in some
- contexts.
- :token:`SimpleValue` has a number of forms:
- .. productionlist::
- SimpleValue: `TokIdentifier`
- The value will be the variable referenced by the identifier. It can be one
- of:
- .. The code for this is exceptionally abstruse. These examples are a
- best-effort attempt.
- * name of a ``def``, such as the use of ``Bar`` in::
- def Bar : SomeClass {
- int X = 5;
- }
- def Foo {
- SomeClass Baz = Bar;
- }
- * value local to a ``def``, such as the use of ``Bar`` in::
- def Foo {
- int Bar = 5;
- int Baz = Bar;
- }
- * a template arg of a ``class``, such as the use of ``Bar`` in::
- class Foo<int Bar> {
- int Baz = Bar;
- }
- * value local to a ``multiclass``, such as the use of ``Bar`` in::
- multiclass Foo {
- int Bar = 5;
- int Baz = Bar;
- }
- * a template arg to a ``multiclass``, such as the use of ``Bar`` in::
- multiclass Foo<int Bar> {
- int Baz = Bar;
- }
- .. productionlist::
- SimpleValue: `TokInteger`
- This represents the numeric value of the integer.
- .. productionlist::
- SimpleValue: `TokString`+
- Multiple adjacent string literals are concatenated like in C/C++. The value
- is the concatenation of the strings.
- .. productionlist::
- SimpleValue: `TokCodeFragment`
- The value is the string value of the code fragment.
- .. productionlist::
- SimpleValue: "?"
- ``?`` represents an "unset" initializer.
- .. productionlist::
- SimpleValue: "{" `ValueList` "}"
- ValueList: [`ValueListNE`]
- ValueListNE: `Value` ("," `Value`)*
- This represents a sequence of bits, as would be used to initialize a
- ``bits<n>`` field (where ``n`` is the number of bits).
- .. productionlist::
- SimpleValue: `ClassID` "<" `ValueListNE` ">"
- This generates a new anonymous record definition (as would be created by an
- unnamed ``def`` inheriting from the given class with the given template
- arguments) and the value is the value of that record definition.
- .. productionlist::
- SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
- A list initializer. The optional :token:`Type` can be used to indicate a
- specific element type, otherwise the element type will be deduced from the
- given values.
- .. The initial `DagArg` of the dag must start with an identifier or
- !cast, but this is more of an implementation detail and so for now just
- leave it out.
- .. productionlist::
- SimpleValue: "(" `DagArg` `DagArgList` ")"
- DagArgList: `DagArg` ("," `DagArg`)*
- DagArg: `Value` [":" `TokVarName`] | `TokVarName`
- The initial :token:`DagArg` is called the "operator" of the dag.
- .. productionlist::
- SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
- Bodies
- ------
- .. productionlist::
- ObjectBody: `BaseClassList` `Body`
- BaseClassList: [":" `BaseClassListNE`]
- BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
- SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
- DefmID: `TokIdentifier`
- The version with the :token:`MultiClassID` is only valid in the
- :token:`BaseClassList` of a ``defm``.
- The :token:`MultiClassID` should be the name of a ``multiclass``.
- .. put this somewhere else
- It is after parsing the base class list that the "let stack" is applied.
- .. productionlist::
- Body: ";" | "{" BodyList "}"
- BodyList: BodyItem*
- BodyItem: `Declaration` ";"
- :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
- The ``let`` form allows overriding the value of an inherited field.
- ``def``
- -------
- .. TODO::
- There can be pastes in the names here, like ``#NAME#``. Look into that
- and document it (it boils down to ParseIDValue with IDParseMode ==
- ParseNameMode). ParseObjectName calls into the general ParseValue, with
- the only different from "arbitrary expression parsing" being IDParseMode
- == Mode.
- .. productionlist::
- Def: "def" `TokIdentifier` `ObjectBody`
- Defines a record whose name is given by the :token:`TokIdentifier`. The
- fields of the record are inherited from the base classes and defined in the
- body.
- Special handling occurs if this ``def`` appears inside a ``multiclass`` or
- a ``foreach``.
- ``defm``
- --------
- .. productionlist::
- Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";"
- Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
- precede any ``class``'s that appear.
- ``foreach``
- -----------
- .. productionlist::
- Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
- :| "foreach" `Declaration` "in" `Object`
- The value assigned to the variable in the declaration is iterated over and
- the object or object list is reevaluated with the variable set at each
- iterated value.
- Top-Level ``let``
- -----------------
- .. productionlist::
- Let: "let" `LetList` "in" "{" `Object`* "}"
- :| "let" `LetList` "in" `Object`
- LetList: `LetItem` ("," `LetItem`)*
- LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
- This is effectively equivalent to ``let`` inside the body of a record
- except that it applies to multiple records at a time. The bindings are
- applied at the end of parsing the base classes of a record.
- ``multiclass``
- --------------
- .. productionlist::
- MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
- : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
- BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
- MultiClassID: `TokIdentifier`
- MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`
|