IntroductionToTheClangAST.rst 5.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129
  1. =============================
  2. Introduction to the Clang AST
  3. =============================
  4. NOTE: this document applies to the original Clang project, not the DirectX
  5. Compiler. It's made available for informational purposes only.
  6. This document gives a gentle introduction to the mysteries of the Clang
  7. AST. It is targeted at developers who either want to contribute to
  8. Clang, or use tools that work based on Clang's AST, like the AST
  9. matchers.
  10. .. raw:: html
  11. <center><iframe width="560" height="315" src="http://www.youtube.com/embed/VqCkCDFLSsc?vq=hd720" frameborder="0" allowfullscreen></iframe></center>
  12. `Slides <http://llvm.org/devmtg/2013-04/klimek-slides.pdf>`_
  13. Introduction
  14. ============
  15. Clang's AST is different from ASTs produced by some other compilers in
  16. that it closely resembles both the written C++ code and the C++
  17. standard. For example, parenthesis expressions and compile time
  18. constants are available in an unreduced form in the AST. This makes
  19. Clang's AST a good fit for refactoring tools.
  20. Documentation for all Clang AST nodes is available via the generated
  21. `Doxygen <http://clang.llvm.org/doxygen>`_. The doxygen online
  22. documentation is also indexed by your favorite search engine, which will
  23. make a search for clang and the AST node's class name usually turn up
  24. the doxygen of the class you're looking for (for example, search for:
  25. clang ParenExpr).
  26. Examining the AST
  27. =================
  28. A good way to familarize yourself with the Clang AST is to actually look
  29. at it on some simple example code. Clang has a builtin AST-dump mode,
  30. which can be enabled with the flag ``-ast-dump``.
  31. Let's look at a simple example AST:
  32. ::
  33. $ cat test.cc
  34. int f(int x) {
  35. int result = (x / 42);
  36. return result;
  37. }
  38. # Clang by default is a frontend for many tools; -Xclang is used to pass
  39. # options directly to the C++ frontend.
  40. $ clang -Xclang -ast-dump -fsyntax-only test.cc
  41. TranslationUnitDecl 0x5aea0d0 <<invalid sloc>>
  42. ... cutting out internal declarations of clang ...
  43. `-FunctionDecl 0x5aeab50 <test.cc:1:1, line:4:1> f 'int (int)'
  44. |-ParmVarDecl 0x5aeaa90 <line:1:7, col:11> x 'int'
  45. `-CompoundStmt 0x5aead88 <col:14, line:4:1>
  46. |-DeclStmt 0x5aead10 <line:2:3, col:24>
  47. | `-VarDecl 0x5aeac10 <col:3, col:23> result 'int'
  48. | `-ParenExpr 0x5aeacf0 <col:16, col:23> 'int'
  49. | `-BinaryOperator 0x5aeacc8 <col:17, col:21> 'int' '/'
  50. | |-ImplicitCastExpr 0x5aeacb0 <col:17> 'int' <LValueToRValue>
  51. | | `-DeclRefExpr 0x5aeac68 <col:17> 'int' lvalue ParmVar 0x5aeaa90 'x' 'int'
  52. | `-IntegerLiteral 0x5aeac90 <col:21> 'int' 42
  53. `-ReturnStmt 0x5aead68 <line:3:3, col:10>
  54. `-ImplicitCastExpr 0x5aead50 <col:10> 'int' <LValueToRValue>
  55. `-DeclRefExpr 0x5aead28 <col:10> 'int' lvalue Var 0x5aeac10 'result' 'int'
  56. The toplevel declaration in
  57. a translation unit is always the `translation unit
  58. declaration <http://clang.llvm.org/doxygen/classclang_1_1TranslationUnitDecl.html>`_.
  59. In this example, our first user written declaration is the `function
  60. declaration <http://clang.llvm.org/doxygen/classclang_1_1FunctionDecl.html>`_
  61. of "``f``". The body of "``f``" is a `compound
  62. statement <http://clang.llvm.org/doxygen/classclang_1_1CompoundStmt.html>`_,
  63. whose child nodes are a `declaration
  64. statement <http://clang.llvm.org/doxygen/classclang_1_1DeclStmt.html>`_
  65. that declares our result variable, and the `return
  66. statement <http://clang.llvm.org/doxygen/classclang_1_1ReturnStmt.html>`_.
  67. AST Context
  68. ===========
  69. All information about the AST for a translation unit is bundled up in
  70. the class
  71. `ASTContext <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html>`_.
  72. It allows traversal of the whole translation unit starting from
  73. `getTranslationUnitDecl <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html#abd909fb01ef10cfd0244832a67b1dd64>`_,
  74. or to access Clang's `table of
  75. identifiers <http://clang.llvm.org/doxygen/classclang_1_1ASTContext.html#a4f95adb9958e22fbe55212ae6482feb4>`_
  76. for the parsed translation unit.
  77. AST Nodes
  78. =========
  79. Clang's AST nodes are modeled on a class hierarchy that does not have a
  80. common ancestor. Instead, there are multiple larger hierarchies for
  81. basic node types like
  82. `Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_ and
  83. `Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_. Many
  84. important AST nodes derive from
  85. `Type <http://clang.llvm.org/doxygen/classclang_1_1Type.html>`_,
  86. `Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_,
  87. `DeclContext <http://clang.llvm.org/doxygen/classclang_1_1DeclContext.html>`_
  88. or `Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_, with
  89. some classes deriving from both Decl and DeclContext.
  90. There are also a multitude of nodes in the AST that are not part of a
  91. larger hierarchy, and are only reachable from specific other nodes, like
  92. `CXXBaseSpecifier <http://clang.llvm.org/doxygen/classclang_1_1CXXBaseSpecifier.html>`_.
  93. Thus, to traverse the full AST, one starts from the
  94. `TranslationUnitDecl <http://clang.llvm.org/doxygen/classclang_1_1TranslationUnitDecl.html>`_
  95. and then recursively traverses everything that can be reached from that
  96. node - this information has to be encoded for each specific node type.
  97. This algorithm is encoded in the
  98. `RecursiveASTVisitor <http://clang.llvm.org/doxygen/classclang_1_1RecursiveASTVisitor.html>`_.
  99. See the `RecursiveASTVisitor
  100. tutorial <http://clang.llvm.org/docs/RAVFrontendAction.html>`_.
  101. The two most basic nodes in the Clang AST are statements
  102. (`Stmt <http://clang.llvm.org/doxygen/classclang_1_1Stmt.html>`_) and
  103. declarations
  104. (`Decl <http://clang.llvm.org/doxygen/classclang_1_1Decl.html>`_). Note
  105. that expressions
  106. (`Expr <http://clang.llvm.org/doxygen/classclang_1_1Expr.html>`_) are
  107. also statements in Clang's AST.