PTHInternals.rst 7.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166
  1. ==========================
  2. Pretokenized Headers (PTH)
  3. ==========================
  4. NOTE: this document applies to the original Clang project, not the DirectX
  5. Compiler. It's made available for informational purposes only.
  6. This document first describes the low-level interface for using PTH and
  7. then briefly elaborates on its design and implementation. If you are
  8. interested in the end-user view, please see the :ref:`User's Manual
  9. <usersmanual-precompiled-headers>`.
  10. Using Pretokenized Headers with ``clang`` (Low-level Interface)
  11. ===============================================================
  12. The Clang compiler frontend, ``clang -cc1``, supports three command line
  13. options for generating and using PTH files.
  14. To generate PTH files using ``clang -cc1``, use the option ``-emit-pth``:
  15. .. code-block:: console
  16. $ clang -cc1 test.h -emit-pth -o test.h.pth
  17. This option is transparently used by ``clang`` when generating PTH
  18. files. Similarly, PTH files can be used as prefix headers using the
  19. ``-include-pth`` option:
  20. .. code-block:: console
  21. $ clang -cc1 -include-pth test.h.pth test.c -o test.s
  22. Alternatively, Clang's PTH files can be used as a raw "token-cache" (or
  23. "content" cache) of the source included by the original header file.
  24. This means that the contents of the PTH file are searched as substitutes
  25. for *any* source files that are used by ``clang -cc1`` to process a
  26. source file. This is done by specifying the ``-token-cache`` option:
  27. .. code-block:: console
  28. $ cat test.h
  29. #include <stdio.h>
  30. $ clang -cc1 -emit-pth test.h -o test.h.pth
  31. $ cat test.c
  32. #include "test.h"
  33. $ clang -cc1 test.c -o test -token-cache test.h.pth
  34. In this example the contents of ``stdio.h`` (and the files it includes)
  35. will be retrieved from ``test.h.pth``, as the PTH file is being used in
  36. this case as a raw cache of the contents of ``test.h``. This is a
  37. low-level interface used to both implement the high-level PTH interface
  38. as well as to provide alternative means to use PTH-style caching.
  39. PTH Design and Implementation
  40. =============================
  41. Unlike GCC's precompiled headers, which cache the full ASTs and
  42. preprocessor state of a header file, Clang's pretokenized header files
  43. mainly cache the raw lexer *tokens* that are needed to segment the
  44. stream of characters in a source file into keywords, identifiers, and
  45. operators. Consequently, PTH serves to mainly directly speed up the
  46. lexing and preprocessing of a source file, while parsing and
  47. type-checking must be completely redone every time a PTH file is used.
  48. Basic Design Tradeoffs
  49. ----------------------
  50. In the long term there are plans to provide an alternate PCH
  51. implementation for Clang that also caches the work for parsing and type
  52. checking the contents of header files. The current implementation of PCH
  53. in Clang as pretokenized header files was motivated by the following
  54. factors:
  55. **Language independence**
  56. PTH files work with any language that
  57. Clang's lexer can handle, including C, Objective-C, and (in the early
  58. stages) C++. This means development on language features at the
  59. parsing level or above (which is basically almost all interesting
  60. pieces) does not require PTH to be modified.
  61. **Simple design**
  62. Relatively speaking, PTH has a simple design and
  63. implementation, making it easy to test. Further, because the
  64. machinery for PTH resides at the lower-levels of the Clang library
  65. stack it is fairly straightforward to profile and optimize.
  66. Further, compared to GCC's PCH implementation (which is the dominate
  67. precompiled header file implementation that Clang can be directly
  68. compared against) the PTH design in Clang yields several attractive
  69. features:
  70. **Architecture independence**
  71. In contrast to GCC's PCH files (and
  72. those of several other compilers), Clang's PTH files are architecture
  73. independent, requiring only a single PTH file when building a
  74. program for multiple architectures.
  75. For example, on Mac OS X one may wish to compile a "universal binary"
  76. that runs on PowerPC, 32-bit Intel (i386), and 64-bit Intel
  77. architectures. In contrast, GCC requires a PCH file for each
  78. architecture, as the definitions of types in the AST are
  79. architecture-specific. Since a Clang PTH file essentially represents
  80. a lexical cache of header files, a single PTH file can be safely used
  81. when compiling for multiple architectures. This can also reduce
  82. compile times because only a single PTH file needs to be generated
  83. during a build instead of several.
  84. **Reduced memory pressure**
  85. Similar to GCC, Clang reads PTH files
  86. via the use of memory mapping (i.e., ``mmap``). Clang, however,
  87. memory maps PTH files as read-only, meaning that multiple invocations
  88. of ``clang -cc1`` can share the same pages in memory from a
  89. memory-mapped PTH file. In comparison, GCC also memory maps its PCH
  90. files but also modifies those pages in memory, incurring the
  91. copy-on-write costs. The read-only nature of PTH can greatly reduce
  92. memory pressure for builds involving multiple cores, thus improving
  93. overall scalability.
  94. **Fast generation**
  95. PTH files can be generated in a small fraction
  96. of the time needed to generate GCC's PCH files. Since PTH/PCH
  97. generation is a serial operation that typically blocks progress
  98. during a build, faster generation time leads to improved processor
  99. utilization with parallel builds on multicore machines.
  100. Despite these strengths, PTH's simple design suffers some algorithmic
  101. handicaps compared to other PCH strategies such as those used by GCC.
  102. While PTH can greatly speed up the processing time of a header file, the
  103. amount of work required to process a header file is still roughly linear
  104. in the size of the header file. In contrast, the amount of work done by
  105. GCC to process a precompiled header is (theoretically) constant (the
  106. ASTs for the header are literally memory mapped into the compiler). This
  107. means that only the pieces of the header file that are referenced by the
  108. source file including the header are the only ones the compiler needs to
  109. process during actual compilation. While GCC's particular implementation
  110. of PCH mitigates some of these algorithmic strengths via the use of
  111. copy-on-write pages, the approach itself can fundamentally dominate at
  112. an algorithmic level, especially when one considers header files of
  113. arbitrary size.
  114. There is also a PCH implementation for Clang based on the lazy
  115. deserialization of ASTs. This approach theoretically has the same
  116. constant-time algorithmic advantages just mentioned but also retains some
  117. of the strengths of PTH such as reduced memory pressure (ideal for
  118. multi-core builds).
  119. Internal PTH Optimizations
  120. --------------------------
  121. While the main optimization employed by PTH is to reduce lexing time of
  122. header files by caching pre-lexed tokens, PTH also employs several other
  123. optimizations to speed up the processing of header files:
  124. - ``stat`` caching: PTH files cache information obtained via calls to
  125. ``stat`` that ``clang -cc1`` uses to resolve which files are included
  126. by ``#include`` directives. This greatly reduces the overhead
  127. involved in context-switching to the kernel to resolve included
  128. files.
  129. - Fast skipping of ``#ifdef`` ... ``#endif`` chains: PTH files
  130. record the basic structure of nested preprocessor blocks. When the
  131. condition of the preprocessor block is false, all of its tokens are
  132. immediately skipped instead of requiring them to be handled by
  133. Clang's preprocessor.