123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166 |
- ==========================
- Pretokenized Headers (PTH)
- ==========================
- NOTE: this document applies to the original Clang project, not the DirectX
- Compiler. It's made available for informational purposes only.
- This document first describes the low-level interface for using PTH and
- then briefly elaborates on its design and implementation. If you are
- interested in the end-user view, please see the :ref:`User's Manual
- <usersmanual-precompiled-headers>`.
- Using Pretokenized Headers with ``clang`` (Low-level Interface)
- ===============================================================
- The Clang compiler frontend, ``clang -cc1``, supports three command line
- options for generating and using PTH files.
- To generate PTH files using ``clang -cc1``, use the option ``-emit-pth``:
- .. code-block:: console
- $ clang -cc1 test.h -emit-pth -o test.h.pth
- This option is transparently used by ``clang`` when generating PTH
- files. Similarly, PTH files can be used as prefix headers using the
- ``-include-pth`` option:
- .. code-block:: console
- $ clang -cc1 -include-pth test.h.pth test.c -o test.s
- Alternatively, Clang's PTH files can be used as a raw "token-cache" (or
- "content" cache) of the source included by the original header file.
- This means that the contents of the PTH file are searched as substitutes
- for *any* source files that are used by ``clang -cc1`` to process a
- source file. This is done by specifying the ``-token-cache`` option:
- .. code-block:: console
- $ cat test.h
- #include <stdio.h>
- $ clang -cc1 -emit-pth test.h -o test.h.pth
- $ cat test.c
- #include "test.h"
- $ clang -cc1 test.c -o test -token-cache test.h.pth
- In this example the contents of ``stdio.h`` (and the files it includes)
- will be retrieved from ``test.h.pth``, as the PTH file is being used in
- this case as a raw cache of the contents of ``test.h``. This is a
- low-level interface used to both implement the high-level PTH interface
- as well as to provide alternative means to use PTH-style caching.
- PTH Design and Implementation
- =============================
- Unlike GCC's precompiled headers, which cache the full ASTs and
- preprocessor state of a header file, Clang's pretokenized header files
- mainly cache the raw lexer *tokens* that are needed to segment the
- stream of characters in a source file into keywords, identifiers, and
- operators. Consequently, PTH serves to mainly directly speed up the
- lexing and preprocessing of a source file, while parsing and
- type-checking must be completely redone every time a PTH file is used.
- Basic Design Tradeoffs
- ----------------------
- In the long term there are plans to provide an alternate PCH
- implementation for Clang that also caches the work for parsing and type
- checking the contents of header files. The current implementation of PCH
- in Clang as pretokenized header files was motivated by the following
- factors:
- **Language independence**
- PTH files work with any language that
- Clang's lexer can handle, including C, Objective-C, and (in the early
- stages) C++. This means development on language features at the
- parsing level or above (which is basically almost all interesting
- pieces) does not require PTH to be modified.
- **Simple design**
- Relatively speaking, PTH has a simple design and
- implementation, making it easy to test. Further, because the
- machinery for PTH resides at the lower-levels of the Clang library
- stack it is fairly straightforward to profile and optimize.
- Further, compared to GCC's PCH implementation (which is the dominate
- precompiled header file implementation that Clang can be directly
- compared against) the PTH design in Clang yields several attractive
- features:
- **Architecture independence**
- In contrast to GCC's PCH files (and
- those of several other compilers), Clang's PTH files are architecture
- independent, requiring only a single PTH file when building a
- program for multiple architectures.
- For example, on Mac OS X one may wish to compile a "universal binary"
- that runs on PowerPC, 32-bit Intel (i386), and 64-bit Intel
- architectures. In contrast, GCC requires a PCH file for each
- architecture, as the definitions of types in the AST are
- architecture-specific. Since a Clang PTH file essentially represents
- a lexical cache of header files, a single PTH file can be safely used
- when compiling for multiple architectures. This can also reduce
- compile times because only a single PTH file needs to be generated
- during a build instead of several.
- **Reduced memory pressure**
- Similar to GCC, Clang reads PTH files
- via the use of memory mapping (i.e., ``mmap``). Clang, however,
- memory maps PTH files as read-only, meaning that multiple invocations
- of ``clang -cc1`` can share the same pages in memory from a
- memory-mapped PTH file. In comparison, GCC also memory maps its PCH
- files but also modifies those pages in memory, incurring the
- copy-on-write costs. The read-only nature of PTH can greatly reduce
- memory pressure for builds involving multiple cores, thus improving
- overall scalability.
- **Fast generation**
- PTH files can be generated in a small fraction
- of the time needed to generate GCC's PCH files. Since PTH/PCH
- generation is a serial operation that typically blocks progress
- during a build, faster generation time leads to improved processor
- utilization with parallel builds on multicore machines.
- Despite these strengths, PTH's simple design suffers some algorithmic
- handicaps compared to other PCH strategies such as those used by GCC.
- While PTH can greatly speed up the processing time of a header file, the
- amount of work required to process a header file is still roughly linear
- in the size of the header file. In contrast, the amount of work done by
- GCC to process a precompiled header is (theoretically) constant (the
- ASTs for the header are literally memory mapped into the compiler). This
- means that only the pieces of the header file that are referenced by the
- source file including the header are the only ones the compiler needs to
- process during actual compilation. While GCC's particular implementation
- of PCH mitigates some of these algorithmic strengths via the use of
- copy-on-write pages, the approach itself can fundamentally dominate at
- an algorithmic level, especially when one considers header files of
- arbitrary size.
- There is also a PCH implementation for Clang based on the lazy
- deserialization of ASTs. This approach theoretically has the same
- constant-time algorithmic advantages just mentioned but also retains some
- of the strengths of PTH such as reduced memory pressure (ideal for
- multi-core builds).
- Internal PTH Optimizations
- --------------------------
- While the main optimization employed by PTH is to reduce lexing time of
- header files by caching pre-lexed tokens, PTH also employs several other
- optimizations to speed up the processing of header files:
- - ``stat`` caching: PTH files cache information obtained via calls to
- ``stat`` that ``clang -cc1`` uses to resolve which files are included
- by ``#include`` directives. This greatly reduces the overhead
- involved in context-switching to the kernel to resolve included
- files.
- - Fast skipping of ``#ifdef`` ... ``#endif`` chains: PTH files
- record the basic structure of nested preprocessor blocks. When the
- condition of the preprocessor block is false, all of its tokens are
- immediately skipped instead of requiring them to be handled by
- Clang's preprocessor.
|