DataFlowSanitizer.rst 5.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161
  1. =================
  2. DataFlowSanitizer
  3. =================
  4. .. toctree::
  5. :hidden:
  6. DataFlowSanitizerDesign
  7. .. contents::
  8. :local:
  9. Introduction
  10. ============
  11. NOTE: this document applies to the original Clang project, not the DirectX
  12. Compiler. It's made available for informational purposes only.
  13. DataFlowSanitizer is a generalised dynamic data flow analysis.
  14. Unlike other Sanitizer tools, this tool is not designed to detect a
  15. specific class of bugs on its own. Instead, it provides a generic
  16. dynamic data flow analysis framework to be used by clients to help
  17. detect application-specific issues within their own code.
  18. Usage
  19. =====
  20. With no program changes, applying DataFlowSanitizer to a program
  21. will not alter its behavior. To use DataFlowSanitizer, the program
  22. uses API functions to apply tags to data to cause it to be tracked, and to
  23. check the tag of a specific data item. DataFlowSanitizer manages
  24. the propagation of tags through the program according to its data flow.
  25. The APIs are defined in the header file ``sanitizer/dfsan_interface.h``.
  26. For further information about each function, please refer to the header
  27. file.
  28. ABI List
  29. --------
  30. DataFlowSanitizer uses a list of functions known as an ABI list to decide
  31. whether a call to a specific function should use the operating system's native
  32. ABI or whether it should use a variant of this ABI that also propagates labels
  33. through function parameters and return values. The ABI list file also controls
  34. how labels are propagated in the former case. DataFlowSanitizer comes with a
  35. default ABI list which is intended to eventually cover the glibc library on
  36. Linux but it may become necessary for users to extend the ABI list in cases
  37. where a particular library or function cannot be instrumented (e.g. because
  38. it is implemented in assembly or another language which DataFlowSanitizer does
  39. not support) or a function is called from a library or function which cannot
  40. be instrumented.
  41. DataFlowSanitizer's ABI list file is a :doc:`SanitizerSpecialCaseList`.
  42. The pass treats every function in the ``uninstrumented`` category in the
  43. ABI list file as conforming to the native ABI. Unless the ABI list contains
  44. additional categories for those functions, a call to one of those functions
  45. will produce a warning message, as the labelling behavior of the function
  46. is unknown. The other supported categories are ``discard``, ``functional``
  47. and ``custom``.
  48. * ``discard`` -- To the extent that this function writes to (user-accessible)
  49. memory, it also updates labels in shadow memory (this condition is trivially
  50. satisfied for functions which do not write to user-accessible memory). Its
  51. return value is unlabelled.
  52. * ``functional`` -- Like ``discard``, except that the label of its return value
  53. is the union of the label of its arguments.
  54. * ``custom`` -- Instead of calling the function, a custom wrapper ``__dfsw_F``
  55. is called, where ``F`` is the name of the function. This function may wrap
  56. the original function or provide its own implementation. This category is
  57. generally used for uninstrumentable functions which write to user-accessible
  58. memory or which have more complex label propagation behavior. The signature
  59. of ``__dfsw_F`` is based on that of ``F`` with each argument having a
  60. label of type ``dfsan_label`` appended to the argument list. If ``F``
  61. is of non-void return type a final argument of type ``dfsan_label *``
  62. is appended to which the custom function can store the label for the
  63. return value. For example:
  64. .. code-block:: c++
  65. void f(int x);
  66. void __dfsw_f(int x, dfsan_label x_label);
  67. void *memcpy(void *dest, const void *src, size_t n);
  68. void *__dfsw_memcpy(void *dest, const void *src, size_t n,
  69. dfsan_label dest_label, dfsan_label src_label,
  70. dfsan_label n_label, dfsan_label *ret_label);
  71. If a function defined in the translation unit being compiled belongs to the
  72. ``uninstrumented`` category, it will be compiled so as to conform to the
  73. native ABI. Its arguments will be assumed to be unlabelled, but it will
  74. propagate labels in shadow memory.
  75. For example:
  76. .. code-block:: none
  77. # main is called by the C runtime using the native ABI.
  78. fun:main=uninstrumented
  79. fun:main=discard
  80. # malloc only writes to its internal data structures, not user-accessible memory.
  81. fun:malloc=uninstrumented
  82. fun:malloc=discard
  83. # tolower is a pure function.
  84. fun:tolower=uninstrumented
  85. fun:tolower=functional
  86. # memcpy needs to copy the shadow from the source to the destination region.
  87. # This is done in a custom function.
  88. fun:memcpy=uninstrumented
  89. fun:memcpy=custom
  90. Example
  91. =======
  92. The following program demonstrates label propagation by checking that
  93. the correct labels are propagated.
  94. .. code-block:: c++
  95. #include <sanitizer/dfsan_interface.h>
  96. #include <assert.h>
  97. int main(void) {
  98. int i = 1;
  99. dfsan_label i_label = dfsan_create_label("i", 0);
  100. dfsan_set_label(i_label, &i, sizeof(i));
  101. int j = 2;
  102. dfsan_label j_label = dfsan_create_label("j", 0);
  103. dfsan_set_label(j_label, &j, sizeof(j));
  104. int k = 3;
  105. dfsan_label k_label = dfsan_create_label("k", 0);
  106. dfsan_set_label(k_label, &k, sizeof(k));
  107. dfsan_label ij_label = dfsan_get_label(i + j);
  108. assert(dfsan_has_label(ij_label, i_label));
  109. assert(dfsan_has_label(ij_label, j_label));
  110. assert(!dfsan_has_label(ij_label, k_label));
  111. dfsan_label ijk_label = dfsan_get_label(i + j + k);
  112. assert(dfsan_has_label(ijk_label, i_label));
  113. assert(dfsan_has_label(ijk_label, j_label));
  114. assert(dfsan_has_label(ijk_label, k_label));
  115. return 0;
  116. }
  117. Current status
  118. ==============
  119. DataFlowSanitizer is a work in progress, currently under development for
  120. x86\_64 Linux.
  121. Design
  122. ======
  123. Please refer to the :doc:`design document<DataFlowSanitizerDesign>`.