MarkedUpDisassembly.rst 3.3 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586
  1. =======================================
  2. LLVM's Optional Rich Disassembly Output
  3. =======================================
  4. .. contents::
  5. :local:
  6. Introduction
  7. ============
  8. LLVM's default disassembly output is raw text. To allow consumers more ability
  9. to introspect the instructions' textual representation or to reformat for a more
  10. user friendly display there is an optional rich disassembly output.
  11. This optional output is sufficient to reference into individual portions of the
  12. instruction text. This is intended for clients like disassemblers, list file
  13. generators, and pretty-printers, which need more than the raw instructions and
  14. the ability to print them.
  15. To provide this functionality the assembly text is marked up with annotations.
  16. The markup is simple enough in syntax to be robust even in the case of version
  17. mismatches between consumers and producers. That is, the syntax generally does
  18. not carry semantics beyond "this text has an annotation," so consumers can
  19. simply ignore annotations they do not understand or do not care about.
  20. After calling ``LLVMCreateDisasm()`` to create a disassembler context the
  21. optional output is enable with this call:
  22. .. code-block:: c
  23. LLVMSetDisasmOptions(DC, LLVMDisassembler_Option_UseMarkup);
  24. Then subsequent calls to ``LLVMDisasmInstruction()`` will return output strings
  25. with the marked up annotations.
  26. Instruction Annotations
  27. =======================
  28. .. _contextual markups:
  29. Contextual markups
  30. ------------------
  31. Annoated assembly display will supply contextual markup to help clients more
  32. efficiently implement things like pretty printers. Most markup will be target
  33. independent, so clients can effectively provide good display without any target
  34. specific knowledge.
  35. Annotated assembly goes through the normal instruction printer, but optionally
  36. includes contextual tags on portions of the instruction string. An annotation
  37. is any '<' '>' delimited section of text(1).
  38. .. code-block:: bat
  39. annotation: '<' tag-name tag-modifier-list ':' annotated-text '>'
  40. tag-name: identifier
  41. tag-modifier-list: comma delimited identifier list
  42. The tag-name is an identifier which gives the type of the annotation. For the
  43. first pass, this will be very simple, with memory references, registers, and
  44. immediates having the tag names "mem", "reg", and "imm", respectively.
  45. The tag-modifier-list is typically additional target-specific context, such as
  46. register class.
  47. Clients should accept and ignore any tag-names or tag-modifiers they do not
  48. understand, allowing the annotations to grow in richness without breaking older
  49. clients.
  50. For example, a possible annotation of an ARM load of a stack-relative location
  51. might be annotated as:
  52. .. code-block:: nasm
  53. ldr <reg gpr:r0>, <mem regoffset:[<reg gpr:sp>, <imm:#4>]>
  54. 1: For assembly dialects in which '<' and/or '>' are legal tokens, a literal token is escaped by following immediately with a repeat of the character. For example, a literal '<' character is output as '<<' in an annotated assembly string.
  55. C API Details
  56. -------------
  57. The intended consumers of this information use the C API, therefore the new C
  58. API function for the disassembler will be added to provide an option to produce
  59. disassembled instructions with annotations, ``LLVMSetDisasmOptions()`` and the
  60. ``LLVMDisassembler_Option_UseMarkup`` option (see above).