jit-imt 6.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139
  1. IMT-based interface invocation support
  2. The mono JIT can use an IMT-style invocation system to call interface methods.
  3. This considerably reduces the runtime memory usage when many interface types
  4. are loaded, because the old system required an array in MonoVTable indexed
  5. by the interface id, which grows linearly as more interfaces are loaded.
  6. In some cases there are also speedups, since an interface call can reduce to
  7. a virtual call automatically.
  8. IMT instead uses a fixed-size table and hashes each method in the implemented
  9. interfaces to a slot in the IMT table. To be able to resolve collisions, at each
  10. callsite we store the interface MonoMethod to be called in a well-known register and
  11. the IMT table will contain a snippet of code that uses it to jump to the
  12. proper vtable slot. The interface invocation sequence becomes (in pseudo-code):
  13. mov magic_reg, interface_monomethod
  14. call vtable [imt_slot]
  15. The IMT table is stored at negative addresses in the vtable, like the old
  16. interface array used to be.
  17. A small note on the choice of magic_reg for different JIT backends: the IMT
  18. method identifier doesn't necessarily need to be stored in a register, though
  19. doing so is fast and the JIT code has already the infrastructure to handle this
  20. case in an arch-independent way. A JIT porter just needs to #define
  21. MONO_ARCH_IMT_REG to the choosen register. Note that this register should be
  22. part of the MONO_ARCH_CALLEE_REGS set as it will be handled by the local register
  23. allocator (see mini/inssel.brg) and it must not be part of the registers used for
  24. argument passing as you'd overwrite an argument in that case.
  25. Also note that the method-specific trampoline code should make sure to preserve
  26. this register (but it should already if it's in MONO_ARCH_CALLEE_REGS as
  27. it could have been used for a vtable indirect call).
  28. Note that in the case of a nono-colliding IMT slot, the interface call
  29. instruction sequence becomes equivalent to a virtual call, as the IMT slot
  30. will contain the direct trampoline for the method and the magic trampoline will
  31. set the slot to the method's native code address once it is compiled.
  32. In case of collisions in the IMT slot, the JIT performs a linear search if
  33. the colliding methods are few or a binary search otherwise.
  34. To make this easier for each JIT port, a sort of internal representation
  35. of the code is created: this is an array of MonoIMTCheckItem structures
  36. built in a way to allow easy generation of a bsearch, when the list of colliding
  37. methods becomes large.
  38. Each item in the array represents either a direct check for a method to be invoked
  39. or a bisection check in the bsearch algorithm.
  40. struct _MonoIMTCheckItem {
  41. MonoMethod *method;
  42. int check_target_idx;
  43. int vtable_slot;
  44. guint8 *jmp_code;
  45. guint8 *code_target;
  46. guint8 is_equals;
  47. guint8 compare_done;
  48. guint8 chunk_size;
  49. guint8 short_branch;
  50. };
  51. For a direct check, the is_equals value is non-zero and the emitted code
  52. should be equivalent to:
  53. if (magic_reg != item->method)
  54. jump_to_item (array [item->check_target_idx]);
  55. jump_to_vtable (item->vtable_slot);
  56. Note that if item->check_target_idx is 0, the jump should be omitted
  57. since this is the end of a linear sequence (you might want to insert a jump to
  58. a breakpoint, though, for debugging) and this would mean that we have an error:
  59. the IMT slot was asked to execute an interface method that the type doesn't implement.
  60. In the future we might want to handle this case not with a breakpoint or assert, but
  61. by either throwing an InvalidCast exception or by going into the runtime and
  62. adding support for the interface automagically to the type/vtable: this could be used
  63. both for tranparent proxies and for the implicit interfaces that vectors in 2.0
  64. provide.
  65. For a bisect check the code is even simpler:
  66. if (magic_reg >= item->method)
  67. jump_to_item (array [item->check_target_idx]);
  68. In this case item->check_target_idx is always non-zero.
  69. Note that in both cases item->method becomes an immediate constant in the
  70. jitted code.
  71. The other fields in the structure are there to provide to the backend
  72. common storage for data needed during emission.
  73. As each item's code is emitted, the start of it is stored in the code_target
  74. field. At the same time when a conditional branch is inserted, its address
  75. is stored in jmp_code: this way with a single forward pass on the array at
  76. the end of the emission phase the branches can be patched to point to the
  77. proper target item's code (this process would patch the jump_to_item pseudo
  78. instructions described above).
  79. chunk_size can be used to store the size of the code generated for the item: this
  80. can be used to optimize the short/long branch instructions, together with
  81. info stored in short_branch. It is also used to calculate the size of the
  82. code to allocate for the whole IMT thunk.
  83. The compare_done field can be used to avoid doing an additional compare
  84. in a is_equals item for the same MonoMethod that was just compared in a
  85. bisecting item. Suppose we have 4 methods colliding in a slot, A, B, C and D.
  86. The arch-independent code already took care of sorting them, so that:
  87. A < B < C < D
  88. The generated code will look like (M is the method to call):
  89. compare (C, M)
  90. goto upper_sequence if bigger_equals
  91. /* linear sequence */
  92. compare (M, A)
  93. goto B_found if not_equals
  94. jump to A's slot
  95. B_found:
  96. jump to B's slot
  97. upper_sequence:
  98. /* we just did a compare against C, no need to compare again */
  99. goto D_found if not_equals
  100. jump to C's slot
  101. D_found:
  102. jump to D's slot
  103. This optimization is of course valid for architectures with flags registers.
  104. As a further optimization to reduce memory usage, the Mono runtime sets the
  105. IMT slots initially to a single-instance magic trampoline so there is actually no
  106. memory used up by the thunks in the case of collisions. When an interface method is
  107. called the magic trampoline will fill-in the IMT slot with the proper thunk or
  108. trampoline, so later calls will use the fast path.
  109. This single-instance trampoline will use MONO_FAKE_IMT_METHOD as the method
  110. it's asking to be compiled and executed: the trampoline code does recognize
  111. this special value and retrives the interface method to call from the usual
  112. MONO_ARCH_IMT_REG saved by the trampoline code.
  113. Given that only the IMT slots that are actually used will be initialized, this saves
  114. quite a bit of memory, as it's unlikely that all the interface methods are called on
  115. all the different types.