24 ani în urmă · 951a13548a
--- a/docs/jit-thoughts
+++ b/docs/jit-thoughts
@@ -11,7 +11,53 @@ We are designing a JIT compiler, so we have to consider two things:
 
				 The current approach is to keep the JITer as simple as possible, and thus as
			
 
				 fast as possible. The generated code quality will suffer from that.
			
 
				 
			
 
				-X86 register allocation:
			
 
				+We do not map local variables to registers at the moment, and this makes the
			
 
				+whole JIT much easier, for example we do not need to identify basic block
			
 
				+boundaries or the lifetime of local variables, or select the variables which
			
 
				+are worth to put into a register.
			
 
				+
			
 
				+Register allocation is thus done only inside the trees of the forest, and each
			
 
				+tree can use the full set of registers. We simply split a tree if we get out of
			
 
				+registers, for example the following tree:
			
 
				+
			
 
				+
			
 
				+              add(R0)
			
 
				+             /   \
			
 
				+            /     \
			
 
				+           a(R0)  add(R1)
			
 
				+                 /   \
			
 
				+                /     \
			
 
				+               b(R1)  add(R2)
			
 
				+                     /   \
			
 
				+                    /     \
			
 
				+                   c(R2)   b(R3)
			
 
				+
			
 
				+can be transformed to:
			
 
				+
			
 
				+
			
 
				+       stloc(t1)         add(R0)
			
 
				+         |              /   \
			
 
				+         |             /     \
			
 
				+        add(R0)       a(R0)  add(R1)
			
 
				+       /   \                /   \
			
 
				+      /     \              /     \
			
 
				+     c(R0)   b(R1)        b(R1)  t1(R2)
			
 
				+
			
 
				+
			
 
				+Please notice that the split trees use less registers than the original
			
 
				+tree. 
			
 
				+
			
 
				+
			
 
				+Register Allocation:
			
 
				+====================
			
 
				+
			
 
				+With lcc you can assign a fixed register to a tree before register
			
 
				+allocation. For example this is needed by call, which return the value always
			
 
				+in EAX on x86. The current implementation works without such system, due to
			
 
				+special forest generation.
			
 
				+
			
 
				+
			
 
				+X86 Register Allocation:
			
 
				 ========================
			
 
				 
			
 
				 We can use 8bit or 16bit registers on the x86. If we use that feature we have
			
@@ -27,17 +73,28 @@ Most processors have more that one register set, at least one for floating
 
				 point values, and one for integers. Should we support architectures with more
			
 
				 that two sets? Does someone knows such an architecture?
			
 
				 
			
 
				-Register Allocation:
			
 
				-====================
			
 
				+64bit Integer Values:
			
 
				+=====================
			
 
				+
			
 
				+I can imagine two different implementation. On possibility would be to treat
			
 
				+long (64bit) values simply like any other value type. This implies that we
			
 
				+call class methods for ALU operations like add or sub. Sure, this method will
			
 
				+be be a bit inefficient.
			
 
				+
			
 
				+The more performant solution is to allocate two 32bit registers for each 64bit
			
 
				+value. We add a new non terminal to the monoburg grammar called long_reg. The
			
 
				+register allocation routines takes care of this non terminal and allocates two
			
 
				+registers for them.
			
 
				 
			
 
				-With lcc you can assign a fixed register to a tree before register
			
 
				-allocation. For example this is needed by call, which return the value always
			
 
				-in EAX on x86. The current implementation works without such system (due to
			
 
				-special forest generation), and I wonder if we really need this feature?
			
 
				 
			
 
				 Forest generation:
			
 
				 ==================
			
 
				 
			
 
				+It seems that trees generated from the CIL language have some special
			
 
				+properties, i.e. the trees already represents basic blocks, so there can be no
			
 
				+branches to the inside of such a tree. All results of those trees are stored to
			
 
				+memory.
			
 
				+
			
 
				 One idea was to drive the code generation directly from the CIL code, without
			
 
				 generating an intermediate forest of trees. I think this is not possible,
			
 
				 because you always have to gather some attributes and attach it to the
			
@@ -46,8 +103,6 @@ tree is the right thing and that also works perfectly with monoburg. IMO we
 
				 would not get any benefit from trying to feed monoburg directly with CIL
			
 
				 instructions. 
			
 
				 
			
 
				-We can also speedup the tree generation by using alloca instead of malloc.
			
 
				-
			
 
				 DAG handling:
			
 
				 =============