Browse Source

Enable JIT compiler for x64.

Only works on Linux/x64 and Windows/x64 right now.
Force an x64 build on Linux/x64 with: make CC="gcc -m64"
NYI: handle on-trace OOM errors.
NYI: improve register allocation for x64.
Mike Pall 15 years ago
parent
commit
2e22d33d9d
7 changed files with 25 additions and 31 deletions
  1. 2 1
      doc/changes.html
  2. 2 3
      doc/faq.html
  3. 6 13
      doc/install.html
  4. 3 2
      doc/luajit.html
  5. 3 4
      doc/status.html
  6. 9 7
      src/Makefile
  7. 0 1
      src/lj_arch.h

+ 2 - 1
doc/changes.html

@@ -58,7 +58,7 @@ to see whether newer versions are available.
 <li>CPU support:
 <li>CPU support:
 <ul>
 <ul>
 <li>Port integrated memory allocator to Linux/x64 and Windows/x64.</li>
 <li>Port integrated memory allocator to Linux/x64 and Windows/x64.</li>
-<li>Port the interpreter to x64.</li>
+<li>Port interpreter and JIT compiler to x64.</li>
 <li>Port DynASM to x64.</li>
 <li>Port DynASM to x64.</li>
 <li>Many 32/64 bit cleanups in the VM.</li>
 <li>Many 32/64 bit cleanups in the VM.</li>
 <li>Allow building the interpreter with either x87 or SSE2 arithmetics.</li>
 <li>Allow building the interpreter with either x87 or SSE2 arithmetics.</li>
@@ -80,6 +80,7 @@ to see whether newer versions are available.
 </ul></li>
 </ul></li>
 <li>Structural and performance enhancements:
 <li>Structural and performance enhancements:
 <ul>
 <ul>
+<li>Improve heuristics for bytecode penalties and blacklisting.</li>
 <li>Split CALL/FUNC recording and clean up fast function call semantics.</li>
 <li>Split CALL/FUNC recording and clean up fast function call semantics.</li>
 <li>Major redesign of internal function call handling.</li>
 <li>Major redesign of internal function call handling.</li>
 <li>Improve FOR loop const specialization and integerness checks.</li>
 <li>Improve FOR loop const specialization and integerness checks.</li>

+ 2 - 3
doc/faq.html

@@ -137,9 +137,8 @@ The compiler will happily optimize away such indirections.</dd>
 machine code. This means the code generator must be ported to each
 machine code. This means the code generator must be ported to each
 architecture. And the fast interpreter is written in assembler and
 architecture. And the fast interpreter is written in assembler and
 must be ported, too. This is quite an undertaking.<br> Currently only
 must be ported, too. This is quite an undertaking.<br> Currently only
-x86 CPUs are supported. x64 support is in the works. Other
-architectures will follow with sufficient demand and/or
-sponsoring.</dd>
+x86 and x64 CPUs are supported. Other architectures will follow based
+on sufficient user demand and/or sponsoring.</dd>
 </dl>
 </dl>
 
 
 <dl>
 <dl>

+ 6 - 13
doc/install.html

@@ -54,16 +54,10 @@ LuaJIT currently builds out-of-the box on all popular x86 systems
 (Linux, Windows, OSX etc.). It builds and runs fine as a 32&nbsp;bit
 (Linux, Windows, OSX etc.). It builds and runs fine as a 32&nbsp;bit
 application under x64-based systems, too.
 application under x64-based systems, too.
 </p>
 </p>
-<p class="indent" style="color: #00a000;">
-The x64 port of LuaJIT is still experimental and not enabled by default.
-It only contains the interpreter and only builds on Linux/x64 and WIN64
-right now. If you want to give it a try, follow the special build instructions
-below.
-</p>
-<p class="indent" style="color: #00a000;">
-Note that the pure interpreter is quite a bit faster than Lua, but of
-course not as fast as the x86 JIT compiler. Work on the x64 JIT compiler
-is still ongoing.
+<p style="color: #00a000;">
+The x64 port of LuaJIT is still preliminary and not enabled by default.
+It only builds on Linux/x64 and Windows/x64 right now. If you want to
+give it a try, please follow the special build instructions below.
 </p>
 </p>
 
 
 <h2>Configuring LuaJIT</h2>
 <h2>Configuring LuaJIT</h2>
@@ -119,8 +113,7 @@ make
 </pre>
 </pre>
 <div style="color: #00a000;">
 <div style="color: #00a000;">
 <p>
 <p>
-You can force a build of the x64 interpreter on Linux/x64 with the
-following command:
+You can force a native x64 build on Linux/x64 with the following command:
 </p>
 </p>
 <pre class="code">
 <pre class="code">
 make CC="gcc -m64"
 make CC="gcc -m64"
@@ -212,7 +205,7 @@ setenv /release /x86
 </pre>
 </pre>
 <div style="color: #00a000;">
 <div style="color: #00a000;">
 <p>
 <p>
-Or select the x64 compiler (this only builds the interpreter right now):
+Or select the x64 compiler:
 </p>
 </p>
 <pre class="code">
 <pre class="code">
 setenv /release /x64
 setenv /release /x64

+ 3 - 2
doc/luajit.html

@@ -63,8 +63,9 @@ standard Lua interpreter and can be deployed as a drop-in replacement.
 <p>
 <p>
 LuaJIT offers more performance, at the expense of portability. It
 LuaJIT offers more performance, at the expense of portability. It
 currently runs on all popular operating systems based on <b>x86 CPUs</b>
 currently runs on all popular operating systems based on <b>x86 CPUs</b>
-(Linux, Windows, OSX etc.). A port to x64 CPUs is currently ongoing &mdash;
-you can follow its progress in the <a href="http://luajit.org/download.html"><span class="ext">&raquo;</span>&nbsp;git repository</a>.
+(Linux, Windows, OSX etc.). A preliminary port to Linux/x64 and Windows/x64
+is already available (follow the <a href="install.html">build instructions</a>
+to enable it).
 Other platforms will be supported in the future, based on user demand
 Other platforms will be supported in the future, based on user demand
 and sponsoring.
 and sponsoring.
 </p>
 </p>

+ 3 - 4
doc/status.html

@@ -148,9 +148,8 @@ trace linking heuristics prevent this, but in the worst case this
 means the code always falls back to the interpreter.
 means the code always falls back to the interpreter.
 </li>
 </li>
 <li>
 <li>
-<b>Trace management</b> needs more tuning: better blacklisting of aborted
-traces, less drastic countermeasures against trace explosion and better
-heuristics in general.
+<b>Trace management</b> needs more tuning: less drastic countermeasures
+against trace explosion and better heuristics in general.
 </li>
 </li>
 <li>
 <li>
 Some checks are missing in the JIT-compiled code for obscure situations
 Some checks are missing in the JIT-compiled code for obscure situations
@@ -199,7 +198,7 @@ Nonetheless, it compiles to native code and needs to be adapted to each
 architecture. Porting the compiler backend is probably the easier task,
 architecture. Porting the compiler backend is probably the easier task,
 but a key element of its design is the fast interpreter, written in
 but a key element of its design is the fast interpreter, written in
 machine-specific assembler.<br>
 machine-specific assembler.<br>
-An x64 port is already in the works, thanks to the
+A preliminary x64 port is already available, thanks to the
 <a href="sponsors.html">LuaJIT sponsorship program</a>.
 <a href="sponsors.html">LuaJIT sponsorship program</a>.
 Other ports will follow &mdash; companies which are
 Other ports will follow &mdash; companies which are
 interested in sponsoring a port to a particular architecture, please
 interested in sponsoring a port to a particular architecture, please

+ 9 - 7
src/Makefile

@@ -20,9 +20,6 @@ NODOTABIVER= 51
 # Turn any of the optional settings on by removing the '#' in front of them.
 # Turn any of the optional settings on by removing the '#' in front of them.
 # You need to 'make clean' and 'make' again, if you change any options.
 # You need to 'make clean' and 'make' again, if you change any options.
 #
 #
-# Note: LuaJIT can only be compiled for x86, and not for x64 (yet)!
-#       In the meantime, the x86 binary runs fine under a x64 OS.
-#
 # It's recommended to compile at least for i686. By default the assembler part
 # It's recommended to compile at least for i686. By default the assembler part
 # of the interpreter makes use of CMOV/FCOMI*/FUCOMI* instructions, anyway.
 # of the interpreter makes use of CMOV/FCOMI*/FUCOMI* instructions, anyway.
 CC= gcc -m32 -march=i686
 CC= gcc -m32 -march=i686
@@ -30,10 +27,14 @@ CC= gcc -m32 -march=i686
 # binaries to a different machine:
 # binaries to a different machine:
 #CC= gcc -m32 -march=native
 #CC= gcc -m32 -march=native
 #
 #
+# Currently LuaJIT builds by default as a 32 bit binary. Use this to force
+# a native x64 build on Linux/x64:
+#CC= gcc -m64
+#
 # Since the assembler part does NOT maintain a frame pointer, it's pointless
 # Since the assembler part does NOT maintain a frame pointer, it's pointless
-# to slow down the C part by not omitting it. Debugging and tracebacks are
-# not affected -- the assembler part has frame unwind information and GCC
-# emits it with -g (see CCDEBUG below).
+# to slow down the C part by not omitting it. Debugging, tracebacks and
+# unwinding are not affected -- the assembler part has frame unwind
+# information and GCC emits it where needed (x64) or with -g (see CCDEBUG).
 CCOPT= -O2 -fomit-frame-pointer
 CCOPT= -O2 -fomit-frame-pointer
 # Use this if you want to generate a smaller binary (but it's slower):
 # Use this if you want to generate a smaller binary (but it's slower):
 #CCOPT= -Os -fomit-frame-pointer
 #CCOPT= -Os -fomit-frame-pointer
@@ -75,7 +76,8 @@ XCFLAGS=
 #
 #
 # Use the system provided memory allocator (realloc) instead of the
 # Use the system provided memory allocator (realloc) instead of the
 # bundled memory allocator. This is slower, but sometimes helpful for
 # bundled memory allocator. This is slower, but sometimes helpful for
-# debugging. It's mandatory for Valgrind's memcheck tool, too.
+# debugging. It's helpful for Valgrind's memcheck tool, too. This option
+# cannot be enabled on x64, since the built-in allocator is mandatory.
 #XCFLAGS+= -DLUAJIT_USE_SYSMALLOC
 #XCFLAGS+= -DLUAJIT_USE_SYSMALLOC
 #
 #
 # This define is required to run LuaJIT under Valgrind. The Valgrind
 # This define is required to run LuaJIT under Valgrind. The Valgrind

+ 0 - 1
src/lj_arch.h

@@ -48,7 +48,6 @@
 #define LJ_TARGET_X64		1
 #define LJ_TARGET_X64		1
 #define LJ_TARGET_X86ORX64	1
 #define LJ_TARGET_X86ORX64	1
 #define LJ_PAGESIZE		4096
 #define LJ_PAGESIZE		4096
-#define LJ_ARCH_NOJIT		1	/* NYI */
 #else
 #else
 #error "No target architecture defined"
 #error "No target architecture defined"
 #endif
 #endif