Not completed yet. Only Section 1, 2, 3, 4 are rewritten.
Kouya Shimura
Fujitsu Laboratories
kouya@flab.fujitsu.co.jp
Translation by Satoshi Matsuoka
Tokyo Institute of Technology
matsu@is.titech.ac.jp
Friday, October 29, 1999
This internal specification covers the runtime portion of the OpenJIT backend compiler for SPARC V8 CPU and IA32(x86) CPU. We outline the structure of this specification document below. The OpenJIT Backend compiler largely consists of the compiler part and the runtime part; this document covers the runtime part.
The Java JDK has several APIs for JIT compilers. OpenJIT is plugged into a given JDK using this API. We first explain the API, and describe the implementation in OpenJIT.
When JDK starts up, it first reads the Java system class (java.lang.System
), and then loads the java.lang.Compiler
class. Java.lang.Compiler
is defined as follows:
public final class Compiler { private Compiler() {} // don't make instances private static native void initialize(); static { try { String library = System.getProperty("java.compiler"); if (library != null) { System.loadLibrary(library); initialize(); } } catch (Throwable e) { } } ... }
When this class is loaded, the class initializer is executed, and looks for the property "java.compiler
". On JDK 1.1.x and JDK 1.2.x, the user either specifies the compiler via a command-line option "-Djava.compiler=
XXX", or sets the environment variable JAVA_COMPILER
, allowing the system to dynamically link the library via System.loadLibrary
. Then, the native method initialize()
is invoked. This method is defined in C as follows:
void java_lang_Compiler_initialize (Hjava_lang_Compiler *this) { void *address = (void *)sysDynamicLink("java_lang_Compiler_start"); if (address != 0) { (*(void (*) (void **)) address) (CompiledCodeLinkVector); } compilerInitialized = TRUE; }
NOTE: Above is the definition on JDK 1.1.x, and it is different on JDK 1.2.x. The difference is the exact type of CompiledCodeLinkVector
and it is not so serious difference for our purpose, so we define version independent structure JITlink
to hide differences related on the type.
By defining the function java_lang_Compiler_start()
in the OpenJIT native library, this function is thereby invoked by the JVM, allowing proper initialization of OpenJIT. The argument CompiledCodeLinkVector
passes the necessary values for JIT compilation; it essentially is a vector of hook functions for JIT compilation, and by modifying the vector appropriately, the JIT compiler is invoked appropriately by the JVM when needed. Some essential hook functions are described in Table 1.
Function name | Feature |
---|---|
InitializeForCompiler |
On class load |
invokeCompiledMethod |
On method invocation |
CompiledCodeSignalHandler |
On signal occurrence |
CompilerFreeClass |
When class is no longer needed |
CompilerCompileClass |
Compilation of specified class |
CompilercompileClasses |
Compilation of specified classes |
CompilerEnable |
Enable Compiler |
CompilerDisable |
Disable Compiler |
ReadInCompiledCode |
Load pre-compiled code |
PCinCompiledCode |
For exception handling |
CompiledCodePC |
For exception handling |
CompiledFramePrev |
For exception handling |
Figure 1 illustrates the overview of JIT compilation process. When Java is invoked with a JIT compiler specified, the native portion of the JDK is dynamically linked into the JVM. By setting the hook functions appropriately as described above, the class loader will always call the InitializeForCompiler()
function on each class load.
InitializeForCompiler
, in turn, modifies the invoker of all the methods defined for the class (except for the native and abstract methods) so that the JIT compiler is invoked for dynamic compilation from the bytecode to the native code. When the method is compiled, the method invocation flag is modified so that the compiled code is now executed directly, and passes on the control to the generated native code. Thereafter, JVM invokes the compiled native code directly by the virtue of the flag.
When a compiled native code calls another compiled native code, the CompiledCode
field of the methodblock
structure is read and the control is transferred directly via a jump instruction.
The compiled native method accumulates in memory. The space is reclaimed when the hook function CompilerFreeClass
is called when the JDK actually deletes the class from memory. The hook function in turn frees the memory for the native code as well.
We have overviewed the JIT compilation process. As we can see, the unit of dynamic compilation is each individual method, and compilation happens on the first time the method is invoked. By all mans one can have adaptive strategies to compile on nth invocation, etc.
java_lang_Compiler_start
)In OpenJIT, the initialization routine java_lang_Compiler_start
performs the following task:
_compile_lock
), which is initialized here.org.OpenJIT.Sparc
or org.OpenJIT.X86
class
org.OpenJIT.Compile
for each method in order to compile. So the native runtime library needs to know the class before compilation started. At this time, we have machine-dependent concrete classes org.OpenJIT.Sparc
for SPARC and org.OpenJIT.X86
for x86.
org.OpenJIT.ExceptionHandler
class
org.OpenJIT.ExceptionHandler
class by loading it.OpenJIT_compile()
function
ResolveClassConstant()
function, obtaining the methodblock
structure for OpenJIT_compile()
.org.OpenJIT.Compile
class
CompiledCodeLinkVector
CompiledCodeVector
. Instead, we must re-initialize the classes already loaded by resetting the method invocation functions etc. of the already loaded classes, in order to allow them to be compiled as well.OpenJIT_InitializeForCompiler
)static void OpenJIT_InitializeForCompiler(ClassClass *cb)
Given a class, set the invoker functions of all the methods except for native methods, abstract methods and class initializer methods so that they are dynamically compiled on invocation. Also, in order to allow calls to a compiled method from another compiled method, the compilation must set the CompiledCode
field of the methodblock
structure.
Also, we set the CompiledCodeFlags
of the methodblock
structure depending on the type of the return value of the method. This value is used in dispatchVM()
function when the compiled native calls the JVM for interpretive execution.
Furthermore, by setting values in OpenJIT.properties
file, we can restrict classes or methods to be or not to be compiled.
OpenJIT_SignalHandler
)static void OpenJIT_SignalHandler(int sig, siginfo_t *info, ucontext_t *uc) -- for SPARC
static void OpenJIT_SignalHandler(int sig, void *info, void *uc) -- for x86
Called when a signal occurs. OpenJIT generates appropriate Java exceptions for arbitrary Unix signals. This function looks at the signal for Java exceptions, and routes control to an appropriate handler. If it receives a signal that is irrelevant to Java exceptions, it simply returns. For details of the exception handling please refer to Section 5.
OpenJIT_CompilerFreeClass
)static void OpenJIT_CompilerFreeClass(ClassClass *cb)
When JDK decides that a certain class is no longer necessary, this function is called, freeing the space occupied by the compiled native code.
OpenJIT_CompilerCompileClass
)static bool_t OpenJIT_CompilerCompileClass(ClassClass *cb)
Called from a java user program with the following: java.lang.Compiler.compileClass(clazz)
. This forces compilation of all methods that have not been compiled in the given class.
OpenJIT_CompilerEnable
)static void OpenJIT_CompilerEnable()
Called from a java user program with the following: java.lang.Compiler.enable()
. The methods that are called following this call will be compiled.
OpenJIT_CompilerDisable
)static void OpenJIT_CompilerDisable()
Called from a java user program with the following: java.lang.Compiler.disable()
. The subsequent methods called after this call will not be compiled. Note that, for the default OpenJIT implementation where each method is compiled on its first invocation, the caller of this method will have been already compiled and thus will execute as compiled native code.
OpenJIT_PCinCompiledCode
)static bool_t OpenJIT_PCinCompiledCode(caddr_t *pc, struct methodblock *mb)
Judges whether the current execution is within a given method using the program counter and the methodblock
structure. This function is used by the JDK when an exception occurs and it traces and displays the stacktrace of execution. If the method is being executed, then it returns TRUE, otherwise FALSE.
OpenJIT_CompiledCodePC
)static unsigned char *OpenJIT_CompiledCodePC(JavaFrame *frame, struct methodblock *mb)
Returns the value of the program counter given the frame and the methodblock
. This function is used by the JDK when an exception occurs and it traces and displays the stacktrace of execution.
methodblock
. Everything seems to work fine under this simplification for JDK 1.1.x and JDK 1.2.x, but other JDKs might break this assumption.
OpenJIT_CompiledFramePrev
)static JavaFrame *OpenJIT_CompiledFramePrev(JavaFrame *frame, JavaFrame *buf)
Converts the native compiled code stackframe into Java stackframe used by the JDK (JavaFrame
). This function is used by the JDK when an exception occurs and it traces and displays the stacktrace of execution.
The generated code by the OpenJIT follows the C stackframe convention, and this function performs the conversion under that assumption. For JDK 1.1.x, the JavaFrame
structure only utilizes the current method and the vars frame; thus, in practice these are the only two fields set by the function. As the converted frame must use the buf memory region, the function sets the values and returns buf.
For each method, both JIT compilation and and transfer of control to the native method happens at the point of the subject method invocation.
The JVM interpreter loop is structured as follows. When a method is invoked, the invoker function of the methodblock structure mb is called. Under interpretive execution, this in turn calls the JVM to generate a new Java stack frame. The first argument of o invoker()
is a pointer to a class object for static method calls, and is the pointer to the invoked object on normal method calls. The second argument mb
is a pointer to the methoblock
structure, and the third argument args_size
indicates the types of the arguments. The 4th argument ee
is a pointer to the execution environment structure ExecEnv
.
while(1) { get opcode from pc switch(opcode) { ...(various implementation of the JVM bytecodes) callmethod: mb->invoker(o, mb, args_size, ee); frame = ee->current_frame; /* setup java frame */ pc = frame->lastpc; /* setup pc*/ break; } }
As mentioned earlier, we substitute the value of the invoker to OpenJIT_invoke
when the class is loader. The OpenJIT_invoke
function is defined as follows in C:
bool_t OpenJIT_invoke(JHandle *o, struct methodblock *mb, int args_size, ExecEnv *ee);
This function in turns upcalls the OpenJIT_compile
to dynamically compile the method. Thereafter, the control is transferred to mb->invoker
, transferring control to the just compiled method.
A method is compiled and the invoker as well as the CompiledCode
fields of the methodblock
structure are initialized.
void OpenJIT_compile(struct methodblock *mb)
This function performs the followings:
CompiledCode
fields in the methodblock
structure. When a method is invoked during compilation, we set the invoker and CompiledCode
so that the interpreter is invoked. This allows natural handling of recursive self-compilation of OpenJIT compiler classes.OpenJIT_Sparc_compile()
). The Java method is upcalled to perform the actual compilation.methodblock
structure so that the compiled native code is invoked.methodblock
field values are restored to their original values.The CompileMethod
function sets the value of the invoker to one of the following stub functions according to the return type of the method. The last letter of the function indicates the return type: V:void, I:int, J:long, F:float, D:double. Other types such as Object or short,char,byte that could be encoded in 1 word use I as a default.
bool_t invokeCompiledCodeV(JHandle *o, struct methodblock *mb, int args_size, ExecEnv *ee) bool_t invokeCompiledCodeI(JHandle *o, struct methodblock *mb, int args_size, ExecEnv *ee) bool_t invokeCompiledCodeJ(JHandle *o, struct methodblock *mb, int args_size, ExecEnv *ee) bool_t invokeCompiledCodeF(JHandle *o, struct methodblock *mb, int args_size, ExecEnv *ee) bool_t invokeCompiledCodeD(JHandle *o, struct methodblock *mb, int args_size, ExecEnv *ee)
These stub function performs the following:
invokeCompiledCode
function itself has only four arguments, and arguments given for method to be called are contained in previous native frame which calls invokeCompiledCode
. To pass method's arguments to callee, we first expand native frame of invokeCompiledCode
by manipulating stack pointer, and then copy arguments into expanded native frame.
push
ed using inline assembler so stack is automatically expanded.methodblock
structure into appropriate place: Required for compiled code calling convention. For details, refer to the descriptions of the compiled code to compiled code details.
CompiledCode
.
invokeCompiledCode
function.Steps 2~4 are already packaged in the Macro INVOKE_COMPILED_CODE
.
As mentioned earlier, control is passed to the CompliedCode
field of the methodblock
. The C-style description of the call is as follows:
mb->CompiledCode(obj, arg0, arg1, arg2, ...)
The arguments are passed via SPARC function call convention, i.e., the first 6 arguments are passed in the registers %o0 ~ %o5. Note that, for Java native code, %o is reserved for the object or the class pointer of the call, so the register usage actually is shifted by one. As an example for the following Java program:
obj.method(arg0, arg1, arg2, arg3, arg4, arg5)
%o0 will have obj, %o1 will be assigned arg0, ..., %o5 will be assigned arg4, whereas arg5 is allocated onto the stack as caller-save register.
Similarly, the method return values also follow the SPARC function call convention, i.e., for integers, the value is returned in %i0, for longs (64-bits) %0 and %1, float in %f0, and doubles are returned in %f0 and %f1.
What differs from C function calls is that, we must always set the methodblock
of the method to be called into the %g3 register. This value is necessary for virtual function calls (OpenJIT_invokevirtual
, OpenJIT_invokevirtualobject_quick
), as well as for exception handling, upon which the value must be saved in the native stack.
The arguments are passed via UNIX/x86 function call convention, all arguments are passed through native stack. Method's return value are treated just same as C function call convention. (i.e. return values are returned using registers)
Register usage for return values
- %eax
- for objects,
int
, and lower-half oflong
- %edx
- for higher-half of
long
- %f0(floating stack top)
- for
float
anddouble
What differs from C function calls is that we must always set the methodblock
of the method to be called onto native stack top before the actual call. This value is necessary for virtual function calls (OpenJIT_invokevirtual
, OpenJIT_invokevirtualobject_quick
), as well as for exception handling, upon which the value must be saved in the native stack.
Here is an example for the following Java Program:
obj.method(arg0, arg1, arg2)
On the caller, arguments and methodblock
pointer are pushed before the actual call. Then method call
is performed. (See Native stack usage below)
just before method call | just after callee prologue | ||||
---|---|---|---|---|---|
register | stack | register | stack | ||
%esp | -> | ...callee's frame... | |||
%ebp | -> | %ebp of caller | |||
return address to caller | |||||
%esp | -> | methodblock for method to be called | methodblock for method to be called | ||
obj | obj | ||||
arg0 | arg0 | ||||
arg1 | arg1 | ||||
arg2 | arg2 | ||||
...caller's frame... | ...caller's frame... | ||||
%ebp | -> | %ebp of caller's caller | %ebp of caller's caller | ||
return address to caller's caller | return address to caller's caller | ||||
...arguments given to caller... | ...arguments given to caller... |
The methods to be compiled by the JITs already have the value of the CompiledCode
field in the methodblock
changed to dispatchJITCompiler
on class loading time. When a method is invoked for the first time, even if invoked from the compiled code, the dispatchJITCompiler
is invoked. So, no special treatments to compile the method are needed.
When the CompiledCode
field of the methodblock
structure is set to be dispatchJVM
, then the following function is called to transfer the control to the interpreter. This is used when compilation fails, or the compilation is restricted due to the compiler option.
void dispatchJVM(? arg0, ? arg1, ? arg2, ...)
JVM facilitates the following function to call a Java method from native code in general:
long do_execute_java_method(ExecEnv *ee, void *obj, char *method_name, char *signature, struct methodblock *mb, bool_t isStaticCall, ...);
DispatchJVM
interfaces the calling convention of the compiled native methods and this function:
methodblock
structure from %g3methodblock
structure of the caller into the dummy Java frame. This is required for reflection and exception handling.do_execute_java_method
When the method is originally native to begin with, the code
field of the methodblock
points to the native code to be executed. A call can be thus made in the following way in C:
optop = (*(stack_item *(*)(stack_item*, ExecEnv*))mb->code)(optop, ee)
The first argument is a pointer to the operand stack, and the second argument is a pointer to the execution environment ExecEnv
. As a result, when we call a native method from the compiled native code, we must assign the register arguments into operand stack similarly to the call to the JVM. Also, the returned value on the operand stack must be placed back in the register. For this purpose, we define 5 stub functions according to the return type:
void dispatchNormNativeV(...) int dispatchNormNativeI(...) int64_t dispatchNormNativeJ(...) float dispatchNormNativeF(...) double dispatchNormNativeD(...)
When the native method is a synchronized method, it further requires the monitor lock/unlock operations. For efficiency, we further define a set of separate stub functions to cover this case:
void dispatchSynchNativeV(...) int dispatchSynchNativeI(...) int64_t dispatchSynchNativeJ(...) float dispatchSynchNativeF(...) double dispatchSynchNativeD(...)
These functions perform the followings:
methodblock
from %g3(on SPARC) or stack(on x86) to obtain the starting address of the native method.of ExecEnv
methodblock
of the caller, and set to the dummy JVM frame. Again, this is required for reflection and exception handling.MonitorEnter
MonitorExit
handle_exception
, which actually controls the transfer entirely and never returns.mb->code
from the operand stack and place it appropriately into the register according to the return type.DISPATCH_NATIVE
Java bytecodes refers to classes, instance variables, and methods via symbol names. Symbols are stored in a structure called constant pool. Thus, on bytecodes execution, one must search the constant pool with the given symbols as a key, and obtain the actual address. This search is quite costly, as one must lock the constant pool region for multithreaded execution. Moreover, constant pool references occur quite frequently, and the cost of the search could dominate the overall execution time.
The JVM implementation solves this problem by modifying the bytecode on the fly. That is to say, the bytecode for constant pool access is modified to an equivalent so-called quick bytecode, which refers to the absolute address after the name has been resolved. For example, the bytecode instruction:
getfield #22 <Field Obj var>
pushes the object variable value onto the operand stack. When this instruction is first executed, the constant pool is always searched for the constant pool index #22. As a result, when we find that this variable can be accessed at a 4-byte offset from the object header, then the JVM modifies the code at runtime to the following quick version:
getfield_quick 4
then subsequently re-executes the instruction. From that point on, the quick instruction is always used, since the constant pool does not change over the execution of the program, effectively eliminating the lookup cost.
However, for native code, it is more difficult to eliminate this cost of lookup by naive application of a similar technique. A simple method would be to search all the possible symbol values in the constant pool and resolve them at once at compile time. However, constant pool resolution is not mere simple symbol resolution, but rather incurs other processing such as class loading and initialization; thus, this strategy could change the semantics of the program by changing the initialization order of classes.
The viable option is to change the native instruction code in the same manner as the interpreter. However, it is much more difficult to do for native code, which involves several native instructions per each bytecode. Since the length of the instruction sequence cannot change, this could involve insertion of several NOP instructions. Moreover, the change must be atomic, requiring some form of mutual exclusion. Moreover, the change must propagate across code caches on different processors in a multiprocessor environment.
The basic solution is as follows. For SPARCs, we place the CALL instruction to the constant pool resolution routine preceding a delay slot which contains the MOV instruction to set the index number of the symbol onto a register:call resolve mov #22,%o0
When this sequence is executed, the resolve() routine is called. There, after the constant pool is searched and the address corresponding to the index is found, the two instructions are rewritten so that now the instruction sequence places the the address (or the offset) value into a register:
sethi %hi(offset), %o0 or %o0, %lo(offset),%o0
Then, the resolve()
routine returns with the resolved address in the %o0 register. The instruction immediately following the two rewritten instructions merely accesses the memory using %o0. For SPARCs, we could further optimize this as small offsets can be encoded into one instruction, and the load instruction contains a displacement field.
Although the basic idea was given, in practice for SPARCs the MOV instruction only accepts signed 13 bits as index values. In JVM, the index value can be as large as 16 bits; so, we have employed the SETHI instruction instead of the MOV instruction, allowing the usage of 22 bits. One caveat is that the encoded index value is the value obtained by left shifting the bits by 10 bits.
call resolver -> sethi %hi(offset),%o0 sethi (index<<10),%o0 -> or %o0,%lo(offset),%o0
The following functions actually searches the constant pool and resolves the offsets:
int OpenJIT_resolve_getField(int index)
The JVM getfield
instruction is translated to call this function, which performs the followings:
CHECK_SELF_MODIFYING
)methodblock
of the caller. We obtain the address of the constant pool from this methodblock
structure. Also, we set the address value into the JVM dummy frame in case exception happens.RESOLVE_CLASS_CONST
)PATCH_SET_O0
)int OpenJIT_resolve_putField(int index)
The JVM putfield
instructions is translated to
call this function which performs the similar steps to OpenJIT_getField
.
int OpenJIT_resolveStaticField(int index)
The JVM putstatic
bytecode is translated to call this function.
CHECK_SELF_MODIFYING
)index
by 10bitsmethodblock
of the caller. We obtain the address of the constant pool from this methodblock
structure. Also, we set the address value into the JVM dummy frame in case exception happens.RESOLVE_CLASS_CONST
)PATCH_SET_O0
)
u.static_address
of the fieldblock
.PATCH_SET_O0
)u.static_value
of the fieldblock
.
void OpenJIT_getstatic(int index)
The JVM getstatic
instruction is translated to call
this function which performs the similar steps to OpenJIT_resolveStaticField
.
int OpenJIT_resolveString(int index)
For JVM bytecodes ldc
and ldc_w
, if the constant pool type is CONSTANT_String
, then the bytecodes are translated to call the function.
CHECK_SELF_MODIFYING
)index
by 10bitsmethodblock
of the caller. We obtain the address of the constant pool from this methodblock
structure. Also, we set the address value into the JVM dummy frame in case exception happens.RESOLVE_CLASS_CONST
)PATCH_SET_O0
)We modify only the CALL instruction. This is only employed when a new class is loaded on method call. This series of instructions is a combination of setting the pointer to the method block into the %g3 register, and CALLing the invoker function. Here, only the CALL instruction is modified.
sethi %hi(mb),%g3 -> sethi %hi(mb),%g3 call old_invoker -> call new_invoker or %g3,%lo(mb),%g3 -> or %g3,%lo(mb),%g3
The functions below are subject to such one-instruction modification:
int OpenJIT_invokeinterface(...)
The JVM invokeinterface
bytecode is translated to call this function, which performs the followings:
CHECK_SELF_MODIFYING
)methodblock
of the caller.RESOLVE_CLASS_CONST
)OpenJIT_invokeinterface_quick
OpenJIT_invokeinterface_quick
.The JVM invokeinteface_quick instruction is translated to invoke this function. Also, as indicated in Section 4.1.2.1, it is invoked subsequently to the invocation of OpenJIT_invokeinterface
. The %g3 register must contain the predicted value of the method table. The procedure is almost same as when JVM processes the invokeinterface_quick
bytecode, but differs in the following points:
We modify the instruction that sets the %g3 register, preceding the call instruction which called this function; the modification is such that the predicted value is shifted by 24 bits, and set to the upper 8-bits of the %g3 register. Since the predicated value could be old and stale, we do not lock the instruction upon modification.
void OpenJIT_invokespecial(...)
The JVM invokespecial
, invokenonvirtual_quick
bytecodes are translated to call this function. The function is effectively used when the method does not change for the given call site irrespective of the type of the object.
CHECK_SELF_MODIFYING
)methodblock
of the caller.methodblock
structure, and find out whether we are invoking the methods in the ancestor classes (super).OpenJIT_invokesuper_quick
, and jump to OpenJIT_invokesuper_quick()
.RESOLVE_NATIVE_OR_COMPILE
)mb->CompiledCode
.void OpenJIT_invokesuper_quick(...)
The JVM invokesuper_quick
bytecode is translated to call this function. Also, it might be called after a call to the OpenJIT_invokespecial
function. It performs essentially the same procedure as the JVM for this instruction.
void OpenJIT_invokestatic(...)
The JVM invokestatic
instruction (in case it does not require constant pool resolution) and the invokestatic_quick
instruction are translated to call this function, which performs the followings, allowing direct, fast calls to static methods:
CHECK_SELF_MODIFYING
)methodblock
of the caller.RESOLVE_NATIVE_OR_COMPILE
).mb->CompiledCode
mb->CompiledCode
.We modify the instruction sequence consisting of three instructions, which involves a method invocation with constant pool resolution. The general sequence is as follows:
call old_invoker -> sethi %hi(mb),%g3 sethi %g3,index<<10 -> call new_invoker illtrap -> or %g3,%lo(mb),%g3 /* delay slot */
The following functions are subject to 3 instruction modification:
void OpenJIT_invokespecial_resolve(...)
The JVM invokespecial
bytecode is translated to call this function. After constant pool resolution, we perform the same process as the OpenJIT_invokespecial
function.
void OpenJIT_invokestatic_resolve(...)
The JVM invokestatic
bytecode is translated to call this function. After constant pool resolution, we perform the same process as the OpenJIT_invokestatic
function.
void OpenJIT_invokevirtual_resolve(...)
The JVM invokevirtual
and invokevirtual_quick
bytecode are translated to call this function. After constant pool resolution,
we perform the process similar to the OpenJIT_invokestatic
, but the
call target of the self-modified code differs in the following way:
OpenJIT_invokestatic
to make a jump to mb->CompiledCode
. This allows direct jump to the target method.java.lang.Object
methods:OpenJIT_invokevirtaulobject_quick
OpenJIT_invokevirtual_quick
Subsequently, the case analysis becomes unnecessary, speeding up the virtual call.
We modify the instruction sequence consisting of three instructions, which involves an access to class object with constant pool resolution. The general sequence is as follows. For this kind of sequences, for the modified target of the call instruction, the first argument of the call is the resolved address of the class object.
call old_func -> sethi %hi(mb),%o0 sethi %o0,index<<10 -> call new_func illtrap -> or %g3,%lo(mb),%o0 /* delay slot */
The following functions are subject to 3 instruction modification with %o0:
HObject *OpenJIT_new(int index)
The JVM new
bytecode is translated to call this function. It performs the following steps:
CHECK_SELF_MODIFYING
)index
right by 10 bitsRESOLVE_CLASS_CONST
)OpenJIT_new_quick
(PATCH_SET_O0_and_CALL
)OpenJIT_new_quick
)HArrayOfObject *OpenJIT_anewarray(int index, int size)
The JVM anewarray
bytecode is translated to call this function. It performs steps similar to OpenJIT_new
, self-modifies the target to OpenJIT_anewarray_quick
, and jumps to the OpenJIT_anewarray_quick
.
HObject *OpenJIT_multianewarray(int index, int dimensions, ...)
The JVM multianewarray
bytecode is translated to call this function. It performs steps similar to OpenJIT_new
, self-modifies the target to OpenJIT_multianewarray_quick
, and jumps to the OpenJIT_multianewarray_quick
.
void OpenJIT_checkcast(int index, JHandle *h)
The JVM checkcast
bytecode is translated to call this function. It performs steps similar to OpenJIT_new
, self-modifies the target to OpenJIT_checkcast_quick
, and jumps to the OpenJIT_checkcast_quick
.
bool_t OpenJIT_instanceof(int index, JHandle *h)
The JVM instanceof
bytecode is translated to call this function. It performs steps similar to OpenJIT_new
, self-modifies the target to OpenJIT_instanceof_quick
, and jumps to the OpenJIT_instanceof_quick
.
We modify the CALL instruction which takes an argument as an index to a field or a String constant. The general sequence is as follows:
mov %eax,index -> mov %eax,index call resolver -> mov %eax,offset
int OpenJIT_resolve_getField(int index)
The JVM getfield
instruction is translated to call
this function, which performs the followings:
methodblock
of the caller. We obtain the address of the constant pool from this methodblock
structure. Also, we set the address value into the JVM dummy frame in case exception happens.RESOLVE_CLASS_CONST
)PATCH_CODE_FIELD_ACCESS
)int OpenJIT_resolve_putField(int index)
The JVM putfield
instructions is translated to
call this function which performs the similar steps to OpenJIT_getField
.
int OpenJIT_resolveStaticField(int index)
The JVM putstatic
instruction is translated to call this function.
methodblock
of the caller. We obtain the address of the constant pool from this methodblock
structure. Also, we set the address value into the JVM dummy frame in case exception happens.RESOLVE_CLASS_CONST
)PATCH_CODE_FIELD_ACCESS
)
u.static_address
of the fieldblock
.PATCH_CODE_FIELD_ACCESS
)
u.static_value
of the fieldblock
.void OpenJIT_getstatic(int index)
The JVM getstatic
instruction is translated to call
this function which performs the similar steps to OpenJIT_resolveStaticField
.
int OpenJIT_resolveString(int index)
For JVM bytecodes ldc
and ldc_w
, if the constant pool type is CONSTANT_String
, then the bytecodes are translated to call the function.
methodblock
of the caller. We obtain the address of the constant pool from this methodblock
structure. Also, we set the address value into the JVM dummy frame in case exception happens.RESOLVE_CLASS_CONST
).PATCH_CODE_FIELD_ACCESS
).We modify only the CALL instruction. This is only employed when a new class is loaded on method call. This series of instructions is a combination of setting the pointer to the method block into the %eax register, and CALLing the invoker function. Here, only the CALL instruction is modified.
mov %eax,mb -> mov %eax,mb call old_invoker -> call new_invoker
int OpenJIT_invokeinterface(...)
The JVM invokeinterface
instruction is translated to call this function, which performs the followings:
methodblock
of the caller.RESOLVE_CLASS_CONST
).OpenJIT_invokeinterface_quick
.OpenJIT_invokeinterface_quick
.The JVM invokeinteface_quick instruction is translated to
invoke this function. Also, as indicated in Section 4.2.2.1, it is
invoked subsequently to the invocation of
OpenJIT_invokeinterface
. The %eax register must
contain the predicted value of the method table. The procedure is
almost same as when JVM processes the
invokeinterface_quick
bytecode, but differs in that
when the method is found, we modify the instruction in order to
set the predicted value. We modify the instruction that sets the %eax register,
preceding the call instruction which called this function.
Since the predicated value could be old and stale, we do not
lock the instruction upon modification.
void OpenJIT_invokespecial(...)
The JVM invokespecial
, invokenonvirtual_quick
instructions are translated to call this function. The function is effectively used when the method does not change for the given call site irrespective of the type of the object.
methodblock
of the caller.methodblock
structure, and find out whether we are invoking the methods in the ancestor classes (super).OpenJIT_invokesuper_quick
, and jump to OpenJIT_invokesuper_quick()
.RESOLVE_NATIVE_OR_COMPILE
).mb->CompiledCode
.void OpenJIT_invokesuper_quick(...)
The JVM invokesuper_quick
instruction is translated to call this function. Also, it might be called after a call to the OpenJIT_invokespecial
function. It performs essentially the same procedure as the JVM for this instruction.
void OpenJIT_invokestatic(...)
The JVM invokestatic
instruction (in case it does not require constant pool resolution) and the invokestatic_quick
instruction are translated to call this function, which performs the followings, allowing direct, fast calls to static methods:
methodblock
of the caller.RESOLVE_NATIVE_OR_COMPILE
).mb->CompiledCode
.mb->CompiledCode
.We modify the instruction sequence, which involves a method invocation with constant pool resolution. The general sequence is as follows:
push index -> push mb call old_invoker -> call new_invoker
void OpenJIT_invokespecial_resolve(...)
The JVM invokespecial
bytecode is translated to call this function. After constant pool resolution, we perform the same process as the OpenJIT_invokespecial
function.
void OpenJIT_invokestatic_resolve(...)
The JVM invokestatic
bytecode is translated to call this function. After constant pool resolution, we perform the same process as the OpenJIT_invokestatic
function.
void OpenJIT_invokevirtual_resolve(...)
The JVM invokevirtual
and invokevirtual_quick
bytecode are translated to call this function. After constant pool resolution,
we perform the process similar to the OpenJIT_invokestatic
, but the
call target of the self-modified code differs in the following way:
OpenJIT_invokestatic
to make a jump to mb->CompiledCode
. This allows direct jump to the target method.java.lang.Object
methods:OpenJIT_invokevirtualobject_quick
.OpenJIT_invokevirtual_quick
,
but in X86, we inline the function body at the call site.Subsequently, the case analysis becomes unnecessary, speeding up the virtual call.
We modify the instruction sequence, which involves an access to class object with constant pool resolution. The general sequence is as follows. For this kind of sequences, for the modified target of the call instruction, the first argument of the call is the resolved address of the class object.
push index -> push class call old_invoker -> call new_invoker
HObject *OpenJIT_new(int index)
The JVM new
bytecode is translated to call this function. It performs the following steps:
methodblock
of the caller.RESOLVE_CLASS_CONST
).OpenJIT_new_quick
(PATCH_CODE_PUSH_AND_CALL
).OpenJIT_new_quick
).HArrayOfObject *OpenJIT_anewarray(int index, int size)
The JVM anewarray
bytecode is translated to call this function. It performs steps similar to OpenJIT_new
, self-modifies the target to OpenJIT_anewarray_quick
, and jumps to the OpenJIT_anewarray_quick
.
HObject *OpenJIT_multianewarray(int index, int dimensions, ...)
The JVM multianewarray
bytecode is translated to call this function. It performs steps similar to OpenJIT_new
, self-modifies the target to OpenJIT_multianewarray_quick
, and jumps to the OpenJIT_multianewarray_quick
.
void OpenJIT_checkcast(int index, JHandle *h)
The JVM checkcast
bytecode is translated to call this function. It performs steps similar to OpenJIT_new
, self-modifies the target to OpenJIT_checkcast_quick
, and jumps to the OpenJIT_checkcast_quick
.
bool_t OpenJIT_instanceof(int index, JHandle *h)
The JVM instanceof
bytecode is translated to call this function. It performs steps similar to OpenJIT_new
, self-modifies the target to OpenJIT_instanceof_quick
, and jumps to the OpenJIT_instanceof_quick
.
Since JVM is inherently multithreaded, caution is required for atomic updating of successive sequence of multiple instructions. If the self-modification is not atomic, other threads might try to execute the half-cooked instruction sequence, resulting in a critical error.
The current JVM supports two types of thread system. One is the green thread, and the other is the native thread. The green thread only works for uniprocessor machines, and the context switching occurs only at fixed, safe locations, and thus such problems do not occur. For native threads, however multiple threads might be executing on different processors, resulting in partially rewritten instruction sequences to be executed. Thus, it is extremely important to guarantee the atomicity of self-modification in an efficient manner. OpenJIT implements such an atomic update in the following way:
CHECK_SELF_MODIFYING()
This macro checks whether the call instruction to the function which uses the macro has been modified or not. If it has been modified, then the control returns to the modified instruction of the call site, which is re-executed.
PATCH_CODE(CODE, OFFSET)
This macro modifies the instruction whose offset is OFFSET
from the call instruction which called the function which uses this macro. Subsequently, the instruction cache is flushed. For example, PATCH_CODE(code,4)
modifies the delay slot of the call site which called the function.
CHECK_PATCHING_CODE()
This macro should check the code being patched by another thread. But not implemented yet.
PATCH_CODE_FIELD_ACCESS(ADDR_OR_OFFSET)
This macro modifies the instructions which access fields or String constants to place ADDR_OR_OFFSET into the register %eax.
PATCH_CODE_CALL(FUNC)
This macro mainly used in method resolutions modifies the call
instructions to call
FUNC.
PATCH_CODE_PUSH_AND_CALL(FUNC, ARG0)
This macro mainly used in class resolutions modifies the
call
instructions to call
FUNC with register %eax set ARG0.
In this section, we descirbe one example of an atomic modification of multiple instructions. We assume that the first instruction of the sequence of instructions to be modified is a CALL instruction, followed by NOP instructions. The function which had been called by the CALL instruction modifies the instruction sequence.
We illustrate this scheme below:
Rewriting sequence of multiple instructions | |||||||
---|---|---|---|---|---|---|---|
label | Step 1 | -> | Step 2 | -> | Step 3 | -> | Step 4 |
... | ... | ... | ... | ||||
Inst0 | Inst0 | Inst0 | Inst0 | ||||
modify: | CALL A | Branch modify | Branch modify | Inst1 | |||
NOP | NOP | Inst2 | Inst2 | ||||
NOP | NOP | Inst3 | Inst3 | ||||
NOP | NOP | Inst4 | Inst4 | ||||
Inst5 | Inst5 | Inst5 | Inst5 | ||||
... | ... | ... | ... |
One problem with this scheme is when multiple threads execute the CALL A instruction. However, since both threads will be modifying the instruction sequence (INST1 , ... , INST4) identically, this will not cause a problem (It is a little bit more subtle than this***).
For efficient execution, OpenJIT backend does not generally check for exceptions except for a few instances where explicit runtime checks are required. Instead, exceptions are checked and processed using the Unix signaling mechanism.
Figure 2 indicates how the compiled native code executes. We must check for exception occurrence when the transfer of control occurs between the compiled native code and other native code such as the JVM interpreter and runtime routines, and native methods. For example, In Figure(exception), we must check for exception for each point in the control flow marked by a star. On the other hand, for exceptions occurring with the compiled native code, we generally employ the Unix signals, and do not explicit check for exception occurrence.
By setting the Java Native Code API, the following function is called when a Unix signal occurs:
static void OpenJIT_SignalHandler(int sig, siginfo_t *info, ucontext_t *uc)
Below are the possible exceptions that might occur in runtime. Other signals are not JVM exceptions, but rather a compiler or a JVM bug.
Signal | Purpose |
---|---|
SIGFPE | zero division |
SIGSEGV | null pointer, stack overflow |
SIGILL | array index out of bounds |
Within the OpenJIT_SignalHandler
function, in order to check that the signal was indeed generated by a Java exception and not a compiler or a JVM bug, we check the instruction that caused the exception, and its operand address. For each type of exception, we perform the check in the following way, and by calling the setcontext()
system call, we setup the calling frame so that the instruction causing the exception behaves as if it had called the exception generation function.
The signal SIGFPE is raised, and the exception code info->si_code
is either FPE_INTOVF or FPE_INTDIV. If so, signal handler sets up the context so that it seems as if the following function had been called from the instruction that caused the exception.
void catchZeroDivide(unsigned char *pc)
The signal SIGSEGV is raised, and the exception code info->si_code
is SEGV_MAPERR. In addition, the base register of the instruction that caused the exception is 0. Here is the exception generation function:
void catchNullPointer(unsigned char *pc)
The signal SIGSEGV is raised, and the exception code info->si_code
is SEGV_MAPERR. In addition, the instruction that caused the exception is ld [%sp + constant]
. Here is the exception generation function:
void catchStackOverflow(unsigned char *pc)
The signal SIGILL is raised, and the instruction that caused the exception is a trap instruction, and the trap code is ST_RANGE_CHECK. Here is the exception generation function:
void catchArrayIndexOutOfBounds(unsigned char *pc, int index)
This function is slightly different, in that the index of the array must be given as the second parameter. For this reason, before we perform setcontext
, we must check the value of the register which was used as a operand to calculate the out-of-bounds condition.
FIND_EXCEPTION_FRAME(pc, ee)
This is a macro used by the exception generation functions described above in order to identify the method that caused the exception. It performs the following steps:
methodblock
structure into the dummy JVM frame. fillInStackTrace
(described later in Section 5.4.).handle_exception
)bool_t handle_exception (ExecEnv *execEnv)
This function traces the compiled native code stack, and finds the corresponding exception handler, and jumps to the handler. As is with C longjump()
, it makes a jump leapfrogging the nested function calls. Because SPARC has register windows, they must be restored appropriately during leapfrogging. Here are the steps:
while(1) { /* delete the stackframe of the runtime routine */ while(%i7(address of the caller) is within the runtime routine) { restore /* recover the register window */ } if (%i7(return address) is not a compiled native code) { /* Return to the JVM interpreter loop */ return FALSE; } /* Set the lastpc. Needed when returning to the interpreter loop? */ ee->current_frame->lastpc = %i7 Extract the pointer to the methodblock structure from %fp, and set it to the variable mb /* Find the exception handler for the caught exception within mb */ new_pc = JITProcedureFindThrowTag(ee, mb, ee->exception.exc, %i7) if (new_pc != 0) { /* An exception handler is found! */ exceptionClear(ee) /* Clear the exception flag */ /* * The exception handler for the compiled native code assume * that the pointer to the object that caused the exception * is in %i7 */ %i7 = ee->excetion.exc restore /* restore the register window */ jump new_pc /* Jump to the exception handler */ } /* Exception handler is not found */ if (mb is a synchronized method) { /* unlock the monitor lock */ /* The monitor object is stored in %fp[-1] */ monitorExit(%fp[-1]) } restore }
fillInStackTrace
JDK calls the SignalError
function when an exception occurs. This function in turns calls fillInStackTrace()
,. Also, java.lang.Throwable
class has a method fillInStackTrace
, allowing the user program to obtain the status of the current Java method, and the trace of the stackframe.
The code generated by the OpenJIT compiler does not generate a Java frame when compiled native code is called from another native code. As a result, JVM cannot trace the stackframe. To solve this problem, the JDK prepares the following API:
JavaFrame *JITCompiledFramePrev(JavaFrame *frame, JavaFrame *buf)
Other than fillInStackTrace
, this function is used to obtain the trace of the stackframe. JVM basically uses the following algorithm to walk the stack to obtain the trace:
{ JavaFrame *frame, buf; frame = ExecEnv->current_frame; while(frame) { if (frame->current_method->fb.access & ACC_MACHINE_COMPILED) { frame = CompiledFramePrev(frame, &buf); } else { frame = frame->prev; } } }
Thus, before JITCompiledFramePrev
is called, ExecEnv
(the execution environment structure) current_frame must have the Java fame of the compiled native code. For this purpose, when there is a possibility that an exception may occur upon calling a JVM function from the OpenJIT runtime routine, we must also set the JVM frame in the ExecEnv->current_frame
.
For OpenJIT, we judged that it is too expensive to generate a JVM frame each time this happens. Instead, we generate a dummy JVM frame only when the control flow transfers from the compiled native code into the internals of the JVM, and set it to ExecEnv->current_frame
. When the OpenJIT runtime routine calls a JVM function, we merely set the current_method
of the dummy frame.
We show the other OpenJIT runtime functions that are called from the compiled native code that the OpenJIT compiler generates. The compiled native code may also call a C library function or a JVM function. The table below indicates where the called functions are being defined.
JVM Instruction | Runtime Function | Library |
---|---|---|
anewarray_quick | HArrayOfObject *OpenJIT_anewarray_quick(ClassClass *array_cb, int size) | OpenJIT |
athrow | void OpenJIT_athrow(HJava_lang_Object *obj) | OpenJIT |
checkcast_quick | void OpenJIT_checkcast_quick(ClassClass *cb, JHandle *h) | OpenJIT |
d2l | int64_t __dtoll(double d) | C |
dcmpg | int OpenJIT_dcmpg(stack_item *p) | OpenJIT |
dcmpl | int OpenJIT_dcmpl(stack_item *p) | OpenJIT |
drem | double OpenJIT_drem(stack_item *p) | OpenJIT |
f2l | int64_t __ftoll(float f) | C |
fcmpg | bool_t OpenJIT_fcmpg(float *p) | OpenJIT |
fcmpl | bool_t OpenJIT_fcmpl(float *p) | OpenJIT |
frem | float OpenJIT_frem(float *args) | OpenJIT |
instanceof | bool_t OpenJIT_instanceof(int index, JHandle *h) | OpenJIT |
l2d | double OpenJIT_l2d(signed hi, unsigned lo) | OpenJIT |
l2f | float OpenJIT_l2f(signed hi, unsigned lo) | OpenJIT |
lcmp | bool_t OpenJIT_lcmp(long long x, long long y) | OpenJIT |
ldiv | int64_t __div64(int64_t x, int64_t y) | C |
lmul | int64_t __mul64(int64_t x, int64_t y) | C |
lrem | int64_t __rem64(int64_t x, int64_t y) | C |
lshl | uint64_t longOpenJIT_lshl(signed hi, unsigned lo, unsigned b) | OpenJIT |
lshr | uint64_t longOpenJIT_lshr(signed hi, unsigned lo, unsigned b) | OpenJIT |
lushr | uint64_t longOpenJIT_lushr(signed hi, unsigned lo, unsigned b) | OpenJIT |
monitorEnter | void monitorEnter(unsigned int key) | JDK |
monitorExit | void monitorExit(unsigned int key) | JDK |
multianewarray_quick | HObject *OpenJIT_multianewarray_quick(ClassClass *array_cb, int dimensions, stack_item *optop) | OpenJIT |
new_quick | HObject *OpenJIT_new_quick(ClassClass *cb) | OpenJIT |
newarray | JHandle *OpenJIT_newarray(int type, int size) | OpenJIT |
We covered the runtime structure of the OpenJIT backend system. For the details of how the JVM instructions are translated, and runtime functions are called, the readers are referred to the files in org/OpenJIT/Sparc.java
. The layout of the stackframe of the compiled native code is described in a companion document OpenJIT Backend Compiler Internal Specification.