OpenJIT

OpenJIT Backend Compiler (Runtime) Internal Specification version 1.1.7

Newer specification draft (version 1.1.15) -- not completed.

Kouya Shimura
Fujitsu Laboratories
kouya@flab.fujitsu.co.jp

Translation by Satoshi Matsuoka
Tokyo Institute of Technology
matsu@is.titech.ac.jp

Friday, October 29, 1999

1. Introduction

This internal specification covers the runtime portion of the OpenJIT backend compiler for SPARC V8 CPU. We outline the structure of this specification document below. The OpenJIT Backend compiler largely consists of the compiler part and the runtime part; this document covers the runtime part.

2. Java Native Code API

The Java JDK has several APIs for JIT compilers. OpenJIT is plugged into a given JDK using this API. We first explain the API, and describe the implementation in OpenJIT.

2.1 JIT Compiler Initialization

When JDK starts up, it first reads the Java system class (java.lang.System), and then loads the java.lang.Compiler class. Java.lang.Compiler is defined as follows:

public final class Compiler {
    private Compiler() {} // don't make instances
    private static native void initialize();
    static {
        try {
            String library = System.getProperty("java.compiler");
            if (library != null) {
                System.loadLibrary(library);
                initialize();
            }
        } catch (Throwable e) {
        }
    }
...
}

When this class is loaded, the class initializer is executed, and looks for the property "java.compiler". On JDK 1.1.x, the user either specifies the compiler via a command-line option "-Djava.compiler=XXX", or sets the environment variable JAVA_COMPILER, allowing the system to dynamically link the library via System.loadLibrary. Then, the native method initialize() is invoked. This method is defined in C as follows:

void java_lang_Compiler_initialize (Hjava_lang_Compiler *this)
{
    void *address = (void *)sysDynamicLink("java_lang_Compiler_start");
    if (address != 0) {
        (*(void (*) (void **)) address) (CompiledCodeLinkVector);
    }
    compilerInitialized = TRUE;
}

By defining the function java_lang_Compiler_start() in the OpenJIT native library, this function is thereby invoked by the JVM, allowing proper initialization of OpenJIT. The argument CompiledCodeLinkVector passes the necessary values for JIT compilation; it essentially is a vector of hook functions for JIT compilation, and by modifying the vector appropriately, the JIT compiler is invoked appropriately by the JDK when needed. Some essential hook functions are described in Table 1.

Table 1: Some essential hook functions

Function name	Feature
`InitializeForCompiler`	On class load
`invokeCompiledMethod`	On method invocation
`CompiledCodeSignalHandler`	On signal occurrence
`CompilerFreeClass`	When class is no longer needed
`CompilerCompileClass`	Compilation of specified class
`CompilercompileClasses`	Compilation of specified classes
`CompilerEnable`	Enable Compiler
`CompilerDisable`	Disable Compiler
`ReadInCompiledCode`	Load pre-compiled code
`PCinCompiledCode`	For exception handling
`CompiledCodePC`	For exception handling
`CompiledFramePrev`	For exception handling

2.2 Overview of JIT compilation process

Figure 1: Flow of JIT compilation

Figure 1 illustrates the overview of JIT compilation process. When Java is invoked with a JIT compiler specified, the native portion of the JDK is dynamically linked into the JVM. By setting the hook functions appropriately as described above, the class loader will always call the InitializeForCompiler() function on each class load.

InitializeForCompiler, in turn, modifies the invoker of all the methods defined for the class (except for the native and abstract methods) so that the JIT compiler is invoked for dynamic compilation from the bytecode to the native code. When the method is compiled, the method invocation flag is modified so that the compiled code is now executed directly, and passes on the control to the generated native code. Thereafter, JVM invokes the compiled native code directly by the virtue of the flag.

When a compiled native code calls another compiled native code, the CompiledCode field of the methodblock structure is read and the control is transferred directly via a jump instruction.

The compiled native method accumulates in memory. The space is reclaimed when the hook function CompilerFreeClass is called when the JDK actually deletes the class from memory. The hook function in turn frees the memory for the native code as well.

We have overviewed the JIT compilation process. As we can see, the unit of dynamic compilation is each individual method, and compilation happens on the first time the method is invoked. By all mans one can have adaptive strategies to compile on nth invocation, etc.

2.3 JIT Compiler Initialization (`java_lang_Compiler_start`)

In OpenJIT, the initialization routine java_lang_Compiler_start performs the following task:

Initialization of the compiler lock variable:
In a multithreaded environment one must prevent OpenJIT from compiling the same method with multiple threads at the same time. We thus use a lock variable (_compile_lock), which is initialized here.
Load org.OpenJIT.Sparc class
In order to obtain the pointer to the OpenJIT class structure (ClassClass structure), the OpenJIT classes are loaded. OpenJIT instantiates the OpenJIT compiler classes before it starts compilation of a Java method; the pointer is required for this (*** 意味がいまいち)
Loading of the org.OpenJIT.ExceptionHandler class
Similarly, in order to instantiate at compile time, we obtain the pointer to org.OpenJIT.ExceptionHandler class by loading it.
Obtain pointer to the OpenJIT_compile() function
We next obtain the pointer to the upcall entry point for OpenJIT Java classes by calling the ResolveClassConstant() function, obtaining the methodblock structure for OpenJIT_compile().
Initialization of the org.OpenJIT.Compile class
We initialize several internal native variables such as debugging info variables, floating point constants, and class variables for entry point (address) of runtime routines (***).
Initialization of the CompiledCodeLinkVector
We set the essential hook functions as described in 2.1.
Re-initialization Resetting of method invocation functions of the classes already loaded
Although not a problem for standard JIT compilers which are present from the point of invocation of the JDK***, for OpenJIT several system classes are already loaded when this initialization function is called. Such classes if left alone are never compiled if we merely set the hook functions in the CompiledCodeVector. Instead, we must re-initialize the classes already loaded by resetting the method invocation functions etc. of the already loaded classes, in order to allow them to be compiled as well.

2.4 Class Initialization (`OpenJIT_InitializeForCompiler`)

static void OpenJIT_InitializeForCompiler(ClassClass *cb)

Given a class, set the invoker functions of all the methods except for the native methods and the abstract methods so that they are dynamically compiled on invocation. Also, in order to allow calls to a compiled method from another compiled method, the compilation must set the CompiledCode field of the methodblock structure.

Also, we set the CompiledCodeFlags of the methodblock structure depending on the type of the return value of the method. This value is used in dispatchVM() function when the compiled native calls the JVM for interpretive execution.

Furthermore, by setting a command-line option, we can restrict the classes and methods to be compiled by calling the function match_compile_methods. We pass the methodblock structure to the function, and if the return value is TRUE then the method is subject to compilation, whereas if FALSE then the method is not to be compiled.

2.5 Signal Handler (`OpenJIT_SignalHandler`)

static void OpenJIT_SignalHandler(int sig, siginfo_t *info, ucontext_t *uc)

Called when a signal occurs. OpenJIT generates Java exceptions with Unix signals. This function looks at the signal for Java exceptions, and routes control to an appropriate handler. If it receives a signal that is irrelevant to Java exceptions, it simply returns. For details of the exception handling please refer to Section 5.

2.6 Freeing of Class (`OpenJIT_CompilerFreeClass`)

static void OpenJIT_CompilerFreeClass(ClassClass *cb)

When JDK decides that a certain class is no longer necessary, this function is called, freeing the space occupied by the compiled native code.

2.7 Class Compilation (`OpenJIT_CompilerCompileClass`)

static bool_t OpenJIT_CompilerCompileClass(ClassClass *cb)

Called from a java user program with the following: java.lang.Compiler.compileClass(clazz). This forces compilation of all methods that have not been compiled in the given class.

2.8 Enabling of the JIT compiler (`OpenJIT_CompilerEnable`)

static void OpenJIT_CompilerEnable()

Called from a java user program with the following: java.lang.Compiler.enable(). The methods that are called following this call will be compiled.

2.9 Disabling of the JIT compiler (`OpenJIT_CompilerDisable`)

static void OpenJIT_CompilerDisable()

Called from a java user program with the following: java.lang.Compiler.disable(). The subsequent methods called after this call will not be compiled. Note that, for the default OpenJIT implementation where each method is compiled on its first invocation, the caller of this method will have been already compiled and thus will execute as compiled native code.

2.10 Compiled code execution test (`OpenJIT_PCinCompiledCode`)

static bool_t OpenJIT_PCinCompiledCode(caddr_t *pc, struct methodblock *mb)

Judges whether the current execution is within a given method using the program counter and the methodblock structure. This function is used by the JDK when an exception occurs and it traces and displays the stacktrace of execution. If the method is being executed, then it returns TRUE, otherwise FALSE.

2.11 The Value of the Program Counter (`OpenJIT_CompiledCodePC`)

static unsigned char *OpenJIT_CompiledCodePC(JavaFrame *frame, struct methodblock *mb)

Returns the value of the program counter given the frame and the methodblock. This function is used by the JDK when an exception occurs and it traces and displays the stacktrace of execution.

NOTE: For OpenJIT, for simplification this function does not return the correct value of the PC, but rather returns the entry address of the compiled native code of given methodblock. Everything seems to work fine under this simplification for JDK 1.1.x and JDK 1.2, but other JDKs might break this assumption.

2.12 Generating Java Stack Frame (`OpenJIT_CompiledFramePrev`)

static JavaFrame *OpenJIT_CompiledFramePrev(JavaFrame *frame, JavaFrame *buf)

Converts the native compiled code stackframe into Java stackframe used by the JDK (JavaFrame). This function is used by the JDK when an exception occurs and it traces and displays the stacktrace of execution.

The generated code by the OpenJIT follows the C stackframe convention, and this function performs the conversion under that assumption. For JDK 1.1.x, the JavaFrame structure only utilizes the current method and the vars frame; thus, in practice these are the only two fields set by the function. As the converted frame must use the buf memory region, the function sets the values and returns buf.

3. Method Invocation

For each method, both JIT compilation and and transfer of control to the native method happens at the point of the subject method invocation.

The JVM interpreter loop is structured as follows. When a method is invoked, the invoker function of the methodblock structure mb is called. Under interpretive execution, this in turn calls the JVM to generate a new Java stack frame. The first argument of o invoker() is a pointer to a class object for static method calls, and is the pointer to the invoked object on normal method calls. The second argument mb is a pointer to the methoblock structure, and the third argument args_size indicates the types of the arguments. The 4th argument ee is a pointer to the execution environment structure ExecEnv.

while(1) {
    get opcode from pc
    switch(opcode) {
    ...(various implementation of the JVM bytecodes)
    callmethod:
        mb->invoker(o, mb, args_size, ee);
        frame = ee->current_frame; /* setup java frame */
        pc = frame->lastpc; /* setup pc*/
        break;
    }
}

3.1 Invoking the Compiler from the JDK

As mentioned earlier, we substitute the value of the invoker to OpenJIT_invoke when the class is loader. The OpenJIT_invoke function is defined as follows in C:

bool_t OpenJIT_invoke(JHandle *o, struct methodblock *mb,
                      int args_size, ExecEnv *ee);

This function in turns upcalls the OpenJIT_compile to dynamically compile the method. Thereafter, the control is transferred to mb->invoker, transferring control to the just compiled method.

3.2 OpenJIT compile

A method is compiled and the invoker as well as the CompiledCode fields of the methodblock structure are initialized.

void OpenJIT_compile(struct methodblock *mb)

This function performs the followings:

Mutual exclusion to prevent simultaneous compilation of the same method. As mentioned earlier, we prevent multiple threads from compiling the same method at the same time with proper mutual execution using a compile lock.
Setup of invoker and CompiledCode fields in the methodblock structure. When a method is invoked during compilation, we set the invoker and CompiledCode so that the interpreter is invoked. This allows natural handling of recursive self-compilation of OpenJIT compiler classes.
Invocation of the dynamic compiler (OpenJIT_Sparc_compile()). The Java method is upcalled to perform the actual compilation.
When the compilation is successful: We set the stub function in the invoker field of the methodblock structure so that the compiled native code is invoked.
When compilation fails: The methodblock field values are restored to their original values.

3.3 Invoking the compiled code from the JVM

The CompileMethod function sets the value of the invoker to one of the following stub functions according to the return type of the method. The last letter of the function indicates the return type:　V:void, I:int, J:long, F:float,　D:double. Other types such as Object or short,char,byte that could be encoded in 1 word use I as a default.

bool_t invokeCompiledCodeV(JHandle *o, struct methodblock *mb,
                           int args_size, ExecEnv *ee)
bool_t invokeCompiledCodeI(JHandle *o, struct methodblock *mb,
                           int args_size, ExecEnv *ee)
bool_t invokeCompiledCodeJ(JHandle *o, struct methodblock *mb,
                           int args_size, ExecEnv *ee)
bool_t invokeCompiledCodeF(JHandle *o, struct methodblock *mb,
                           int args_size, ExecEnv *ee)
bool_t invokeCompiledCodeD(JHandle *o, struct methodblock *mb,
                           int args_size, ExecEnv *ee)

These stub function performs the following:

Saving of the Java frame.
Setting up of the equivalent native code stackframe.
Calling of the function makeCompiledFrame. A dummy JVM frame is created for exception handling and Java reflection, etc.
Necessary exception of the native frame by manipulating the stack pointer %sp.
The invokeCompiledCode is four arguments; should there be more arguments in the call, then the save area for the arguments must be allocated on the stack by bumping the %sp.
Set pointer to the methodblock structure in register %3: Required for compiled code calling convention. For details, refer to the descriptions of the compiled code to compiled code details.
Calling of the CompiledCode.
For SPARCs, we pass up to 6 arguments via registers. As such, we fetch 6 values from the JVM operand stack and store them into registers prior to the call. Note that for the current implementation, we ALWAYS pass six values, rather than case analyzing for fewer arguments. In practice this has proven to be effective.
After method execution, the return value is pushed onto the JVM frame according to the return type. The JVM stack is adjusted as well, and the frame is restored.
Check for exceptions. If an exception has occurred, return TRUE otherwise FALSE (***to whom?)

Steps 2~4 are already packaged in the Macro INVOKE_COMPILED_CODE.

3.4 Invoking Compiled Code from Compiled Code

As mentioned earlier, control is passed to the CompliedCode field of the methodblock. The C-style description of the call is as follows:

mb->CompiledCode(obj, arg0, arg1, arg2, ...)

The arguments are passed via SPARC function call convention, i.e., the first 6 arguments are passed in the registers %o0 ~ %o5. Note that, for Java native code, %o is reserved for the object or the class pointer of the call, so the register usage actually is shifted by one. As an example for the following Java program:

obj.method(arg0, arg1, arg2, arg3, arg4, arg5)

%o0 will have obj, %o1 will be assigned arg0, ..., %o5 will be assigned arg4, whereas arg5 is allocated onto the stack as caller-save register.

Similarly, the method return values also follow the SPARC function call convention, i.e., for integers, the value is returned in %i0, for longs (64-bits) %0 and %1, float in %f0, and doubles are returned in %f0 and %f1.

What differs from C function calls is that, we must always set the methodblock of the method to be called into the %g3 register. This value is necessary for virtual function calls (OpenJIT_invokevirtual, OpenJIT_invokevirtualobject_quick), as well as for exception handling, upon which the value must be saved in the native stack.

3.5 Invoking the Compiler from the Compiled Code

The methods to be compiled by the JITs already have the value of the CompiledCode field in the methodblock changed to dispatchJITCompiler on class loading time. When this method is invoked for the first time from within the compiled code, the dispatchJITCompiler is invoked. (***) This function, if written in C, would have the following interface:

? dispatchJITCompiler(? arg0, ? arg1, ? arg2, ...)

The types of the arguments and the return value depends on the method, and is indeterminate at compile time. In practice, we use assembly for efficiency to code in the following way:

dispatchJITCompiler:
save %sp,-112,%sp
call compileMethod,0 ! compileMethod(mb)
mov %g3,%o0
ld [%g3+.off_CompiledCode],%l1 ! jump to mb->CompiledCode
jmp %l1
restore

Firstly, compileMethod is invoked. The compileMethod does the compilation and sets the CompiledCode field. Then, this field is re-invoked with the same arguments, effectively executing the compiled native code.

3.6 Invoking the Interpreter (JVM) from the compiled code

When the CompiledCode field of the methodblock structure is set to be dispatchJVM, then the following function is called to transfer the control to the interpreter. This is used when compilation fails, or the compilation is restricted due to the compiler option.

void dispatchJVM(? arg0, ? arg1, ? arg2, ...)

JVM facilitates the following function to call a Java method from native code in general:

long do_execute_java_method(ExecEnv *ee, void *obj, char *method_name,
                            char *signature, struct methodblock *mb,
                            bool_t isStaticCall, ...);

DispatchJVM interfaces the calling convention of the compiled native methods and this function:

Extract the methodblock structure from %g3
Save the argument registers onto the stack
Set the methodblock structure of the caller into the dummy Java frame. This is required for reflection and exception handling.
Call do_execute_java_method
Set the return values into registers according to the return type.

3.7 Invoking Native Method from Compiled Code

When the method is originally native to begin with, the code field of the methodblock points to the native code to be executed. A call can be thus made in the following way in C:

optop = (*(stack_item *(*)(stack_item*, ExecEnv*))mb->code)(optop, ee)

The first argument is a pointer to the operand stack, and the second argument is a pointer to the execution environment ExecEnv. As a result, when we call a native method from the compiled native code, we must assign the register arguments into operand stack similarly to the call to the JVM. Also, the returned value on the operand stack must be placed back in the register. For this purpose, we define 5 stub functions according to the return type:

void dispatchNormNativeV(...)
int dispatchNormNativeI(...)
int64_t dispatchNormNativeJ(...)
float dispatchNormNativeF(...)
double dispatchNormNativeD(...)

When the native method is a synchronized method, it further requires the monitor lock/unlock operations. For efficiency, we further define a set of separate stub functions to cover this case:

void dispatchSynchNativeV(...)
int dispatchSynchNativeI(...)
int64_t dispatchSynchNativeJ(...)
float dispatchSynchNativeF(...)
double dispatchSynchNativeD(...)

These functions perform the followings:

Obtain the value of the methodblock from %g3 to obtain the starting address of the native method.
Save the register arguments onto the native stack (Not the Java operand stack). These will become the second argument of the native call.
Obtain the value of ExecEnv
Obtain the value of the methodblock of the caller, and set to the dummy JVM frame. Again, this is required for reflection and exception handling.
For synchronized methods, call MonitorEnter
Call the native method
For synchronized methods, call MonitorExit
Check whether an exception has occurred. If so, call handle_exception, which actually controls the transfer entirely and never returns.
Obtain the return value of mb->code from the operand stack and place it appropriately into the register according to the return type.

NOTE: Steps 2 ~ 8 are defined as a macro DISPATCH_NATIVE

4. Self Modifying Code

Java bytecodes refers to classes, instance variables, and methods via symbol names. Symbols are stored in a structure called constant pool. Thus, on bytecodes execution, one must search the constant pool with the given symbols as a key, and obtain the actual address. This search is quite costly, as one must lock the constant pool region for multithreaded execution. Moreover, constant pool references occur quite frequently, and the cost of the search could dominate the overall execution time.

The JVM implementation solves this problem by modifying the bytecode on the fly. That is to say, the bytecode for constant pool access is modified to an equivalent so-called quick bytecode, which refers to the absolute address after the name has been resolved. For example, the bytecode instruction:

getfield #22 <Field Obj var>

pushes the object variable value onto the operand stack. When this instruction is first executed, the constant pool is always searched for the constant pool index #22. As a result, when we find that this variable can be accessed at a 4-byte offset from the object header, then the JVM modifies the code at runtime to the following quick version:

getfield_quick 4

then subsequently re-executes the instruction. From that point on, the quick instruction is always used, since the constant pool does not change over the execution of the program, effectively eliminating the lookup cost.

However, for native code, it is more difficult to eliminate this cost of lookup by naive application of a similar technique. A simple method would be to search all the possible symbol values in the constant pool and resolve them at once at compile time. However, constant pool resolution is not mere simple symbol resolution, but rather incurs other processing such as class loading and initialization; thus, this strategy could change the semantics of the program by changing the initialization order of classes.

The viable option is to change the native instruction code in the same manner as the interpreter. However, it is much more difficult to do for native code, which involves several native instructions per each bytecode. Since the length of the instruction sequence cannot change, this could involve insertion of several NOP instructions. Moreover, the change must be atomic, requiring some form of mutual exclusion. Moreover, the change must propagate across code caches on different processors in a multiprocessor environment.

The basic solution is as follows. For SPARCs, we place the CALL instruction to the constant pool resolution routine preceding a delay slot which contains the MOV instruction to set the index number of the symbol onto a register:

call  resolve
mov   #22,%o0

When this sequence is executed, the resolve() routine is called. There, after the constant pool is searched and the address corresponding to the index is found, the two instructions are rewritten so that now the instruction sequence places the the address (or the offset) value into a register:

sethi %hi(offset), %o0
or    %o0, %lo(offset),%o0

Then, the resolve() routine returns with the resolved address in the %o0 register. The instruction immediately following the two rewritten instructions merely accesses the memory using %o0. For SPARCs, we could further optimize this as small offsets can be encoded into one instruction, and the load instruction contains a displacement field.

4.1 Two Instruction Modification

Although the basic idea was given, in practice for SPARCs the MOV instruction only accepts signed 13 bits as index values. In JVM, the index value can be as large as 16 bits; so, we have employed the SETHI instruction instead of the MOV instruction, allowing the usage of 22 bits. One caveat is that the encoded index value is the value obtained by left shifting the bits by 10 bits.

call resolver           -> sethi %hi(offset),%o0
sethi (index<<10),%o0   -> or %o0,%lo(offset),%o0

The following functions actually searches the constant pool and resolves the offsets:

4.1.1 OpenJIT_resolveField

int OpenJIT_resolveField(int index)

The JVM getfield and putfield instructions are translated to call this function, which performs the followings:

Check for self modification (CHECK_SELF_MODIFYING)
Right shift the index by 10bits
Extract the methodblock of the caller. We obtain the address of the constant pool from this methodblock structure. Also, we set the address value into the JVM dummy frame in case exception happens.
Search and resolve the constant in the pool (RESOLVE_CLASS_CONST)
Check whether we have access rights to the field. If the field is static then generate an exception.
Obtain the offset, and modify the instruction as stated above (PATCH_SET_O0)
Return the offset value

4.1.2 OpenJIT_resolveStaticField

int OpenJIT_resolveStaticField(int index)

The JVM getstatic and putstatic bytecodes are translated to call this function.

Check for self modification (CHECK_SELF_MODIFYING)
Right shift the index by 10bits
Extract the methodblock of the caller. We obtain the address of the constant pool from this methodblock structure. Also, we set the address value into the JVM dummy frame in case exception happens.
Search and resolve the constant in the pool (RESOLVE_CLASS_CONST)
Check whether we have access rights to the field. If the field is NOT static then generate an exception.
If the field type is long or double (64-bits)
Obtain the offset, and modify the instruction as stated above (PATCH_SET_O0)
Return the u.static_address of the fieldblock.
If the field type is not 64-bits
Obtain the offset, and modify the instruction as stated above (PATCH_SET_O0)

4.1.3 OpenJIT_resolveString

int OpenJIT_resolveString(int index)

For JVM bytecodes ldc and ldc_w, if the constant pool type is CONSTANT_String, then the bytecodes are translated to call the function.

Check for self modification (CHECK_SELF_MODIFYING)
Right shift the index by 10bits
Extract the methodblock of the caller. We obtain the address of the constant pool from this methodblock structure. Also, we set the address value into the JVM dummy frame in case exception happens.
Search and resolve the constant in the pool (RESOLVE_CLASS_CONST)
Obtain the offset, and modify the instruction as stated above (PATCH_SET_O0)
Return the address value

4.2 One Instruction Modification

We modify only the CALL instruction. This is only employed when a new class is loaded on method call. This series of instructions is a combination of setting the pointer to the method block into the %g3 register, and CALLing the invoker function. Here, only the CALL instruction is modified.

sethi %hi(mb),%g3       -> sethi %hi(mb),%g3
call old_invoker        -> call new_invoker
or %g3,%lo(mb),%g3      -> or %g3,%lo(mb),%g3

The functions below are subject to such one-instruction modification:

4.2.1 OpenJIT_invokeinterface

int OpenJIT_invokeinterface(...)

The JVM invokeinterface bytecode is translated to call this function, which performs the followings:

Check for self modification (CHECK_SELF_MODIFYING)
Extract the methodblock of the caller.
Search and resolve the constant in the pool (RESOLVE_CLASS_CONST)
Patch the instruction to call OpenJIT_invokeinterface_quick
Jump to OpenJIT_invokeinterface_quick.

4.2.2 OpenJIT_invokeinterface_quick

The JVM invokeinteface_quick instruction is translated to invoke this function. Also, as indicated in Section 4.1.1, it is invoked subsequently to the invocation of OpenJIT_invokeinterface. The %g3 register must contain the predicted value of the method table. The procedure is almost same as when JVM processes the invokeinterface_quick bytecode, but differs in the following points:

The predicated value is obtained by shifting %g3 right by 24 bits
When the method is found, we modify the instruction in order to set the predicted value.

We modify the instruction that sets the %g3 register, preceding the call instruction which called this function; the modification is such that the predicted value is shifted by 24 bits, and set to the upper 8-bits of the %g3 register. Since the predicated value could be old and stale, we do not lock the instruction upon modification.

4.2.3 OpenJIT_invokespecial

void OpenJIT_invokespecial(...)

The JVM invokespecial, invokenonvirtual_quick bytecodes are translated to call this function. The function is effectively used when the method does not change for the given call site irrespective of the type of the object.

Check for self modification (CHECK_SELF_MODIFYING)
Extract the methodblock of the caller.
Check the class and the methodblock structure, and find out whether we are invoking the methods in the ancestor classes (super).
For super calls, modify the instruction to be a call to OpenJIT_invokesuper_quick, and jump to OpenJIT_invokesuper_quick().
If it is not a super call, either compile the method to be called, or load it in the case it is a native method. (RESOLVE_NATIVE_OR_COMPILE)
Modify the instruction to directly call mb->CompiledCode.
Jump to mb->CompiledCode

4.2.4 OpenJIT_invokesuper_quick

void OpenJIT_invokesuper_quick(...)

The JVM invokesuper_quick bytecode is translated to call this function. Also, it might be called after a call to the OpenJIT_invokespecial function. It performs essentially the same procedure as the JVM for this instruction.

4.2.5 OpenJIT_invokestatic

void OpenJIT_invokestatic(...)

The JVM invokestatic instruction (in case it does require constant pool resolution) and the invokestatic_quick instruction are translated to call this function, which performs the followings, allowing direct, fast calls to static methods:

Check for self modification (CHECK_SELF_MODIFYING)
Extract the methodblock of the caller.
Either compile the callee method or load the native method as specified in the classfile (RESOLVE_NATIVE_OR_COMPILE).
Modify the instruction to make a direct jump to mb->CompiledCode
Make the jump to mb->CompiledCode.

4.2.6 OpenJIT_invokevirtual

void OpenJIT_invokevirtual(...)

For invokevirtual bytecodes that does not require constant pool resolution, and the invokevirtual_quick instructions are translated to call this function. The procedure is similar to OpenJIT_invokestatic, but the call target of the self-modified code differs in the following way:

When the method is private:: We rewrite the method in the same manner as the OpenJIT_invokestatic to make a jump to mb->CompiledCode. This allows direct jump to the target method.
java.lang.Object methods:: We rewrite the target of the call to OpenJIT_invokevirtaulobject_quick
Other cases:: We rewrite the target of the call to OpenJIT_invokevirtual_quick

Subsequently, the case analysis becomes unnecessary, speeding up the virtual call.

4.3 Three Instruction Modification

We modify the instruction sequence consisting of three instructions, which involves a method invocation with constant pool resolution. The general sequence is as follows:

call old_invoker        -> sethi %hi(mb),%g3
sethi %g3,index<<10     -> call new_invoker
illtrap                 -> or %g3,%lo(mb),%g3	/* delay slot */

The following functions are subject to 3 instruction modification:

4.3.1 OpenJIT_invokespecial_resolve

void OpenJIT_invokespecial_resolve(...)

The JVM invokespecial bytecode is translated to call this function. After constant pool resolution, we perform the same process as the OpenJIT_invokespecial function.

4.3.2 OpenJIT_invokestatic_resolve

void OpenJIT_invokestatic_resolve(...)

The JVM invokestatic bytecode is translated to call this function. After constant pool resolution, we perform the same process as the OpenJIT_invokestatic function.

4.3.3 OpenJIT_invokevirtual_resolve

void OpenJIT_invokevirtual_resolve(...)

The JVM invokevirtual bytecode is translated to call this function. After constant pool resolution, we perform the same process as the OpenJIT_invokevirtual function.

4.4 Three Instruction Modification (Modification of %o0)

We modify the instruction sequence consisting of three instructions, which involves an access to class object with constant pool resolution. The general sequence is as follows. For this kind of sequences, for the modified target of the call instruction, the first argument of the call is the resolved address of the class object.

call old_func           -> sethi %hi(mb),%o0
sethi %o0,index<<10     -> call new_func
illtrap                 -> or %g3,%lo(mb),%o0	/* delay slot */

The following functions are subject to 3 instruction modification with %o0:

4.4.1 OpenJIT_new

HObject *OpenJIT_new(int index)

The JVM new bytecode is translated to call this function. It performs the following steps:

Check for self modification (CHECK_SELF_MODIFYING)
Shift the index right by 10 bits
Constant pool resolution (RESOLVE_CLASS_CONST)
Check the access rights to the class, and for illegal access generate an exception
Self-modify the call so that the new target is OpenJIT_new_quick (PATCH_SET_O0_and_CALL)
Physically allocate an object from memory (same as OpenJIT_new_quick)
Return the pointer to the new object.

4.4.2 OpenJIT_anewarray

HArrayOfObject *OpenJIT_anewarray(int index, int size)

The JVM anewarray bytecode is translated to call this function. It performs steps similar to OpenJIT_new, self-modifies the target to OpenJIT_anewarray_quick, and jumps to the OpenJIT_anewarray_quick.

4.4.3 OpenJIT_multianewarray

HArrayOfObject *OpenJIT_multianewarray(int index, int dimensions,
                                       stack_item *optop)

The JVM multianewarray bytecode is translated to call this function. It performs steps similar to OpenJIT_new, self-modifies the target to OpenJIT_multianewarray_quick, and jumps to the OpenJIT_multianewarray_quick.

4.4.4 OpenJIT_checkcast

void OpenJIT_checkcast(int index, JHandle *h)

The JVM checkcast bytecode is translated to call this function. It performs steps similar to OpenJIT_new, self-modifies the target to OpenJIT_checkcast_quick, and jumps to the OpenJIT_checkcast_quick.

4.4.5 OpenJIT_instanceof

bool_t OpenJIT_instanceof(int index, JHandle *h)

The JVM instanceof bytecode is translated to call this function. It performs steps similar to OpenJIT_new, self-modifies the target to OpenJIT_instanceof_quick, and jumps to the OpenJIT_instanceof_quick.

4.5 Self-modifying code in OpenJIT

Since JVM is inherently multithreaded, caution is required for atomic updating of successive sequence of multiple instructions. If the self-modification is not atomic, other threads might try to execute the half-cooked instruction sequence, resulting in a critical error.

The current JVM supports two types of thread system. One is the green thread, and the other is the native thread. The green thread only works for uniprocessor machines, and the context switching occurs only at fixed, safe locations, and thus such problems do not occur. For native threads, however multiple threads might be executing on different processors, resulting in partially rewritten instruction sequences to be executed. Thus, it is extremely important to guarantee the atomicity of self-modification in an efficient manner. OpenJIT implements such an atomic update in the following way:

4.5.1 Macro `CHECK_SELF_MODIFYING()`

This macro checks whether the call instruction to the function which uses the macro has been modified or not. If it has been modified, then the control returns to the modified instruction of the call site, which is re-executed.

4.5.2 Macro `PATCH_CODE(CODE, OFFSET)`

This macro modifies the instruction whose offset is OFFSET from the call instruction which called the function which uses this macro. Subsequently, the instruction cache is flushed. For example, PATCH_CODE(code,4) modifies the delay slot of the call site which called the function.

4.5.3 Modifying multiple instructions atomically

Multiple instructions are modified atomically in the following way. We assume that the first instruction of the sequence of instructions to be modified is a CALL instruction, followed by NOP instructions. The function which had been called by the CALL instruction modifies the instruction sequence.

First, the CALL instruction is modified to unconditional branch instruction to effectively spin lock on the instruction. This modification is atomic, and any thread which executes the jump instruction goes into infinite spin.
Change the NOP instructions to desired instruction sequence.
Change the unconditional branch instruction to the desired instruction. The threads that had been spinning on the branch instruction will resume with the execution of the new instruction sequence.

We illustrate this scheme below:

	Rewriting sequence of multiple instructions
label	Step 1	->	Step 2	->	Step 3	->	Step 4
	...		...		...		...
	Inst0		Inst0		Inst0		Inst0
modify:	CALL A		Branch modify		Branch modify		Inst1
	NOP		NOP		Inst2		Inst2
	NOP		NOP		Inst3		Inst3
	NOP		NOP		Inst4		Inst4
	Inst5		Inst5		Inst5		Inst5
	...		...		...		...

One problem with this scheme is when multiple threads execute the CALL A instruction. However, since both threads will be modifying the instruction sequence (INST1 , ... , INST4) identically, this will not cause a problem (It is a little bit more subtle than this***).

5. Exception Handling

For efficient execution, OpenJIT backend does not generally check for exceptions except for a few instances where explicit runtime checks are required. Instead, exceptions are checked and processed using the Unix signaling mechanism.

Figure 2: Flow of Execution

Figure 2 indicates how the compiled native code executes. We must check for exception occurrence when the transfer of control occurs between the compiled native code and other native code such as the JVM interpreter and runtime routines, and native methods. For example, In Figure(exception), we must check for exception for each point in the control flow marked by a star. On the other hand, for exceptions occurring with the compiled native code, we generally employ the Unix signals, and do not explicit check for exception occurrence.

5.1 Checking for Exceptions using Unix Signals

By setting the Java Native Code API, the following function is called when a Unix signal occurs:

static void OpenJIT_SignalHandler(int sig, siginfo_t *info, ucontext_t *uc)

Below are the possible exceptions that might occur in runtime. Other signals are not JVM exceptions, but rather a compiler or a JVM bug.

Signal	Purpose
SIGFPE	zero division
SIGSEGV	null pointer, stack overflow
SIGILL	array index out of bounds

Within the OpenJIT_SignalHandler function, in order to check that the signal was indeed generated by a Java exception and not a compiler or a JVM bug, we check the instruction that caused the exception, and its operand address. For each type of exception, we perform the check in the following way, and by calling the setcontext() system call, we setup the calling frame so that the instruction causing the exception behaves as if it had called the exception generation function.

5.1.1 Zero Division

The signal SIGFPE is raised, and the exception code info->si_code is either FPE_INTOVF or FPE_INTDIV. If so, signal handler sets up the context so that it seems as if the following function had been called from the instruction that caused the exception.

void catchZeroDivide(unsigned char *pc)

5.1.2 Null Pointer

The signal SIGSEGV is raised, and the exception code info->si_code is SEGV_MAPERR. In addition, the base register of the instruction that caused the exception is 0. Here is the exception generation function:

void catchNullPointer(unsigned char *pc)

5.1.3 Stack Overflow

The signal SIGSEGV is raised, and the exception code info->si_code is SEGV_MAPERR. In addition, the instruction that caused the exception is ld [%sp + constant]. Here is the exception generation function:

void catchStackOverflow(unsigned char *pc)

5.1.4 Array index out of bounds

The signal SIGILL is raised, and the instruction that caused the exception is a trap instruction, and the trap code is ST_RANGE_CHECK. Here is the exception generation function:

void catchArrayIndexOutOfBounds(unsigned char *pc, int index)

This function is slightly different, in that the index of the array must be given as the second parameter. For this reason, before we perform setcontext, we must check the value of the register which was used as a operand to calculate the out-of-bounds condition.

5.2 Macro `FIND_EXCEPTION_FRAME(pc, ee)`

This is a macro used by the exception generation functions described above in order to identify the method that caused the exception. It performs the following steps:

Flush the register window
Trace the native stack
Walk the stack until the frame for the compiled native code is found.
Set the pointer to the methodblock structure into the dummy JVM frame.
Setup for the fillInStackTrace (described later in Section 5.4.).

5.3 Jumping into an exception handler (`handle_exception`)

bool_t handle_exception (ExecEnv *execEnv)

This function traces the compiled native code stack, and finds the corresponding exception handler, and jumps to the handler. As is with C longjump(), it makes a jump leapfrogging the nested function calls. Because SPARC has register windows, they must be restored appropriately during leapfrogging. Here are the steps:

while(1) {
    /* delete the stackframe of the runtime routine */
    while(%i7(address of the caller) is within the runtime routine) {
        restore /* recover the register window */
    }
    if (%i7(return address) is not a compiled native code) {
        /* Return to the JVM interpreter loop */
        return FALSE;
    }

    /* Set the lastpc. Needed when returning to the interpreter loop? */
    ee->current_frame->lastpc = %i7

    Extract the pointer to the methodblock structure from %fp, and
    set it to the variable mb

    /* Find the exception handler for the caught exception within mb */
    new_pc = JITProcedureFindThrowTag(ee, mb, ee->exception.exc, %i7)
    if (new_pc != 0) {
        /* An exception handler is found! */
        exceptionClear(ee) /* Clear the exception flag */

        /*
         * The exception handler for the compiled native code assume
         * that the pointer to the object that caused the exception
         * is in %i7
         */
        %i7 = ee->excetion.exc
        restore /* restore the register window */
        jump new_pc /* Jump to the exception handler */
    }
    /* Exception handler is not found */
    if (mb is a synchronized method) {
        /* unlock the monitor lock */
        /* The monitor object is stored in %fp[-1] */
        monitorExit(%fp[-1])
    }
    restore
}

5.4 `fillInStackTrace`

JDK calls the SignalError function when an exception occurs. This function in turns calls fillInStackTrace(),. Also, java.lang.Throwable class has a method fillInStackTrace, allowing the user program to obtain the status of the current Java method, and the trace of the stackframe.

The code generated by the OpenJIT compiler does not generate a Java frame when compiled native code is called from another native code. As a result, JVM cannot trace the stackframe. To solve this problem, the JDK prepares the following API:

JavaFrame *JITCompiledFramePrev(JavaFrame *frame, JavaFrame *buf)

Other than fillInStackTrace, this function is used to obtain the trace of the stackframe. JVM basically uses the following algorithm to walk the stack to obtain the trace:

{
  JavaFrame *frame, buf;
  frame = ExecEnv->current_frame;
  while(frame) {
    if (frame->current_method->fb.access & ACC_MACHINE_COMPILED) {
      frame = CompiledFramePrev(frame, &buf);
    } else {
      frame = frame->prev;
    }
  }
}

Thus, before JITCompiledFramePrev is called, ExecEnv (the execution environment structure) current_frame must have the Java fame of the compiled native code. For this purpose, when there is a possibility that an exception may occur upon calling a JVM function from the OpenJIT runtime routine, we must also set the JVM frame in the ExecEnv->current_frame.

For OpenJIT, we judged that it is too expensive to generate a JVM frame each time this happens. Instead, we generate a dummy JVM frame only when the control flow transfers from the compiled native code into the internals of the JVM, and set it to ExecEnv->current_frame. When the OpenJIT runtime routine calls a JVM function, we merely set the current_method of the dummy frame.

6. Other Runtime Functions

We show the other OpenJIT runtime functions that are called from the compiled native code that the OpenJIT compiler generates. The compiled native code may also call a C library function or a JVM function. The table below indicates where the called functions are being defined.

JVM Instruction	Runtime Function	Library
anewarray_quick	HArrayOfObject OpenJIT_anewarray_quick(ClassClass array_cb, int size)	OpenJIT
athrow	void OpenJIT_athrow(HJava_lang_Object *obj)	OpenJIT
checkcast_quick	void OpenJIT_checkcast_quick(ClassClass cb, JHandle h)	OpenJIT
d2l	int64_t __dtoll(double d)	C
dcmpg	int OpenJIT_dcmpg(stack_item *p)	OpenJIT
dcmpl	int OpenJIT_dcmpl(stack_item *p)	OpenJIT
drem	double OpenJIT_drem(stack_item *p)	OpenJIT
f2l	int64_t __ftoll(float f)	C
fcmpg	bool_t OpenJIT_fcmpg(float *p)	OpenJIT
fcmpl	bool_t OpenJIT_fcmpl(float *p)	OpenJIT
frem	float OpenJIT_frem(float *args)	OpenJIT
instanceof	bool_t OpenJIT_instanceof(int index, JHandle *h)	OpenJIT
l2d	double OpenJIT_l2d(signed hi, unsigned lo)	OpenJIT
l2f	float OpenJIT_l2f(signed hi, unsigned lo)	OpenJIT
lcmp	bool_t OpenJIT_lcmp(long long x, long long y)	OpenJIT
ldiv	int64_t __div64(int64_t x, int64_t y)	C
lmul	int64_t __mul64(int64_t x, int64_t y)	C
lrem	int64_t __rem64(int64_t x, int64_t y)	C
lshl	uint64_t longOpenJIT_lshl(signed hi, unsigned lo, unsigned b)	OpenJIT
lshr	uint64_t longOpenJIT_lshr(signed hi, unsigned lo, unsigned b)	OpenJIT
lushr	uint64_t longOpenJIT_lushr(signed hi, unsigned lo, unsigned b)	OpenJIT
monitorEnter	void monitorEnter(unsigned int key)	JDK
monitorExit	void monitorExit(unsigned int key)	JDK
multianewarray_quick	HObject OpenJIT_multianewarray_quick(ClassClass array_cb, int dimensions, stack_item *optop)	OpenJIT
new_quick	HObject OpenJIT_new_quick(ClassClass cb)	OpenJIT
newarray	JHandle *OpenJIT_newarray(int type, int size)	OpenJIT

7. Conclusion

We covered the runtime structure of the OpenJIT backend system. For the details of how the JVM instructions are translated, and runtime functions are called, the readers are referred to the files in org/OpenJIT/Sparc.java. The layout of the stackframe of the compiled native code is described in a companion document OpenJIT Backend Compiler Internal Specification.

openjit@is.titech.ac.jp

Last modified: Sat Oct 30 18:31:00 JST 1999