All Categories :
Java
Chapter 34
Java Under the Hood: Inside the
Virtual Machine
by Stephen Ingram
CONTENTS
This chapter takes an in-depth look at the internals of the Java
virtual machine (VM). Although an understanding of Java's internals
is not required to be an effective Java programmer, comprehension
of this chapter provides the basis for making the transition to
expert-level Java coding. In any event, the VM internals shed
light on the mindset of Java's original designers. Exploration
at this level is a fascinating journey because of the elegance
behind the Java architecture.
Looking at a virtual machine from the outside in is probably the
best way to understand its workings. Incremental learning results
when you move from the known to the unknown, so this chapter starts
with the item you are most familiar with: the class file.
The class file is similar to standard language object modules.
When a C language file is compiled, the output is an object module.
Multiple object modules are linked together to form an executable
program. In Java, the class file replaces the object module and
the Java virtual machine replaces the executable program.
You'll find all the information needed to execute a class contained
within the class file; you'll also find extra information that
aids debugging and source file tracking. Remember that Java has
no "header" include files, so the class file format
also has to fully convey class layout and members. Parsing a class
file yields a wealth of class information, not just its runtime
architecture.
The overall layout of the class file uses an outer structure and
a series of substructures that contain an ever-increasing amount
of detail. The outer layer is described by the following ClassFile
structure:
ClassFile
{
u4 magic;
u2 minor_version;
u2 major_version;
u2 constant_pool_count;
cp_info constant_pool[constant_pool_count - 1];
u2 access_flags;
u2 this_class;
u2 super_class;
u2 interfaces_count;
u2 interfaces[interfaces_count];
u2 fields_count;
field_info fields[fields_count];
u2 methods_count;
method_info methods[methods_count];
u2 attributes_count;
attribute_info attributes[attribute_count];
}
In addition to the generic class information (this_class,
super_class, version, and so on), there are
three major substructures: cp_info, field_info,
and method_info. The attribute_info structure
is considered a minor substructure because attributes recur throughout
the class file at various levels. Fields and methods contain their
own set of attributes and some individual attributes also contain
their own private attribute arrays.
The symbols u2 and u4 represent unsigned 2-byte
and unsigned 4-byte quantities.
Simply regurgitating class file structures does not provide the
best basis for actual learning. To better convey the overall class
file structure, I wrote an interactive Java application that presents
a class file in a tree format. Figure 34.1 shows the tool in action.
With this tool, you can view any class and save it in ASCII format.
Navigation is performed with the keyboard: The arrow keys provide
movement and the spacebar expands and contracts the nodes.
Figure 34.1 : The class Viewer application in action.
The Viewer.class tool and source code are provided on the CD-ROM
that accompanies this book. Because it's an application, run the
tool from the command line with this statement: java Viewer
Now that you have a tool, you need a simple class to explore.
Here's one, just waiting for you:
public class test
{
public static int st_one;
public test()
{
st_one = 100;
}
public test(int v)
{
st_one = v;
}
public native boolean getData(int data[] );
public int do_countdown()
{
int x = st_one;
System.out.println("Performing countdown:");
while ( x-- != 0 )
System.out.println(x);
return st_one;
}
public int do_countdown(int x)
{
int save = x;
System.out.println("Performing countdown:");
while ( x-- != 0 )
System.out.println(x);
return save;
}
}
This class doesn't actually do very much, but it does provide
a basis for class file exploration. Once compiled, the outer layer
of the resulting class file is as follows:
----file name: test.class
|....magic: cafebabe
|....version: 45.3
|+---ConstantPool [42]
|....access: PUBLIC
|....this: Class: test
|....parent: Class: java/lang/Object
|....No Interfaces
|+---Fields [1]
|+---Methods [5]
|+---Attributes [1]
The magic number of a class is 0xcafebabe. If
a class file does not contain this number, the Java virtual machine
will refuse to load the class. This number must appear in a class
file or the file is assumed to be invalid. The current major version
is 45; the current minor version is 3. Future Java compilers will
increment these numbers when the underlying class file format
changes. Version numbers allow future Java machines to recognize
and disallow execution of older class files.
Access Flags
Access flags are used throughout the class file to convey the
access characteristics of various items. The flag itself is a
collection of 11 individual bits. Table 34.1 lays out the masks.
Table 34.1. Access flag bit values.
Flag Value | Indication
|
ACC_PUBLIC = 0x0001 | Global visibility
|
ACC_PRIVATE = 0x0002 | Local class visibility
|
ACC_PROTECTED = 0x0004 | Subclass visibility
|
ACC_STATIC = 0x0008 | One occurrence in system (not per class)
|
ACC_FINAL = 0x0010 | No changes allowed
|
ACC_SYNCHRONIZED = 0x0020 | Access with a monitor
|
ACC_VOLATILE = 0x0040 | No local caching
|
ACC_TRANSIENT = 0x0080 | Not a persistent value
|
ACC_NATIVE = 0x0100 | Native method implementation
|
ACC_INTERFACE = 0x0200 | Class is an interface
|
ACC_ABSTRACT = 0x0400 | Class or method is abstract
|
Access flags are present for a class and its fields and methods.
Only a subset of values appears in any given item. Some bits apply
only to fields (for example, VOLATILE and TRANSIENT);
others apply only to methods (for example, SYNCHRONIZED
and NATIVE).
Attributes
Attributes, like access flags, appear throughout a class file.
They have the following form:
GenericAttribute_info
{
u2 attribute_name;
u4 attribute_length;
u1 info[attribute_length];
}
A generic structure exists to enable loaders to skip over attributes
they don't understand. The actual attribute has a unique structure
that can be read if the loader understands the format. As an example,
the following structure specifies the format of a source file
attribute:
SourceFile_attribute
{
u2 attribute_name;
u4 attribute_length;
u2 sourcefile_index;
}
The name of an attribute is an index into the constant pool. You
learn about the constant pool in the next section. If a loader
does not understand the source file attribute structure, it can
skip the data by reading the number of bytes specified in the
length parameter. For the source file attribute, the
length is 2.
Constant Pool
The constant pool forms the basis for all numbers and strings
within a class file. Nowhere else do you find strings or numbers.
Any time you need to reference a string or number, you substitute
an index into the constant pool. Consequently, the constant pool
is the dominant feature of a class. The pool is even used directly
within the virtual machine itself.
There are 12 different types of constants:
- CONSTANT_Utf8 = 1
- CONSTANT_Unicode = 2
- CONSTANT_Integer = 3
- CONSTANT_Float = 4
- CONSTANT_Long = 5
- CONSTANT_Double = 6
- CONSTANT_Class = 7
- CONSTANT_String = 8
- CONSTANT_Fieldref = 9
- CONSTANT_Methodref = 10
- CONSTANT_InterfaceMethodref = 11
- CONSTANT_NameAndType = 12
Each constant structure leads off with a tag identifying the structure
type. Following the type is data specific to each individual structure.
The layout of each constant structure is shown in Listing 34.1.
Listing 34.1. The layout of all 12 constant structures.
CONSTANT_Utf8_info
{
u1 tag;
u2 length;
u1 bytes[length];
}
CONSTANT_Unicode_info
{
u1 tag;
u2 length;
u2 words[length];
}
CONSTANT_Integer_info
{
u1 tag;
u4 bytes;
}
CONSTANT_Float_info
{
u1 tag;
u4 bytes;
}
CONSTANT_Long_info
{
u1 tag;
u4 high_bytes;
u4 low_bytes;
}
CONSTANT_Double_info
{
u1 tag;
u4 high_bytes;
u4 low_bytes;
}
CONSTANT_Class_info
{
u1 tag;
u2 name_index;
}
CONSTANT_String_info
{
u1 tag;
u2 string_index;
}
CONSTANT_Fieldref_info
{
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
CONSTANT_Methodref_info
{
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
CONSTANT_InterfaceMethodref_info
{
u1 tag;
u2 class_index;
u2 name_and_type_index;
}
CONSTANT_NameAndType_info
{
u1 tag;
u2 name_index;
u2 signature_index;
}
The CONSTANT_Utf8 structure contains standard ASCII text
strings. These are not null-terminated because they use
an explicit length parameter. Notice that most of the
constants reference other constants for information. Methods,
for example, specify a class and type by providing indexes to
other constant pool members. Constant pool cross-references eliminate
repetition of data.
The constant pool for the sample test class appears as
follows:
|---file name: test.class
|...magic: cafebabe
|...version: 45.3
|---ConstantPool [42]
| |...String #28 -> Performing countdown:
| |...Class: java/lang/System
| |...Class: java/lang/Object
| |...Class: java/io/PrintStream
| |...Class: test
| |...Method: java/lang/Object.<init>()V
| |...Field: test.st_one I
| |...Field: java/lang/System.out Ljava/io/PrintStream;
| |...Method: java/io/PrintStream.println(I)V
| |...Method: java/io/PrintStream.println(Ljava/lang/String;)V
| |...NameAndType: st_one I
| |...NameAndType: println (I)V
| |...NameAndType: out Ljava/io/PrintStream;
| |...NameAndType: println (Ljava/lang/String;)V
| |...NameAndType: <init> ()V
| |...Utf8 [7] println
| |...Utf8 [4] (I)V
| |...Utf8 [3] ()I
| |...Utf8 [13] ConstantValue
| |...Utf8 [4] (I)I
| |...Utf8 [19] java/io/PrintStream
| |...Utf8 [10] Exceptions
| |...Utf8 [15] LineNumberTable
| |...Utf8 [1] I
| |...Utf8 [10] SourceFile
| |...Utf8 [14] LocalVariables
| |...Utf8 [4] Code
| |...Utf8 [21] Performing countdown:
| |...Utf8 [3] out
| |...Utf8 [21] (Ljava/lang/String;)V
| |...Utf8 [16] java/lang/Object
| |...Utf8 [6] <init>
| |...Utf8 [21] Ljava/io/PrintStream;
| |...Utf8 [16] java/lang/System
| |...Utf8 [12] do_countdown
| |...Utf8 [5] ([I)Z
| |...Utf8 [6] st_one
| |...Utf8 [7] getData
| |...Utf8 [9] test.java
| |...Utf8 [3] ()V
| |...Utf8 [4] test
|...access: PUBLIC
|...this: Class: test
|...parent: Class: java/lang/Object
|...No Interfaces
|+--Fields [1]
|+--Methods [5]
|+--Attributes [1]
NOTE |
The Viewer tool substitutes pool indexes with actual pool data whenever possible.
|
Fields
Field structures contain the individual data members of a class.
Any class item that is not a method is placed into the fields
section of the class file. The field structure looks
like this:
field_info
{
u2 access_flags;
u2 name_index;
u2 signature_index;
u2 attribute_count;
attribute_info attributes[attribute_count];
}
The sample test class contains one field:
|---file name: test.class
|...magic: cafebabe
|...version: 45.3
|+--ConstantPool [42]
|...access: PUBLIC
|...this: Class: test
|...parent: Class: java/lang/Object
|...No Interfaces
|---Fields [1]
| |---st_one
| |...I
| |...PUBLIC STATIC
| |...No attributes
|+--Methods [5]
|+--Attributes [1]
Methods
The method section of the class file contains all the
executable content of a class. In addition to the method name
and signature, the structure contains a set of attributes. One
of these attributes has the actual bytecodes that the virtual
machine will execute. The method structure is shown here:
method_info
{
u2 access_flags;
u2 name_index;
u2 signature_index;
u2 attributes_count;
attribute_info attributes[attribute_count];
}
The sample test class contains the following method
section:
|---file name: test.class
|...magic: cafebabe
|...version: 45.3
|+--ConstantPool [42]
|...access: PUBLIC
|...this: Class: test
|...parent: Class: java/lang/Object
|...No Interfaces
|+--Fields [1]
|---Methods [5]
| |---<init>
| | |...()V
| | |...PUBLIC
| | |---Attributes [1]
| | |---Code
| | |...max_stack = 1
| | |...max_locals = 1
| | |+--Byte Codes [10]
| | |...No exceptions
| | |+--Attributes [1]
| |---<init>
| | |...(I)V
| | |...PUBLIC
| | |---Attributes [1]
| | |---Code
| | |...max_stack = 1
| | |...max_locals = 2
| | |+--Byte Codes [9]
| | |...No exceptions
| | |+--Attributes [1]
| |---getData
| | |...([I)Z
| | |...PUBLIC NATIVE
| | |...No attributes
| |---do_countdown
| | |...()I
| | |...PUBLIC
| | |---Attributes [1]
| | |---Code
| | |...max_stack = 2
| | |...max_locals = 2
| | |+--Byte Codes [33]
| | |...No exceptions
| | |+--Attributes [1]
| |---do_countdown
| |...(I)I
| |...PUBLIC
| |---Attributes [1]
| |---Code
| |...max_stack = 2
| |...max_locals = 3
| |+--Byte Codes [29]
| |...No exceptions
| |+--Attributes [1]
|+--Attributes [1]
Each method has a name and signature. Signatures are used by Java
to determine calling arguments and return types. The format of
a signature is as follows:
"(args*)return_type"
Arguments can be any combination of the characters listed in Table
34.2. Class name arguments are written as follows:
Lclass_name;
The semicolon (;) signals the end of the class name, just as the
right parenthesis signals the end of an argument list. Arrays
are followed by the array type:
[B for an array of bytes
[Ljava/langString; for an array of objects (in this case, Strings)
Table 34.2. Method signature symbols.
Type | Signature Character
|
byte | B
|
char | C
|
class | L
|
end of class | ;
|
float | F
|
double | D
|
function | (
|
end of function | )
|
int | I
|
long | J
|
short | S
|
void | V
|
boolean | Z
|
All the methods except getData() have a code attribute.
This method is marked as NATIVE, so the Java virtual
machine expects the code to be in a native library. Each non-native
method contains a code attribute that has the following format:
Code_attribute
{
u2 attribute_name;
u4 attribute_length;
u2 max_stack;
u2 max_locals;
u4 code_length;
u1 code[code_length];
u2 exception_table_length;
ExceptionItem exceptions[exception_table_length];
u2 attributes_count;
attribute_info attributes[attribute_count];
}
Code attributes contain a private list of other attributes. Typically,
these are debugging lists, such as line-number information.
The pc register points to the next bytecode to execute.
Whenever an exception is thrown, the method's exception table
is searched for a handler. Each exception table entry has this
format:
ExceptionItem
{
u2 start_pc;
u2 end_pc;
u2 handler_pc;
u2 catch_type;
}
If the pc register is within the proper range and the
thrown exception is the proper type, the entry's handler code
block is executed. If no handler is found, the exception propagates
up to the calling method. The procedure repeats itself until either
a valid handler is found or the program exits.
Now that you've hit the code attribute, it's time to jump into
the virtual machine.
The Java virtual machine interprets Java bytecodes that are contained
in code attributes. The virtual machine is stack based. Most computer
architectures perform their operations on a mixture of memory
locations and registers. The Java virtual machine performs its
operations exclusively on a stack. This was done primarily to
support portability. No assumptions could be made about the size
or number of registers in a given CPU. Intel microprocessors are
especially limited in their register composition.
The virtual machine does contain some registers, but these are
used for tracking the current state of the machine:
- The pc register points to the next bytecode to execute
- The vars register points to the local variables for
a method
- The optop register points to the operand stack
- The frame register points to the execution environment
All these registers are 32 bits wide and point into separate storage
blocks. The blocks, however, can be allocated all at once because
the code attribute specifies the size of the operand stack, the
number of local variables, and the length of the bytecodes.
Most Java bytecodes work on the operand stack. For example, to
add two integers together, each integer is pushed onto the operand
stack. The addition operator removes the top two integers, adds
them, and places the result in their place back on the stack:
..., 4, 5 -> ..., 9
NOTE |
Operand stack notation is used throughout the remainder of this chapter. The stack reads from left to right, with the stack top on the extreme right. Ellipses indicate indeterminate data buried on the stack. The arrow indicates an operation; the data to the right of the arrow represents the stack after the operation is performed.
|
Each stack location is 32 bits wide. Longs and doubles are 64
bits wide, so they take up two stack locations.
Each code attribute specifies the size of the local variables.
A local variable is 32 bits wide, so long and double primitives
take up two variable slots. Unlike C, all method arguments appear
as local variables. The operand stack is reserved exclusively
for operations.
Before detailing the bytecodes, I think a more holistic view of
the virtual machine's operations would be instructive.
Consider the following method from the sample test class:
public int do_countdown(int x)
{
int save = x;
System.out.println("Performing countdown:");
while ( x-- != 0 )
System.out.println(x);
return save;
}
The class file specifies that this method has a maximum operand
stack of two and a local variable block of three. Figure 34.2
shows the initial state of these two blocks.
Figure 34.2 : The initial state of the operand stack
and local variable blocks.
Notice how the method's input argument is placed into the local
variable block. Java contains an extensive set of bytecodes dedicated
to moving data between the local variable block and the operand
stack. All data must be moved to the stack before it can be used.
This is an important distinction between Java machine code and
register-based machine architectures. Register machines reference
memory directly from the contents of a register. Java must first
move the contents of a variable to its operand stack before it
can reference the location.
The do_countdown(int x) method contains 29 bytecodes.
These two instructions store the input argument x into
the local variable save:
0 iload_1 Move argument x onto the operand stack
1 istore_2 Store top of stack (x) into variable save
The next three bytecodes first load a target object (System.out)
and a String onto the operand stack. At this point, println(String)
is invoked. Notice that before the invocation, both elements of
the operand stack are occupied. After the invocation, the operand
stack is completely empty because invokevirtual removes
all the stack elements it uses:
2 getstatic #8 <Field: java/lang/System.out Ljava/io/PrintStream;>
5 ldc1 #1 <String #28 -> Performing countdown:>
7 invokevirtual #10 <Method: java/io/PrintStream.println(Ljava/lang/String;)V>
This instruction transfers control to the test portion of the
method's while loop. Most modern compilers code a loop
with the loop test at the bottom:
10 goto 20
The loop body calls println(int):
13 getstatic #8 <Field: java/lang/System.out Ljava/io/PrintStream;>
16 iload_1 Move argument x onto the operand stack
17 invokevirtual #9 <Method: java/io/PrintStream.println(I)V>
Okay, this next bit is a little tricky. First, the current value
of x is placed onto the stack. Next, 1 is subtracted
from variable x. Finally, the value on the stack is checked
for zero. Notice that the value on the stack is x before
it was decremented. If the stack value is not zero, the body of
the loop will be executed:
20 iload_1 Move argument x onto the operand stack
21 iinc 1 -1 Decrement argument x by one
24 ifne 13 Branch if stack top (x before decrement) not zero
The final two bytecodes move the value of variable save
onto the stack and then return this value to the calling method:
27 iload_2 Move variable save onto the operand stack
28 ireturn Move integer on stack onto calling method's operand stack
When a class is loaded, it is passed through a bytecode verifier
before it is executed. The verifier checks the internal consistency
of the class and the validity of the code. Java uses a late binding
scheme that puts the code at risk. In traditional languages, the
object linker binds all the method calls and variable accesses
to specific addresses. In Java, the virtual machine doesn't perform
this service until the last possible moment. As a result, it is
possible for a called class to have changed since the original
class was compiled. Method names or their arguments may have been
altered, or the access levels may have been changed. One of the
verifier's jobs is to make sure that all external object references
are correct and allowed.
No assumptions can be made about the origin of bytecodes. A hostile
compiler could be used to create executable bytecodes that conform
to the class file format, but specify illegal codes.
The verifier uses a conservative four-pass verification algorithm
to check bytecodes.
Pass 1
The first pass through the verifier reads in the class file and
ensures that it is valid. The magic number must be present
and all the class data must be present with no truncation or extra
data after the end of the class. Any recognized attributes must
have the correct length and the constant pool must not have any
unrecognized entries.
Pass 2
The second pass through the verifier involves validating class
features other than the bytecodes. All methods and fields must
have a valid name and signature, and every class must have a super
class. Signatures are not actually checked, but they must appear
valid. The next pass is more specific.
Pass 3
The third pass is the most complex because the bytecodes are validated.
The bytecodes are analyzed to make sure that they have the correct
type and number of arguments. In addition, a data-flow analysis
is performed to determine each path through the method. Each path
must arrive at a given point with the same stack size and types.
Each path must call methods with the proper arguments, and fields
must be modified with values of the appropriate type. Class accesses
are not checked in this pass. Only the return type of external
functions is verified.
Forcing all paths to arrive with the same stack and registers
can lead the verifier to fail some otherwise legitimate bytecodes.
This is a small price to pay for this high level of security.
Pass 4
The fourth pass loads externally referenced classes and checks
that the method name and signatures match. This pass also validates
that the current class has access rights to the external class.
After complete validation, each instruction is replaced with a
_quick alternative. These _quick bytecodes indicate
that the class has been verified and need not be checked again.
The bytecodes can be divided into 11 major categories:
- Pushing constants onto the stack
- Moving local variable contents to and from the stack
- Managing arrays
- Generic stack instructions (dup, swap, pop,
and nop)
- Arithmetic and logical instructions
- Conversion instructions
- Control transfer and function return
- Manipulating object fields
- Method invocation
- Miscellaneous operations
- Monitors
Each bytecode has a unique tag and is followed by a fixed number
of additional arguments. Notice that there is no way to work directly
with class fields or local variables. They must be moved to the
operand stack before any operations can be performed on the contents.
Generally, there are multiple formats for each individual operation.
The addition operation provides a good example. There are actually
four forms of addition: iadd, ladd, fadd,
and dadd. Each type assumes the top two stack items are
of the correct format: integers, longs, floats, or doubles.
Pushing Constants onto the Stack
Java uses the following instructions for moving object data and
local variables to the operand stack.
Push One-Byte Signed Integer:
bipush=16 byte1 Stack: ... -> ..., byte1
Push Two-Byte Signed Integer:
sipush=17 byte1 byte2 Stack: ... -> ..., word1
Push Item from the Constant Pool (8-bit index):
ldc1=18 indexbyte1 Stack: ... -> ..., item
Push Item from the Constant Pool (16-bit index):
ldc2=19 indexbyte1 indexbyte2 Stack: ... -> ..., item
Push Long or Double from Constant Pool (16-bit index):
ldc2w=20 indexbyte1 indexbyte2 Stack: ... -> ..., word1, word2
Push Null Object:
aconst_null=1 Stack: ... -> ..., null
Push Integer Constant -1:
iconst_m1=2 Stack: ... -> ..., -1
Push Integer Constants:
iconst_0=3 Stack: ... -> ..., 0
iconst_1=4 Stack: ... -> ..., 1
iconst_2=5 Stack: ... -> ..., 2
iconst_3=6 Stack: ... -> ..., 3
iconst_4=7 Stack: ... -> ..., 4
iconst_5=8 Stack: ... -> ..., 5
Push Long Constants:
lconst_0=9 Stack: ... -> ..., 0, 0
lconst_1=10 Stack: ... -> ..., 0, 1
Push Float Constants:
fconst_0=11 Stack: ... -> ..., 0
fconst_1=12 Stack: ... -> ..., 1
fconst_2=13 Stack: ... -> ..., 2
Push Double Constants:
dconst_0=14 Stack: ... -> ..., 0, 0
dconst_1=15 Stack: ... -> ..., 0, 1
Accessing Local Variables
The most commonly referenced local variables are at the first
four offsets from the vars register. Because of this,
Java provides single-byte instructions to access these variables
for both reading and writing. A two-byte instruction is needed
to reference variables greater than four deep. The variable at
location zero is the class pointer itself (the this pointer).
Load Integer from Local Variable:
iload=21 vindex Stack: ... -> ..., contents of varaible at vars[vindex]
iload_o=26 Stack: ... -> ..., contents of variable at vars[0]
iload_1=27 Stack: ... -> ..., contents of variable at vars[1]
iload_2=28 Stack: ... -> ..., contents of variable at vars[2]
iload_3=29 Stack: ... -> ..., contents of variable at vars[3]
Load Long Integer from Local Variable:
lload=22 vindex Stack: .. -> ..., word1, word2 from vars[vindex] & vars[vindex+1]
lload_0=30 Stack: .. -> ..., word1, word2 from vars[0] & vars[1]
lload_1=31 Stack: .. -> ..., word1, word2 from vars[1] & vars[2]
lload_2=32 Stack: .. -> ..., word1, word2 from vars[2] & vars[3]
lload_3=33 Stack: .. -> ..., word1, word2 from vars[3] & vars[4]
Load Float from Local Variable:
fload=23 vindex Stack: ... -> ..., contents from vars[vindex]
fload_0=34 Stack: ... -> ..., contents from vars[0]
fload_1=35 Stack: ... -> ..., contents from vars[1]
fload_2=36 Stack: ... -> ..., contents from vars[2]
fload_3=37 Stack: ... -> ..., contents from vars[3]
Load Double from Local Variable:
dload=24 vindex Stack: ... -> ..., word1, word2 from vars[vindex] & vars[vindex+1]
dload_0=38 Stack: ... -> ..., word1, word2 from vars[0] & vars[1]
dload_1=39 Stack: ... -> ..., word1, word2 from vars[1] & vars[2]
dload_2=40 Stack: ... -> ..., word1, word2 from vars[2] & vars[3]
dload_3=41 Stack: ... -> ..., word1, word2 from vars[3] & vars[4]
Load Object from Local Variable:
aload=25 vindex Stack: ... -> ..., object from vars[vindex]
aload_0=42 Stack: ... -> ..., object from vars[0]
aload_1=43 Stack: ... -> ..., object from vars[1]
aload_2=44 Stack: ... -> ..., object from vars[2]
aload_3=45 Stack: ... -> ..., object from vars[3]
Store Integer into Local Variable:
istore=54 vindex Stack: ..., INT -> ... into vars[vindex]
istore_0=59 Stack: ..., INT -> ... into vars[0]
istore_1=60 Stack: ..., INT -> ... into vars[1]
istore_2=61 Stack: ..., INT -> ... into vars[2]
istore_3=62 Stack: ..., INT -> ... into vars[3]
Store Long Integer into Local Variable:
lstore=55 vindex Stack: ..., word1, word2 -> ... into vars[vindex] & vars[vindex+1]
lstore_0=63 Stack: ..., word1, word2 -> ... into vars[0] & vars[1]
lstore_1=64 Stack: ..., word1, word2 -> ... into vars[1] & vars[2]
lstore_2=65 Stack: ..., word1, word2 -> ... into vars[2] & vars[3]
lstore_3=66 Stack: ..., word1, word2 -> ... into vars[3] & vars[4]
Store Float into Local Variable:
fstore=56 vindex Stack: ..., FLOAT -> ... into vars[vindex]
fstore_0=67 Stack: ..., FLOAT -> ... into vars[0]
fstore_1=68 Stack: ..., FLOAT -> ... into vars[1]
fstore_2=69 Stack: ..., FLOAT -> ... into vars[2]
fstore_3=70 Stack: ..., FLOAT -> ... into vars[3]
Store Double into Local Variable:
dstore=57 vindex Stack: ..., word1, word2 -> ... into vars[vindex] & vars[vindex+1]
dstore_0=71 Stack: ..., word1, word2 -> ... into vars[0] & vars[1]
dstore_1=72 Stack: ..., word1, word2 -> ... into vars[1] & vars[2]
dstore_2=73 Stack: ..., word1, word2 -> ... into vars[2] & vars[3]
dstore_3=74 Stack: ..., word1, word2 -> ... into vars[3] & vars[4]
Store Object into Local Variable:
astore=58 vindex Stack: ..., OBJ -> ... into vars[vindex]
astore_0=75 Stack: ..., OBJ -> ... into vars[0]
astore_1=76 Stack: ..., OBJ -> ... into vars[1]
astore_2=77 Stack: ..., OBJ -> ... into vars[2]
astore_3=78 Stack: ..., OBJ -> ... into vars[3]
Increment Local Variable (incrementing applies only to integers):
iinc=132 vindex constant Stack: ... -> ... vars[vindex] += constant
Managing Arrays
Arrays are treated as objects, but they don't use a method table
pointer. Because of this uniqueness, arrays have special bytecodes
to create and access them.
Allocate a New Array:
newarray=188 type Stack: ..., size -> ..., OBJ
Allocate a New Array of Objects:
anewarray=189 classindex1 classindex2 Stack: ..., size -> ..., OBJ
Allocate a New Multidimensional Array:
multianewarray=197 indexbyte1 indexbyte1 indexbyte2 Stack: ..., size1, size2, etc. -> ..., OBJ
Get the Array Length:
arraylength=190 Stack: ..., OBJ -> ..., length
Load Primitives from the Array:
iaload=46 Stack: ..., OBJ, index -> ..., INT
laload=47 Stack: ..., OBJ, index -> ..., LONG1, LONG2
faload=48 Stack: ..., OBJ, index -> ..., FLOAT
daload=49 Stack: ..., OBJ, index -> ..., DOUBLE1, DOUBLE2
aaload=50 Stack: ..., OBJ, index -> ..., OBJ
baload=51 Stack: ..., OBJ, index -> ..., BYTE
caload=52 Stack: ..., OBJ, index -> ..., CHAR
saload=53 Stack: ..., OBJ, index -> ..., SHORT
Store Primitives into the Array:
iastore=79 Stack: ..., OBJ, index, INT -> ...
lastore=80 Stack: ..., OBJ, index, LONG1, LONG2 -> ...
fastore=81 Stack: ..., OBJ, index, FLOAT -> ...
dastore=82 Stack: ..., OBJ, index, DOUBLE1, DOUBLE2 -> ...
aastore=83 Stack: ..., OBJ, index, OBJ -> ...
bastore=84 Stack: ..., OBJ, index, BYTE -> ...
castore=85 Stack: ..., OBJ, index, CHAR -> ...
sastore=86 Stack: ..., OBJ, index, SHORT -> ...
Generic Stack Instructions
Following are the basic operations that alter the stack.
Do Nothing:
nop=0 Stack: ... -> ...
Pop Stack Values:
pop=87 Stack: ..., VAL -> ...
pop2=88 Stack: ..., VAL1, VAL2 -> ...
Duplicate Stack Values and Possibly Insert Below Stack Top:
dup=89 Stack: ..., V -> ..., V, V
dup2=92 Stack: ..., V1, V2 -> ..., V1, V2, V1, V2
dup_x1=90 Stack: ..., V1, V2 -> ..., V2, V1, V2
dup2_x1=93 Stack: ..., V1, V2, V3 -> ..., V2, V3, V1, V2, V3
dup_x2=91 Stack: ..., V1, V2, V3 -> ..., V3, V1, V2, V3
dup2_x2=94 Stack: ..., V1, V2, V3, V4 -> ..., V3, V4, V1, V2, V3, V4
Swap Two Stack Items:
swap=95 Stack: ..., V1, V2 -> ..., V2, V1
Arithmetic and Logical Instructions
All the arithmetic operations operate on four possible types:
integer, long, float, or double. Logical instructions operate
only on integer and long types.
Addition:
iadd=96 Stack: ..., INT1, INT2 -> ..., INT1+INT2
ladd=97 Stack: ..., L1_1, L1_2, L2_1, L2_2 -> ..., L1+L2 (high), L1+L2 (low)
fadd=98 Stack: ..., FLOAT1, FLOAT2 -> ..., FLOAT1+FLOAT2
dadd=99 Stack: ..., D1_1, D1_2, D2_1, D2_2 -> ..., D1+D2 (high), D1+D2 (low)
Subtraction:
isub=100 Stack: ..., INT1, INT2 -> ..., INT1-INT2
lsub=101 Stack: ..., L1_1, L1_2, L2_1, L2_2 -> ..., L1-L2 (high), L1-L2 (low)
fsub=102 Stack: ..., FLOAT1, FLOAT2 -> ..., FLOAT1-FLOAT2
dsub=103 Stack: ..., D1_1, D1_2, D2_1, D2_2 -> ..., D1-D2 (high), D1-D2 (low)
Multiplication:
imul=104 Stack: ..., INT1, INT2 -> ..., INT1*INT2
lmul=105 Stack: ..., L1_1, L1_2, L2_1, L2_2 -> ..., L1*L2 (high), L1*L2 (low)
fmul=106 Stack: ..., FLOAT1, FLOAT2 -> ..., FLOAT1*FLOAT2
dmul=107 Stack: ..., D1_1, D1_2, D2_1, D2_2 -> ..., D1*D2 (high), D1*D2 (low)
Division:
idiv=108 Stack: ..., INT1, INT2 -> ..., INT1/INT2
ldiv=109 Stack: ..., L1_1, L1_2, L2_1, L2_2 -> ..., L1/L2 (high), L1/L2 (low)
fdiv=110 Stack: ..., FLOAT1, FLOAT2 -> ..., FLOAT1/FLOAT2
ddiv=111 Stack: ..., D1_1, D1_2, D2_1, D2_2 -> ..., D1/D2 (high), D1/D2 (low)
Remainder:
irem=112 Stack: ..., INT1, INT2 -> ..., INT1%INT2
lrem=113 Stack: ..., L1_1, L1_2, L2_1, L2_2 -> ..., L1%L2 (high), L1%L2 (low)
frem=114 Stack: ..., FLOAT1, FLOAT2 -> ..., FLOAT1%FLOAT2
drem=115 Stack: ..., D1_1, D1_2, D2_1, D2_2 -> ..., D1%D2 (high), D1%D2 (low)
Negation:
ineg=116 Stack: ..., INT -> ..., -INT
lneg=117 Stack: ..., LONG1, LONG2 -> ..., -LONG1, -LONG2
fneg=118 Stack: ..., FLOAT -> ..., -FLOAT
dneg=119 Stack: ..., DOUBLE1, DOUBLE2 -> ..., -DOUBLE1, -DOUBLE2
Integer Logical Instructions:
>>> denotes an unsigned right shift
ishl=120 Stack: ..., INT1, INT2 -> INT1<<(INT2 & 0x1f)
ishr=122 Stack: ..., INT1, INT2 -> INT1>>(INT2 & 0x1f)
iushr=124 Stack: ..., INT1, INT2 -> INT1>>>(INT2 & 0x1f)
Long Integer Logical Instructions:
>>> denotes an unsigned right shift
lshl=121 Stack: ..., L1, L2, INT -> L1<<(INT & 0x3f), L2<<(INT & 0x3f)
lshr=123 Stack: ..., L1, L2, INT -> INT1>>(INT & 0x3f), L2>>(INT & 0x03)
lushr=125 Stack: ..., L1, L2, INT -> INT1>>>(INT & 0x3f), L2>>>(INT & 0x3f)
Integer Boolean Operations:
iand=126 Stack: ..., INT1, INT2 -> ..., INT1&INT2
ior=128 Stack: ..., INT1, INT2 -> ..., INT1|INT2
ixor=130 Stack: ..., INT1, INT2 -> ..., INT1^INT2
Long Integer Boolean Operations:
land=127 Stack: ..., L1_1, L1_2, L2_1, L2_2 -> ..., L1_1&L2_1, L1_2&L2_2
lor=129 Stack: ..., L1_1, L1_2, L2_1, L2_2 -> ..., L1_1|L2_1, L1_2|L2_2
lxor=131 Stack: ..., L1_1, L1_2, L2_1, L2_2 -> ..., L1_1^L2_1. L1_2^L2_2
Conversion Instructions
Because most of the previous bytecodes expect the stack to contain
a homogenous set of operands, Java uses conversion functions.
In code, you can add a float and an integer, but Java will first
convert the integer to a float type before performing the addition.
Integer Conversions:
i2l=133 Stack: .., INT -> ..., LONG1, LONG2
i2f=134 Stack: .., INT -> ..., FLOAT
i2d=135 Stack: .., INT -> ..., DOUBLE1, DOUBLE2
int2byte=145 Stack: .., INT -> ..., BYTE
int2char=146 Stack: .., INT -> ..., CHAR
int2short=147 Stack: .., INT -> ..., SHORT
Long Integer Conversions:
l2i=136 Stack: .., LONG1, LONG2 -> ..., INT
l2f=137 Stack: .., LONG1, LONG2 -> ..., FLOAT
l2d=138 Stack: .., LONG1, LONG2 -> ..., DOUBLE1, DOUBLE2
Float Conversions:
f2i=139 Stack: .., FLOAT -> ..., INT
f2l=140 Stack: .., FLOAT -> ..., LONG1, LONG2
f2d=141 Stack: .., FLOAT -> ..., DOUBLE1, DOUBLE2
Double Conversions:
d2i=142 Stack: .., DOUBLE1, DOUBLE2 -> ..., INT
d2l=143 Stack: .., DOUBLE1, DOUBLE2 -> ..., LONG1, LONG2
d2f=144 Stack: .., DOUBLE1, DOUBLE2 -> ..., FLOAT
Control Transfer and Function Return
All branch indexes are signed 16-bit offsets from the current
pc register.
Comparisons with Zero:
ifeq=153 branch1 branch2 Stack: ..., INT -> ...
ifne=154 branch1 branch2 Stack: ..., INT -> ...
iflt=155 branch1 branch2 Stack: ..., INT -> ...
ifge=156 branch1 branch2 Stack: ..., INT -> ...
ifgt=157 branch1 branch2 Stack: ..., INT -> ...
ifle=158 branch1 branch2 Stack: ..., INT -> ...
Comparisons with Null:
ifnull=198 branch1 branch2 Stack: ..., OBJ -> ...
ifnonnull=199 branch1 branch2 Stack: ..., OBJ -> ...
Compare Two Integers:
if_icmpeq=159 branch1 branch2 Stack: ..., INT1, INT2 -> ...
if_icmpne=160 branch1 branch2 Stack: ..., INT1, INT2 -> ...
if_icmplt=161 branch1 branch2 Stack: ..., INT1, INT2 -> ...
if_icmpge=162 branch1 branch2 Stack: ..., INT1, INT2 -> ...
if_icmpgt=163 branch1 branch2 Stack: ..., INT1, INT2 -> ...
if_icmple=164 branch1 branch2 Stack: ..., INT1, INT2 -> ...
Compare Two Long Integers:
lcmp=148 Stack: ..., L1_1, L1_2, L2_1, L2_2 -> ..., INT (One of [-1, 0, 1])
Compare Two Floats:
l->-1 on NaN, g->1 on NaN.
fcmpl=149 Stack: ..., FLOAT1, FLOAT2 -> ..., INT (One of [-1, 0, 1])
fcmpg=150 Stack: ..., FLOAT1, FLOAT2 -> ..., INT (One of [-1, 0, 1])
Compare Two Doubles:
l->-1 on NaN, g->1 on NaN.
dcmpl=151 Stack: ..., D1_1, D1_2, D2_1, D2_2 -> ..., INT (One of [-1, 0, 1])
dcmpg=152 Stack: ..., D1_1, D1_2, D2_1, D2_2 -> ..., INT (One of [-1, 0, 1])
Compare Two Objects:
if_acmpeq=165 branch1 branch2 Stack: ..., OBJ1, OBJ2 -> ...
if_acmpne=166 branch1 branch2 Stack: ..., OBJ1, OBJ2 -> ...
Unconditional Branching (16-bit and 32-bit branching):
goto=167 branch1 branch2 Stack: ... -> ...
goto_w=200 branch1 branch2 branch3 branch4 Stack: ... -> ...
Jump Subroutine (16-bit and 32-bit jumps):
jsr=168 branch1 branch2 Stack: ... -> ..., returnAddress
jsr_w=201 branch1 branch2 branch3 branch4 Stack: ... -> ..., returnAddress
Return from Subroutine:
The return address is retrieved from a local variable, not the
stack.
ret=169 vindex Stack: ... -> ... (returnAddress <- vars[vindex])
ret_w=209 vindex1 vindex2 Stack: ... -> ... (returnAddress <- vars[vindex])
Return Primitives:
The current stack frame is destroyed. The top primitive is pushed
onto the caller's operand stack.
ireturn=172 Stack: ..., INT -> [destroyed]
lreturn=173 Stack: ..., LONG1, LONG2 -> [destroyed]
freturn=174 Stack: ..., FLOAT -> [destroyed]
dreturn=175 Stack: ..., DOUBLE1, DOUBLE2 -> [destroyed]
areturn=176 Stack: ..., OBJ -> [destroyed]
return=177 Stack: ... -> [destroyed]
Calling the Breakpoint Handler:
breakpoint=202 Stack: ..., -> ...
Manipulating Object Fields
A 16-bit index into the constant pool is used to retrieve the
class and field name. These names are used to determine the field
offset and width. The object reference on the stack is used as
the source or target. Values are 32 or 64 bits, depending on the
field information in the constant pool.
Getstatic=178 index1 index2 Stack: ..., -> ..., VAL
Putstatic=179 index1 index2 Stack: ..., VAL -> ...
Getfield=180 index1 index2 Stack: ..., OBJ -> ..., VAL
Putfield=181 index1 index2 Stack: ..., OBJ, VAL -> ...
Method Invocation
There are four types of method invocation:
- invokevirtual=182-This a the normal method dispatch
in Java. Use the index bytes to create a 16-bit index into the
constant table of the current class. Extract the method name and
signature. Search the method table of the stack object to determine
the method address. Use the method signature to remove the method
arguments from the operand stack and transfer them to the new
method's local variables.
- invokenonvirtual=183-Used when a method is called
with the super keyword. Use the index bytes to create
a 16-bit index into the constant pool of the current class. Extract
the method name and signature. Search the named class's method
table to determine the method address. Extract the object and
arguments and place them in the new method's local variables.
- invokestatic=184-Used to call static methods. Create
a 16-bit index into the current class's constant pool. Extract
the method and search the named class's method table for the address.
Transfer the arguments as before. There is no object to pass.
- invokeinterface=185-Invoke an interface function.
Again, a 16-bit index is created to find the method name and signature.
This time, however, the number of arguments is determined from
the bytecodes, not the signature.
virtual index1 index2 Stack: ..., OBJ, [arg1, [arg2, ...]] -> ...
nonvirtual index1 index2 Stack: ..., OBJ, [arg1, [arg2, ...]] -> ...
static index1 index2 Stack: ..., [arg1, [arg2, ...]] -> ...
interface index1 index2 nargs resv Stack: ..., OBJ, [arg1, [arg2, ...]] -> ...
Miscellaneous Operations
The following instructions don't fall under any other heading;
they deal with generic object operations, such as creation and
casting.
Throw Exception:
athrow=191 Stack: ..., OBJ -> [undefined]
Create a New Object:
new=187 index1 index2 Stack: ... -> ..., OBJ
Check a Cast Operation:
checkcast=192 index1 index2 Stack: ..., OBJ -> ..., OBJ
Instanceof:
instanceof=193 index1 index2 Stack: ..., OBJ -> ... INT (1 or 0)
Monitors
Monitor instructions are used for synchronization.
Enter a Monitored Region of Code:
monitorenter=194 Stack: ..., OBJ -> ...
Exit a Monitored Region of Code:
monitorexit=195 Stack: ..., OBJ -> ...
This chapter revealed the internal structure of the Java virtual
machine through use of the Viewer tool. You learned to interpret
signatures and read Java machine code. The elegance of the Java
class file was also presented.
Keep this information in mind when you are working with Java.
An appreciation for the internal structure will help you become
an expert Java programmer. I hope this introduction has heightened
your curiosity; feel free to use the Viewer program to explore
some of your own Java classes.
Contact
reference@developer.com with questions or comments.
Copyright 1998
EarthWeb Inc., All rights reserved.
PLEASE READ THE ACCEPTABLE USAGE STATEMENT.
Copyright 1998 Macmillan Computer Publishing. All rights reserved.