HostedDB - Dedicated UNIX Servers

-->
Internet Security Professional Reference:Java Security
Previous Table of Contents Next


The Java Virtual Machine

The Java Virtual Machine (JVM) is an attempt to provide an abstract specification to which builders can design their interpreter without forcing a specific implementation, while ensuring that all programs written in Java will be executable on any system that follows the design. The JVM provides concrete definitions for several aspects of an implementation, specifically in the distribution of Java code through an interchange specification. This specification includes the opcode and operand syntax, along with their values, the values of any identifiers, the layout of structures such as the constant pool, and the layout of the Java object format as implemented in the class file. These definitions provide the needed information for other developers to implement their own JVM interpreters, making the Java specification open for outside development. The hopes of the designers were to free Java from the restrictions of a proprietary language and allow developers to use it as they desire.

By creating a virtual machine from the ground up for Java, the developers at Sun were able to build many security features into the entire language architecture. Two areas in which the Java Virtual Machine come into play are in the garbage collected heap and memory area. The concept of garbage collection has been mentioned before, and its simplification of the programmer’s job is important in reducing errors introduced into programs. The memory area in Java is implemented in such a way that programmers are unable to tell where they are, and thus are unable to use this information to gain access to sensitive code.

Why a New Machine Code Specification?

The JVM provides the hardware platform specification to which all Java code is compiled. All computers have a specific processor known as the CPU, or central processing unit. There are a host of different CPUs that give each machine its computing power: Intel’s x86, Apple/IBM/Motorola’s PowerPC, DEC’s Alpha, Mips R series, Sun’s Sparc chips, and many others. Each of these chips has a different way of doing things, so software must be written for each individual machine type to run properly. For Java to overcome this problem of portability, the developers picked a single machine for which to compile and then interpret on all the others. Which chip did they choose to write Java for? None.

The JVM is a hypothetical CPU that can be easily implemented on a host of computers without being too close to any of them. The virtual machine must overcome differences in many CPUs. The Intel CPUs, for example, are all CISC (Complex Instruction Set Computing). They supply a host of instructions that the CPU can perform, the idea being that by providing many functions in microcode (essentially small software inside a chip), the shorter the code the chip needs to execute can be. Providing many functions, however, costs the CPU in performance because executing microcode is slower than executing functions that are hardwired.

RISC (Reduced Instruction Set Computing) chips take the opposite philosophy. Rather than providing a host of instructions, the RISC computer provides only the very basics needed to execute a program. Thus, a program may be larger in order to do the same thing a CISC program would do because it must perform its instructions many more times in order to duplicate the functionality found in a single instruction of CISC. All these instructions on a RISC processor, however, are hard wired into silicon, enabling them to run at incredible speeds, thus overcoming the longer pieces of code.

Picking one design over the other would make it difficult for the system not chosen to interpret the commands effectively. Instead, the Java designers selected their own specification for a chip’s instruction set. These opcodes are closely related to the Java language and can be considered an intermediate step between leaving the files as uncompiled source code, which would be the ultimate in portability, and compiling for each individual hardware system, which would provide the best possible speed. By providing a neutral intermediate specification, the JVM attempts to make a compromise between these two important aspects of distributed systems: portability and performance.

The Java Virtual Machine Description

The JVM consists of the following five specifications that control the implementation and interpretation of Java code.

  The instruction set
  The register set
  The stack
  The garbage collected heap
  The memory area

It does not matter how you want to implement each of these features, as long as they follow the specifications laid out by the designers for running all Java code. This means you could choose to interpret the Java bytecodes, creating a system similar to the Java or HotJava executables. Or, you could recompile the incoming Java code into native machine format to benefit from native code performance. If you really need to produce the best possible speed, you could even implement the JVM in silicon. Of course, it would then be a JM rather than a JVM.

The Instruction Set

The instruction set for the JVM is exactly equivalent to the instruction set for a CPU. When you compile Java source code into binary, you are in essence creating an assembly language program just as in C. Each instruction in Java consists of an opcode followed by an optional operand. Example opcodes include the following:

  Instructions for loading integers from memory (iload loads an integer)
  Managing arrays (anewarray allocates a new array)
  Logical operators (and logically ands two integers)
  Flow control (ret returns from a method call).

Each opcode is represented by an 8-bit number, followed by varying length operands. These operands give the needed data for each opcode, such as where to jump or what number to use in a computation. Many opcodes do not have any operands.

In computing, it is typical to align all opcodes and operands to 32- or 64-bit words. This enables the machine to move through the code in constant jumps, knowing exactly where the next instruction will be. Because the opcodes are only 8 bits and the operands vary in size, however, aligning to anything larger than 8 bits would waste space (see fig. 13.2). The wasted space would be a function of the average operand size and how much larger the bytecode alignment was. Deciding that compactness was more important than the performance hit incurred, the Java designers specifically chose this method.


Previous Table of Contents Next