Bottom of Chapter     Contents     Previous Chapter     Next Chapter

4
: A Comparable Code-Based Interpreter

The second step to comparing the efficiency of tree-based and instruction-based interpreters, having built a tree-based interpreter, is to create the instruction-based interpreter. This chapter explores the design choices taken in the augmenting the pJTI to create the Picojava Hybrid Interpreter (pJHI), so that it can interpret program trees which are composed of instructions as well as those composed of statements.

The design and implementation choices made here are all made with a view to providing an instruction-based interpreter which is as similar as possible to the existing statement-based interpreter. By doing this, it can be subsequently argued that any comparisons made between the two interpreters reflect only the differences arising from the structural differences of the two approaches, and not the underlying implementation detail.

4.1 Design Requirements

An instruction-based interpreter is still tree-based in thes sense that it requires a representation of the program tree. The abstract grammar of the Picojava assembly language, discussed alongside the Picojava source language in Chapter 2 and included in its entirety in Appendix B, gives the structure of this tree. The difference comes only at the level of Block:

As well as the program tree we also need some mechanism for keeping track of the dynamic state of a program. Instead of seeking to emulate the implementation choices with a real instruction-based Java VM, we choose to use a symbol table which mimics as closely as possible that used by the pJTI.

The differences will come as there will now be different requirements for interpreter state, the third and final ingredient of the interpreter. The design will now need to incorporate a program counter and an explicit operand stack, as instructions will need to interact with both these entities. In the statement interpreter, the program counter was successfully hidden in the implementation of Block, and the operand stack was implicit from the invocation stack of statements in the statement tree.

4.1.1 Symbol Table Requirements

While the Java VM does not have a readily identifiable symbol table, at least not for variables[7], in order to fairly compare the statement interpreter and instruction interpreter, this element of implementation is retained as it is described in §3.1.1.

4.1.2 Interpreter State Requirements

The Java VM keeps a set of method invocation frames in order to keep track of the stack of methods which are currently executing, with the most recently invoked method being on top of the stack. Indeed, any instruction interpreter will need to keep track of method invocations, and to make it comparable to the pJTI, block invocations will also be required. There will be instruction invocations, but these are much less imporant than statement invocations, as they will not be directly nested in the same way as statement invocations must be.

In the Java VM, there is no such thing as a statement, and each method invocation frame has associated with it a program counter which indicates the current instruction executing in that method and an operand stack which keeps track of computations within each method. These will need to be included in an explicit way in an instruction interpreter, as they are inherent to the behaviour of such an interpreter, but they will be associated with Blocks and not Methods. This is in line with the way in which the pJTI has been designed.

4.1.3 Program Tree Requirements

In the Java VM, at the level of a method, the tree becomes a linearized structure where instructions are bytecoded. The interpreter suggested here does not necessarily use bytecodes, but it does use a linearized structure at the level of Block to represent Instructions.

Like a statement-based program tree, an instruction-based program tree is a structure which at its root has a single CompilationUnit node. Each node is in turn composed of further nodes, as given by the rules Picojava assembly language grammar of Appendix B.

4.2 Design Choices

4.2.1 Symbol Table Design

The symbol table will be a class, and will act primarily as a dictionary matching qualified identifiers with variables (of primitive types or class instances), classes and methods. This is as for the pJTI, and the design is more fully discussed in §3.2.1.

4.2.2 Interpreter State Design

Whereas the stack of statement invocations could be used within the pJTI to record the current progress of a computation, the same is not possible in an instruction interpreter. An explicit operand stack is needed in order to record the results from previous instructions so that they may be used by subsequent instructions; i.e. instructions are not composed further instructions in the same way as statements can be composed of further statements.

Therefore, in line with the way we augmented the symbol table designed in §3.2 with interpreter state, we will further augment the symbol table with operations for dealing with an explicit operand stacks:

We also augment the symbol table with operations for maintaining a program counter. This will need to be altered by branch instructions in order to effect flow control, so it is not sufficient to include it in the implementation of the Blocks themselves (as was the case in the pJTI).

4.2.3 Program Tree Design

The following classes will be used to construct an instruction-based program tree:

4.3 Implementation Overview

Where functions of existing classes from the pJTI need to be augmented, the new classes will be defined as subclasses of the existing classes, prefixed with the word Code. It is assumed that such subclassing would not be naïvely implemented by any Java VM used to run any comparison, and as a result the use of subclasses carries no overhead.[8]

4.3.1 Symbol Table Implementation

The new methods required in the implementation of CodeSymbolTable to provide for an operand stack are:

To maintain a program counter, it is necessary to add methods for:

4.3.2 Interpreter State Implementation

The implementation of interpreter state is shared between the CodeSymbolTable class and the Java language itself. Some interpreter state is implicit in the nested method invocations and associated frames of variables within the Java environment, while the remainder is explicitly given, either in the variables used by executing methods, or within explicit repositories provided by the CodeSymbolTable.

4.3.3 Program Tree Implementation

All but the terminal components of an instruction-based program tree have a list method. This list method takes one parameter, the CodeSymbolTable which is associated with that program tree. The CodeSymbolTable contains all the information required to produce a concrete listing for a program tree: all that is necessary is to call the list method for the root CompilationUnit object of that tree with the appropriate CodeSymbolTable object as parameter.

Some components of the program tree are executable, and therefore also have a run method. Again, the run method only takes one parameter, the CodeSymbolTable. The CodeSymbolTable is passed from method to method as the program tree is traversed, and is dynamically updated with program state information, particularly with values of variables, the values on operand stacks and the value of the current program counter.

A run method in the instruction interpreter may have a return type, but it is not appropriate for any member of the Instruction class to have a return type for its run method.

The executable components of an instruction-based program tree (i.e. those with a run method) are:

4.4 Implementation Details

4.4.1 Implementation Of Identifiers

Identifiers and QualifiedIdentifiers are implemented in the pJHI as they are implemented in the pJTI.

Identifiers for all but class and method references are eliminated in the Java VM, but in order to ensure that the two interpretive approaches can be structurally compared, we need to ensure that they are dealing with the same basic entities. For the instruction interpreter to deal with pointers while the chosen implementation for the statement interpreter is forced to look up identifiers in tables would be unacceptable. See §4.4.3 below for a further explanation of this.

4.4.2 Implementation Of Types And Values

The AlgebraicOrStructured class and its subclasses are implemented in the pJHI as they are implemented in the pJTI. This is because the data types in both the statement and instruction purposes need to be the same in order to write equivalent programs.

4.4.3 Implementation Of Symbol Dictionaries

There class dictionary and variable dictionary are implemented in the pJHI as they are implemented in the pJTI.

While the Java VM would use a class dictionary also, the function of a variable dictionary is eliminated through the use of pointers within instructions to reference a specific location in the current method invocation frame. For instance, Java's equivalent of the Instruction_Action_Load instruction looks like:


iload #3


This instruction means "load the integer from index three in the current method invocation frame". There is no reason why a more sophisticated version of the pJTI couldn't use a similar scheme, but for the purposes of the current comparison, it is sufficient to ensure that both interpreters use the same dictionary implementations. For example, one can imagine a statement like "reference the value associated with location three in the current block invocation frame".

4.4.4 Implementation Of Symbol Stacks

The pJHI uses symbol stacks in the same way as the pJTI, and the implementation is also the same.

Strictly speaking, because the instruction intepreter evaluates actual parameters before the time of method call instead of during the time of method call, IcyStacks (§3.4.4) are not needed. However, it is convenient to use the same implementation of symbol stacks in the instruction interpreter, and it removes any possible advantage which might be gained by a different implementation.

4.4.5 Implementation Of The Return Variable

The return variable is used in the same way and implemented in the same way in both the pJTI and pJHI.

4.4.6 Implementation Of Concrete Listing Generation

Once again, there is no change between the use and implementation of concrete listing generation between the pJTI and pJHI. Of course, the list methods list statements and instructions in different ways, but the mechanisms for that listing are the same.

4.4.7 Implementation Of Operand Stack And Program Counter

As well as the methods provided for convenience (§3.4.7), which are implemented in the same way in both interpreters, the pJHI has additional methods for managing the explicit operand stack and program counter:

The stack of operand stacks, and the operand stacks themselves are both implemented using instances of java.util.Stack.

4.4.8 Implementation Of The Compilation Unit

At the level of CompilationUnit, there is no functional change from the implementation in the pJTI to that in the pJHI.

4.4.9 Implementation Of Classes

Again, there is no functional change in Classes from the implementation in the pJTI to that in the pJHI.

4.4.10 Implementation Of Methods

CodeMethods are constructed using CodeMethod(Identifier, CodeBlock) while constructors are constructed using CodeMethod(CodeBlock). Parameters are then added using addParameter(Parameter), with a Parameter having no identifier being identified as the return parameter.

As for Methods, the Parameters of a method are stored in an instance of java.util.Vector.

4.4.11 Implementation Of Blocks

CodeBlocks are constructed using Block(). Declarations are then added using addDeclaration(Declaration). Instructions are added using addInstruction (Instruction).

As was the case for Blocks, the Declarations and Instructions of a block are stored in separate instances of java.util.Vector.

4.4.12 Implementation Of Flow Control Instructions

The construction of a Instruction_Action_Branch, Instruction_Action_Branch_ ConditionalTrue or Instruction_Action_Branch_ConditionalFalse requires only a single parameter indexing the instruction in the current block which is to be branched to.

The implementation of the run method requires only that the condition (if any) be tested, and the branch be taken if the condition has the required value. A straight Instruction_Action_Branch has no condition to test- it branches uncoditionally.

As an example of implementation, consider Instruction_Action_Branch_ ConditionalTrue:


class Instruction_Branch_ConditionalTrue extends 
    Instruction_Branch
{

  public Instruction_Branch_ConditionalTrue

      (int lineNumber)
  {

    super(lineNumber);
  }

  public void run(CodeSymbolTable symbolTable)
  {
    if (((Algebraic_Boolean)symbolTable.pop()) 
         .getValue())
    {
      symbolTable.branch(destination);
    }
  }

  public void list(CodeSymbolTable symbolTable)
  {
    System.out.print("iftrue "+destination);
    symbolTable.newListingLine();
  }
}

Here, destination is the protected instance variable from the parent class Instruction_Action_Branch, which looks like this:


class Instruction_Branch extends Instruction
{

  protected int destination;

  public Instruction_Branch(int lineNumber)
  {
    destination=lineNumber;
  }

  public void run(CodeSymbolTable symbolTable)
  {
    symbolTable.branch(destination);
  }

  public void list(CodeSymbolTable symbolTable)
  {
    System.out.print("goto "+destination);
    symbolTable.newListingLine();
  }
}

4.4.13 Implementation Of Constant Value Generation

A constant value generation instruction is simply implemented by placing the constant value encoded within the instruction on the operand stack. This is achieved using the push method within the augmented CodeSymbolTable class (§4.4.7). Consider Instruction_Action_ Evaluate_Constant_SignedInteger5 as an example:


abstract class Instruction_Action_Evaluate_Constant_ 

    Algebraic extends Instruction_Action_Evaluate_ 
    Constant
{

  protected Algebraic value;

  public Instruction_Action_Evaluate_Constant_ 
      Algebraic(Algebraic newValue)
  {
    value=newValue;
  }

  public void run(CodeSymbolTable symbolTable)
  {
    symbolTable.push((AlgebraicOrStructured)value);
  }
}

class Instruction_Action_Evaluate_Constant_Algebraic_ 

    SignedInteger5 extends Instruction_Action_
    Evaluate_Constant_Algebraic
{

  public Instruction_Action_Evaluate_Constant_ 
      Algebraic_SignedInteger5(Algebraic newValue)
  {
    super(newValue);
  }

  public void list(CodeSymbolTable symbolTable)
  {
    System.out.print("iconst ("+((Algebraic_
                                  SignedInteger5)
                                  value)
                                 .getValue()+")");
    symbolTable.newListingLine();
  }
}

4.4.14 Implementation Of Unary And Binary Operations

The unary and binary operations are implemented by code similar to that for Instruction_Action_Evaluate_Constant (§4.4.13), except that these two operations must also use the CodeSymbolTable's pop operation to obtain their operands.

If we consider the run method for Instruction_Action_Evaluate_BinaryOp_Sum, we will see how such operations are constructed:


  public void run
      (CodeSymbolTable symbolTable)
  {
    symbolTable.push((AlgebraicOrStructured)
                     new Algebraic_SignedInteger5(
                       ((Algebraic_SignedInteger5)
                        symbolTable.pop()).getValue()+
                       ((Algebraic_SignedInteger5)
                        symbolTable.pop()).getValue()
                     )
                    );
  }

Each operation also has a list method which outputs the instruction mnemonic.

4.4.15 Implementation Of Data Transfer Instructions

The data transfer instructions operate on a value from the variable dictionary, either replacing it with a new value taken from the operand stack (Instruction_Action_Store), or placing its value on the operand stack for use in a computation (Instruction_Action_Load). They therefore use the pop and push methods of CodeSymbolTable, used for manipulating the current operand stack.

The implementation of Instruction_Action_Load is:


class Instruction_Action_Load extends Instruction_ 
    Action
{

  protected QualifiedIdentifier identifier;
  ...

  public void run(CodeSymbolTable symbolTable)
  {
    symbolTable.push(symbolTable.getQualifiedValue
                       (identifier));
  }

  ...
}

There is also a destructor for listing a load instruction, list.

The implementation of Instruction_Action_Store is:


class Instruction_Action_Store extends Instruction_ 
    Action
{

  protected QualifiedIdentifier identifier;
  ...

  public void run(CodeSymbolTable symbolTable)
  {

    if (identifier==null) {
      symbolTable.setReturnValue(symbolTable.pop());
    } else {
      symbolTable.setQualifiedValue(identifier,
                                    symbolTable.pop()
                                   );
    }
  }

  ...
}

You will notice that there is provision in the code for the parameter to this instruction being null. This is the special case of the store statement being used to assign the return variable wihin a method, as discussed in §2.15.3.

Instruction_Action_Store also has a destructor for listing, list.

4.4.16 Implementation Of Method Invocation Instructions

Whereas method invocation statements (§3.4.16) needed storage within themselves for the actual parameter expressions, method invocation instructions do not carry such information. The parameters are taken from the operand stack.


class Instruction_Action_Evaluate_Method extends

    Instruction_Action_Evaluate
{

  private QualifiedIdentifier identifier;

  private int actualParameters;
  // Constructor to create an instruction which calls 
  // a specified method

  public Instruction_Action_Evaluate_Method(
    QualifiedIdentifier newIdentifier,
    CodeMethod newMethod)
  {
    identifier=newIdentifier;
    actualParameters=newMethod.getNumberParameters();
  }

  // Destructor which pushes the result of invoking a 
  // method

  public void run(CodeSymbolTable symbolTable)
  {
    AlgebraicOrStructured result;
    ObjectInterface object;
    CodeMethod method;

    // Obtain reference to object and make current
    object=symbolTable.setCurrentObject(
             symbolTable.getParentObject(identifier)
           );

    // Obtain reference to method
    method=(CodeMethod)(object.getParentClass())
                       .getMethod(identifier
                                  .getIdentifier());

    // Set formal parameters for method, popping them 
    // off operand stack

    for (int i=0; i<actualParameters; i++)
      method.setParameter(symbolTable,i,
                          symbolTable.pop());

    // Execute method
    result=method.run(symbolTable);

    // Pop current object
    symbolTable.popCurrentObject();

    // Push result onto operand stack
    symbolTable.push(result);
  }

  // Destructor which writes a method call to the 
  // output stream

  public void list(CodeSymbolTable symbolTable)
  {
    System.out.print("invokevirtual_"+
                     actualParameters+" ("+
                     symbolTable
                     .getQualifiedSymbolName
                     (identifier)+")");
    symbolTable.newListingLine();
  }
}

4.4.17 Implementation Of Class Instance Creation Instructions

Whereas class instance creation statements (§3.4.17) needed storage within themselves for the actual parameter expressions for calling the constructor, class instance creation instructions do not carry such information. The parameters are taken from the operand stack.


class Instruction_Action_Create_Object extends 
    Instruction_Action_Create
{

  private Identifier prototypeClass;

  private int constructorSignature;

  private int constructionParameters;


  // Constructor to make an instruction to create an 
  // object

  public Instruction_Action_Create_Object(
    CodeSymbolTable symbolTable,
    Identifier newPrototypeClass,

    int newConstructorSignature)
  {
    prototypeClass=newPrototypeClass;
    constructorSignature=newConstructorSignature;
    constructionParameters=((symbolTable.getClass
                               (prototypeClass))
                            .getMethodAt
                               (constructorSignature)
                           ).getNumberParameters();
  }

  // Destructor which pushes a new object onto the 
  // operand stack

  public void run(CodeSymbolTable symbolTable)
  {
    ClassInterface prototype;
    ObjectInterface object;
    CodeMethod method;

    // Obtain reference to prototype class
    prototype=symbolTable.getClass(prototypeClass);

    // Create object
    object=symbolTable.setCurrentObject(new 
             Structured_Object(prototype));

    // Obtain reference to method
    method=(CodeMethod)prototype.getMethodAt
             (constructorSignature);

    // Set formal parameters for method

    for (int i=0; i<constructionParameters; i++)
      method.setParameter(symbolTable,i,
                          symbolTable.pop());

    // Execute method
    method.run(symbolTable);

    // Push created object onto operand stack
    symbolTable.push((AlgebraicOrStructured)
                     symbolTable.popCurrentObject());
  }

  // Destructor which writes a method call to the 
  // output stream

  public void list(CodeSymbolTable symbolTable)
  {
    // Write identifier of prototype class
    System.out.print("new_"+constructionParameters+
                     " ("+symbolTable.getSymbolName
                            (prototypeClass)+")");
    symbolTable.newListingLine();
  }
}




[7]  The Java VM obviously must have a symbol table for classes.

[8]  In truth, the final implementation of the pJHI has SymbolTable as a superinterface, TreeSymbolTable and CodeSymbolTable as subinterfaces, and HashingTreeSymbolTable and HashingCodeSymbolTable as subclasses of HashingSymbolTable which implement these respective interfaces. The other classes which are changed are subclassed and subinterfaced in similar ways.



   Previous Chapter     Next Chapter     Top Of Chapter

Exit: Java- Trees Versus Bytes; Kasoft Typesetting; Archer


Java- Trees Versus Bytes is a BComp Honours thesis written by Kade Hansson.

Copyright 1998 Kade "Archer" Hansson; e-mail: archer@dialix.com.au

Last updated: Monday 12th February 2001