Bottom of Chapter     Contents     Previous Chapter     Next Chapter

6
: Trees Compared To Code

In this chapter, we will look at the empirical results obtained by running multiple trials of the same benchmark under the three interpreter implementations. In this way, we will be able to determine which implementation of each benchmark is more efficient, and by examining all these results together, we will hopefully be able to determine which implementation was most successful.

In determining the most efficient implementation, and comparing the differences in efficiency between implementations, we should be able to answer the question posed in §1.4: Are tree representations able to be interpreted faster than code representations? We will also be able to make comments about in what circumstances the difference is widened or narrowed.

All trials were conducted using the same Power Macintosh 6100/60 running MacOS 8 and Metrowerks Java. The trials were conducted while the system was under the same load conditions, and the elapsed times are for the system clock. Shaded trials have not been included in averages.

6.1 Empty Loop Results

Twenty trials were carried out for each of the three implementations. The results were as follows (all times are in seconds):

Trial set 1
Code 7.671 6.172 6.157 6.169 7.373 6.184 6.218 7.05 6.136 6.155
Tree 6.232 6.077 6.076 6.143 6.166 6.106 6.054 6.118 6.035 6.141
Opt. Tree 1.485 1.588 1.45 1.651 1.597 1.432 1.56 1.442 1.54 1.419

Trial set 2
Code 7.892 6.245 6.272 6.677 6.264 6.226 7.503 6.239 6.303 6.258
Tree 6.644 6.428 6.447 6.397 6.575 6.411 6.421 6.516 6.377 6.435
Opt. Tree 2.736 1.723 1.615 1.757 1.783 1.592 1.756 1.705 1.591 1.74

Average of trial set 1 Average of trial set 2 Average overall
Code 6.402 6.443 6.422
Tree 6.102 6.445 6.274
Opt. Tree 1.520 1.696 1.608

↑ Table 6.1 EmptyLoop Results

For EmptyLoop, the stacking tree interpreter is 2.32% faster than the code interpreter, while the optimized tree interpreter is 74.96% faster.

6.2 Deep Empty Loop Results

Twenty trials were carried out for each of the three implementations. The results were as follows (all times are in seconds):

Trial set 1
Code 16.862 16.871 16.979 16.828 16.849 16.911 16.781 16.893 16.832 16.782
Tree 16.785 16.596 16.652 16.797 16.617 16.605 16.656 16.682 16.653 16.653
Opt. Tree 4.867 4.799 4.699 4.794 4.741 4.799 4.772 4.733 4.81 4.728

Trial set 2
Code 16.882 16.733 16.883 16.815 16.796 16.839 16.817 16.909 16.772 16.791
Tree 16.763 16.569 16.651 16.785 16.579 16.65 16.688 16.619 16.645 16.659
Opt. Tree 5.002 4.799 4.81 4.914 4.828 4.94 4.805 4.831 4.881 4.888

Average of trial set 1 Average of trial set 2 Average overall
Code 16.858 16.817 16.838
Tree 16.657 16.649 16.653
Opt. Tree 4.774 4.870 4.822

↑ Table 6.2 DeepEmptyLoop Results

For DeepEmptyLoop, the stacking tree interpreter is 1.10% faster than the code interpreter, while the optimized tree interpreter is 71.36% faster.

6.3 Field Access Results

Twenty trials were carried out for each of the three implementations. The results were as follows (all times are in seconds):

Trial set 1
Code 8.306 8.275 8.283 8.225 8.313 8.251 8.181 8.278 8.276 8.308
Tree 8.182 8.167 8.077 8.033 8.118 8.136 8.134 8.12 8.143 8.109
Opt. Tree 2.994 2.919 2.956 3.045 2.947 3.02 2.996 2.955 2.924 2.98

Trial set 2
Code 8.557 8.49 8.79 8.475 8.483 8.53 8.428 8.506 8.522 8.671
Tree 8.457 8.503 8.555 8.485 8.546 8.399 8.411 8.496 8.529 8.453
Opt. Tree 3.134 3.188 3.085 3.111 3.142 3.089 3.19 3.081 3.187 3.137

Average of trial set 1 Average of trial set 2 Average overall
Code 8.270 8.545 8.407
Tree 8.122 8.483 8.303
Opt. Tree 2.974 3.134 3.054

↑ Table 6.3 FieldAccess Results

For FieldAccess, the stacking tree interpreter is 1.25% faster than the code interpreter, while the optimized tree interpreter is 63.67% faster.

6.4 Method Call Results

Twenty trials were carried out for each of the three implementations. The results were as follows (all times are in seconds):

Trial set 1
Code 11.088 12.335 10.992 12.011 11.038 11.179 11.058 11.207 11.086 11.152
Tree 11.485 11.342 11.501 11.316 11.434 11.304 11.433 11.447 11.402 11.37
Opt. Tree 4.867 4.818 4.807 4.781 4.84 4.898 4.77 4.782 4.747 4.847

Trial set 2
Code 11.336 12.084 12.163 11.297 11.299 12.081 11.982 11.402 11.444 11.336
Tree 11.531 11.312 11.498 11.379 11.434 11.322 11.515 11.449 11.46 11.299
Opt. Tree 4.896 4.728 4.761 4.74 4.83 4.819 4.741 4.745 4.712 4.858

Average of trial set 1 Average of trial set 2 Average overall
Code 11.315 11.642 11.479
Tree 11.403 11.420 11.412
Opt. Tree 4.816 4.783 4.799

↑ Table 6.4 MethodCall Results

For MethodCall, the stacking tree interpreter is 0.58% faster than the code interpreter, while the optimized tree interpreter is 58.19% faster.

6.5 Often True Results

Twenty trials were carried out for each of the three implementations. The results were as follows (all times are in seconds):

Trial set 1
Code 14.797 14.688 14.853 14.731 14.798 14.677 14.824 14.758 14.727 14.797
Tree 14.696 14.559 14.656 14.632 14.527 14.636 14.636 14.542 14.577 14.593
Opt. Tree 3.581 3.56 3.481 3.62 3.507 3.56 3.568 3.505 3.595 3.479

Trial set 2
Code 14.796 14.605 14.779 14.669 14.791 14.685 14.798 14.726 14.811 14.815
Tree 14.669 14.527 14.689 14.586 14.526 14.617 14.669 14.483 14.595 14.64
Opt. Tree 3.536 3.566 3.462 3.621 3.517 3.474 3.5 3.491 3.577 3.461

Average of trial set 1 Average of trial set 2 Average overall
Code 14.765 14.748 14.746
Tree 14.605 14.600 14.603
Opt. Tree 3.546 3.521 3.533

↑ Table 6.5 OftenTrue Results

For OftenTrue, the stacking tree interpreter is 1.04% faster than the code interpreter, while the optimized tree interpreter is 76.06% faster.

6.6 Fibonacci Results

Twenty trials were carried out for each of the three implementations. The results were as follows (all times are in seconds):

Trial set 1
Code 1.601 1.376 1.47 1.356 1.59 1.475 1.364 1.547 1.39 1.462
Tree 1.559 1.485 1.324 1.428 1.4 1.46 1.32 1.425 1.331 1.509
Opt. Tree 0.83 0.917 0.737 0.74 0.836 0.743 0.735 0.878 0.805 0.731

Trial set 2
Code 1.396 1.558 1.522 1.38 1.586 1.385 1.55 1.406 1.51 1.529
Tree 1.593 1.38 1.497 1.517 1.449 1.484 1.374 1.523 1.384 1.488
Opt. Tree 0.766 0.985 0.746 0.751 0.756 0.958 0.75 0.753 0.94 0.746

Average of trial set 1 Average of trial set 2 Average overall
Code 1.463 1.482 1.473
Tree 1.424 1.469 1.447
Opt. Tree 0.795 0.815 0.805

↑ Table 6.6 Fibonacci Results

For Fibonacci, the stacking tree interpreter is 1.78% faster than the code interpreter, while the optimized tree interpreter is 45.33% faster.

6.7 Summary

As expected, EmptyLoop showed the greatest speed up, both in the best-case and worst-case tree interpreters. This indicates that statement-based interpreters do derive an advantage from not having to explicitly execute branch instructions.

OftenTrue was the best performer in terms of potential speed-up, and this could be attributed to its heavy use of the operand stack. This would advantage the optimized tree interpreter significantly, but would have no effect at all on the stacking tree interpreter (as observed in the results).

DeepEmptyLoop showed a high potential speed up also, but in the stacking tree interpreter this was lost. The results for this are diluted by the sheer amount of data transfer going on relative to the amount of loop mechanics.

Clearly loops benefit most from a tree interpreter, as expected. Even the "real world" example of Fibonacci showed a 45% speed up in the best case, and a little shy of 2% speed up in the worst case.


   Previous Chapter     Next Chapter     Top of Chapter