Evaluation of aspa patches for the core JRE 7 runtime library

Eduardo R. B. Marques, LaSIGE/FCUL, 2013
http://www.dcc.fc.up.pt/~edrdo/aspa

Last updated: April 10, 2013 (work in progress)

Contents

1  Test set

We considered as test subject the rt.jar JAR archive that is bundled with the Oracle’s Java Runtime Environment (JRE) for Java 7. The archive contains the core JRE runtime library, including for instance packages java.lang and java.util, among other well-known standard Java packages.

We download all JRE 7 versions, from the initial JRE 7 release, plus all subsequent updates available from Oracle’s J2SE homepage, as of Apr. 4, 2013.

The rt.jar archive contains only JVM class files, except for the JAR manifest. Its size is reasonably stable across the various versions: a mean size of 52.8 MB with a standard deviation of just 102 KB.

2  Test methodology

2.1  aspa patches

We derived aspa patches for rt.jar using the jardiff.sh utility script included in aspa-0.2 for each pair of successive JRE 7 releases.

2.2  bsdiff , xdelta and git patches

For comparison, we also derived binary difference patches generated by:

2.3  Patch input preparation

JAR files are in fact ZIP files, however the rt.jar file and others are included in the JRE distribution with compression disabled. This means that class files are encoded in plain JVM format inside the JAR file, but there are some remaining ZIP format dependencies (entry headers, CRC, etc).

The ZIP format data and the order of ZIP entries (class files) inside the JAR do not affect aspa but of course that does happen for the binary difference tools, which handle byte streams sequencially. For this reason we fed bsdiff , xdelta and git a file with (only) the concatenation of all JVM class files inside rt.jar in package-class lexicographical name ordering.

In this manner, we fairly compare all tools in what concerns a strict dependency from the JVM file format for patch derivation, and we minimize the aspa advantage of insensitivity to ZIP entry order (order of classes inside the JAR file).

2.4  Use of bzip2 compression

After derivation, all aspa , xdelta , and bsdiff patches were compressed using the bzip2 format with maximum compression level (9) enabled. bsdiff already uses the bzip2 library internally in this manner, hence the compression step was skipped for it. Note xdelta has an internal compression scheme that was disabled (-0 option), and aspa and git use no built-in compression.

3  Results

3.1  Summary evolution of rt.jar

Using aspa we were able to obtain a report of patched, added and deleted JVM classes in each successive rt.jar JRE7 update versus the preceding version. The characterization of changes with this level of granularity is shown in Figure 1.


Figure 1: Changed, added and deleted class files

3.2  Patch sizes

Figure 2 shows the size of patches for each of the JRE versions and tools considered. The information is complemented in Figure 3 where we explicitly depict the size ration between bsdiff , xdelta and git patches versus corresponding aspa patches.


Figure 2: Patch size


Figure 3: Patch size ratio vs. aspa

4  Evaluation

4.1  Patch size

The general conclusion is that aspa patches are significantly smaller than patches generated by all binary diffence tools we considered.

The bsdiff is the one that compares more favorably with aspa . Still, bsdiff patches were at least 1.3 times (JRE updates sf u3 and u15) larger (for u03 and u15) than aspa patches. The maximum patch size ratio vs. aspa was 2.2 (for update u05), the mean ratio was 1.6 with a standard deviation of 0.26.

xdelta and git clearly fared worse than bsdiff . The mean patch size ratio vs. aspa was 3.9 for xdelta (std. dev. 1.31) and 7.7 for git (std. dev. 2.55).

4.2  Correlation with class changes

We also derived the correlation coefficient for the patch size samples for each tool and the update size set samples, expressing the update size simply as the sum of changed, added, and deleted files for each version of rt.jar (see Figure 1).

The correlation coefficients were 0.94 for aspa , 0.90 for bsdiff , and 0.88 for both xdelta and git. A value of 1 would express (possible) maximum correlation. In spite of the level of granularity we consider, this suggests (as expected) that aspa patches are in more direct correlation with the change set in the JRE 7 version transitions.


This document was translated from LATEX by HEVEA.