Monday, 14 January 2013

Improving Finally Block Handling

In my last post I described a heuristic for de-duplicating finally blocks within Java bytecode. Since then I have done some more work on this which is worth writing up.

Cleaning Up My Code

My initial implementation of the heuristic I described can be seen here. This code isn't very clear at all. My initial instinct was to improve this by adding better comments. However, I recently watched a talk which was based on a book called Clean Code which made a very interesting point:

Every Comment is a Failure (This isn't quite a hard and fast rule - comments are sometimes appropriate to express high level design decisions)

In light of this I decided to avoid comments and instead focus on improving the names of types, names of variables and names of functions. Here are some of the examples of changes which I made:
List<Instruction> => Sequence
List<List<Instruction>> => Cluster
List<List<Instruction>> cluster = findCluster(clusters, p); 
=> Cluster cluster = findClusterContainingInsn(clusters, p);
The full diff can be seen here. I think the result is much easier to read but I still need to improve or ideally strip out some of my existing comments. In some places I am still trying to use a comment to describe the following block of code which suggests I should actually be extracting out more methods.

Improving Finally Block Dedup Heuristic

My initial heuristic wasn't quite good enough. Consider the following simple Java code:
    try {
      System.out.println("A");
    } catch (RuntimeException ex) {
      System.out.println("B");
    } finally {
      System.out.println("C");
    }
This compiles to the following bytecode (with catch blocks and coverage probes annotated).
 0 getstatic #16 <java/lang/System.out>
 3 ldc #22 <A>
 5 invokevirtual #24 <java/io/PrintStream.println>
 8 goto 42 (+34)

catch (RuntimeException) => 11

11 astore_1
12 getstatic #16 <java/lang/System.out>
15 ldc #30 <B>
17 invokevirtual #24 <java/io/PrintStream.println>

catch (Anything) => 31

20 getstatic #16 <java/lang/System.out>
23 ldc #32 <C>
25 invokevirtual #24 <java/io/PrintStream.println>

coverageProbe[0] = true

28 goto 50 (+22)

31 astore_2
32 getstatic #16 <java/lang/System.out>
35 ldc #32 <C>
37 invokevirtual #24 <java/io/PrintStream.println>
40 aload_2

coverageProbe[1] = true

41 athrow

42 getstatic #16 <java/lang/System.out>
45 ldc #32 <C>
47 invokevirtual #24 <java/io/PrintStream.println>

coverageProbe[2] = true

50 return
Assuming no exception was thrown coverageProbe[2] would be hit naturally. The finally block dedup heuristic would also mark coverageProbe[1] and coverageProbe[0] as hit. Working backwards from these probes would lead the RuntimeException block to be marked as covered even though no RuntimeException has been thrown.

My solution is to treat copied probes differently. For these probes coverage is only propagated up until a line where the number of Sequence duplicates changes. This prevents coverage from a finally block leaking outside the finally block. This appears to achieve the desired result.