Cleaning Up My Code
My initial implementation of the heuristic I described can be seen here. This code isn't very clear at all. My initial instinct was to improve this by adding better comments. However, I recently watched a talk which was based on a book called Clean Code which made a very interesting point:
Every Comment is a Failure (This isn't quite a hard and fast rule - comments are sometimes appropriate to express high level design decisions)
In light of this I decided to avoid comments and instead focus on improving the names of types, names of variables and names of functions. Here are some of the examples of changes which I made:
List<Instruction> => Sequence List<List<Instruction>> => Cluster List<List<Instruction>> cluster = findCluster(clusters, p); => Cluster cluster = findClusterContainingInsn(clusters, p);The full diff can be seen here. I think the result is much easier to read but I still need to improve or ideally strip out some of my existing comments. In some places I am still trying to use a comment to describe the following block of code which suggests I should actually be extracting out more methods.
Improving Finally Block Dedup Heuristic
My initial heuristic wasn't quite good enough. Consider the following simple Java code:
try { System.out.println("A"); } catch (RuntimeException ex) { System.out.println("B"); } finally { System.out.println("C"); }This compiles to the following bytecode (with catch blocks and coverage probes annotated).
0 getstatic #16 <java/lang/System.out> 3 ldc #22 <A> 5 invokevirtual #24 <java/io/PrintStream.println> 8 goto 42 (+34) catch (RuntimeException) => 11 11 astore_1 12 getstatic #16 <java/lang/System.out> 15 ldc #30 <B> 17 invokevirtual #24 <java/io/PrintStream.println> catch (Anything) => 31 20 getstatic #16 <java/lang/System.out> 23 ldc #32 <C> 25 invokevirtual #24 <java/io/PrintStream.println> coverageProbe[0] = true 28 goto 50 (+22) 31 astore_2 32 getstatic #16 <java/lang/System.out> 35 ldc #32 <C> 37 invokevirtual #24 <java/io/PrintStream.println> 40 aload_2 coverageProbe[1] = true 41 athrow 42 getstatic #16 <java/lang/System.out> 45 ldc #32 <C> 47 invokevirtual #24 <java/io/PrintStream.println> coverageProbe[2] = true 50 returnAssuming no exception was thrown coverageProbe[2] would be hit naturally. The finally block dedup heuristic would also mark coverageProbe[1] and coverageProbe[0] as hit. Working backwards from these probes would lead the RuntimeException block to be marked as covered even though no RuntimeException has been thrown.
My solution is to treat copied probes differently. For these probes coverage is only propagated up until a line where the number of Sequence duplicates changes. This prevents coverage from a finally block leaking outside the finally block. This appears to achieve the desired result.