Finally Blocks
It turns out there is no explicit support for this Java language
feature so in bytecode it's use has unfortunate consequences. Consider the
following Java code:
try { try { System.out.println("A"); } catch (RuntimeException ex) { System.out.println("B"); } finally { System.out.println("C"); } } catch (RuntimeException ex) { System.out.println("D"); } finally { System.out.println("E"); }
This will end up producing the following bytecode (annotated with line numbers and exception handlers):
try{ try{ try{ try{ L3: 0 getstatic <java/lang/System.out> 3 ldc <A> 5 invokevirtual <java/io/PrintStream.println> 8 goto 42 (+34) } catch (RuntimeException) => 11 L4: 11 astore_1 L5: 12 getstatic <java/lang/System.out> 15 ldc <B> 17 invokevirtual <java/io/PrintStream.println> } catch (Anything) => 31 L7: 20 getstatic <java/lang/System.out> 23 ldc <C> 25 invokevirtual <java/io/PrintStream.println> 28 goto 84 (+56) L6: 31 astore_2 L7: 32 getstatic <java/lang/System.out> 35 ldc <C> 37 invokevirtual <java/io/PrintStream.println> L8: 40 aload_2 41 athrow L7: 42 getstatic <java/lang/System.out> 45 ldc <C> 47 invokevirtual <java/io/PrintStream.println> L8: 50 goto 84 (+34) } catch (RuntimeException) => 53 L9: 53 astore_1 L10: 54 getstatic <java/lang/System.out> 57 ldc <D> 59 invokevirtual <java/io/PrintStream.println> L12: 62 getstatic <java/lang/System.out> } catch (Anything) => 73 65 ldc <E> 67 invokevirtual <java/io/PrintStream.println> 70 goto 92 (+22) L11: 73 astore_3 L12: 74 getstatic <java/lang/System.out> 77 ldc <E> 79 invokevirtual <java/io/PrintStream.println> L13: 82 aload_3 83 athrow L12: 84 getstatic <java/lang/System.out> 87 ldc <E> 89 invokevirtual <java/io/PrintStream.println> L14: 92 return
From
a code coverage point of view this is a complete nightmare. For any
branch within a finally block the number of actual branches in the
bytecode is multiplied by the number of times that the finally block
is duplicated (at least two).
Technically this
is correct as to cover all the possible paths through the bytecode you
must consider all of the possible exception paths. However, in practice
the developer is likely only concerned with ensuring that all the
branches in their finally blocks are covered at least once and don't
want to worry about the possible paths into the finally block.
Implementing
the filtering required to deduplicate coverage of finally blocks is
made tricky by the fact that the start/end of the duplicate blocks are not explicitly
annotated in any way in the bytecode. The closest I could get was to
write a heuristic which works during the analysis of a method:
- When a new line number marker is seen a new list of instructions is started.
- Instructions are added to this list as they are seen. The instruction arguments are ignored.
- After all the instructions in a method have been visited:
- For each line number which was seen:
- The instruction sequences seen with that line number are "clustered". Sequences which start with the same instruction are grouped together and assumed to be copies caused by the line being within a finally block.
- For each covered probe:
- Find the instruction sequence which contains the probe.
- Mark the corresponding probe in every other sequence in the cluster as covered.
- Run the existing probe coverage propagation.
- For each cluster:
- If the last instruction of any sequence in the cluster is covered, mark all other sequences in the cluster as covered.
- Disable coverage reporting for all but one sequence.
int i = 0; for (i++; i < 10; i++) { System.out.println("A"); }
This produces the following bytecode:
L1: 0 iconst_0 1 istore_1 L2: 2 iinc 1 by 1 5 goto 19 (+14) L3: 8 getstatic #16 <java/lang/System.out> 11 ldc #22 <A> 13 invokevirtual #24 <java/io/PrintStream.println> L2: 16 iinc 1 by 1 19 iload_1 20 bipush 10 22 if_icmplt 8 (-14) L5: 25 return
There are two blocks of code tagged as being on line 2 and they both start with the same instruction. With my heuristic in place one of these blocks will be ignored as they are assumed to be duplicates. This is a shame but in practice I have yet to think of an example of realistic code which would be matched by my heuristic.
Conclusions
I'm pleased that my heuristic works as well as it does but I would love to come up with a better mechanism. If you have any suggestions I would love to hear them!