The next instrumentation tool which I wrote (also started in 2010) was motivated by a session using the excellent VisualVM memory profiling to investigate why an application was performing profligate memory allocation. The results in VisualVM are presented as a tree of allocation stack traces.
This was frustrating in some cases. For example a string constructed from several parts results in stack traces through StringBuilder.append() as well as through StringBuilder.toString(). In practice I would prefer to see both of these allocations presented together as the methods are called from the same site. I also wanted to get a summary of the lifetime of the objects being allocated grouped together into buckets. This is something which VisualVM doesn't offer at all - the closest thing available is an indication of the number of GCs which an object has survived.
I managed both of these by using the JVM TI API to write an agent that added an InMemProfiler method call into the java.lang.Object constructor and after every array allocation (as these don't use the Object constructor). This instrumentation is done with the native JVM TI API.
When profiling object lifetimes the profiler code creates a Java weak reference for each allocated object to detect when it has been collected. The tracking of collection could be done much more efficiently using the native JVM TI object tagging mechanism but I wanted to do as much as possible using Java.