Call Graph Visualisation with AspectJ and Dot
27 September 2008
Posted in Coding, Visualisation
One of my favourite tools to render graphs is GraphViz Dot and in an earlier entry I described how to use it to visualise Spring contexts. Today I want to showcase a different application.
Call graphs show how methods call each other, which can be useful for a variety of reasons. The example I use here is the graph rooted in a unit test suite, and in this case the graph gives an understanding of how localised the unit tests are, how much they are real unit tests or how close they are to mini-integration tests. In an ideal case the test method should call the method under test and nothing else. However, even with mock objects that's not always practical. And if, like myself, you fall into the classicist camp of unit testers, as described by Martin Fowler in Mocks aren't Stubs, you might actually not be too fussed about a few objects being involved in a single test. In either case, looking at the call graph shows you exactly which methods are covered by which unit tests.
There are several ways to generate calls graphs and I'm opting for dynamic analysis, which simply records the call graph while the code is being executed. A good theoretical reason is that dynamic analysis can handle polymorphism but a more practical reason is that it's actually really easy to do dynamic analysis; provided you use the right tools. The approach I describe in this article uses Eclipse AJDT to run the unit tests with a simple Java aspect that records the call graph and writes it out into a format that can be rendered more or less directly with Dot. Of course, this technique is not limited to creating graphs for unit test; it only depends on weaving an AspectJ aspect into a Java application.
Let's start with the result. The following diagram shows the call graph for a few simple unit tests in the test suite for the CruiseControl dashboard. The tests methods are in the leftmost column and method invocations occur from left to right. It's clearly visible that the tests are quite localised.
The next example shows a section that's a bit more intertwined. Some tests exercise the same method, some tests call more than one method, and some methods are called indirectly through different paths. All in all pretty reasonable, though.
Now, I don't want to showcase only the good parts. There are also sections in the graph that show how some of the test and dependencies are, well, a bit messy. In fairness, though, this is partly caused by some domain objets that are used across multiple tests.
How hard can it be to create these graphs? Not very, as it turns out.
Step 1 is to create an aspect that intercepts all interesting method calls, which normally means calls to methods that are part of the project. It's also possible, though, to include calls to libraries and frameworks. Depending on your experience with AspectJ this is as simple as follows:
The first part of the aspect is the definition of the pointcut: it matches all public methods in "com.example" and sub-packages but excludes calls to the two classes that form part of the call graph logger itself.
The before advice, which is run before each method that matches the pointcut, places the signature of the current method on a stack maintained by the graph logger class. With the current method signature as the topmost element and the one that called it as the second from the top, the advice asks the logger to log the corresponding call. The after advice simply pops the topmost call from the stack.
This is almost all the excitement. The implementation of
CallLogger deals with maintaining the stack and writing out the call in the format required by Dot. The complete implementation with some comments follows below.
An instance is made available as a public singleton. The main reason for using a singleton, rather than creating an instance in the aspect, is that if coherent logging is required across multiple projects in Eclipse it's easiest to add a copy of the aspect to each project and have all aspects use the same logger instance.
The fields hold the actual stack of method names, a set that contains all calls that have been logged already, and a writer to which the actual output is written. The set is a simple optimisation to prevent the same call from being written over and over again.
If anyone knows a better way to turn the AspectJ signature into a pretty string, please leave a comment.
Following a guard statement that protects the code from trying to log a call when only one method is on the stack, the code creates a description of the call in the format required by Dot for a graph edge, ie. the origin node in double quotes, followed by a stylised arrow, followed by the target node, again in double quotes.
If this call is not in the call log, the code writes it to the writer and then stores the call in the log so that it does not get written again.
Running some code, the unit tests in this example, will now create a file named "calls.txt" that contains all calls to methods that match the pointcut description in the aspect. The calls are written in Dot format but to make the output a valid input file for Dot a header and a closing bracket at the end are required.
It would be easy to write the required header when the writer is initialised but realistically it's not possible to know when
logCall is called for the last time, and thus it's not possible to write the closing bracket from the Java code. Therefore...
Step 2 is a small shell script, or a script in your favourite scripting language, that wraps the required header and footer around the raw graph edges.
I have an admission to make: This script is simplified and would not create the output I've shown further up. With a graph definition like this Dot would not put all unit tests nodes into the leftmost column but instead it would move them just left of the method they call. If, for example, test 1 called method A, which in turn called method B, and test 2 called method B directly, then test 1 would be in the first column, method A and test 2 in the second column and method B in the third column. The following UNIX command line wizardry, when added just below the
cat calls.txt line in the script above, solves this problem. The explanation is left as an exercise to the reader...
I hope this article shows how two simple but powerful tools combined make it easy to create a really useful visualisation; and how clever but cryptic UNIX command lines can be.