We all like metrics, statistics, and measurements that let us know how we’re doing. Released about a week ago, the 2009 Standish CHAOS Report should have plenty of numbers. Jim Johnson, the guy behind it all, cites some numbers pulled from the report. It appears failure is on the rise. In fact, it’s the highest failure rate in over a decade.
I’ve always liked the Standish report. It’s packed full of interesting and high impact statistics that can really help drive a point home - especially when you’re on stage presenting. But you also have to be careful about reading too deeply into those number. It’s very easy to metriculate, and some have challenged the accuracy of the numbers within prior reports.
Filed Under Agile, Development, Metrics | Leave a Comment
It’s very common for teams to use averages to communicate a measurement. The average number of lines of code per method or the average number of features delivered per iteration are simple examples. Most of the time, the average represents the mean, which is the arithmetic average of all samples from a chosen population. But average can also be used to refer to the median (the middle value in a series) and mode (the most frequently occurring value in a series). This can be misleading, and the result is metriculation.
Software development teams often use cyclomatic complexity (CCN) to measure the complexity of their code. Using a well-chosen average, it’s easy to misinterpret, or miscommunicate, the results. Let’s say we want to calculate the average CCN for a system. Consider seven methods with the following CCN:
Method 1: 3 Method 2: 3 Method 3: 3 Method 4: 12 Method 5: 120 Method 6: 85 Method 7: 15
Using this sample, the mean CCN is 34, the median is 12, and the mode is 3. While average CCN typically uses the mean, in this situation, the mean provides a false positive indicating poor code quality. Given a different sample, the results could be skewed in the opposite direction - providing a false positive indicating high quality.
Combining mean with median and mode may serve as a warning indicator. With a mode of 3 and a mean of 34, we might suspect a wide range of values. Another way to determine if the mean is an accurate representation of the sample is to calculate the standard deviation or variance. These represent the how spread out a distribution is. In this case, the standard deviation is almost 48. A number way to high for the mean CCN to provide an accurate representation of code quality.
The point here is that we have to be careful with how we use metrics as a measurement. Sometimes, additional analysis is required before we make a decision in how to proceed. While this example uses CCN, it would be easy to imagine other examples where this form of metriculation - the well-chosen average - might take place.
Metriculation is derived from statisticulation, a term introduced in How to Lie with Statistics.
Filed Under Agile, Development, Metrics | Leave a Comment
Metrics can be used to garner a lot of feedback that’s valuable to the software development team. And they can also be used as a convincing argument to push an agenda. You have to be careful that metrics are used legitimately, and avoid metriculation. Metriculation is a term I use to describe how metrics can lie. One form of metriculation is the faulty assumption.
A faulty assumption occurs when we draw some conclusion based on the occurrence of simultaneous events, but have no substantive evidence correlating the events. Faulty assumptions are based on the logical fallacy of correlation proves causation. In software development, we have to be very careful not to make decisions based on faulty assumptions. Let’s take a couple of simple examples.
On any software development team, the number of developers who eat lunch while working at their desks may be related to the number of developers that get stuck in the elevator. The conclusion then is that you should not eat lunch at your desk if you want to avoid getting stuck on the elevator. Even the less than astute individual recognizes the absurdity of that conclusion.
But what about this more plausable scenario? The number of developers using Fancy Framework X is related to the number of developers with fewer defects in their code. Our conclusion now is that if we want to reduce software defects, then all developers should use Fancy Framework X. Possibly. But is there a causal relationship between using Fancy Framework X and code with fewer defects? Or have we falsely assumed that there is?
Proving causal relationships is difficult, if not impossible. The slightest possibility that there may not be a causal relationship means the conclusion can always be challenged. Proving causality requires statistical analysis, which may not always be feasible. Sometimes the best we can do is apply inductive reasoning to arrive at our conclusion based on the likelihood of causality. The moral here is that we should not be awestruck by metrics. Nor should we be naive. Instead, leveraging metrics requires critical thinking.
Filed Under Architecture & Design, Java, Metrics, OSGi, Platforms | Leave a Comment
I took the liberty of running JarAnalyzer on the OSGi bundles deployed as part of Spring 2.5.6. These are the JAR files found in the /dist/modules directory. Click the image at right to reveal the relationships between JAR files.
It’s interesting to see the dependency relationships and layering of the framework. Note that there are no cycles. I have always felt a significant advantage of Spring is the way development teams can incrementally adopt the framework. Start using core for basic dependency injection and move up the stack to JDBC and declarative transactions, ORM integration, and integration with your favorite web framework. It’s the flexible architecture of Spring that allows this. I’m guessing that when the Spring team went about modularizing the framework around OSGi, the architectural flexibility already embodied in previous versions of Spring made their job easier.
For those of you who want the more detailed information, you can view the JarAnalyzer html report for Spring 2.5.6 showing a variety of metrics related to design quality.
I just got off the phone with Capers Jones, founder of SPR. We had a great conversation on software metrics, and he definitely gave me some interesting bits to chew on. One aspect of metrics that I’ve been particularly interested in is how IT can use metrics to show their value to the business. He was pretty crisp in his response when he stated that it isn’t IT’s job to measure their value to the business.
This sent me swirling for a moment, and I’m sure the depth of the conversation that followed shortly thereafter was lost on me. His point was that IT doesn’t really know how to measure their value. Instead, it’s the business people who understand what value they hope to get from the software, so it must be the business people that measure that value. I think Caper’s point is this. If I spend $100 on a product, it’s my responsibility to ensure I’ve gotten $100 worth of value out of that product. The company I purchased the product from has no way of measuring the value I received from the product.
But I’ll take that one step further. The company does have the ability to measure the perceived value, and there are a lot of ways to do that. They can follow-up with me directly to obtain qualitative measurements on my satisfaction with the product. They can also monitor various sales channels to obtain quantitative measurements. They can combine this qualitative and quantitative data to create meaningful measurements that allows them to gauge the value of their product to the consumer. And really, as value goes up, so to do sales.
This translates to IT. IT does have the ability to gather qualitative data from their customers. They also have the ability to obtain loads of quantitative information. The key is that IT cannot measure their value to the business without working closely with their business partners who will provide them with the information they need to make those measurements.
Capers went on to say that IT must be able to demonstrate competency in the products and services they deliver, and that there is business value in that. Examples included showing that my productivity rates exceed those of my outsourcing counterparts, that I’m able to build software with fewer defects, or that I can deliver software that isn’t vulnerable to security breach.
I sum it up like this. If I’m able to increase my internal efficiencies and also able to measure and improve my effectiveness, that translates into my advantage because I’m able to deliver higher quality products and services more quickly than my competitors. And that’s an advantage to my customers, which keeps them coming back to me. The key is that I must be able to measure it.
A post on the Burton Group APS blog about how metrics can be used to push an agenda from a specific perspective. Metriculation is a term I introduce, derived from Statisticulation, where metrics are made to lie.
I wonder if Andy has seen this.

Updated (11/02/07) : Please note the responses from Alberto and Bob attached to this post. They’ve offered some assurance that Crap4J does not transmit any code to their servers, and that the licensing snafu was due to a simple oversight. They also resolve to correct the licensing agreements. Thank you, Alberto and Bob! : End Update
I went to install the Crap4J Eclipse plug-in. As part of this plug-in, there are four separate features, and I happened to actually read the license agreement for each of them. In a nutshell, for three of the features (Agitair JUnit Runner, Agitair JUnit4 Suport, and Public API for Generated Tests), the license agreement states that the software is experimental and primarily for academic, research, and open source use. But that’s not the alarming part. It also says that it transmits your code over the open internet to be analyzed on non-secure Agitair computers shared by multiple users. Here’s the exact text:
THIS SOFTWARE IS INTENDED PRIMARILY FOR ACADEMIC, RESEARCH, AND OPEN SOURCE
USE. WHILE COMMERCIAL USE IS ALLOWED, PLEASE BE AWARE THAT YOUR CODE
IS TRANSMITTED OVER THE OPEN INTERNET AND ANALYZED ON NON-SECURE
COMPUTERS SHARED BY MULTIPLE USERS.
I don’t like that much, and it seems a bit sneaky to hide that rather important note in a license agreement that I doubt many folks read. There should be a more noticeable disclaimer somewhere. Also, I found no such notice in the Ant Task distribution (in fact, couldn’t find a license agreement included at all). But that’s not saying the Ant Task does or does not transmit your code.
I don’t know the internal behavior of Crap4J. Maybe it doesn’t send your code anywhere. But the license agreement indicates that Crap4J does, or at the very least, that they have the right to do so. Maybe, giving them the benefit of the doubt, they didn’t fully review the license agreement, and aren’t aware of what it says. Either way, the fact that this important note is buried within a license agreement without any other public disclaimer is very alarming and deceiving.
JarAnalyzer has always had the ability to create a dot-compliant output file that could be used with GraphViz to generate a component diagram. In the past, this had always been done using the DOTSummary class. Unfortunately, this meant that if you wanted to generate output files in both xml and dot, you had to run JarAnalyzer twice. Now, thanks to a stylesheet that I graciously stole from JDepend and modified to work with JarAnalyzer, there’s a new way to generate a dot-compliant output file that is much nicer than what you’ll get when using DOTSummary. Plus, you only have to run JarAnalyzer once, then apply two stylesheets to the xml file generated to get both the html report and component diagram.
In addition to being a bit more efficient, it’s also cleaner. The old component diagram is shown at left on top, while the new component diagram using the stylesheet is shown at bottom left. The stylesheet avoids the confusion where DOTSummary changed the name of the .jar file and stripped off the .jar extension. As seen on the diagrams, a .jar file named bill.jar now actually appears as bill.jar on the component diagram, not bill.
The new stylesheet isn’t part of the JarAnalyzer distribution…yet, but you can download the stylesheet. To run JarAnalyzer as part of your Ant build script and get both the html and component diagram output, drop the stylesheet in the directory containing JarAnalyzer (the same directory with jaranalyzer.xsl), and modify your build script similar to the following (you need GraphViz installed to run dot):
<target name="dotanalyzerapp.new" depends="bundle">
<taskdef name="jaranalyzer"
classname="com.kirkk.analyzer.textui.JarAnalyzerTask">
<classpath>
<pathelement path="${buildlib}/jaranalyzer-1.2.jar"/>
<pathelement path="${buildlib}/lib/bcel-5.2.jar"/>
<pathelement path="${buildlib}/lib/jakarta-regexp-1.3.jar"/>
<pathelement path="${buildlib}/lib"/>
</classpath>
</taskdef>
<jaranalyzer srcdir="${buildstats}"
destfile="${buildstats}/appdependencies.xml"
summaryclass="com.kirkk.analyzer.textui.XMLUISummary"/>
<style in="${buildstats}/appdependencies.xml"
out="${buildstats}/appdependencies.html"
style="${buildlib}/jaranalyzer.xsl">
</style>
<style in="${buildstats}/appdependencies.xml"
out="${buildstats}/appdependencies.grph"
style="${buildlib}/jaranalyzer2dot.xsl">
</style>
<exec executable="dot" >
<arg line="-Tpng -Nshape=box -Nfontsize=30 -Nwidth=1.5
-Nheight=1.25<br></arg> ./buildstats/appdependencies.grph
-o ./buildstats/appdependencies.png">
</exec>
</target>
Filed Under Architecture & Design, Java, Metrics, Platforms | 4 Comments
I’ve updated JarAnalyzer to correct some of the problems reported when analyzing applications written on J2SE 5.0. The issues were primarily surrounding the use of Generics. If you’ve been experiencing any of the following problems when running JarAnalyzer 1.1, upgrading to JarAnalyzer 1.2 should make them go away and put you back on track.
A few other bug fixes and enhancements also found their way into version 1.2.
You can download the most current version of JarAnalyzer from the JarAnalyzer home.