It’s very common for teams to use averages to communicate a measurement. The average number of lines of code per method or the average number of features delivered per iteration are simple examples. Most of the time, the average represents the mean, which is the arithmetic average of all samples from a chosen population. But average can also be used to refer to the median (the middle value in a series) and mode (the most frequently occurring value in a series). This can be misleading, and the result is metriculation.
Software development teams often use cyclomatic complexity (CCN) to measure the complexity of their code. Using a well-chosen average, it’s easy to misinterpret, or miscommunicate, the results. Let’s say we want to calculate the average CCN for a system. Consider seven methods with the following CCN:
Method 1: 3 Method 2: 3 Method 3: 3 Method 4: 12 Method 5: 120 Method 6: 85 Method 7: 15
Using this sample, the mean CCN is 34, the median is 12, and the mode is 3. While average CCN typically uses the mean, in this situation, the mean provides a false positive indicating poor code quality. Given a different sample, the results could be skewed in the opposite direction - providing a false positive indicating high quality.
Combining mean with median and mode may serve as a warning indicator. With a mode of 3 and a mean of 34, we might suspect a wide range of values. Another way to determine if the mean is an accurate representation of the sample is to calculate the standard deviation or variance. These represent the how spread out a distribution is. In this case, the standard deviation is almost 48. A number way to high for the mean CCN to provide an accurate representation of code quality.
The point here is that we have to be careful with how we use metrics as a measurement. Sometimes, additional analysis is required before we make a decision in how to proceed. While this example uses CCN, it would be easy to imagine other examples where this form of metriculation - the well-chosen average - might take place.
Metriculation is derived from statisticulation, a term introduced in How to Lie with Statistics.
Leave a Reply