The usual caveat: correlation is not cause. The purpose of this analysis is to find objective correlations between properties and costs and it's true, such a correlation does not necessarily mean that reducing those properties also reduces costs. But having no objective correlations is not cause, either, and this is the situation we find ourselves in regarding most principles we use today. Consider this post as an attempt to examine our beliefs rather than to definitively justify them.
The actual measurements made were as follows. If measuring size, then we measured the size of a class or package over all the times it was updated over multiple releases. Let's say it was updated five times. Then the two numbers correlated for this class or method were the number of times updated - 5 - against the average size over all 5 updates.
Also, it was noticed that some Java bytecode changes even when the source code does not (and even when the compiler versions do not change). As there was no way around this fault, this introduces some error into the analysis.
As usual, the caveat must be made that the only way to ensure that a class or package has been updated between project releases is to manually check it, and this was NOT done in this case. Instead, the project releases were sieved through a code analyzer which checked automatically for changes by comparing the before and after bytecode of each method. This will not catch all method changes, and could flag unchanged methods as changed. Hence the results of this experiment are not definitive.
The programs analyzed were:
Only core jar files were analyzed where programs were very large.
The analysis requires that successive release use the same Java compiler version, as whether a method had changed is identified by whether the its bytecode changed. Hence Hadoop releases, for example, are split in two around a compiler update.
All data available on request.
The properties investigated were:
The Spearman correleation coefficients on class-level are:
Program | S. | D. f. | C. c. | T. d. | I. s. | A. p. c. | A. | C. d. | Impd. S. | C. | M. | D. | D. o. |
JUnit | 0.38 | 0.13 | 0.35 | 0.2 | -0.04 | 0.47 | 0.15 | -0.1 | 0.13 | -0.13 | -0.06 | 0.14 | 0.22 |
ActiveMQ | 0.25 | 0.23 | 0.22 | 0.19 | 0.04 | 0.23 | 0.24 | 0.17 | 0.18 | -0.12 | -0.13 | 0.15 | 0.18 |
Camel_1 | 0.3 | 0.22 | 0.25 | 0.22 | 0.07 | 0.35 | 0.21 | 0.11 | 0.17 | -0.08 | -0.11 | 0.11 | 0.2 |
Camel_2 | 0.25 | 0.16 | 0.22 | 0.15 | 0.06 | 0.29 | 0.19 | 0.11 | 0.14 | -0.11 | -0.14 | 0.1 | 0.15 |
FitNesse2 | 0.31 | 0.17 | 0.23 | 0.23 | 0.04 | 0.39 | 0.24 | 0.08 | 0.19 | -0.04 | -0.03 | 0.07 | 0.22 |
Hadoop_1 | 0.42 | 0.37 | 0.35 | 0.27 | 0.11 | 0.56 | 0.37 | 0.19 | 0.22 | -0.21 | -0.27 | 0.22 | 0.33 |
Hadoop_2 | 0.27 | 0.2 | 0.27 | 0.21 | -0.06 | 0.45 | 0.27 | 0.15 | 0.29 | -0.18 | -0.24 | 0.16 | 0.37 |
Log4j_2 | 0.46 | 0.42 | 0.44 | 0.38 | 0.28 | 0.48 | 0.21 | 0.14 | 0.01 | 0.06 | 0.02 | 0.31 | 0.05 |
Lucene_1 | 0.33 | 0.21 | 0.3 | 0.29 | 0.06 | 0.42 | 0.23 | 0.17 | 0.18 | -0.09 | -0.05 | 0.2 | 0.17 |
Lucene_2 | 0.18 | -0.02 | 0.1 | 0.07 | -0.08 | 0.13 | 0.14 | 0.12 | 0.08 | -0.23 | -0.19 | -0.02 | 0.14 |
Maven | 0.25 | 0.1 | 0.24 | 0.12 | 0.04 | 0.36 | -0.06 | 0.04 | -0.04 | 0.05 | 0.06 | 0.12 | -0.02 |
Struts | 0.25 | 0.21 | 0.13 | 0.14 | 0.12 | 0.31 | 0.11 | -0.01 | -0.02 | -0.04 | -0.04 | 0.07 | 0.01 |
Zookeeper | 0.47 | 0.34 | 0.43 | 0.39 | 0.12 | 0.55 | 0.36 | 0.3 | 0.29 | -0.16 | -0.14 | 0.24 | 0.31 |
Coyote | 0.27 | 0.19 | 0.23 | 0.27 | 0.07 | 0.37 | 0.23 | 0.22 | 0.16 | -0.09 | -0.1 | 0.25 | 0.24 |
Derby | 0.4 | 0.27 | 0.36 | 0.24 | 0.11 | 0.37 | 0.19 | 0.18 | 0.13 | -0.11 | -0.11 | 0.23 | 0.16 |
Avg. | 0.32 | 0.21 | 0.27 | 0.23 | 0.06 | 0.38 | 0.2 | 0.12 | 0.14 | -0.1 | -0.1 | 0.16 | 0.18 |
The Spearman correleation coefficients on package-level are:
Program | S. | D. f. | C. c. | T. d. | I. s. | A. p. c. | A. | C. d. | Impd. S. | C. | M. | D. | D. o. |
JUnit | 0.3 | 0.3 | 0.24 | 0.33 | -0.15 | 0.3 | 0.26 | 0.3 | 0.32 | 0.01 | 0 | 0.18 | 0.22 |
ActiveMQ | 0.21 | 0.2 | 0.2 | 0.18 | 0.04 | 0.2 | 0.24 | 0.16 | 0.17 | -0.1 | -0.1 | 0.1 | 0.2 |
Camel_1 | 0.55 | 0.38 | 0.48 | 0.51 | -0.06 | 0.61 | 0.35 | 0.35 | 0.47 | -0.04 | -0.04 | 0.52 | 0.42 |
Camel_2 | 0.39 | 0.26 | 0.38 | 0.32 | 0.02 | 0.44 | 0.29 | 0.19 | 0.35 | 0.04 | 0.03 | 0.36 | 0.32 |
FitNesse2 | 0.41 | 0.26 | 0.36 | 0.19 | 0.09 | 0.4 | 0.28 | 0.18 | 0.16 | -0.09 | -0.1 | 0.29 | 0.08 |
Hadoop_1 | 0.54 | 0.74 | 0.55 | 0.27 | 0.49 | 0.65 | 0.25 | 0.18 | 0.51 | 0.36 | 0.36 | 0.34 | 0.04 |
Hadoop_2 | 0.29 | 0.35 | 0.27 | 0.35 | 0.24 | 0.21 | 0.29 | 0.33 | 0.44 | 0.17 | 0.16 | 0.06 | -0.03 |
Log4j_2 | 0.66 | 0.41 | 0.66 | 0.45 | -0.01 | 0.63 | 0.14 | 0.54 | 0.12 | -0.06 | -0.08 | 0.54 | 0.15 |
Lucene_1 | 0.36 | 0.21 | 0.39 | 0.32 | -0.35 | 0.39 | 0.32 | 0.24 | 0.31 | -0.02 | -0.03 | 0.33 | 0.31 |
Lucene_2 | 0.42 | 0.07 | 0.4 | 0.26 | -0.12 | 0.35 | 0.33 | 0.25 | 0.35 | -0.24 | -0.23 | 0.23 | 0.32 |
Maven | 0.42 | 0.36 | 0.46 | 0.45 | 0.21 | 0.45 | 0.4 | 0.2 | 0.42 | 0.03 | 0.03 | 0.31 | 0.44 |
Struts | 0.29 | 0.35 | 0.2 | 0.16 | -0.07 | 0.27 | 0.11 | 0.03 | 0.04 | 0.15 | 0.14 | 0.3 | 0.06 |
Zookeeper | 0.32 | 0.25 | 0.32 | 0.15 | 0.26 | 0.36 | 0.26 | 0.1 | 0.1 | 0.04 | 0.04 | 0.28 | 0.14 |
Coyote | 0.46 | 0.24 | 0.46 | 0.43 | 0.15 | 0.52 | 0.19 | 0 | 0.09 | -0.04 | -0.05 | 0.41 | 0.11 |
Derby | 0.5 | 0.44 | 0.45 | 0.29 | 0.04 | 0.52 | 0.28 | 0.21 | 0.16 | -0.17 | -0.19 | 0.42 | 0.13 |
Avg. | 0.41 | 0.32 | 0.39 | 0.31 | 0.05 | 0.42 | 0.27 | 0.22 | 0.27 | 0 | -0 | 0.31 | 0.19 |