Virtual Machinery - Sidebar 3 - WMC, CBO, RFC, LCOM, DIT, NOC

Sidebar 3 - WMC, CBO, RFC, LCOM, DIT, NOC - 'The Chidamber and Kemerer Metrics'

Links

**** Just released - JHawk 6.1.3 - See here for details - new Review Factor metric and DataViewer command line service to create motion charts (Professional licenses only) ****

WMC, CBO, RFC, LCOM, DIT, NOC - 'The Chidamber and Kemerer Metrics'

Aside from the Halstead Metrics this is probably the best known set of metrics. The difference with the 'C&K' metrics (as they are sometimes abbreviated) is that they were specifically designed with Object Oriented code in mind. In general their calculation has been viewed as straightforward and non-controversial (the possible exception being LCOM). I'm not so sure of this for reasons outlined below. The metrics were first proposed in 1994 when the only commercially significant Object-Oriented languages were Smalltalk and C++. C&K used a two commercial applications (one in C++ and one in Smalltalk) to assess the valude of their metrics.
What was their aim?
Although reuse is frequently mentioned this was not the initial aim of Chidamber and Kemerer's original suite of metrics. They had read Grady Booch's 'Object Oriented Design with Applications' and were looking for a set of metrics that would assist this design approach. The idea was that the metrics would allow a designer to compare one potential design against another and predict which would be better. This meant that the metrics would have to be capable of being based on a design rather than code. Another interesting point was that C&K firmly believed in a practical commercial approach stating '00 design metrics should offer needed insights into whether developers are following 00 principles in their designs. This use of metrics may be an especially critical one as organizations begin the process of migrating their staffs toward the adoption of 00 principles.'. They took three stages of Boochs recommended OOD process (Booch's fourth stage was implementation) and identified their metrics as applying to these -

Identification of Classes - WMC, DIT, NOC

Semantics of Classes - WMC, RFC, LCOM

Relationships between classes - RFC, CBO

C&K used a formal process based on Weyukers criteria for evaluating software metrics to evaluate their own suite of metrics. In general they felt that their criteria encouraged a proliferation of classes but that more classes made the software more complex under Weyukers criteria. Not only that but the C++ and Smalltalk developers involved in the applications used in their study felt the same - more classes made it more difficult to track down errors. This runs counter to our current (and C&Ks) view that we give each class a single coherent responsibility (usually for a small set of data and its associated methods) and that if that results in more classes then so be it. Perhaps the problem with C&Ks metrics is that they encourage more classes but not neccessarily the right ones? C&K also felt that DIT and LCOM violated one of Weyukers criteria and it is interesting that LCOM is one of the criteria that is frequently changed.

In my analysis of these metrics I'm taking the view of a Java programmer analysing code that has already been written. This is how we tend to apply metrics to code these days. C&K wrote these metrics when C++ and Smalltalk were the only viable OO languages, and OO itself was in its infancy (well Smalltalk was a spotty teenager), they also wrote them to be calculated at design time. So if my judgements seem a little harsh let them be read in that light.

Let's now look at the individual metrics to see how they are calculated and whether we have truly fixed definitions for each - back into the swamp of definition!

CBO - Coupling between Objects
Coupling between objects (CBO) is a count of the number of classes that are coupled to a particular class i.e. where the methods of one class call the methods or access the variables of the other. These calls need to be counted in both directions so the CBO of class A is the size of the set of classes that class A references and those classes that reference class A. Since this is a set - each class is counted only once even if the reference operates in both directions i.e. if A references B and B references A, B is only counted once.

C&K viewed that CBO should be as low as possible for three reasons -

Increased coupling increases interclass dependancies, making the code less modular and less suitable for reuse. In other words if you wanted to package up the code for reuse you might end up having to include code that was not really fundamental to the core functionality.

More coupling means that the code becomes more difficult to maintain since an alteration to code in one area runs a higher risk of affecting code in another (linked) area.

The more links between classes the more complex the code and the more difficult it will be to test

The only consideration in calculating this metric is whether only the classes written for the project should be considered or whether library classes should be included.

DIT - Depth of Inheritance Tree
Depth of Inheritance Tree (DIT) is a count of the classes that a particular class inherits from. C&K suggested the following consequences based on the depth of inheritance -

The deeper a class is in the hierarchy, the greater the number of methods it is likely to inherit, making it more complex to predict its behavior

Deeper trees constitute greater design complexity, since more methods and classes are involved

The deeper a particular class is in the hierarchy, the greater the potential reuse of inherited methods

The first two points seem to be interpreted as 'bad' indicators while the third seems to be suggested as 'good' - making it difficult to decide whether DIT should be large or small. I would suggest that a deep hierarchy is a good thing - you don't really want to punish OO programmers for using inheritance.

There are a couple of considerations here -

Should the Inheritance Tree stop at the boundary of the system under examination or should it also include library classes (even whether Object should be included since all classes ultimately inherit from this)? As C&K took the view that it was the numbers of inherited methods that increased the potential complexity then possibly all of the classes (inside and outside the code under examination) should be included

Java classes have a single inheritance tree but should this also include interfaces that are implemented and may include data defined in these interfaces such as statics? See the point about the longest chain below.

Java interfaces have multiple inheritance should the depth of the tree be a count of all the interfaces or simply be the longest chain of interefaces? C&K use the longest chain in their calculations but if the issue is the greater number of inherited functionality (data and methods) how does this make sense.

LCOM - Lack of Cohesion of Methods
LCOM is probably the most controversial and argued over of the C&K metrics. In their original incarnation C&K defined LCOM based on the numbers of pairs of methods that shared references to instance variables. Every method pair combination in the class was assessed. If the pair do not share references to any instance variable then the count is increased by 1 and if they do share any instance variables then the count is decreased by 1. LCOM is viewed as a measure of how well the methods of the class co-operate to achieve the aims of the class. A low LCOM value suggests the class is more cohesive and is viewed as better. C&K's rationale for the LCOM method was as follows -

Cohesiveness of methods within a class is desirable, since it promotes encapsulation.

Lack of cohesion implies classes should probably be split into two or more subclasses.

Any measure of disparateness of methods helps identify flaws in the design of classes.

Low cohesion increases complexity, thereby increasing the likelihood of errors during the development process.

One of the difficulties with the original C&K LCOM measurement was that it's maximum value was dependant on the number of method pairs - with 3 methods you can have 3 method pairs but with 4 you can have 6. This means that to assess whether your value of LCOM is low you have to take into account the number of method pairs - an LCOM of 2 with 3 method pairs should be considered differently to an LCOM of 2 with 100 message pairs. This was addressed by Henderson-Sellars when he introduced LCOM* (sometimes known as LCOM3 with C&Ks LCOM being known as LCOM1). LCOM* is calculated by (numMethods -numAccesses/numInstVars)/(numMethods-1) where numMethods is the number of methods in the class, numAccesses is the number of methods in a class that access an instance variable and numInstvars is the number of instance variables. If the number of attributes or methods is zero then LCOM* is zero. LCOM* can be in the range 0 to 2 with values over 1 being viewed as being suggestive of poor design.

There are several other calculations of LCOM and these are reviewed by Etzhorn et al. This covers areas such as the inclusion of inherited variables in the calculation and the inclusion of constructor and destructor methods and they looked at eight possible implementations of LCOM. Their conclusion was that a version of LCOM produced by Li and Henry that included constructor methods, but not inherited variables, gave a result that was closest to cohesion assessed visually from the code.

Although there is a fair amount of debate about how to calculate LCOM and it features in a lot of metrics sets an increasing number of researchers (including Henderson-Sellars) suggest that it is not a particularly useful metric. Perhaps this is also reflected in there being a fair amount of debate about how to calculate LCOM but very little on how to interpret it and how it fits in with other metrics.

NOC - Number of Children
Number of Children (NOC) is defined by C&K the number of immediate subclasses of a class. C&K's view was that -

The greater the number of children, the greater the level of reuse, since inheritance is a form of reuse.

The greater the number of children, the greater the likelihood of improper abstraction of the parent class. If a class has a large number of children, it may be a case of misuse of subclassing.

The number of children gives an idea of the potentialinfluence a class has on the design. If a class has a large number of children, it may require more testing of the methods in that class.

So their view was that increasing the number of children was improving reusability but if you didn't do it correctly you could make a mess of your design and increase the burden on your testers. This would suggest that this metric might point you towards code that you should look at - but you were then going to have to make a qualitative judgement about that code.

The calculation itself seems very simple. But should you perhaps include all the subclasses that ultimately have that class as their parent. And what about interfaces? - what relationship does a class that implements the interface have to that interface - is it a child or an instance? And if you include classes that implement the interface do you include the classes that implement the child interfaces that inherit from the root interface?

RFC - Response For Class
This is the size of the Response set of a class. The Response set for a class is defined by C&K as 'a set of methods that can potentially be executed in response to a message received by an object of that class'. That means all the methods in the class and all the methods that are called by methods in that class. As it is a set each called method is only counted once no matter how many times it is called. C&K's view was that -

If a large number of methods can be invoked in response to a message, the testing and debugging of the class becomes more complicated since it requires a greater level of understanding required on the part of the tester

The larger the number of methods that can be invoked from a class, the greater the complexity of the class

A worst case value for possible responses will assist in appropriate allocation of testing time

This is probably the most clear cut of the metrics - a high RFC suggests that something is wrong. There might be some debate about what a high level of RFC is but if you started at the classes with the highest RFC level in your code and worked down you would probably eliminate a lot of 'bad smells'.

WMC - Weighted Methods for Class
Weighted methods for Class (WMC) was originally proposed by C&K as the sum of all the complexities of the methods in the class. Rather than use Cyclomatic Complexity they assigned each method a complexity of one making WMC equal to the number of methods in the class. Most 'classic' implementations follow this rule. C&Ks view of WMC was -

The number of methods and the complexity of methods involved is a predictor of how much time and effort is required to develop and maintain the class

The larger the number of methods in a class the greater the potential impact on children, since children will inherit all the methods defined in the class

Classes with large numbers of methods are likely to be more application specific, limiting the possibility of reuse.

I have quite a few issues with this metric, starting with the name - don't call a metric Weighted Methods per class if you are immediately going to assign each method an equal and arbitrary value of 1. Their view that classes with more methods are less likely to be reused seems strange in that, potentially, there is more there to reuse! I think that number of methods is a better name for this method and that Total Cyclomatic Complexity for a class is a more useful pointer to one or more potential 'bad smells'.

Overview
So where does this leave us with the C&K metrics? In summary -

CBO - Straightforward calculation. Low values are good.

DIT - In general, but not always, a high DIT is viewed as a good thing - so you can use this as an indicator and then use your judgement after a visual examination of the class. There is some debate about calculation and, in the case of Java, what do we do about interfaces?

LCOM - Wide variety of measurement approaches. Not sure what it indicates. In general high levels are viewed to be bad

NOC - A high level may be either a good or a bad thing, potential for confusion in measurement, particularly with regard to interfaces

RFC - A high level is a useful indicator of potential problems, uncontroversial measurement

WMC - Doesn't really do what it claims to do - replace with either number of methods or total cyclomatic complexity

References

Chidamber, S. and Kemerer. C. (1994). A metric suite for object oriented design, IEEE Transactions on Software Engineering, 20(6)

Shyam R. Chidamber, David P. Darcy and Chris F. Kemerer (1988). Managerial use of metrics for object-oriented software: an exploratory analysis. IEEE Transactions on Software Engineering vol 24. pp629-640, 1998

E. Weyuker, �Evaluating software complexity measures,� IEEE Trans.Software Eng., vol. 14, pp. 1357-1365, 1988.

Letha Etzkorn, Carl Davis, and Wei Li (1997). "A Statistical Comparison of Various Definitions of the LCOM Metric". Technical Report TR-UAH-CS-1997-02, Computer Science Dept., Univ. Alabama in Huntsville, 1997

If you want to find out more about Java metrics you can read our full tutorial here or you can read the individual sections on System and Package Level Metrics, Class Level Metrics and Method Level Metrics.

You may be interested in our products. All have demo or trial versions. Just click on the links below to find out more -

All our products can be purchased online - just follow the link below -