Virtual Machinery logo
Sidebar 1 - Comments, Lines of Code, Statements and Expressions.
Sidebars Home Products Downloads About us Contact us Links
**** Just released - JHawk 6.1.3 - See here for details - new Review Factor metric and DataViewer command line service to create motion charts (Professional licenses only) ****

Click here to download demo Click here to see details of our JHawk Java Metrics product

Lines of Code and Statements

There is probably no area of metrics more likely to generate controversy than this. With good reason - it is the most abused level of Metrics. It’s the domain of the pointy headed manager who views Lines of Code as a productivity measure and a selling point for the software – ‘3.5 million lines of code you know’. Programmers are paid to be smart – ‘you want lines of code – I’ll give you lines of code’. Spaghetti ensues.

Strangely enough I like lines of code as a measure. It has a strong correlation with Cyclomatic Complexity and is easier to measure. It also has a strong correlation with how we feel about the difficulty of a piece of code when we view it. Faced with a three to five line method we probably grasp its meaning quickly. Faced with twenty lines we have to scan up and down a method repeatedly to get its meaning.

Many years ago I wrote a small accounting package for someone. It was relatively simple so I decided to write as low maintenance a piece of code as I could. The language that I used (MUMPS) was procedural so I put all my utilities into separate procedures. I then embarked on an exercise that I described as ‘Aggressive code reduction’. I turned everything that was used more than once into a procedure. The process was successful – in the dozen years that the code ran for there were only four errors. It was also easy to maintain. There wasn’t a procedure with more than half a dozen lines in it.

I’ve followed that principle since and when I started programming in Smalltalk I found that it not only suited the language but it was ‘the way’ as espoused by the masters of the language like Kent Beck and Ron Jeffries. I’ve carried it through to Java as well – so the first thing I look at when I review a piece of code (either mine or somebody else’s) is the size of the methods.

When you decide that lines of code are a good measure – you then have to decide what a line of code is. Let’s look at a piece of code and see what we get –

public double calculateBalanceWithInterest(double rate, double minForInterest) {
	if (balance > minForInterest) {
		double interest = balance*rate;
		balance = balance +interest;
	}
	return balance;
}

You could count that as seven lines of code. If the pointed headed boss has listened to me and changed his tune I can make it less lines of code with a quick bit of editing.

public double calculateBalanceWithInterest(double rate, double minForInterest) {
     if (balance > minForInterest) {double interest = balance*rate; balance = balance +interest;} return balance;}

Bingo! – I have three lines of code – my method is now more than twice as good in the land of the pointy heads.

I’ve always taken the view that the important measure is the number of Java statements. So in this case I view the method declaration (including the closing right bracket) as one statement and the if statement (including its closing right bracket) as another. Taking any line that ends in a semicolon as a statement means that the two lines of code in the if statement and the return statement count as three more. That gives me five statements. The nice thing about this is that the programmer does not get punished for formatting their code in a clear manner. For example there is a common house style that would format our example code as follows :

public double calculateBalanceWithInterest(double rate, double minForInterest)
{
	if (balance > minForInterest) 
        {
		double interest = balance*rate;
		balance = balance +interest;
	}
	return balance;
}

Under the approach described above this is still five statements.

Comments

Comments are another point of contention. The general view is that comments are a mitigation of complex code. There is a school of thought that says that code can be written to be self-documenting. I have no problem with the theory but the problem is that one persons self-documenting code means nothing to someone else. Here we have a sensibly named method (to anybody who knows what Menelaus theorem is) -

public boolean useMenelausTheorem(Point[] coords, Point[] lines) {
	…………………………….
	return result;
}

We can’t parse comments to get their meaning so we must take it on trust that the comment is valuable.

public double calculateBalanceWithInterest(double rate, double minForInterest) {
	//Only calculate interest if the balance is over the minimum amount
	if (balance > minForInterest) {
		double interest = balance*rate;
		balance = balance +interest;
	}
	return balance;
}

That’s valuable.

public double calculateBalanceWithInterest(double rate, double minForInterest) {
	//!!!! Remember to pick up kids at four !!!!!!!
	if (balance > minForInterest) {
		double interest = balance*rate;
		balance = balance +interest;
	}
	return balance;
}

That isn’t – well it probably was the day it was written, but it isn't valuable in the long term.

We have to be careful how we count comment lines as well – otherwise we could encourage counter-productive behaviour. If you count the number of lines you encourage people into elaborate strategies to magnify the figures. You also discourage sensible strategies like end of line comments that are less obtrusive to the reader.

My view is that if you must count comments then it is better to count the number of comments in preference to the number of lines of comments. Some metrics like the Maintainablity Index include comments as part of their calculation – I’m not sure of the value of this and it seems that others feel the same as the Maintainability index has a version that does not include comments in the calculation. My preference is to use other metrics (like method size or cyclomatic complexity) to guide you to code that has fallen outside your range of acceptable values. Once you get to the code you can make a judgement as to whether the number and quality of comments mitigate the level of the metric.

Expressions

Expressions add another level of complexity to the reading of a piece of code. First we need to define an expression. An expression in Java is something that returns a value. Examples in our code would be ‘balance*rate’ and ‘balance+interest’. We can concatenate expressions in a statement but this may add to our perceived complexity of the code. So we could take our two statements above and concatenate them into a single statement with two expressions –

balance = balance + balance * rate;

This is a fairly trivial example as the meaning is quite clear since we all understand (don’t we?) that Java’s operator precedence will ensure that the line will be interpreted as:

balance = balance + (balance * rate);

but without the brackets we will have to think for a second – does it perhaps mean –

balance = (balance + balance )* rate;

That doubt increases our complexity. So we have saved one statement but we have made another more complex by increasing the number of expressions in it. When we measure the complexity of our code we need to reflect this – perhaps by measuring the average number of expressions per statement. Those of us who remember C++ will no doubt also remember the pride that was taken in cramming the most amount of function into a C++ line. This made the line completely unintelligible to anyone but the author and is responsible for the bad reputation for maintainability that C++ code has.

We can take a parallel with the English language to illustrate this concept of expressions and statements. Take the following two sentences -

I walked back from the shops with my Brother and my Father. My Father had sore legs.

That’s pretty clear. Now look at this sentence -

I walked back from the shops with my Brother and my Father who had sore legs.

It’s not clear who has the sore legs here and the rules of the English language are not strong enough to allow us to be sure. By concatenating the two sentences we have increased the complexity and lost meaning.

If you want to find out more about Java metrics you can read our full tutorial here or you can read the individual sections on System and Package Level Metrics Class Level Metrics and Method Level Metrics.

 
 
 
You may be interested in our products. All have demo or trial versions. Just click on the links below to find out more -
 
 
 
 
 

All our products can be purchased online - just follow the link below -

 

 
 
 

Contact Us

© 2020 Virtual Machinery   All Rights Reserved.