« What is the iPad for? | Main | Apple's "Magic Mouse" - just sleight of hand? »
Wednesday
Nov112009

Code formatting in C++ Part Three

In this article I am going to present my recommendation for a C++ code formatting style (although it applies to most free-formatted languages, especially those that are C/C++ like).

I have covered the background to most of my choices in some detail (some would say too much - but I invoke Blaise Pascal here) in the first two articles of this series, the rather consistently named:

Code formatting in C++ Part One
Code formatting in C++ Part Two
Since the style I am about to present is a little unusual in places, and arbitrary in others, I encourage you to take a look a the previous articles if you have not already done so. Also, where arbitrary looking numbers are used, follow the spirit of the rule rather than the letter (or, in this case, number).

Page width

The proposals below refer often to page width, and by this I mean the number of characters that you would normally expect to be visible while reading and writing code in an editor. For example, it used to be common to keep within 80 characters (or less) due to text mode screen sizes. These days windows can easily be sized to much greater character widths, but I would still recommend adopting a page width of between 80-100 characters. It is not a hard limit, although it is more important in some areas than others. Personally I still try to stick to 80.

Proposal 1: Formatting variable declaration blocks

char*          txt = "hello";
int            i = 7;
std::string    txt2 = "world";
std::vector<std::string>            v;
std::map<std::string, std::string>  m;

Variable declarations should be grouped together where possible (without violating the principle of locality - ie, keeping them close to first use) in "islands" of no more than 16 lines at a time. If there are more than 16 variable declarations in a group then use single lines of whitespace to break them up. Try to keep variables of similar length in the same block.

Within each block, align the variable names as much as possible. Where there is a large variation in type name length, sub-group longer names and shorter names together and align variable names in sub-group blocks instead (note how the vector and map, above, are separated out this way)

This proposal applies both in function body code and within class declarations (and at global scope, if you must).

Proposal 2: Formatting function signatures

Function signatures come in two forms, and we make a distinction here. The first form is the prototype, usually found in a header file, if at all. The second form is part of the definition and is followed by the function body (if applying this to a language without the separate prototype stage, the first form does not exist, of course). We shall start with that:

Function definition signatures

////////////////////////////////////////////////////////////////////////////////
void ClassName::MethodName
(
	char*          txt = "hello",
	int            i = 7,
	std::string    txt2 = "world",
	std::vector<std::string>            v,
	std::map<std::string, std::string>  m
)
const
{
	// ... method body
}

The example here is for a method of a class, but the formatting would be the same for a free function

The return value and function or method name (along with any modifier prefixes - e.g. static, or namespace prefixes) appear on their own line.
Next are the parentheses - both of which appear on their own line - indented to the same level as the preceding line.
Within the parentheses, each on their own lines, are the arguments - formatted according to Proposal 1. Any post-fix modifiers (just const, in this case) appear on their own line, followed by the function or method body.

This is almost certainly the most controversial proposal and I will take up my additional rationale in the next article

If a comment block does not already precede the signature, use a line of forward slashes for about a page width (e.g. I run them up to the 80th column).

An additional point worth mentioning here is that this style lends itself well to being used with the Doxygen "inline comment" method of documenting function and method arguments.

Function prototype signatures

void MethodName
	(	char*          txt,
		int            i,
		std::string    txt2,
		std::vector<std::string>            v,
		std::map<std::string, std::string>  m ) const;

If a separate prototype is required there are some differences to the formatting. This might seem a little odd but I'll provide the rationale in the following article.

First, the parentheses appear in-line with the arguments block, rather than on their own lines. Furthermore the whole block itself is indented with respect to the function name. Finally, any post-fix modifiers (which may include the pure virtual marker here) appear on the same line as the closing parenthesis.

Note that no line of comment characters precedes the signature. Ideally functions and methods would be fully documented at the implementation site and the documentation extracted from comments using a tool such as Doxygen. There are reasons to consider keeping the prototypes clear of too many comments, but obviously you can put them here if you are sure that is best for you

Proposal 3: Function calls

If a function call fits within a normal page width then write it on one line. Long lines should be split across lines according to one of the following two examples:
LongMethodCall1( "some text",
                  aString,
                  anInt,
                  anotherArgument );
string returnVal = ReallyLongMethodNameCall
		( "some text",
		  aString,
		  anInt,
		  anotherArgument );

In both cases the arguments are aligned, one line each, with parentheses on the first and last lines. The first line should share the line with the function or method name itself, unless that would push the argument list across such that any of the arguments end up beyond the page width

General Principles

The proposals above are deliberately narrow in focus, concentrating on those areas that are often left out of standards, or not sufficiently described, and where using an ad-hoc approach is often less than satisfactory. However there are some simple emergent themes that can be carried through to other areas of code:

  • Types and identifier names are separated into columns through alignment
  • Code is kept within a page width where possible. This is especially significant for code that you need to refer to at a glance, such as function prototypes - and is often where it is most overlooked!
  • For "structural" code, such as function signatures, consistency is especially important, even in places where it seems unnecessary (e.g. splitting short function signatures across multiple lines - even empty constructors). For implementation code the choice of when to split can be guided by the page width.

I have made some recommendations in these proposals that are not specifically backed by the discussion in the previous articles. I will attempt to cover these in the next, and final, article in this series.

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (9)

Thanks for posting your thoughts. It's always interesting and useful to see a style presented along with rationale.

My concern with the style you've presented is that it is a high-maintenance style that is brittle in the face of refactoring. I consider this kind of heavyweight alignment something of a code style smell. It can rapidly fall out of sync when you start renaming things, inadvertently introducing gratuitous inconsistencies in the alignment, which defeats the whole purpose of the alignment proposed.

Naturally there is a trade-off in any style, some things are improved while other things are not. The loss of intrinsic sustainability is an issue, and I don't think that readability makes up for it — indeed, I don't see readability as an improved quality as it seems to break with concision and easy chunking in a number of places.

November 11, 2009 | Unregistered CommenterKevlin Henney

Thanks for your comments, Kevlin. You raise some very good questions, of course, and we have discussed some of these before.
I'm going to pick up most of these points in my next article, but I'm especially interested in your final comment, that you "don't see readability as an improved quality". Would you care to elaborate on that further - especially the "break with concision and easy chunking"?

November 11, 2009 | Registered CommenterPhil Nash

I was very interested in reading posts #2 and #3 because I ended up coding in both styles as far back as 7 years ago. The spaces inside the parenthesis makes it much easier to read whatever is inside. I took it farther though by putting spaces between any adjacent parenthesis. This shows up most of the time when you are calling a method inside another method call, e.g. foo.bar( new Integer( 5 ) ); instead of foo.bar(new integer(5)). It was that same readability that led me to using the 'parameter per line' style. I'd ask the commenter above how exactly he breaks a single line of code that is 200 characters long. I would hope he didn't keep it on one line. As soon as you decide to break it into smaller segments you run smack into the same issue.

Good stuff, keep it up. Looking forward to the next post on the subject.

December 17, 2009 | Unregistered CommenterKelly French

Thanks Kelly. I hope to get some time for my final article sometime soon. As for, "how exactly he breaks a single line of code that is 200 characters long" - knowing Kevlin I suspect his response would be that he would never have written a 200 char long line in the first place (or, if someone else had written it he'd have refactored it until it didn't). He'd probably even throw in references to the Encapsulate Context Pattern - right, Kevlin? :-)
He'd be right too. This is one of the themes I want to bring out in my final post - that the need for a layout style to accommodate many parameters could be seen as a code smell in itself. This is also something I'm incubating a little longer as I'm fighting with my current code base to see if I can get it to the point - at least in some places - where the need to split the lines is less necessary. Exceeding the page width is not the only factor to consider, of course - which is another matter I will be taking up.

December 30, 2009 | Registered CommenterPhil Nash

I definitely agree with proposal #3. That is how I have evolved my own function calls and it seems so much clearer and neater. It just seems the natural way to organise these things.

April 8, 2010 | Unregistered CommenterAndy

Take a look at your function call examples - in one, the name of variables lines up with the opening quote of the string, while in the other, they line up with the first character inside the string. Which one is right?

December 19, 2012 | Unregistered CommenterXiong Chiamiov

@Xiong - sorry your post got stuck in my moderation queue - unknown to me.

Anyway, it looks like the first one is a reformatting glitch. The quote should be in line with the first chars.

January 2, 2013 | Registered CommenterPhil Nash

Nice Article Series! I'm responsible for coding standards and guidelines at my company - and always have a hard time justifiying this style - having some rational from the world of speedreading (aka how the eye-brain interaction can match patterns and take information in) is great :-)

Are you planning on writing the mentioned Part 4 ?

April 3, 2014 | Unregistered CommenterFalco

@Falco - glad to know someone is still reading this!
The problem with part 4 is that in the process of trying to pin it down my thinking of it changed.
I should still do it at some point, though - but no promises.

April 3, 2014 | Registered CommenterPhil Nash

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>