Home

      Consistency

      Symmetry

      Globals

      Names

      Declarations

      Initialization

      Comments

      Indentation

      Whitespace

This is currently a work in progress. I will be adding to this from time to time.

Introduction

Think about the last book you read. Before reaching you, this book was written by the author, edited by one or more editors, typeset, printed, bound and sent out for the world at large to consume. Not only was it put together to perform a purpose (entertain you, instruct you in fixing your car, etc.) but it was looked over for structure, syntactical and grammatical correctness and adherence to a certain set of guidelines that all books by that particular publisher follow to a greater or lesser degree.

Like a book, code is meant to perform a purpose. Most code, however, does not go through a rigorous editing process to ensure readability, especially code written by people just picking up the art of programming, whether it is to further their own usefulness in handling a computer, changing careers or what have you. Sadly the professionals in a lot of cases are not much better off than the amateurs. In my experience most code done by software vendors does not go through as many code reviews as it should. Heck, once is a lot in some cases. It seems that the ability to write clean and textually meaningful code is a talent that is either innate or learned through a lot of trial and error by those that have maintained enough bad code in their lifetimes to care enough to say to themselves "I’m not going to subject the people that end up maintaining my code to the same mess."

The reasons for worrying about your code, above and beyond does it perform the function it should, relate to maintainability; not only for you but for anybody that comes after you. If you have had to pick up somebody else’s codebase, especially a somebody else who is no longer present to answer questions and you have been stuck with hard to read code, debugging and maintaining the code becomes much more difficult if the code is poorly structured and hard to read. Likewise if you need help from a more experienced programmer, one of the nicest things you can do is to have a readable piece of code for him. Not only will you make the life of the person helping you easier but by taking the time to write readable code, you are showing that you care about your work. Most people who write software for a living care very deeply about their craft and are more likely to be more helpful to those that show a high level of regard for their own work.

There will be no attempt to dictate stylistic differences such as

if (rowIndex > MAX_ROWS) {
	...
}

vs.

if (rowIndex > MAX_ROWS)
{
	...
}

vs.

if (rowIndex > MAX_ROWS)
	{
	...
	}

As long as a certain idiom is used consistently throughout a code base it has little affect on its readability. Popular style differences such as the "right" way to indent brackets in C/C++ change throughout time. In contrast to that I hope to present a general set of guidelines that will transcend stylistic differences and languages. Hopefully this will serve as a guide to newcomers and people who have practiced the craft of programming for a while will be able to look at this and pick up a couple of tips as well.

Be ConsistentTop

By far the most important thing you can do to create maintainable code for yourself and others is to be consistent. This covers a lot of ground. Naming conventions for variables, use of case, indentation levels and so on should be done in a thoughtful manner. In addition, if you are maintaining a piece of code, you and those following you would best be served by either maintaining the style used by those before you or, if no obvious style is present, start using one and as time permits modify the code base to use the same style throughout. Nothing more clearly indicates that multiple parties have worked on a code base than a mish mash of indentation and bracketing styles throughout the code. In addition to appearing better, consistency will aid greatly in the maintainability of the code base. If member variables for classes all begin with m_ then when you add members to existing classes or new classes, be sure to follow the same convention. Likewise, do not begin the names of local variables in a procedure with m_ as this will lead those coming after you to assume that this is a member variable for some instance of a class or object. Doing such things makes debugging or updating a program much harder than it should be. Combine this with mixed naming conventions and mixed cases (languages such as Visual Basic do not differentiate between Variable, VARIABLE and VaRiABle, while other languages such as C++, Python and JavaScript do) and it is likely a five minute debugging job has turned into something resembling an all day project. Modifying the code to perform some other function or the same function in a slightly different manner (no large project ever has requirements that remain constant for long) can become a Herculean task.

Finish What You StartTop

Generally speaking a task started in one place should end in that place. Look at the following HTML example


<html>
<body>
<script>
function WriteWholeNumbers()
{
	document.write("<tr><td>One</td><td>1</td></tr>\n");
	document.write("<tr><td>Two</td><td>2</td></tr>\n");
	document.write("<tr><td>Three</td><td>3</td></tr>\n");
	document.write("</table>\n");
}
</script>

<table border="1">
	<tr>
		<td>Name</td>
		<td>Number</td>
	</tr>
<script>WriteWholeNumbers();</script>
</body>
</html>

Notice that the table tag starts in the main body of the function and that the end tag for it is written by the script WriteWholeNumbers. Now what happens when we want to add a function that writes irrational numbers? We either have to call the function from WriteWholeNumbers or move the ending table tag somewhere else. If the HTML had originally looked like this


<html>
<body>
<script>
function WriteWholeNumbers()
{
	document.write("<tr><td>One</td><td>1</td></tr>\n");
	document.write("<tr><td>Two</td><td>2</td></tr>\n");
	document.write("<tr><td>Three</td><td>3</td></tr>\n");
}
</script>

<table border="1">
	<tr>
		<td>Name</td>
		<td>Number</td>
	</tr>
<script>WriteWholeNumbers();</script>
</table>
</body>
</html>

Then the function could easily be added and called like so


<html>
<body>
<script>
function WriteWholeNumbers()
{
	document.write("<tr><td>One</td><td>1</td></tr>\n");
	document.write("<tr><td>Two</td><td>2</td></tr>\n");
	document.write("<tr><td>Three</td><td>3</td></tr>\n");
}

function WriteIrrationalNumbers()
{
	document.write("<tr><td>Pi</td><td>3.141592653...</td></tr>\n");
	document.write("<tr><td>e</td><td>2.7182818284...</td></tr>\n");
}
</script>

<table border="1">
	<tr>
		<td>Name</td>
		<td>Number</td>
	</tr>
<script>
	WriteWholeNumbers();
	WriteIrrationalNumbers();
</script>
</table>
</body>
</html>

Adding different numbers to this page is much easier because the <table> tag starts and ends in the same logical place. This is a trivial example but as the functionality in a page such as this grows more complex, the problems that arise because of badly placed html tags quickly grows, causing maintenance headaches that could easily have been avoided.

When coding a function it is imperative that the function that allocates resources release them. This makes debugging much easier and makes the code much more maintainable. It eliminates a lot of potential bugs. For example, consider the following C program


#include <stdio.h>

FILE *fptr;
int GetValue()
{
	int fileValue;
	fptr = fopen("c:\\ count.txt", "r+");
	fscanf(fptr,"%d",&fileValue);
	return fileValue;
}

void IncrementValue(int currValue)
{
	rewind(fptr);
	fprintf(fptr,"%d",++currValue);
	fclose(fptr);
}

int main(int argc, char* argv[])
{
	int currentValue;
	currentValue = GetValue();
	IncrementValue(currentValue);

	return 0;
}

This program works just fine. It opens a file called testfile.txt in the root directory, increments the value in the file and then closes the file. This is basically how a web counter may work. However, because the file is global and opened in one function and closed in another, adding features to this program can become increasingly difficult. What if we only want to increment the value up to a certain point? Well, we can do this


#include <stdio.h>

FILE *fptr;
int GetValue()
{
	int fileValue;
	fptr = fopen("c:\\ count.txt", "r+");
	fscanf(fptr,"%d",&fileValue);
	return fileValue;
}

void IncrementValue(int currValue)
{
	rewind(fptr);
	fprintf(fptr,"%d",++currValue);
	fclose(fptr);
}

int main(int argc, char* argv[])
{
	int currentValue;
	currentValue = GetValue();
	if (currentValue < 10)
	{
		IncrementValue(currentValue);
	}

	return 0;
}

Now we have a problem. Once the value in the file is greater than 10, the file is no longer closed. Eventually there will be complaints by the operating system of too many open files. One quick fix for this is the following


#include <stdio.h>

FILE *fptr;
int GetValue()
{
	int fileValue;
	fptr = fopen("c:\\ count.txt", "r+");
	fscanf(fptr,"%d",&fileValue);
	return fileValue;
}

void IncrementValue(int currValue)
{
	rewind(fptr);
	fprintf(fptr,"%d",++currValue);
	fclose(fptr);
}

int main(int argc, char* argv[])
{
	int currentValue;
	currentValue = GetValue();
	if (currentValue < 10)
	{
		IncrementValue(currentValue);
	}
	else
	{
		fclose(fptr);
	}

	return 0;
}

Hopefully you can see how this would quickly spiral out of control. A much better way to handle this would be to write the program as follows


#include <stdio.h>

int GetValue(FILE *fptr)
{
	int fileValue;
	fscanf(fptr,"%d",&fileValue);
	return fileValue;
}

void IncrementValue(FILE *fptr, int currValue)
{
	rewind(fptr);
	fprintf(fptr,"%d",++currValue);
}

int main(int argc, char* argv[])
{
	int currentValue;
	FILE *fptr;
	fptr = fopen("c:\\count.txt", "r+");
	currentValue = GetValue(fptr);
	if (currentValue < 10)
	{
		IncrementValue(fptr, currentValue);
	}
	fclose(fptr);
	return 0;
}

By having the main function open and close the file there should be nothing to worry about regarding a resource leak. We’ve also eliminated a global variable

Avoid Global VariablesTop

Whenever possible, avoid global variables. As shown above, they can easily lead to functions, modules, classes, etc. being tightly coupled. Tightly coupled code is harder to reuse, harder to maintain and harder to extend. Whenever possible variables should be passed into a function or declared locally.

This becomes especially important in environments like VBScript in ASP. As long as the variable has been declared somewhere in either the page you are using or pages that are included, the variable name is referenceable. This ability to declare and use variables anywhere is sometimes a great convienience. Most of the time it is a source of incredibly hard to troubleshoot bugs that can only be found by stepping through each line of code in the debugger. Even then it can be difficult if the variable was created and initialized when a process first starts running but doesn’t crash until it is 90% done because a reference set in the beginning and used throughout the process hasn’t been properly updated near the end. In this case stepping through the locally failing code wouldn’t be enough. You would need to start at the beginning of the process when the pages first load. Depending on the length of time it takes to run the process, you may be better off updating your resume, resigning your position and letting the next sucker in line take a crack at finding the problem.

Use Meaningful Variable NamesTop

Whenever you are naming a variable, resist the temptation to reduce typing by naming things x, dbptr, strptr, etc. It’s been a long time since programmers were restricted to 8 characters or so in a variable name and chances are if you’ve recently started programming, most likely in a language like Visual Basic, Python, Java, etc. that have never had this restriction. Using meaningful variable names will make your code more readable, easier to maintain and easier for the people after you to pick it up.

In addition to cutting down on the number of comments necessary, good variable names will make your code more readable and intuitive. I don’t think a piece of C will ever read like Shakespeare or even ever be self documenting (contrary to what my brother-in-law claims), but meaningful variable names combined with some sense of structure will yield a very readable piece of code. Which is more intuitive to you?


if (fcnptr == NULL)
{
	return FCN_NO_INIT;
}

or


if (functionPointer == NULL)
{
	return ERROR_FUNCTION_NOT_INITIALIZED;
}

You can probably deduce from the first snippet what is going on, but chances are you wouldn’t be 100% sure. You would probably hope for a short comment saying //if the function is not initialized, return an error whereas in the second snippet the code itself is telling you that simply because the programmer was willing to type a few extra characters.

The one exception to this rule is in the use of single letter variables to act as a counter or as indices into an array. c,i,j,m,n,x and y have been used consistently for various purposes in programming for decades now. If it is explicitly clear you are indexing into a structure or counting interations in a small block of code (say 20 lines or less, at most a screen’s worth, although screen sizes vary), then you can slide by using these names.

Declare Variables as Late As PossibleTop

Most modern languages allow you declare a variable whenever you want. Generally speaking it is best to declare a variable as late as possible and only within the scope it will be used in.

There are two reasons for this. One is that it makes the code generally more readable. Occasionally a function will go on longer than it should and it becomes harder to troubleshoot if you have to keep hitting Page Up six times to go back and see what a variable was declared as and what its purpose in life is. Keeping variables defined in scope also lets you get away with the one exception to using meaningful variable names. Since the beginning of time, or at least since people have been programming and studying mathematics, certain variable names have become well known either as loop counters or array indices. c,i,j,m,n,x and y are all used consistently in various idioms to represent indices into various structures. c is often used as a simple counter and if used locally this is made explicit, such as in


//I had to stay after school and write on the blackboard.
for(int c = 0; c < 500; ++c)
{
	cout << "I will not pull Sally\’s hair during recess.";
}

The other reason to declare variables as late as possible to to help improve performance. In most cases declaring a variable is going to cause the environment to create an object to hold the variable. Creating objects entails overhead. Any overhead you can avoid will make your program more efficient. For example, look at this code from a PowerPoint macro


   Dim cbrWiz       As CommandBar
   On Error Resume Next

   ’ Determine whether command bar already exists.
   Set cbrWiz = CommandBars(TOOLBAR_NAME)

   ’ If command bar does not exist, create it.
   If cbrWiz Is Nothing Then
      Err.Clear
      Set cbrWiz = CommandBars.Add(TOOLBAR_NAME)

      ’ Make command bar visible.
      cbrWiz.Visible = True

      ’ Add button control.
      Dim ctlInsert    As CommandBarButton
      Set ctlInsert = cbrWiz.Controls.Add
      With ctlInsert
         .Style = msoButtonCaption
         .Caption = "Name Object"
         .Tag = "Name Object"
         ’ Specify procedure that will run when button is clicked.
         .OnAction = "ShowForm"
      End With

   End If

Notice that ctlInsert is not declared until just before it is used. If the command bar is already there, then so is the button. There is no need to take up the overhead to hold the CommandBarButton object if the toolbar exists, so it is not declared until as late as possible.

Initialize EverythingTop

Whenever you declare a variable you should, as soon as possible, initialize it. In most dynamic scripting languages this is not a problem since you simply declare the variable the first time you use it. There is no need to explicitly allocate the variable. For example in Python if you want to open a file, you do the following:


#open output file for list of applications
outputFile = open(r"c:\python\applist.txt","w")

Nowhere does outputFile need to be declared before using. Thus you can declare it right when you want to use the variable. No fuss, no muss.

To do a similar thing in C would require:


FILE *fileptr;
fileptr = fopen("c:\\count.txt", "r+");

As you see the variable needs to be explicitly declared before using it. But it has been initialized as soon as possible.

Even in languages that usually initialize values for you it is good to get into this habit. Why? Because no language initializes everything to a valid state. At some point you’ll get tripped up with an uninitialized variable. In certain languages like C++, which have copy constructors or allow us to initialize a variable on the same line it is declared, if we combine this guideline with delcaring as late as possible, we only try to allocate variables when we have something to initialize it to. This almost guarantees that we will not be allocating objects or grabbing resources we don’t need.

Use CommentsTop

Use of comments is very important, even if nobody else is going to see your code. Unless you are completely sure you will never look at the code you are typing again, you should at the bare minimum include some comments telling you what the function or functions you are working on are doing. Ideally each function will have a block comment describing the purpose of the function, its inputs and its outputs. Depending on the environment you are in, these sorts of comments can actually be used to generate documentation. For example, Javadoc and Microsoft’s .NET IDE both can make user of comments to generate API documentation.

When commenting code, keep in mind that you should be telling the person reading it the intent of the code. There is no need to go into painstaking detail about every line of code. In fact, in many cases this does nothing but make the code harder to read. For example


//increment x
++x;

This may seem like a horribly simple and contrived example, until see you it somewhere. The following snippet of code from a PowerPoint macro is a good example of what comments should convey


   Dim cbrWiz       As CommandBar
   Dim ctlInsert    As CommandBarButton
   On Error Resume Next
   ’ Determine whether command bar already exists.
   Set cbrWiz = CommandBars(TOOLBAR_NAME)
   ’ If command bar does not exist, create it.
   If cbrWiz Is Nothing Then
      Err.Clear
      Set cbrWiz = CommandBars.Add(TOOLBAR_NAME)
      ’ Make command bar visible.
      cbrWiz.Visible = True
      ’ Add button control.
      Set ctlInsert = cbrWiz.Controls.Add
      With ctlInsert
         .Style = msoButtonCaption
         .Caption = "Name Object"
         .Tag = "Name Object"
         ’ Specify procedure that will run when button is clicked.
         .OnAction = "ShowForm"
      End With
   End If

Use IndentationTop

In addition to the comments in the above sample, notice the use of indentation to set off logical code blocks. Anything that is contained in an if statement, loop or any other control block should be indented. In fact, if you have ever looked at or worked with Python, indentation is actually used to delimit blocks of code so no beginning and ending tokens (if...end if, {...}, Begin...End, etc.) are necessary in the language. Here is the afore mentioned piece of code without any indentation


Dim cbrWiz       As CommandBar
Dim ctlInsert    As CommandBarButton
On Error Resume Next
’ Determine whether command bar already exists.
Set cbrWiz = CommandBars(TOOLBAR_NAME)
’ If command bar does not exist, create it.
If cbrWiz Is Nothing Then
Err.Clear
Set cbrWiz = CommandBars.Add(TOOLBAR_NAME)
’ Make command bar visible.
cbrWiz.Visible = True
’ Add button control.
Set ctlInsert = cbrWiz.Controls.Add
With ctlInsert
.Style = msoButtonCaption
.Caption = "Name Object"
.Tag = "Name Object"
’ Specify procedure that will run when button is clicked.
.OnAction = "ShowForm"
End With
End If

As you can see, this is very hard to read and it is difficult to ascertain which instructions belong with which control structure.

Use WhitespaceTop

One editing feature that is more subtle than indentation and comments but almost as important for creating clear code is use of white space. This not only refers to indentation but to use of newlines to separate logically distinct pieces of code. Let’s take the PowerPoint code for adding the command bar and button and add some whitespace to make it a little more clear which pieces of code logically belong together


   Dim cbrWiz       As CommandBar
   Dim ctlInsert    As CommandBarButton
   On Error Resume Next

   ’ Determine whether command bar already exists.
   Set cbrWiz = CommandBars(TOOLBAR_NAME)

   ’ If command bar does not exist, create it.
   If cbrWiz Is Nothing Then
      Err.Clear
      Set cbrWiz = CommandBars.Add(TOOLBAR_NAME)

      ’ Make command bar visible.
      cbrWiz.Visible = True

      ’ Add button control.
      Set ctlInsert = cbrWiz.Controls.Add
      With ctlInsert
         .Style = msoButtonCaption
         .Caption = "Name Object"
         .Tag = "Name Object"
         ’ Specify procedure that will run when button is clicked.
         .OnAction = "ShowForm"
      End With

   End If

As you can see in the above example, control structures (If, With) are set off from the rest of the code block by whitespace. So is every comment. Generally speaking a comment can be applied to a logical section of code. This will reduce the number of comments necessary. By using good variable names and judiciously using white space, comments can be used sparsely to describe the logical purpose of a block of code.