When you write a program without any modularity, the difficulty seems to go up with the square of the number of statements. One way to understand why this might be is to note that any two statements might "interact". There are many such "interactions". For example, one statement might set a value of a variable is used by another. Such interactions are usually ordered, i.e. what happens depends on which statement comes first. To count potential interactions is the same as counting the number of ordered pairs of statements. This number is N(N-1) where N is the total number of statements.
In other words, without any modularity the potential difficulty of understanding potential interactions between statements goes up with the square of the number of statements.
With modularity, potential interactions are limited to two statements within one module. For example, consider a program that can be written with a hundred lines without modularity, or that can be written in three modules one having 30, 40, and 45 lines each.
The potential interactions between statements in the non-modular version is 9900. In the modular version, the largest number of potential interactions for you to worry about at one time is in the largest module, i.e. 1980 interactions.
The advantage to the modular version is clear.
The advantage to modularity can be had with any kind of module: package, class, subprogram, separately compiled files, etc. For the advantage to be real there must be no interactions between statements in different modules. This means, among other things, that your variables cannot be global.
In practice, some global variables are almost always wanted. Deciding which ones and how you will ensure their correct use is part of the same plan that drives your creation of modules. Creating and evaluating that plan will be the subject of other tips. This tip is concerned with how we restrict access to variables. You need it because without restricted access, you don't get the advantage of modularity discussed above.
I propose to give you a way of looking at variable accessibility that can be used with an older procedural language as well as with a more modern object-oriented language. You can use it when you have language support to enforce your decisions and you can use it when the only enforcement of your decisions is a convention. Conventions are more readily followed when they can be used in all your programming.
My way of categorizing a variable's accessibility isn't as flexible as the scope rules of many languages, but it can be remembered at 2 am and it can used to communicate with programmers who are writing in very different languages than you are. Besides many language nuances just enable you to do things you shouldn't.
Here are my categories:
I think we can agree that the easiest category of variable to deal with is the local variable. This is mainly because the context in which you must think about a local variable is relatively short.
Beginning programmers instinctively think of shifty variables as the most difficult to deal with -- even when this category is not introduced with such a loaded name. I have come to believe beginning programmers are almost right on this one -- hence the name "shifty".
However, shiftiness is valuable -- without it, to print the number 2 you would have to write something like
print_value := 2 print
instead of
print(2)
The variable print_value would be global and would exist only to tell the procedure print what to print. Clearly, the shifty variable which is the parameter to the print procedure is quite useful!
Moreover, shifty variables can be better for program clarity than global ones. Look at the difference between these two procedure invocations
proc(A,B,C,D,E,F,G,H,I)
and
proc
Assume that the code in the first version of proc has no access to global variables and that the code in the second has access to all global variables. That means, when you read the first procedure invocation, you know exactly which variables proc may diddle with, but when read the second procedure invocation you haven't a clue. The first procedure invocation can give the programmer a big advantage, but not as big as the use of regional variables can.
Regional variables are known to a specific, hopefully small, number of procedures. These procedures may all exist in one separately compiled file, in one package, in one class, etc. The programmer makes use of the modularity features in the underlying programming language (or simply makes use of comments and conventions) to enforce a rule that only specifically approved procedures access regional variables in any way at all.
Suppose the variables C, D, E, F, G, H, and I are regional and that proc is one of the procedures which can access them. When proc is invoked, these regional variables are known always to be part of its environment. The variables A and B, however, must be specified in the code that invokes proc to say how proc should do it's job.
Under these conditions, neither of the two previous procedure invocations conveys a sense of what's going on. What does convey that sense is this procedure invocation,
proc(A,B)
along with documentation how proc and its brethren affect their regional variables C, D, E, F, G, H, and I. Regional variables are not passed as parameters -- they are access directly by the procedures in their regions. Regional variables do not tie all statements together with potential interactions the way global variables do -- only the procedures in their region are tied together.
The invocation of proc just shown says "do your thing using A and B ".
These days a common way of implementing regional variables is to put them in an object with the procedures that can play with them. For example, if O is an object containing proc's regional variables, the procedure invocation just shown becomes
O.proc(A,B)
which says to O "adjust yourself with proc using A and B.
Whether object oriented or not, the use of regional variables helps the programmers compartmentalize their thinking. Without such compartmentalizing, it is difficult to keep intellectual control over a complicated system.
A research article in psychology that was written forty years ago has become a part of the education of a software engineer. The author makes a believable claim that we humans can juggle up to (about) seven things at a time. More than that and we start losing it. (Have you seen anybody juggle 12 things? Juggling, by the way, is not one of the author's examples -- perhaps, like me, he couldn't juggle at all.)
When we must deal with more than seven things, we must put them in compartments. Then we can deal at some times with the compartments and at other times with the contents of individual compartments.
My categories of variable accessibility give you four categories of variable to juggle at any one time. Of these categories, only the global variables are the same in all parts of your code. The other variables change with the context. Because they change, you don't have to think about all your variables at once.
Most of the time you should be able to keep the variables in any one category to fewer than seven. And, when you cannot, I have another way of categorizing variables to help you.
I'll end this tip with another view of the four categories of variable accesibility. This view emphasizes how to use variables in each category:
A shifty variable in a procedure proc refers to data that must be imported to guide one specific execution of proc or must be exported by proc back to the environment that called upon proc.
A shifty variable in a function should only be used to provide data that guides one specific execution of the function.
These categories and the magical number are not the only things to consider when determining accesibility to variables, but they provide a framework in which you can think and talk about your decisions. Copyright and Permissions
This tip is distributed to individuals free of charge from the Software Build and Fix web site. All other distribution (including but not limited to internal distribution within an organization and mirroring of any kind) is forbidden without written consent of the copyright holder.
Return to the top of this document.