Three (4) Criteria for Programming Languages


How do you choose a programming language?

Suppose for a moment that you did not have to use a "compatible" language but were free to choose.  What criteria would you apply?

Would you not try to limit the time and effort while increasing the quality of the programs?  Already a long time ago, Edsger Dijkstra wrote about this choice in his famous article "GOTO considered harmful".

Personally, I spend much more time reading my programs than writing them.  I read them while reflecting on the program's structures, during debugging, and later for inspiration or to make changes.

I need to memorise the language well enough that I do not have to look up what individual constructs might mean, and every action should be unambiguously defined.

I should get as much information as possible from the program code as it is also seen by the computer, not from documentation.  A program's documentation may be helpful, but may not correspond to the actual code.

Keeping these in mind, I came to three criteria (but there are now 4):

1. Code readability:

The language should use few characters from the top row of the keyboard (!@#$%^&*(){}[]~|\/)

The more exotic characters a language needs, the worse it is.  The code starts looking agitated instead of quiet.  Bad sinners are Javascript, C and C++, and all derivatives like php.  Even old-style Fortran is quieter to look at than any of these.

2. Language memorisation:

You should not need the programmer's reference manual lying open next to you.

If you need to look up anything, it means you could not remember it.  This of course does not apply to such things as the list of built-in functions, but it strongly applies to declaration and command syntax.  The worst sinner I have worked with was PL/I.  No way to remember all the possible constructs and their special cases.

3. Program documentation:

The more comments you need to write, the worse the language is.

Obviously, assembly languages do need a comment for nearly every line, but high-level languages should not.  They should allow enough data and program structuring that almost no comments are needed.  Occasionally a trick needs to be explained of course.

4. Complete Specification:

Every construct should be completely specified.

The language definition should specify what happens in all boundary conditions.  The best example I can give is this:

if a<>0 and b/a > c then …

Clearly, if the value of a happens to be zero, the division b/a should not be attempted.  This is no problem if the language specifies that the second term of an and will not be evaluated if the first is already false.  But some languages leave it to the compiler writer to decide what happens.  Many obscure bugs occur because of underspecification of the programming language.


Some examples of what I would like (Revolution Transcript does not possess all of them but I have used Transcript-similar syntax to illustrate the points).

Good assignment statement:

Any language that did not have explicit variables and an assignment statement has had to introduce them at some point.  Witness CSS counting variables.

The assignment is central to programming.  I know of only three languages that had it right:  COBOL, the HP 98xx desktop calculator's language, and Transcript.

A computer must calculate all parts of an expression before it can assign the result to a variable.  Therefore the form

put <expression> into <variable>

is to be preferred to the perverse use of the mathematical equation symbolism

<variable> = <expression>

in which not only the order is wrong, but the "=" sign does not mean equality.

Good precedence of operators:

Most languages, even C, perform multiplication before addition.  The expression

a * b + c

will first multiply a and b and then add c to that result.  But many do not tell you what happens in this expression:

a<b and b<c or control(b)=desiredvalue

The and should be done before the or.  Most languages will insist that you use parentheses to indicate what you want, forcing something like:

((a<b) and (b<c)) or (control(b)=desiredvalue)

which is not only harder to read but error prone because of the multiple parentheses.

In addition, the second operand of an and should not be evaluated at all if the first operand is false, and likewise the second operand of an or if the first is true.  This is important in expressions like:

a<>0 and c>(b/a)

because otherwise there could be a division by 0!  It is a nuisance to have to write:

if a<>0 then

if c>(b/a) then

Good definition of loop variables

A loop should always leave its loop variable in a well-defined state.  It is much cleaner code that can rely on a value for i in this loop:

repeat with i=1 to somelimit

if ... then exit repeat

end repeat

put sometable[i] into ...

Checked begin-end identifiers:

on SomeProcedure



end SomeProcedure

I want to specify where I think big blocks begin and end, and know that the compiler has checked this too.  This applies not only to procedures and functions, but also to large constructs:  if, repeat, switch, indeed even to a simple sequence.

Checked named loops:

I want to be able to name a loop with a meaningful identifier.  Such an identifier can then serve three purposes at once:  (1) a comment, (2) a check on where the construct ends, especially if it is big, (3) allow me to write a multi-level exit statement (see below).

FindTheFirstMatch: repeat with i=1 to SomeLimit


end FindTheFirstMatch

Multi-level exits by name:

FindTheFirstMatch: repeat with i=1 to SomeLimit


CheckCorrectEnding: while ch is not in EndDelimiters


if ... then -- there is no match at all!

exit FindTheFirstMatch

end if


end CheckCorrectEnding


end FindTheFirstMatch

Nestable comment blocks, allowed anywhere:

put (Alpha-Beta)*(1-SubPercent /*main percentage not good enough*/ )/Gamma into ExCapital

If I cannot put a comment in the middle of an expression then I would have to write something like this instead:

put (Alpha-Beta)*(1-SubPercent)/Gamma into ExCapital -- use SubPercent because the main percentage is not good enough here!

and that's obviously more verbose.  It might also get overlooked.

/* this removed until we know which units to apply


put Alpha into OldAlpha

repeat with i=1 to somecount


/* the following is an exception, due to the fact that ...

... ... ...



end repeat

put Xyz into Alpha


this removed until we know which units to apply */

General rule:

Most languages today do not allow you to nest comments and they do not check comments properly either.

A general rule could be:  if you need a certain type of comment often, then maybe it should become part of the syntax.  Assertions are of this type.

Valid XHTML 1.0 StrictValid CSS

next planned revision: 2010-11