On control structures and expressions

Matthew Tuck matty@box.net.au
Tue, 01 Dec 1998 20:13:25 +1030


Ursula Dreier wrote:

> Variables - a long time ago I read an article from one of the PL gurus
> (on Algol, I think) containing the sentence "variables considered
> harmful". I think this is true, but unfortunately we can hardly do
> without them. But trying to avoid the need to use temporary variables
> should certainly be an issue. Make heavy use of expressions instead.

I agree.  I consider that you should make a program as functional as
possible, and as imperative as necessary.  I find at work I often have
to make temporary variables due to language deficiencies.

> Let me explain using one of the ugliest examples I can think of:
> 
> int GetWindowText (HWND hWnd, LPTSTR lpString, int nMaxCount);
> 
> a) This thing returns true (actually the length of the string) almost
> all the time, rendering the precious return value useless for all
> practical purposes.

I'm an imperative/functional paradigm integrationist, so I feel that
functions shouldn't really have side effects.  This should really be an
out parameter.  But here I probably diverge from you.  You can get into
serious problems with order of evaluation of parameters and stuff.

> b) It requires that you allocate storage for that text, having to
> guess how long it may grow. You have to think of a name for the
> storage. Maybe you are lazy and call it sTemp or sScr or ptr or just
> s. Maybe you just carelessly allocate or reuse a pointer without
> providing the storage, and be rewarded with an exception if you are
> lucky.

Yeah, well I gather that's Win32, if it was amazing there wouldn't be
MFC and OWL.  =)

> d) Now that temporary variable is laying around. Is the value being
> reused later on? You can't be sure unless you examine the whole scope.

Flow analysis should warn on unused variables.

> If this thing occured in the "then"-part of an if, what happens on
> "else"? More uncertainty. Maybe you also reuse the tempvar for another
> purpose later on, what about (lack of) initialisation then?

Again, if the compiler sees this sort of thing it should emit a warning.

> e) What happens if GetWindowText does return false?
> Because you were in a hurry (or lazy, like me, or both), you forgot to
> code the error handling for this case. No one reminds you of that if
> you don't put in a comment with "TODO" to be searched later on.

Don't allow return values to be ignored for a start.  If you do force a
variable declare for the result, that at least forces a declaration,
which the compiler will complain about if unused.  But the programmer
can always shove it off into a scratch variable if they really want to.

> A better approach would be to have a function...
> 
> string GetWindowText (windowHandle hWnd)
> 
> ...and to use the return value in an EXPRESSION.
> If the function can't return a default or special value on error, an
> exception should be raised and the compiler should provide information
> about that fact to you and warn you, if you choose (much like the way
> it is done in Java).

Yes Java exceptions are fine, except maybe for wanting resume, retry,
etc. capability.  Exceptions are a nice clean system that should be
liberally used where exceptional conditions exist.  It's difficult to
dictate an exception mechnaism for an OS to use though, there are so
many used by languages.  I don't believe we shouldn't ever be writing to
the OS API though, that's just a symptom of our libraries being too
small.  That's why we're seeing the Java-as-OS movement.

> A lesson to be learned from this example is: Return values should be
> used for the result, not for some state information or error code.

Certainly.  Error codes as a return value are frequently ignored while
exceptions are hard to ignore if you have exception lists, like in Java,
where you're forced to provide code to handle the exception.

> There are programming languages where every construct yields a value,
> even loops.

What are you referring to here?  Functional languages?

> The usual semicolon at the end of a "statement" is actually an infix
> operator with a low binding power, which evaluates its left argument,
> then discards it and evaluates and yields its right argument.

I'm not one for the idea.  The expressive power may be there, but
assembly is pretty expressive and regular (these days) too.  Statements
are state changers and functions return values, and never shall they
meet.  =)

> This allows for constructions like these:
> 
> 5+(doSomething(); b)

It's very difficult to read.  This example is degenerate since the
procedure call can be moved out of the expression with no trouble but I
see what it means in general, where "a+(b;c)" and they must be executed
strictly in that order.  I don't feel however that the situation occurs
often enough that you can't do it with variables.  This gives flashbacks
of horrid C code.  =)

> if a then b else c fi := if d then e else f fi;

The use of a conditional expression is good but I'm not convinced it
should look that much like what a conditional statement would.  There's
no side effects so I don't have a problem with the above statement.  I
think you can do this in C++.

> Advantages:
> + concise, short
> + avoids need for temp vars

I do certainly feel screen real estate is important, at least until we
all have 30" monitors.

> Drawbacks
> - reduced locality (tempting to write very long and complicated
> expressions)

Actually, I'd say very short and complicated expressions.  =)

> - compiler has to check for compatible types of intermediate results
> at places where control flow converges (end of if clause, end of loop
> clause). Especially complicated when else is missing.

I don't see this as much of a problem unless you're doing some sort of
dynamic typing.

> I got a question regarding back-translation of a parse tree to some
> user's preferred notation: How about layout and comments? Especially
> layout is specific to a certain notation and will be lost during
> transformation to a parse tree.

Layout can comfortably be removed.  Deparsing is according to the
deparser, smarter ones would use user preferences for layout.  Things
like blank lines would probably need to be handled, but expression
spacing I think can be ignored.

Comments are not free format but rather attached to language elements or
as statements themselves.  They can be deparsed or not, depending on
preferences.

> Some other (mixed) ideas:
> 
>  How about having one general loop construct like this:
> 
> do
>  while <condition>; // Omit this if you don't want a pre-tested loop
>  // (do something)
>  while <condition>; // insert wherever you want to test in the middle of
> a loop
>  // (do something)
>  while <condition>; // Omit this if you don't want a post-tested loop
> od    // or call it endLoop, whatever you like more

I've thought of this sort of thing, but don't think you're gaining
anything here over infinite loop-exit other than prohibiting exiting in
nested blocks and limiting things to one exit point.  Although providing
you leave the more general loop in, it doesn't reduce flexibility.

The gotcha here is this is the opposite of most languages pre-testsed
loops, which are until.  I'm not convinced this really matters though. 
In using a language which allows pre-tested while/until and post-tested
while/until I've found pre-tested until is really common.  And
post-tested while probably is not uncommon either.

In fact, I think until might be more common than while, but I'm not in
favour of the double negative of until not, so wouldn't advocate
removing while.

> Some duplicate code could be avoided if it occurs in places like this:
> 
> if a
>  then b;
>  x;  // duplicate
> else if c
>  then d;
>  x;  // duplicate
> else e;
> fi
> 
> In that language of my own design there was a construction like this:
> 
> if (a)
>  then b;
> else if (c)
>  then d;
> elfi x;
> else e;
> fi
> 
> The semantics is as follows: When encountering an "else" during
> execution of the preceding (then-)clause (b, in this case), jump to
> the next matching fi OR elfi, whatever is nearer.

I've had this sort of idea before, if I'm interpreting you correctly. 
Basically I call them suffixes.  Here you have an if-match suffix.  You
could also have suffixes after there is a return value to examine it,
failure to run-through a pre-tested loop once, if-match prefixes, etc. 
I don't believe the syntactic overhead warrants these rare special cases
though.

I don't believe this situation is as common as mid-tested loops though,
that I have essentially argued for the benefit of in the same way.

-- 
     Matthew Tuck - Software Developer & All-Round Nice Guy
                              ***
       Check out the Ultra programming language project!
              http://www.box.net.au/~matty/ultra/