Declaring arguments to a function

Jeremy Dunn jeremydunn@ibm.net
Sat, 26 Jun 1999 12:38:00 -0700



Hans-Dieter.Dreier@materna.de wrote:


> Let me state some observations from everyday programming:
> 
> 1. Most functions have a fixed small number of arguments.
> Few functions have a fixed large (> 5) number of arguments.
> Very few functions (at least in C) have a variable number of arguments.
> Most of them are I/O, and there is an alternative: use of chained infix operators.


I am not certain that I understand what you mean by "chained infix
operators", could you give me
an example?
 

> 2. The more arguments a function has, the greater the opportunity to make a mistake in the parameter list: If the function has 2 arguments but you supply only one, you will see the difference immediately when lloking at the source code (provided that the expression supplied is not a big ugly monster). If the function has 6 arguments but you supply 5, you won't notice the difference unless you count. Therefore it is less error-prone for a function to have a small number of arguments. I prefer 0-2 arguments.
> 
> 3. If you have a list of tuples, the form (A a, B b, C c, ...) is easier to read and write than the form (A B C, a b c) if the list is long. The second form invites mismatches.

This could be so, but if the argument number rarely creeps above 6 then
there is not much difference. My feeling was that by
splitting the seperate aspects of the declaration into its constituent
parts that the user could focus on any one of those parts by looking on
the particular line that applies to that aspect. The conventional way
requires the user to read across and
down scanning each argument declaration in turn, having to read
everything until what is looked for is found. If I write

[Name-JeremyDunn : BirthYear-1957 : HomeTown-Bellingham,
Name-HarryHoudini : BirthYear-1957 : HomeTown-NewYork]

I could also write this as

[Name(JeremyDunn,HarryHoudini),
 BirthYear(1957),
 HomeTown(Bellingham,NewYork)]

There are some advantages to doing it this way, as one can concentrate
on the category of information quickly. This particular case is more
readable than the first method and displays the information in a more
meaningful manner. The lack of indice on 1957 would have the default
meaning of applying to all the Names and the HomeTowns don't need to be
indiced because they equal the number of Names and correspond directly
to them. Indicing need only occur when the position of the item in the
given list does not correspond with the same indice in the Names list.


> >
> >Declaring argument names is done by giving the name of the argument you
> >are going to use in your program and then following it by the index of
> >the argument position that is going to use that name.
> 
> Counting is additional overhead and yet another possibility to make mistakes.
> Why not use the argument name, something like
> 
> Foo (MySecondArg = 23, MyFirstArg = 4)

We could, but that would defeat the purpose of doing it in the manner
described because we would just be repeating
our argument names over again. Indices take up less room. Counting IS
additional overhead, but if, as you say, functions don't have more than
5 or 6 arguments I don't think this represents much of an extra
overhead.

> > For instance, in a
> >function of two arguments we could write either (Base,Power) without
> >indexes if we have a name for every argument or we could write it as
> >(Base(0),Power(1)) where we declare the indice.
> 
> Can't this be mistaken for a function call (same syntax)?
> Of course, the same applies for my example: A syntax will have to be chosen that does not occur in normal expressions.


Yes, another syntax would be better.


> > Suppose the 1st argument
> >is named Base but we have several related arguments following it of the
> >same type in our program? We could write (Base(0),Power(1-)) where the
> >hyphen means that all arguments following index 0 are related somehow.
> >The compiler would automatically number these extra argument names as
> >Power1, Power2 and so on, and that is how we would call them in our
> >program.
> 
> Nice, but how often do you need such a feature?


Not often. You know, I have a spare tire in my car that I have never had
to use but I think I'll keep it
there anyhow.


> >
> >Defining the argument types is done in a list like (int, float, string)
> >and so on for each argument. If all the arguments are of the same type
> >then you could write (int) to have the type default to int for all of
> >them. Suppose the first three arguments are type int and everything
> >following those are type string? You would write that as
> >(int(-2),string(3-)) where the statement means that arguments at list
> >indexes from 0 to 2 are ints and the indexes from 3 onwards are of type
> >string. Nonconsecutive indexes of the same type could be stated like
> >(int(0,2,4),string(1,3)). (Simple eh?
> 
> It gives me the odd feeling that you are calling functions.


Yes, better sytax again. Maybe curly brackets {}?
 

> >Default values are equivalent to what one does with the "optional"
> >statement in VB. In this case one declares the optional value and
> >follows it with a bracketed list of the index values of the arguments
> >it
> >applies to. The index values for the arguments are numbered starting at
> >0. So if I wanted to declare that argument 3 defaults to 24 then I
> >would
> >write the statement (24(23)). If I wanted arguments 1 thru 4 to default
> >to "A" and argument 5 onward to default to 2 then I would write
> >("A"(-3),2(5-)). I could also have written the start and end indexes
> >explcitly as in ("A"(0-3),2(5-)). If the start index is missing it
> >assumes that it is 0 and if the finish index is missing then it
> >continues for all arguments after the first one. Naturally, one can
> >choose nonconsecutive indexes as in a statement like ("A"(0,2,7)). One
> >may not declare more than one optional value for a given index.
> 
> All very nice, but again, how often does one use it?


But again, the spare tire.

 
> >Now we have three special commands in our language that are very useful
> >in regards to all of this. These three functions have no arguments but
> >return information that is very useful. The 1st function
> >NumberOfArguments() basically counts the number of commas in the
> >argument part of the function and adds one to give the total number of
> >argument spaces that are defined by your statement. So if you wrote
> >
> >FUNCTION(,,,,)
> >
> >with nothing actually input into the statement then NumberOfArguments()
> >would say there are 5 argument slots.
> >
> >The 2nd function is called EmptyArguments(). This function looks at the
> >previous function statement and gives a list of the argument indices
> >that have nothing in them. In our example the list (0,1,2,3,4) would be
> >returned.
> 
> You would write FUNCTION (EmptyArguments ()), right?


No, the EmptyArguments() function is used INSIDE a function to gather
information about what has actually been
supplied to the function. You could write something like

FUNCTION(a,b,c,d){
   Z = EmptyArguments()
   If Member(3,Z) Then <do steps>
}

If the user now uses this function and writes FUNCTION(a,b,c,) where d
is missing then EmptyArguments() will return a list (3) with the indice
of the argument that is there but empty. The function will then compare
3 to the list Z and if 3 is in the
list then a series of steps will be performed. EmptyArguments() enables
the programmer to easily determine what is not there and respond to it.


> >The 3rd function is the complement to the previous and is called
> >FullArguments(). This function returns a list of all indices that
> >actually had some characters typed into them.
> >This way of doing things gives us full argument control and enables us
> >to do some things programatically that cannot be done in other
> >languages.
> 
> That is not quite true. You can always simulate this with standard means like arrays and structures, with modest extra effort. No additional rules required!


Perhaps so, but how modest is this extra effort? Why have the extra
effort in the first place if you can do the task
in a more direct manner?


> >How about an example? Let us write a function called Pwr() that returns
> >the power of a number. Let the statement Pwr(s,t) be equivalent to the
> >statement s^t. Let the function allow us to input up to 5 extra powers
> >so that the statement Pwr(s,t,u,v,w,x) would be equivalent to
> >(((((s^t)^u)^v)^w)^x). If the 1st argument "s" is empty then we wish
> >the
> >base to default to 2.718... the base of logarithms thus Pwr(,t) is
> >equivalent to exp(t). If there is only argument then the function takes
> >the square of whatever you put into it i.e. Pwr(x) is the same as x^2.
> >If there are two arguments and the 2nd argument is empty then the power
> >is assumed to be 3 i.e. Pwr(x,) is the same as x^3. If there is three
> >or
> >more arguments and any of the power arguments are empty then they are
> >assumed to be 2. So the statement Pwr(s,,,) would be the same as
> >(((s^2)^2)^2). We would write our function declaration like this:
> >
> >double Pwr(<7,
> >           (Base(0),Power(1-)),
> >           (double),
> >           (2.718(0),2(1-)),
> >          )
> >{
> >  <program statements>
> >}
> >Now using our special three functions we can access all the argument
> >information we need to write a program that does all of the above.
> >There
> >is no way to write a function with ALL the features described without
> >something like what I have described. I think my way is more intuitive,
> >we do not have to deal with Paramarrays and such.
> 
> I don't think it is more intuitive: It requires you to learn additional rules. It is not self-explanatory (at least not to me). Sorry, but "2(1-)" somehow looks like a syntax error (and in another context, it would be one).


Right, not intuitive, just different. Additional rules? No more so than
normal languages, look in your standard textbook at
how much space is devoted to explaining optional arguments, variable
number of arguments, passing arrays etc. I don't think
this involves EXTRA rules, just different rules.



> >So, does anyone like any of this? Detest it? I await your insightful
> >comments.
> 
> I'd suggest: Let's have a fixed number of arguments and the ability to pass manifest (literal) lists to support variability, like this:
> 
> int: Foo (int: arg1, string: arg2 = "", list(int): arg2 = {1,2}) {...}
> 
> (arg2 defaults to "", arg3 defaults to the list {1,2}).
> 
> With a call like this:
> 
> Foo (3, "Hello", {1,2,3});
> 
> To support "missing" values in a generally usable way, I'd introduce a special "skip" value (to be inserted for void operands) which can also be explicitly written as "?", like in Algol68. If in certain syntactical positions an operand would be missing, "skip" would be assumed. "skip" could be used for any data type, like NULL for pointers of any type in C++. Thus, you could check for missing values like this:
> 
> if xyz == ? then ...
> 
> Given this, we could even do without the default parameter mechanism.


This is good, this is exactly the kind of control I would like to have
in some straightforward fashion.


> IMHO it is important to keep the structure of the language simple. The best way to do this is to have a small set of orthogonal constructions which work the same everywhere. Furthermore, they should be distinguishable from each other without having to look at the context. Ideally, the same token would always stand for the same functionality, regardless of where it would be used. Example:
> 
> For a range, I'd write 1..2 instead of 1-2, because the meaning of "1-2" depends on the context (range 1-2 or numerical expression). Keeping track of the context adds a level of uncertainty to the language novice.
> 
> For the same reason, I'd use () only for expression precedence and function call. Most programmers are used to it, it is common practice and it works well. Using () for other purposes as well doesn't add to clarity. Best (bad) example IMO is the use of <> for template arguments in C++. I'd *never* do it this way.
> 


Your first paragraph here gives me DejaVu Dieter. I recall making the
argument for completely consistent syntax but everyone
balked when I suggested applying it to ALL operators including unary
ones like +-*/. The reason? They just don't like it. It reminds me of
the American attitude about the metric system, they don't want to get
rid of those silly feet and inches. Anything short of what I suggested
automatically depends on context! How can you write pow(a,b) and in the
next line write a+b rather than +(a,b) and not use context to tell what
is going on? That is one reason I don't like most languages, most them
have "special" layouts for certain functions but not others, and each
time they do this requires the user to remember yet another exception to
what could be a completely consistent grammar. I have often found it
interesting that out of the hundreds of languages that have been written
that LISP is probably the only language that is grammatically
consistent. There are only 3 grammars possible:

x FUNC y    unary operators (inherently limited, requires precedence
rules)
FUNC(x,y)   works fine-no precedence
(FUNC,x,y)  works fine-no precedence

If we want the minimum of context rules we are forced to pick one of the
last two forms and carry it out without exception.
Oh well, I am wasting my time. Human nature is against me, people want
feet and inches and the Julian calendar. Enough ranting for today!