From Hans-Dieter.Dreier@materna.de Mon Jul 12 16:33:10 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Mon, 12 Jul 1999 17:33:10 +0200 Subject: Declaring arguments to a function In-Reply-To: <37752C18.1879352D@ibm.net> Message-ID: --4aSfCsv40c4jB8WqLVgxaFPDyTsqvpNo Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable >Hans-Dieter.Dreier@materna.de wrote: > > >> Let me state some observations from everyday programming: >> = >> 1. Most functions have a fixed small number of arguments. >> Few functions have a fixed large (> 5) number of arguments. >> Very few functions (at least in C) have a variable number of arguments. >> Most of them are I/O, and there is an alternative: use of chained infix >operators. > > >I am not certain that I understand what you mean by "chained infix >operators", could you give me >an example? cout << "The result is " << i << "." << endl ... or, to chain strings: str =3D "The result is " + NumToStr (i) + "." + CrLf ... or, to build a list: list =3D a, b, c (Here the "," operator forms a new list from two simple values or appends a= value to a list). >> 3. If you have a list of tuples, the form (A a, B b, C c, ...) is easier= to >read and write than the form (A B C, a b c) if the list is long. The secon= d >form invites mismatches. > >This could be so, but if the argument number rarely creeps above 6 then >there is not much difference. My feeling was that by >splitting the seperate aspects of the declaration into its constituent >parts that the user could focus on any one of those parts by looking on >the particular line that applies to that aspect. The conventional way >requires the user to read across and >down scanning each argument declaration in turn, having to read >everything until what is looked for is found. If I write > >[Name-JeremyDunn : BirthYear-1957 : HomeTown-Bellingham, >Name-HarryHoudini : BirthYear-1957 : HomeTown-NewYork] > >I could also write this as > >[Name(JeremyDunn,HarryHoudini), > BirthYear(1957), > HomeTown(Bellingham,NewYork)] > >There are some advantages to doing it this way, as one can concentrate >on the category of information quickly. This particular case is more >readable than the first method and displays the information in a more >meaningful manner. The lack of indice on 1957 would have the default >meaning of applying to all the Names and the HomeTowns don't need to be >indiced because they equal the number of Names and correspond directly >to them. Indicing need only occur when the position of the item in the >given list does not correspond with the same indice in the Names list. If I forget to specify a value, I'd prefer to get a warning, even if that m= eans that I have to write a little more. Maybe it is simply a matter of per= sonal preferences. How about allowing both forms, alternatively ? >> > Suppose the 1st argument >> >is named Base but we have several related arguments following it of the >> >same type in our program? We could write (Base(0),Power(1-)) where the >> >hyphen means that all arguments following index 0 are related somehow. >> >The compiler would automatically number these extra argument names as >> >Power1, Power2 and so on, and that is how we would call them in our >> >program. >> = >> Nice, but how often do you need such a feature? > > >Not often. You know, I have a spare tire in my car that I have never had >to use but I think I'll keep it >there anyhow. Nice analogy. Actually, I think spare tires are superfluous (if you keep to= populated areas), so we'd save a lot of gas if we'd remove them. How about= including a backup assembler to the compiler, in case the compiler breaks = down? (Just kidding). >> >Now we have three special commands in our language that are very useful >> >in regards to all of this. These three functions have no arguments but >> >return information that is very useful. The 1st function >> >NumberOfArguments() basically counts the number of commas in the >> >argument part of the function and adds one to give the total number of >> >argument spaces that are defined by your statement. So if you wrote >> > >> >FUNCTION(,,,,) >> > >> >with nothing actually input into the statement then NumberOfArguments() >> >would say there are 5 argument slots. >> > >> >The 2nd function is called EmptyArguments(). This function looks at the >> >previous function statement and gives a list of the argument indices >> >that have nothing in them. In our example the list (0,1,2,3,4) would be >> >returned. >> = >> You would write FUNCTION (EmptyArguments ()), right? > > >No, the EmptyArguments() function is used INSIDE a function to gather >information about what has actually been >supplied to the function. You could write something like > >FUNCTION(a,b,c,d){ > Z =3D EmptyArguments() > If Member(3,Z) Then >} > >If the user now uses this function and writes FUNCTION(a,b,c,) where d >is missing then EmptyArguments() will return a list (3) with the indice >of the argument that is there but empty. The function will then compare >3 to the list Z and if 3 is in the >list then a series of steps will be performed. EmptyArguments() enables >the programmer to easily determine what is not there and respond to it. I see. I thought you wanted to supply default values from "the previous fun= ction statement" which I misunderstood as "the previous function call". Why not check the argument directly, like this: FUNCTION(a,b,c,d) { If IsEmpty(d) Then } Isn't that shorter? And you don't need to count arguments. Actually, names = were invented for that purpose. >> >The 3rd function is the complement to the previous and is called >> >FullArguments(). This function returns a list of all indices that >> >actually had some characters typed into them. >> >This way of doing things gives us full argument control and enables us >> >to do some things programatically that cannot be done in other >> >languages. >> = >> That is not quite true. You can always simulate this with standard means >like arrays and structures, with modest extra effort. No additional rules >required! > > >Perhaps so, but how modest is this extra effort? Why have the extra >effort in the first place if you can do the task >in a more direct manner? Well, see example above. Isn't that more "direct" (ie shorter and clearer) = than your approach? > > >> >How about an example? Let us write a function called Pwr() that returns >> >the power of a number. Let the statement Pwr(s,t) be equivalent to the >> >statement s^t. Let the function allow us to input up to 5 extra powers >> >so that the statement Pwr(s,t,u,v,w,x) would be equivalent to >> >(((((s^t)^u)^v)^w)^x). If the 1st argument "s" is empty then we wish >> >the >> >base to default to 2.718... the base of logarithms thus Pwr(,t) is >> >equivalent to exp(t). If there is only argument then the function takes >> >the square of whatever you put into it i.e. Pwr(x) is the same as x^2. >> >If there are two arguments and the 2nd argument is empty then the power >> >is assumed to be 3 i.e. Pwr(x,) is the same as x^3. If there is three >> >or >> >more arguments and any of the power arguments are empty then they are >> >assumed to be 2. So the statement Pwr(s,,,) would be the same as >> >(((s^2)^2)^2). We would write our function declaration like this: >> > >> >double Pwr(<7, >> > (Base(0),Power(1-)), >> > (double), >> > (2.718(0),2(1-)), >> > ) >> >{ >> > >> >} >> >Now using our special three functions we can access all the argument >> >information we need to write a program that does all of the above. >> >There >> >is no way to write a function with ALL the features described without >> >something like what I have described. I think my way is more intuitive, >> >we do not have to deal with Paramarrays and such. >> = >> I don't think it is more intuitive: It requires you to learn additional >rules. It is not self-explanatory (at least not to me). Sorry, but "2(1-)" >somehow looks like a syntax error (and in another context, it would be one= ). > > >Right, not intuitive, just different. Additional rules? No more so than >normal languages, look in your standard textbook at >how much space is devoted to explaining optional arguments, variable >number of arguments, passing arrays etc. I don't think >this involves EXTRA rules, just different rules. I would not allow variable numbers of arguments: use lists instead. I would= not allow optional arguments: Either have a "skip" value (similar to C-NUL= L pointers, but potentially usable for any type), or -better still, because= it requires no extra rule- somehow derive all potentially-optional values = from a (predefined) singleton class that serves the purpose of supplying a = "skip" value which can be tested for. >> IMHO it is important to keep the structure of the language simple. The b= est >way to do this is to have a small set of orthogonal constructions which wo= rk >the same everywhere. Furthermore, they should be distinguishable from each >other without having to look at the context. Ideally, the same token would >always stand for the same functionality, regardless of where it would be u= sed. >Example: >> = >> For a range, I'd write 1..2 instead of 1-2, because the meaning of "1-2" >depends on the context (range 1-2 or numerical expression). Keeping track = of >the context adds a level of uncertainty to the language novice. >> = >> For the same reason, I'd use () only for expression precedence and funct= ion >call. Most programmers are used to it, it is common practice and it works >well. Using () for other purposes as well doesn't add to clarity. Best (ba= d) >example IMO is the use of <> for template arguments in C++. I'd *never* do= it >this way. >> = > > >Your first paragraph here gives me DejaVu Dieter. I recall making the >argument for completely consistent syntax but everyone >balked when I suggested applying it to ALL operators including unary >ones like +-*/. The reason? They just don't like it. It reminds me of >the American attitude about the metric system, they don't want to get >rid of those silly feet and inches. Anything short of what I suggested >automatically depends on context! How can you write pow(a,b) and in the >next line write a+b rather than +(a,b) and not use context to tell what >is going on? That is one reason I don't like most languages, most them >have "special" layouts for certain functions but not others, and each >time they do this requires the user to remember yet another exception to >what could be a completely consistent grammar. I have often found it >interesting that out of the hundreds of languages that have been written >that LISP is probably the only language that is grammatically >consistent. There are only 3 grammars possible: > >x FUNC y unary operators (inherently limited, requires precedence >rules) >FUNC(x,y) works fine-no precedence >(FUNC,x,y) works fine-no precedence Why do you call +-*/ "unary" operators? They require two operands, therefor= e they should be named "binary" or "infix" operators. How do you call opera= tors that take just one argument, like "-" (to take the negative value of a= number)? >If we want the minimum of context rules we are forced to pick one of the >last two forms and carry it out without exception. >Oh well, I am wasting my time. Human nature is against me, people want >feet and inches and the Julian calendar. Enough ranting for today! > Maybe the truth lies in the middle. Certainly, LISP has a very simple synta= x, which is good, but it pays dearly by requiring a pretty printer just to = be readable. Taken to extremes, we could do with just two symbols, 0 and 1,= and encode everything binary, or we could go the APL way and invent a spec= ial character set which has a special symbol for each keyword (very short b= ut, again, hard to read). IMO it is not by chance that most "main stream" p= rogramming languages nowadays look so similar to each other. It's the same = as with bicycles or pencils: A mature design is hardly improvable unless do= ne in a completely new way (say, programming by data glove). Regarding the syntactic handling of special (ie operator) symbols like ordi= nary names: You are right, in principle, BUT if you also allow the "traditi= onal" form, the syntax tends to get too permissive, allowing too many typos= to form legal (but nonsensical) expressions. For this reason, it may also = be harder to read by humans. And if you don't have traditional infix operat= ors, you end up with LISP-like syntax. IMO questions of program representation are not so important if the *struct= ure* and the *concepts* of a language are well designed. Remember, one of t= he goals of Ultra is to allow for *different* front-ends: LISP-like or APL-= like or C-like: no problem as long as there is a transformation to internal= repesentation, which must be chosen to be as flexible as possible (AST). A= nd of course, there must be runtime support for the services needed by the = language, but that is exchangeable and extendable, if done right. So, let's not talk about syntax. Let's talk about language structure and co= ncepts and the foundations on which they build: memory management and execu= tion mechanism, and not worry about representation. -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --4aSfCsv40c4jB8WqLVgxaFPDyTsqvpNo Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a22459a.txt IA5 DX-MAIL X.400 User Agent= --4aSfCsv40c4jB8WqLVgxaFPDyTsqvpNo-- From jeremydunn@ibm.net Tue Jul 13 22:59:48 1999 From: jeremydunn@ibm.net (Jeremy Dunn) Date: Tue, 13 Jul 1999 14:59:48 -0700 Subject: Declaring arguments to a function References: Message-ID: <378BB6D4.9197EFF8@ibm.net> Hans-Dieter.Dreier@materna.de wrote: > >I am not certain that I understand what you mean by "chained infix > >operators", could you give me > >an example? > > cout << "The result is " << i << "." << endl > > ... or, to chain strings: > > str = "The result is " + NumToStr (i) + "." + CrLf > > ... or, to build a list: > > list = a, b, c > > (Here the "," operator forms a new list from two simple values or appends a value to a list). Thanks, I know what you mean now. > >[Name-JeremyDunn : BirthYear-1957 : HomeTown-Bellingham, > >Name-HarryHoudini : BirthYear-1957 : HomeTown-NewYork] > > > >I could also write this as > > > >[Name(JeremyDunn,HarryHoudini), > > BirthYear(1957), > > HomeTown(Bellingham,NewYork)] > > > >There are some advantages to doing it this way, as one can concentrate > >on the category of information quickly. This particular case is more > >readable than the first method and displays the information in a more > >meaningful manner. The lack of indice on 1957 would have the default > >meaning of applying to all the Names and the HomeTowns don't need to be > >indiced because they equal the number of Names and correspond directly > >to them. Indicing need only occur when the position of the item in the > >given list does not correspond with the same indice in the Names list. > > If I forget to specify a value, I'd prefer to get a warning, even if that means that I have to write a little more. Maybe it is simply a matter of personal preferences. How about allowing both forms, alternatively ? Sure, I'm all for freedom of choice. Let the given situation dictate which form is more useful for readibility. > >> You would write FUNCTION (EmptyArguments ()), right? > > > >No, the EmptyArguments() function is used INSIDE a function to gather > >information about what has actually been > >supplied to the function. You could write something like > > > >FUNCTION(a,b,c,d){ > > Z = EmptyArguments() > > If Member(3,Z) Then > >} > > > >If the user now uses this function and writes FUNCTION(a,b,c,) where d > >is missing then EmptyArguments() will return a list (3) with the indice > >of the argument that is there but empty. The function will then compare > >3 to the list Z and if 3 is in the > >list then a series of steps will be performed. EmptyArguments() enables > >the programmer to easily determine what is not there and respond to it. > > I see. I thought you wanted to supply default values from "the previous function statement" which I misunderstood as "the previous function call". > > Why not check the argument directly, like this: > > FUNCTION(a,b,c,d) { > If IsEmpty(d) Then > } > > Isn't that shorter? And you don't need to count arguments. Actually, names were invented for that purpose. Sure, that would be fine. My main concern was with being able to have empty arguments without getting syntax errors. VB will let you write a statement like that but will not allow you to have an expression like FUNCTION(a,b,,,). Your arguments can be optional and not there but they cannot be empty. Using a ParamArray I have been able to use a little trick with the first argument to write something like FUNCTION(,a,b) but it just doesn't like that anywhere else. > I would not allow variable numbers of arguments: use lists instead. I would not allow optional arguments: Either have a "skip" value (similar to C-NULL pointers, but potentially usable for any type), or -better still, because it requires no extra rule- somehow derive all potentially-optional values from a (predefined) singleton class that serves the purpose of supplying a "skip" value which can be tested for. > I don't know that I want the overhead of having to format my input in a list before I can input it into a function to get a result. For some functions this would be the logical way to do it, but for others I think it would be clunky. Using AutoLISP in AutoCAD requires one to input variable numbers of arguments in this manner and I often found that it required more verbiage than should have been necessary and detracted from the function having the most simple form possible. Suppose the addition function required you to supply your numbers as a list instead of being able to write a+b+c+ or +(a,b,c). I don't think I like this as a general rule. If I understand you correctly, if we have a function of the form FUNCTION(a,b,c,d,e) that can have up to 5 arguments and the last three arguments would normally be optional then you would require me to write FUNCTION(a,b,,,) if I don't need the last three? I would not be able to write FUNCTION(a,b)? > >Your first paragraph here gives me DejaVu Dieter. I recall making the > >argument for completely consistent syntax but everyone > >balked when I suggested applying it to ALL operators including unary > >ones like +-*/. The reason? They just don't like it. It reminds me of > >the American attitude about the metric system, they don't want to get > >rid of those silly feet and inches. Anything short of what I suggested > >automatically depends on context! How can you write pow(a,b) and in the > >next line write a+b rather than +(a,b) and not use context to tell what > >is going on? That is one reason I don't like most languages, most them > >have "special" layouts for certain functions but not others, and each > >time they do this requires the user to remember yet another exception to > >what could be a completely consistent grammar. I have often found it > >interesting that out of the hundreds of languages that have been written > >that LISP is probably the only language that is grammatically > >consistent. There are only 3 grammars possible: > > > >x FUNC y unary operators (inherently limited, requires precedence > >rules) > >FUNC(x,y) works fine-no precedence > >(FUNC,x,y) works fine-no precedence > > Why do you call +-*/ "unary" operators? They require two operands, therefore they should be named "binary" or "infix" operators. How do you call operators that take just one argument, like "-" (to take the negative value of a number)? You're right, bad terminology on my part. Even with my bad terminology the "-" operator when used to indicate negation is really not a unary operator because -X is simply a shorthand for 0-X which is a binary operator. > >If we want the minimum of context rules we are forced to pick one of the > >last two forms and carry it out without exception. > >Oh well, I am wasting my time. Human nature is against me, people want > >feet and inches and the Julian calendar. Enough ranting for today! > Maybe the truth lies in the middle. Certainly, LISP has a very simple syntax, which is good, but it pays dearly by requiring a pretty printer just to be readable. Taken to extremes, we could do with just two symbols, 0 and 1, and encode everything binary, or we could go the APL way and invent a special character set which has a special symbol for each keyword (very short but, again, hard to read). IMO it is not by chance that most "main stream" programming languages nowadays look so similar to each other. It's the same as with bicycles or pencils: A mature design is hardly improvable unless done in a completely new way (say, programming by data glove). > I'll take the readability of LISP over PERL any day. You're right, it is not by accident that most mainstream languages look similar to each other. Most TV programs look similar to each other also, but I don't think that is necessarily an indicator of good writing. C for instance uses << in a streaming context and elsewhere as a binary left shift operator, a very bad human factors design in my opinion. Probably most language designers come from a C background and kludge their syntax on top of it so that they can minimize their effort of relearning. I learned a GIS system created by programmers, it had the most godawful bad human inteface a person could create, it was almost as though they deliberately designed it to be as bad as possible. I am not certain that programmers are necessarily the most qualified to design syntax. Has anyone got any information other than personal opinion that shows a tabulation of different types of syntaxes and how difficult or readable a particular syntax is? I often hear "that is harder to read" (including me!) without any human factors experimental results to back up such a statement. > Regarding the syntactic handling of special (ie operator) symbols like ordinary names: You are right, in principle, BUT if you also allow the "traditional" form, the syntax tends to get too permissive, allowing too many typos to form legal (but nonsensical) expressions. For this reason, it may also be harder to read by humans. And if you don't have traditional infix operators, you end up with LISP-like syntax. > Right, or Mathematica syntax which I also love. The nice thing about consistent syntax is that once you know it for one function you know it for ALL of them. Consider VB, first you learn to write A + B then you learn to write Abs(X) then you learn the special forms for a Select Case statement (remember those colons!). That isn't the end, the FOR function has a special syntax, CONST statements have a special form. It is like this with all the C-like languages out there and that is why they are all such a pain in the ass to dive into. > IMO questions of program representation are not so important if the *structure* and the *concepts* of a language are well designed. Remember, one of the goals of Ultra is to allow for *different* front-ends: LISP-like or APL-like or C-like: no problem as long as there is a transformation to internal repesentation, which must be chosen to be as flexible as possible (AST). And of course, there must be runtime support for the services needed by the language, but that is exchangeable and extendable, if done right. > > So, let's not talk about syntax. Let's talk about language structure and concepts and the foundations on which they build: memory management and execution mechanism, and not worry about representation. This is probably true, the structure and concepts of the language are a higher priority. However, syntax discussion IS important because you are not going to be able to create this general front end without input as to what kind of syntax forms people find important to their needs. To say let's not talk about syntax and not worry about representation is to me a little like saying let's concentrate on the carburetor and the piston and not concern ourselves as to what the car looks like. If you give the user a 454 V8 and he wants to put a Pinto body on it he might be frustrated. Syntax discussion provokes thoughts about ways of doing things that one might normally have not thought of, I know this discussing has been very helpful to me. Yours, Jeremy Dunn From Hans-Dieter.Dreier@materna.de Wed Jul 14 14:49:41 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Wed, 14 Jul 1999 15:49:41 +0200 Subject: Declaring arguments to a function In-Reply-To: <378BB6D4.9197EFF8@ibm.net> Message-ID: --GAIRQi3ui68RXMAWVHKroHx76ts0N5fm Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable >> Why not check the argument directly, like this: >> = >> FUNCTION(a,b,c,d) { >> If IsEmpty(d) Then >> } >> = >> Isn't that shorter? And you don't need to count arguments. Actually, nam= es >were invented for that purpose. > > > >Sure, that would be fine. My main concern was with being able to have >empty arguments without getting syntax errors. VB will let you write a >statement like that but will not allow you to have an expression like >FUNCTION(a,b,,,). Your arguments can be optional and not there but they >cannot be empty. Using a ParamArray I have been able to use a little >trick with the first argument to write something like FUNCTION(,a,b) but >it just doesn't like that anywhere else. If we define that missing operands default to "skip" (which can be explictl= y written as "?"), it would be legal to write FUNCTION (a,b,,,) which would= be the same as FUNCTION (a,b,?,?,?). It would also be legal to write If d =3D ? Then ... or even If d =3D Then ... which is nice, orthogonal and invites a lot of errors due to missing operan= ds. >> I would not allow variable numbers of arguments: use lists instead. I wo= uld >not allow optional arguments: Either have a "skip" value (similar to C-NUL= L >pointers, but potentially usable for any type), or -better still, because = it >requires no extra rule- somehow derive all potentially-optional values fro= m a >(predefined) singleton class that serves the purpose of supplying a "skip" >value which can be tested for. >> > > >I don't know that I want the overhead of having to format my input in a >list before I can input it into a function to get a result. Not that much overhead: Just add extra parentheses to form a list printf ("%s%d", ("What's the meaning of it all?", 42)) >For some >functions this would be the logical way to do it, but for others I think >it would be clunky. Using AutoLISP in AutoCAD requires one to input >variable numbers of arguments in this manner and I often found that it >required more verbiage than should have been necessary and detracted >from the function having the most simple form possible. Suppose the >addition function required you to supply your numbers as a list instead >of being able to write a+b+c+ or +(a,b,c). I don't think I like this as >a general rule. If I understand you correctly, if we have a function of >the form FUNCTION(a,b,c,d,e) that can have up to 5 arguments and the >last three arguments would normally be optional then you would require >me to write FUNCTION(a,b,,,) if I don't need the last three? I would not >be able to write FUNCTION(a,b)? Normally yes, but one could define that missing parameters default to "skip= " (and risk more programming errors due to accidentally omitting a paramete= r). It's the old story: Either have a liberal syntax at the expense of inviting= errors, or have a strict one, requiring more writing but being less error = prone. >I'll take the readability of LISP over PERL any day. You're right, it is >not by accident that most mainstream languages look similar to each >other. Most TV programs look similar to each other also, but I don't >think that is necessarily an indicator of good writing. Now you're mixing style and content. >C for instance >uses << in a streaming context and elsewhere as a binary left shift >operator, a very bad human factors design in my opinion. I agree - IMHO C (and C++ to an even greater extent) are no shining example= s of well designed syntax. See multiple purpose use of "break", "static", "= <", ">". >[...] Has >anyone got any information other than personal opinion that shows a >tabulation of different types of syntaxes and how difficult or readable >a particular syntax is? Not AFAIK. Readability certainly is hard to quantify. >I often hear "that is harder to read" (including me!) without any human >factors experimental results to back up such a statement. True. People tend to like best what they are used to, we have to accept thi= s, it's another point in favour of sticking to standards. >> IMO questions of program representation are not so important if the >*structure* and the *concepts* of a language are well designed. Remember, = one >of the goals of Ultra is to allow for *different* front-ends: LISP-like or >APL-like or C-like: no problem as long as there is a transformation to >internal repesentation, which must be chosen to be as flexible as possible >(AST). And of course, there must be runtime support for the services neede= d by >the language, but that is exchangeable and extendable, if done right. >> = >> So, let's not talk about syntax. Let's talk about language structure and >concepts and the foundations on which they build: memory management and >execution mechanism, and not worry about representation. > > >This is probably true, the structure and concepts of the language are a >higher priority. However, syntax discussion IS important because you are >not going to be able to create this general front end without input as >to what kind of syntax forms people find important to their needs. To >say let's not talk about syntax and not worry about representation is to >me a little like saying let's concentrate on the carburetor and the >piston and not concern ourselves as to what the car looks like. If you >give the user a 454 V8 and he wants to put a Pinto body on it he might >be frustrated. Syntax discussion provokes thoughts about ways of doing >things that one might normally have not thought of, I know this >discussing has been very helpful to me. I was assuming that there is a (suitable) translation between representatio= n and language. Given the language, it should be rather easy to find one. K= eeping the language structure simple, orthogonal yet powerful - that's the = interesting part. Of course it is easier to find a nice syntax for a simple= language than for a complicated one. >Yours, > >Jeremy Dunn > -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --GAIRQi3ui68RXMAWVHKroHx76ts0N5fm Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a20973a.txt IA5 DX-MAIL X.400 User Agent= --GAIRQi3ui68RXMAWVHKroHx76ts0N5fm-- From Hans-Dieter.Dreier@materna.de Wed Jul 14 18:04:33 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Wed, 14 Jul 1999 19:04:33 +0200 Subject: NULLs Message-ID: --ChvRtRexp8smkycWO3l1k6yGDWrbnahn Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable NULLs =3D=3D=3D=3D=3D Often there is a situation where a special value is needed that expresses t= he fact that there is "no" value. Examples include NULL pointers in C, "skip" values in Algol68, "NANs" (Not-= A-Number) in arithmetics, NULL in SQL. They share the following common characteristics: 1. There is only one such value (no multiple instances). 2. It can show up whereever a value of the underlying types is expected (po= inter, any value, numbers), but ... 3. ... it is a type of its own, therefore ... 4. ... only a restricted set (if any) of the operations possible for the un= derlying type(s) is allowed for it. NULLs are very convenient but they also introduce some problems: 1. They do not fit nicely into a statical type system. 2. They require (runtime) checks whereever they can appear in order to ensu= re that no invalid operations are performed on them. The solutions C++ uses look somewhat unsatisfactory to me: One is only applicable to reference types (which may be either pointers, X*= , allowing NULLs, or references, X&, disallowing them), the other (union and casts) depend on correct handling by the programmer an= d may cause crashes that the compiler cannot detect. To me, problem #1 seems to call for something like a "union" of types, where an object may be of any of a number of otherwise unrelated types and = the compiler sees to it that no invalid operations can happen. More specifically, only operations are allowed that are legal for the most = derived type that is a common ancestor of all the possible types of the val= ue in question. Problem #2 is IMO best avoided by forbidding this situation. Somehow like this: A, B: aUnion // object named aUnion may be an A or a B ... to be used like this: if aUnion.classOf =3D=3D A // classOf is allowed for any object then // here compiler knows aUnion is an A else // here compiler knows it is something else, in = this case it must be a B IOW, you cannot use aUnion as an A unless you make sure (by a contional) it= really *is* an A. Another possibility would be to make sure that an exception is raised if an= invalid operation takes place, but performing a type check each time (to make sure the exception is raised= ) is likely to be too slow. Any ideas or comments? -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --ChvRtRexp8smkycWO3l1k6yGDWrbnahn Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a23070a.txt IA5 DX-MAIL X.400 User Agent= --ChvRtRexp8smkycWO3l1k6yGDWrbnahn-- From Hans-Dieter.Dreier@materna.de Fri Jul 16 16:56:48 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Fri, 16 Jul 1999 17:56:48 +0200 Subject: NULLs Message-ID: --s0K7sYVmYGo9dY2m9fwxYBDXqtUOv7FY Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable >> >I think that it'd be possible to offer a higher-level construct = >for =3D >> = >> >this than checking to see if a pointer equaled NULL, though. Maybe = >a =3D >> = >> >defined keyword? Would this fit with the static type system? Of =3D >> = >> >course, it would probably be implemented by checking to see if the = >=3D >> = >> >reference was set to NULL but we don't need to know that. >> = >> Could you show an example? >> = > >void Foo(Bar* pBar) >{ > if(valid(pBar)) > { > // pBar points to a valid Bar object. > } >} > >valid() could be keyword much like sizeof()in C++. The compiler would = >require you check to make sure that a pointer was valid before = >allowing you to use it unless of course you're going to support some = >sort of precondition clauses like eiffel. If a precondition was that = >the pointer had to be valid, then you wouldn't need to check it in = >your function body. I already thought it would like that. It can be done this way, of course; b= ut I think it's not general enough - it covers just the case of NULL pointe= rs and nothing else. IMO a construct should be as generally applicable as p= ossible. >I'm actually starting to like your idea of union types. They might = >come in handy for functions that could possibly work on two disparate = >types. Should it be possible to define a union of types and use that = >union as the type? Yes. It must be possible to use the union of types everywhere a type is all= owed. Example: proc ((number | null) a) string OptionalNumberToString { if classOf a =3D=3D null then "" else NumberToString (a) } This could be the definition of a function which accepts a number or a null= and returns a string. Of course, you would define a type class optionalNumber : number | null and use that in instead, for convenience. I chose the "|" symbol to denote = a union in order to avoid confusion with the list-building symbol. ":" mean= s "is derived from (like in C++). >For example: > >union MyUnion : Foo, Bar, Baz; > >void doSomething(MyUnion* xxx) >{ > // now you have to do some sort of switch on the type of xxx >} > >Should you required that you declare all pointers as a union between = >the type you want and the NULL type or should that be implied? You would define a parameterized class from the predefined parameterized cl= ass "ref" that does not allow NULL pointers: class pointer : ref | null; pointer ptr1; ref ptr2; ptr1 allows NULLs, ptr2 doesn't. = BTW: I used C++ notation for the parameterized ("template") class here beca= use I assume you are familiar with C++, but I would rather not use "<>" for= a production syntax. My syntax could look like this: class: pointer a =3D ref a | null pointer int: ptr1 ref int: ptr2 On the left of the colon is the class of what you define, then comes the na= me, then an eqal sign and the value. If something has only one parameter, y= ou may omit the parentheses. So you could also write: pointer (int): ptr1 I only mentioned this because you might see examples of mine using similar = syntax. For now, every syntax is OK for discussion purposes as long as we u= nderstand the meaning - I think we should not get involved into a discussio= n about syntax here. = >By the way, did you know that all of your equal ("=3D") signs are being = >translated into "=3D3D" for some strange reason? I suspect it must be the combination of your mail/news reader and mine, bec= ause I never saw this before. I also notice that my line feeds seem to get = a preceding equal sign. > >Bye, >Jason. > > > >-- >Is your email secure? http://www.pop3now.com >(c) 1998,1999 Cave Creations Corp. All rights reserved. > > Do you mind if I cross-post our recent conversation to the Ultra mailing li= st? If we post to that list, we could let the others participate. They certainl= y have interesting things to say. -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --s0K7sYVmYGo9dY2m9fwxYBDXqtUOv7FY Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a23805a.txt IA5 DX-MAIL X.400 User Agent= --s0K7sYVmYGo9dY2m9fwxYBDXqtUOv7FY-- From Hans-Dieter.Dreier@materna.de Fri Jul 16 16:55:04 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Fri, 16 Jul 1999 17:55:04 +0200 Subject: NULLs Message-ID: --MwYNystOA8slDioWNNMPyJPxUr4iRbLo Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable jason@injektilo.org wrote: >Hi. I've been lurking for a while. Thought I'd add my $0.02 for a = >change. > >> >> NULLs are very convenient but they also introduce some problems: >> = >> 1. They do not fit nicely into a statical type system. > >Is this because you can assign NULL to any variable regardless of its = >type? Is this really that bad? No. The problem comes later: When you use a variable that is NULL. In C++, = a frequent error situation is "this =3D=3D NULL" due to a method call in a = situation like this: MyClass* myPtr; ... myPtr =3D NULL; ... myPtr->Foo (); // bang ! >> A, B: aUnion // object named aUnion may be an A or = >a B >> = >> .... to be used like this: >> = >> if aUnion.classOf =3D=3D A // classOf is allowed for any object >> then // here compiler knows aUnion is an A >> else // here compiler knows it is something = >else, in =3D >> this case it must be a B >> = >> IOW, you cannot use aUnion as an A unless you make sure (by a = >contional) it=3D >> really *is* an A. > >How is this different than checking to see if the variable in = >question is set to NULL? No difference. The important fact is that the compiler *checks* to make sur= e the required condition holds, and won't let you write things like aUnion.Foo () in places where aUnion might be an object (such as NULL) that does not supp= ort Foo (). BTW, if the compiler would not only keep track of types but als= o keep track of values, the same mechanism would rule out stuff like ... if x < 0 then y =3D sqrt (x) because the compiler would know that x < 0 in this place, which violates th= e precondition of sqrt that demands the argument to be >=3D 0, *unless* made sure that x >=3D 0 (in which case the compiler should emit a = warning about unreachable code). >You might actually be on to something. Some programmers claim that = >one should always check every pointer passed or returned to them as = >if they could possibly be NULL. This is unarguably a good programming = >practice. What if a language required that you check each pointer for = >NULL before you could use it in a function or after a function call? = >If the result of a function could never be NULL, then you could = >return a reference in which case the compiler wouldn't require you to = >check to see if it's NULL. You get my point. The compiler would have to keep track of possible types (= and maybe values) for each identifier and intermediate result to avoid forc= ing the programmer to write unneccessary checks, however. Keeping track of possible types is rather easy because the type set can be = easily handled. Keeping track of possible values is hard; so it seems sensi= ble to me to try to map as much as possible of the checking sketched above = to type checking. >I think that it'd be possible to offer a higher-level construct for = >this than checking to see if a pointer equaled NULL, though. Maybe a = >defined keyword? Would this fit with the static type system? Of = >course, it would probably be implemented by checking to see if the = >reference was set to NULL but we don't need to know that. Could you show an example? > >Jason Diamond. > > > >-- >Is your email secure? http://www.pop3now.com >(c) 1998,1999 Cave Creations Corp. All rights reserved. > > -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --MwYNystOA8slDioWNNMPyJPxUr4iRbLo Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a23701a.txt IA5 DX-MAIL X.400 User Agent= --MwYNystOA8slDioWNNMPyJPxUr4iRbLo-- From Hans-Dieter.Dreier@materna.de Fri Jul 16 16:54:32 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Fri, 16 Jul 1999 17:54:32 +0200 Subject: NULLs Message-ID: --pphy9bOpz9PSoBb7qCJwLeMO5ELzkUQP Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable (Forwarding a message from jason@injektilo.org): Hi. I've been lurking for a while. Thought I'd add my $0.02 for a = change. > > NULLs are very convenient but they also introduce some problems: > = > 1. They do not fit nicely into a statical type system. Is this because you can assign NULL to any variable regardless of its = type? Is this really that bad? > = > A, B: aUnion // object named aUnion may be an A or = a B > = > .... to be used like this: > = > if aUnion.classOf =3D3D=3D3D A // classOf is allowed for any object > then // here compiler knows aUnion is an A > else // here compiler knows it is something = else, in =3D > this case it must be a B > = > IOW, you cannot use aUnion as an A unless you make sure (by a = contional) it=3D > really *is* an A. How is this different than checking to see if the variable in = question is set to NULL? You might actually be on to something. Some programmers claim that = one should always check every pointer passed or returned to them as = if they could possibly be NULL. This is unarguably a good programming = practice. What if a language required that you check each pointer for = NULL before you could use it in a function or after a function call? = If the result of a function could never be NULL, then you could = return a reference in which case the compiler wouldn't require you to = check to see if it's NULL. I think that it'd be possible to offer a higher-level construct for = this than checking to see if a pointer equaled NULL, though. Maybe a = defined keyword? Would this fit with the static type system? Of = course, it would probably be implemented by checking to see if the = reference was set to NULL but we don't need to know that. Jason Diamond. -- Is your email secure? http://www.pop3now.com (c) 1998,1999 Cave Creations Corp. All rights reserved. --pphy9bOpz9PSoBb7qCJwLeMO5ELzkUQP Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a23670a.txt IA5 DX-MAIL X.400 User Agent= --pphy9bOpz9PSoBb7qCJwLeMO5ELzkUQP-- From Hans-Dieter.Dreier@materna.de Fri Jul 16 16:58:35 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Fri, 16 Jul 1999 17:58:35 +0200 Subject: NULLs Message-ID: --xbHk5r0jrNdugTp9k8rgZAGu92HJAqIp Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable (Forwarding a message from jason@injektilo.org) Hi. Hans-Dieter.Dreier@Materna.DE wrote: > You would define a parameterized class from the predefined parameterized = class "ref" that does not allow NULL pointers: > > class pointer : ref | null; > > pointer ptr1; > ref ptr2; > > ptr1 allows NULLs, ptr2 doesn't. > This could be a problem. I'm assuming that you're declaring that pointer de= rives from ref, correct? If that's the case, then any operation that accept= s a ref as a parameter will expect that parameter not to be null. If you pa= ss in a pointer (which should be ok because pointer is a subtype of ref) it could possibly be null= and your operation won't even bother to check for this because it thinks i= t only has a ref. I think that you really want the pointer class to be a supertype of ref. Th= is would prevent you passing a pointer to an operation that required a ref.= Sather has this concept although I've only read about it so couldn't tell = you how well it works in practice. > > I only mentioned this because you might see examples of mine using simila= r syntax. For now, every syntax is OK for discussion purposes as long as we= understand the meaning - I think we should not get involved into a discuss= ion about syntax here. > Agreed and understood. Syntax is important--just like the GUI for any appli= cation--but it's the semantics that really counts. > > > Do you mind if I cross-post our recent conversation to the Ultra mailing = list? > If we post to that list, we could let the others participate. They certai= nly have interesting things to say. > Go for it. When I intially replied I thought I was replying to list but the= web-based mail reader I was using from work apparently can't reply to all. Jason. -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --xbHk5r0jrNdugTp9k8rgZAGu92HJAqIp Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a23912a.txt IA5 DX-MAIL X.400 User Agent= --xbHk5r0jrNdugTp9k8rgZAGu92HJAqIp-- From Hans-Dieter.Dreier@materna.de Fri Jul 16 16:56:23 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Fri, 16 Jul 1999 17:56:23 +0200 Subject: NULLs Message-ID: --adrVfTaz1eOAQC0AgbvVMeQgtIR5h8JV Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable (Forwarding a message from jason@injektilo.org) > Keeping track of possible types is rather easy because the type set = can be =3D > easily handled. Keeping track of possible values is hard; so it = seems sensi=3D > ble to me to try to map as much as possible of the checking = sketched above =3D > to type checking. > = I agree. If you're going to support preconditions, you might as well = let those do the checking for you. I think that it usually won't be = too obvious what the value of a variable is. Unless, of course, it = was a constant. Either way, it's a feature that could be optionally = added to the compiler and not affect the language in any way. > >I think that it'd be possible to offer a higher-level construct = for =3D > = > >this than checking to see if a pointer equaled NULL, though. Maybe = a =3D > = > >defined keyword? Would this fit with the static type system? Of =3D > = > >course, it would probably be implemented by checking to see if the = =3D > = > >reference was set to NULL but we don't need to know that. > = > Could you show an example? > = void Foo(Bar* pBar) { if(valid(pBar)) { // pBar points to a valid Bar object. } } valid() could be keyword much like sizeof()in C++. The compiler would = require you check to make sure that a pointer was valid before = allowing you to use it unless of course you're going to support some = sort of precondition clauses like eiffel. If a precondition was that = the pointer had to be valid, then you wouldn't need to check it in = your function body. I'm actually starting to like your idea of union types. They might = come in handy for functions that could possibly work on two disparate = types. Should it be possible to define a union of types and use that = union as the type? For example: union MyUnion : Foo, Bar, Baz; void doSomething(MyUnion* xxx) { // now you have to do some sort of switch on the type of xxx } Should you required that you declare all pointers as a union between = the type you want and the NULL type or should that be implied? By the way, did you know that all of your equal ("=3D") signs are being = translated into "=3D3D" for some strange reason? Bye, Jason. -- Is your email secure? http://www.pop3now.com (c) 1998,1999 Cave Creations Corp. All rights reserved. --adrVfTaz1eOAQC0AgbvVMeQgtIR5h8JV Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a23781a.txt IA5 DX-MAIL X.400 User Agent= --adrVfTaz1eOAQC0AgbvVMeQgtIR5h8JV-- From Hans-Dieter.Dreier@materna.de Fri Jul 16 17:42:48 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Fri, 16 Jul 1999 18:42:48 +0200 Subject: NULLs In-Reply-To: <378F5312.F1ABA0A0@injektilo.org> Message-ID: --GcO7w8ZkekqBxKaI7BOzgRxLaVM6tJFK Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable >Hi. > >Hans-Dieter.Dreier@Materna.DE wrote: > >> You would define a parameterized class from the predefined parameterized >class "ref" that does not allow NULL pointers: >> >> class pointer : ref | null; >> >> pointer ptr1; >> ref ptr2; >> >> ptr1 allows NULLs, ptr2 doesn't. >> > >This could be a problem. I'm assuming that you're declaring that pointer >derives from ref, correct? No. It *may* be a ref (genuine ref, not derived from ref), but it may also = be a NULL. Or contain a null? I'm not sure. Thinking it over, it seems to b= e a "has-a" relationship rather "is-a". Anyway, as long as no check has been made, the compiler will allow only ope= rations of the the most derived common ancestor of both ref and null becaus= e that is the best guess it can make. If the right hand side of the definition contains a "|" (meaning "...or may= be a..." in this case), there is no direct inheritance involved. Maybe it= would have been better to pick another syntax altogether. >If that's the case, then any operation that accepts >a ref as a parameter will expect that parameter not to be null. Correct. > If you pass in >a pointer (which >should be ok because pointer is a subtype of ref) it could possibly be nul= l >and your operation won't even bother to check for this because it thinks i= t >only has a ref. > >I think that you really want the pointer class to be a supertype of ref. T= his >would prevent you passing a pointer to an operation that required a ref. >Sather has this concept although I've only read about it so couldn't tell = you >how well it works in >practice. If pointer were a supertype of ref, you could pass a ref to a function that= expects a pointer. Inside the function, the compiler would treat the ref to be a pointer becau= se that's how it was declared. You could set it to null, for example. Keywo= rd: Contravariance. The problem would have to be caught by the runtime (ove= rriding the "set-to-null" functionality of pointer to produce an exception)= , which is a Bad Thing for a statically type-safe system. >Jason. > > > -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --GcO7w8ZkekqBxKaI7BOzgRxLaVM6tJFK Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a24166a.txt IA5 DX-MAIL X.400 User Agent= --GcO7w8ZkekqBxKaI7BOzgRxLaVM6tJFK-- From Hans-Dieter.Dreier@materna.de Fri Jul 16 17:48:41 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Fri, 16 Jul 1999 18:48:41 +0200 Subject: NULLs Message-ID: --JeU20OJgAVFd1hlNPkgAP31rchCe0z08 Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable sami@pefletti.saunalahti.fi wrote: >Hans-Dieter.Dreier@materna.de wrote: > >> NULLs >> =3D=3D=3D=3D=3D >> >> Often there is a situation where a special value is needed that expresse= s >the fact that there is "no" value. >> >> NULLs are very convenient but they also introduce some problems: >> >> 1. They do not fit nicely into a statical type system. >> 2. They require (runtime) checks whereever they can appear in order to >ensure that no invalid operations are performed on them. >> >> Any ideas or comments? >> > >in modern functional languages there is a powerful mechanism called algebr= aic >datatypes. they can be used to express many kinds of datastructures. one >common datatype >is an optional datatype that can include a value or then not. for example = in >Haskell: > >-- defines algebraic datatype with two constructors, Nothing that includes= no >value and Just that >-- includes a value of type a >data Maybe a =3D Nothing | Just a > >-- then you can define functions for both cases using pattern matching >maybe_square :: Maybe Int -> Maybe Int >maybe_square Nothing =3D Nothing >maybe_square (Just a) =3D Just (a*a) I'm not so familiar with functional languages, although I think I get your = point. I'll try to translate your example to "imperative"; please tell me w= hen I'm wrong: "Maybe a" is a parameterised type (template in C++ speak), "a" would be the template parameter. "Just" is needed to allow pattern matching to distinguish between "a" and "= Nothing". Otherwise "Nothing" could legally be substituted for "a". "Just" is a constant, so to speak, where "a" is a variable. How does the co= mpiler know that? Capitalization? If so, why is "maybe_square" lowercase? "Maybe Int" in the definition of maybe_square makes sure that "a" actually = must be an Int. Can it also be derived from an Int? (if "derived" makes any sense here) But why do you write "Just a" in the last line instead of "Just Int"? And why the parentheses around "(Just a)"? The comment at the beginning says "...algebraic datatype...". Does "algebraic" mean "numerical"? How do we know it is algebraic - it looks rather generic to me, could by an= y type (say, a string)? Sorry for the delayed answer; I accidentally tried to post this answer firs= t to your mail account, but it bounced. If you post follow-ups to the Ultra mailing list, this would allow others t= o participate. -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --JeU20OJgAVFd1hlNPkgAP31rchCe0z08 Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a24520a.txt IA5 DX-MAIL X.400 User Agent= --JeU20OJgAVFd1hlNPkgAP31rchCe0z08-- From mrsmkl@saunalahti.fi Fri Jul 16 21:27:45 1999 From: mrsmkl@saunalahti.fi (Sami Mäkelä) Date: Fri, 16 Jul 1999 23:27:45 +0300 Subject: NULLs References: Message-ID: <378F95C1.1F0A58AC@saunalahti.fi> Hans-Dieter.Dreier@materna.de wrote: > I'm not so familiar with functional languages, although I think I get your point. I'll try to translate your example to "imperative"; please tell me when I'm wrong: > > "Maybe a" is a parameterised type (template in C++ speak), > "a" would be the template parameter. > "Just" is needed to allow pattern matching to distinguish between "a" and "Nothing". Otherwise "Nothing" could legally be substituted for "a". > > "Just" is a constant, so to speak, where "a" is a variable. How does the compiler know that? Capitalization? If so, why is "maybe_square" lowercase? > Just and Nothing are really constructors for type Maybe. Compiler deals with them much like functions. It is true that in some languages they must be capitalized. maybe_square is a function; functions are handled in functional languages same way as all other values >"Maybe Int" in the definition of maybe_square makes sure that "a" actually must be an Int. >Can it also be derived from an Int? (if "derived" makes any sense here) Haskell doesn't have derivation in same sense as OO languages, but you could change the type to >maybe_square :: Num a => Maybe a -> Maybe a this means that type a must be derived from type class Num. >But why do you write "Just a" in the last line instead of "Just Int"? >And why the parentheses around "(Just a)"? in "Just a", 'a' is variable that is bound if the function argument matches this case. parentheses are needed so that the compiler knows that "Just a" is a single pattern > The comment at the beginning says "...algebraic datatype...". >Does "algebraic" mean "numerical"? >How do we know it is algebraic - it looks rather generic to me, could by any type (say, a string)? no it doesn't mean numerical; it can be any type. another name for algebraic datatypes is (disjoint) sum types From jason@injektilo.org Sat Jul 17 03:15:42 1999 From: jason@injektilo.org (jason) Date: Fri, 16 Jul 1999 19:15:42 -0700 Subject: NULLs References: Message-ID: <378FE74E.D62A6B0B@injektilo.org> Hans-Dieter.Dreier@materna.de wrote: > > You would define a parameterized class from the predefined parameterized > >class "ref" that does not allow NULL pointers: > >> > >> class pointer : ref | null; > >> > >> pointer ptr1; > >> ref ptr2; > >> > >> ptr1 allows NULLs, ptr2 doesn't. > >> > > > >This could be a problem. I'm assuming that you're declaring that pointer > >derives from ref, correct? > > No. It *may* be a ref (genuine ref, not derived from ref), but it may also be a NULL. Or contain a null? I'm not sure. Thinking it over, it seems to be a "has-a" relationship rather "is-a". > Anyway, as long as no check has been made, the compiler will allow only operations of the the most derived common ancestor of both ref and null because that is the best guess it can make. > So in this case it wouldn't allow any operations until the check was performed. Unless of course, NULL was a derivative of Object (or Any or whatever your base class for all classes in the language might be named). > > If the right hand side of the definition contains a "|" (meaning "...or may be a..." in this case), there is no direct inheritance involved. Maybe it would have been better to pick another syntax altogether. > If it can lead to confusion, and it obviously did in this case, it should be changed. I do realize these are simply examples and not any definitive syntax. How about: class pointer :? ref | null It looks funny but you definitely won't be confusing it with inheritance. > >I think that you really want the pointer class to be a supertype of ref. This > >would prevent you passing a pointer to an operation that required a ref. > >Sather has this concept although I've only read about it so couldn't tell you > >how well it works in > >practice. > > If pointer were a supertype of ref, you could pass a ref to a function that expects a pointer. > Inside the function, the compiler would treat the ref to be a pointer because that's how it was declared. You could set it to null, for example. Keyword: Contravariance. The problem would have to be caught by the runtime (overriding the "set-to-null" functionality of pointer to produce an exception), which is a Bad Thing for a statically type-safe system. > Good point. Now that I understand your idea more, it makes a lot of sense. It seems to fit nicely into a static type system as long as the compiler requires you to test the type. Jason. From jeremydunn@ibm.net Sat Jul 17 20:43:24 1999 From: jeremydunn@ibm.net (Jeremy Dunn) Date: Sat, 17 Jul 1999 12:43:24 -0700 Subject: If's And Loops Message-ID: <3790DCDC.5C1DCC38@ibm.net> Previous discussion on the portrayal of looping constructs and the many forms they take in various languages has made me think more about what a loop is, and especially how its form is not that much different from If/Then structures (which also have a multiplicity of forms). So let us try another way of representing things to see if there is any conceptual advantage. Let us write If A Then B as If(:=A->B) and let us write If A Then B Else C as If(:=A->B->C) The := indicates that the expression which follows it is the test expression. The -> symbol indicates that the next expression (or group of statements) is evaluated if "A" is true else the next consequent C will be evaluated. We could group multiple expressions to be executed between curly brackets {} as in If(:=A->{a,b,c}->{x,y,x}) where a,b,c,x,y,z are expressions to be evaluated. Naturally we might rearrange this vertically for better appearance as in If(:=A ->{a,b,c} ->{x,y,z} ) The expression is evaluated left to right and every test condition may have no more than two consequents, a true consequent and a false one. A statement like If A then B ElseIf C Then D Else E would be in the form If(:=A->B->:=C->D->E), or one could indent vertically to show structure If(:=A ->B ->:=C ->D ->E ) Each ElseIf statement is merely the False consequent of the previous statement, the final Else statement is merely the False consequent of the last ElseIf statement. This format I believe would obviate the need for SWITCH, CHOOSE, IIF and all the other multiple IF-type commands. We can also indicate a loop structure using this format. The two basic loop formats would be shown as If(B<-=:A) and If(=:A->B) In this case we reverse the := to =: to indicate that the test condition will be repeatedly evaluated until it is True. The arrow shows the direction to the False consequent B that is evaluated. In the first case the statement is processed left to right and the body is evaluated first and then the test statement. The second statment's test condition is evaluated first then the body. Using interior parentheses we can construct any nested structure of If's, Loops that we desire. To run a loop 5 times we could write If(=: (x = 5) ->B) x is initiialized to 0, it is not equal to 5 so it performs B, increments by 1 (by default) and goes back to test again. If we want to increment by something other than 1 then inside B we need to have a statement x+= N where N is the amount we wish to increment. I would suggest modifying C++'s operators in the following way: X+= means X = X+1 X-= means X = X-1 X*= means X = X*2 X/= means X = X/2 X^= means X = X^2 So to decrement a counter we can just write X-= . It is a little unusual to see IF beginning a loop statement but I think you can see the close relationship in abstract structure between the two, the loop is only an if statement that repeats its test condition, or at least it appears to me that that is one way of looking at it. Any ideas, comments, problems? From Hans-Dieter.Dreier@materna.de Mon Jul 19 10:22:27 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Mon, 19 Jul 1999 11:22:27 +0200 Subject: If's And Loops In-Reply-To: <3790DCDC.5C1DCC38@ibm.net> Message-ID: --HU8y26ZEydBtljVTPQQOphhlyB84mFem Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable Jeremy Dunn wrote ... >Previous discussion on the portrayal of looping constructs and the many >forms they take in various languages has made me think more about what a >loop is, and especially how its form is not that much different from >If/Then structures (which also have a multiplicity of forms). So let us >try another way of representing things to see if there is any conceptual >advantage. Let us write > >If A Then B as If(:=3DA->B) > >and let us write > >If A Then B Else C as If(:=3DA->B->C) Everyone used to Pascal would mistake :=3D for an assignment operator. = >The :=3D indicates that the expression which follows it is the test >expression. What other than a test can possibly follow an "if"? >The -> symbol indicates that the next expression (or group >of statements) is evaluated if "A" is true else the next consequent C >will be evaluated. We could group multiple expressions to be executed >between curly brackets {} as in > >If(:=3DA->{a,b,c}->{x,y,x}) C programmers would mistake -> to be a pointer access. The token "->" also = has a different meaning in functional languages. If you use -> to denote both "then" and "else", the token will not be self-= explanatory, you'd have to look at the context instead to determine what it= means, which is a possible source of errors. >where a,b,c,x,y,z are expressions to be evaluated. Naturally we might >rearrange this vertically for better >appearance as in > >If(:=3DA > ->{a,b,c} > ->{x,y,z} > ) > >The expression is evaluated left to right and every test condition may >have no more than two consequents, a true consequent and a false one. A >statement like If A then B ElseIf C Then D Else E would be in the form > >If(:=3DA->B->:=3DC->D->E), or one could indent vertically to show structur= e > >If(:=3DA > ->B > ->:=3DC > ->D > ->E >) > >Each ElseIf statement is merely the False consequent of the previous >statement, the final Else statement is merely the False consequent of >the last ElseIf statement. This format I believe would obviate the need >for SWITCH, CHOOSE, IIF and all the other multiple IF-type commands. Switch statements can always be replaced by if .. then ..elseif ... cascade= s. = Switches were introduced to save typing, especially to avoid redundant typi= ng of the condition. Also, I guess, to make optimisation easier. To be frankly, I can't see any significant improvements of the above notati= on, compared to the standard. > We >can also indicate a loop structure using this format. The two basic loop >formats would be shown as > >If(B<-=3D:A) and >If(=3D:A->B) > >In this case we reverse the :=3D to =3D: to indicate that the test conditi= on >will be repeatedly evaluated until it is True. The arrow shows the >direction to the False consequent B that is evaluated. In the first case >the statement is processed left to right and the body is evaluated first >and then the test statement. The second statment's test condition is >evaluated first then the body. Using interior parentheses we can >construct any nested structure of If's, Loops that we desire. To run a >loop 5 times we could write > >If(=3D: (x =3D 5) ->B) > >x is initiialized to 0, it is not equal to 5 so it performs B, >increments by 1 (by default) and goes back to test again. So this is a for-loop? How do you do a while-loop that runs until a variabl= e x is 5? > If we want to >increment by something other than 1 then inside B we need to have a >statement > >x+=3D N where N is the amount we wish to increment. > >I would suggest modifying C++'s operators in the following way: > >X+=3D means X =3D X+1 >X-=3D means X =3D X-1 >X*=3D means X =3D X*2 >X/=3D means X =3D X/2 >X^=3D means X =3D X^2 > >So to decrement a counter we can just write X-=3D . > >It is a little unusual to see IF beginning a loop statement but I think >you can see the close relationship in abstract structure between the >two, the loop is only an if statement that repeats its test condition, >or at least it appears to me that that is one way of looking at it. > >Any ideas, comments, problems? > Comment: Rather unusual notation. As long as there is no real *functional* = improvement (hardly possible for control flow since this an old issue for w= hich time-proven solutions exist), IMHO the disadvantage of being not self-= explanatory to mainstream programmers is so big that I would rather stick t= o the standards. Possible problem: Where ever you have operators composed of special (ie non= -alnum) characters, you may get a lexer problem. (like < and << in C++). -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --HU8y26ZEydBtljVTPQQOphhlyB84mFem Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a14546a.txt IA5 DX-MAIL X.400 User Agent= --HU8y26ZEydBtljVTPQQOphhlyB84mFem-- From Hans-Dieter.Dreier@materna.de Mon Jul 19 10:34:28 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Mon, 19 Jul 1999 11:34:28 +0200 Subject: NULLs In-Reply-To: <378FE74E.D62A6B0B@injektilo.org> Message-ID: --HTDGPvC97ZvaE7bdoI5o9egm7E3xUWCH Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable > > >Hans-Dieter.Dreier@materna.de wrote: > >> Anyway, as long as no check has been made, the compiler will allow only >operations of the the most derived common ancestor of both ref and null >because that is the best guess it can make. >> > >So in this case it wouldn't allow any operations until the check was >performed. Unless of course, NULL was a derivative of Object (or Any or >whatever your base class for all classes in the language might be named). Any operations that are defined for that pointer class, are allowed without= prior checking. E.g. you could assign the pointer to another pointer (as l= ong as the types-being-pointed-to were compatible, which would be checked b= y a precondition of the assignment operator). >> If the right hand side of the definition contains a "|" (meaning "...or = may >be a..." in this case), there is no direct inheritance involved. Maybe it >would have been better to pick another syntax altogether. >> > >If it can lead to confusion, and it obviously did in this case, it should = be >changed. I do realize these are simply examples and not any definitive syn= tax. > >How about: > >class pointer :? ref | null > That's OK with me. As you said above, it's no definitive syntax. >It looks funny but you definitely won't be confusing it with inheritance. > >[...] > >Jason. > > > -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --HTDGPvC97ZvaE7bdoI5o9egm7E3xUWCH Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a15268a.txt IA5 DX-MAIL X.400 User Agent= --HTDGPvC97ZvaE7bdoI5o9egm7E3xUWCH-- From Hans-Dieter.Dreier@materna.de Mon Jul 19 11:18:35 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Mon, 19 Jul 1999 12:18:35 +0200 Subject: NULLs In-Reply-To: <378F95C1.1F0A58AC@saunalahti.fi> Message-ID: --qhtX7nUzFMuCS6iIixdnkAs4foDnh2tx Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable sami@pefletti.saunalahti.fi wrote: >Hans-Dieter.Dreier@materna.de wrote: > >>But why do you write "Just a" in the last line instead of "Just Int"? >>And why the parentheses around "(Just a)"? > >in "Just a", 'a' is variable that is bound if the function argument matche= s >this case. Is it also allowed to write "Just Int"? Would there be any difference (othe= r than being independent of the name "a" in the definition of "Maybe a") ? > parentheses are needed so that the compiler knows that "Just a" is >a single >pattern What would happen it they were omitted? >> The comment at the beginning says "...algebraic datatype...". >>Does "algebraic" mean "numerical"? >>How do we know it is algebraic - it looks rather generic to me, could by = any >type (say, a string)? > >no it doesn't mean numerical; it can be any type. another name for algebra= ic >datatypes is (disjoint) sum types I must confess that I don't know what that means either. -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --qhtX7nUzFMuCS6iIixdnkAs4foDnh2tx Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a15514a.txt IA5 DX-MAIL X.400 User Agent= --qhtX7nUzFMuCS6iIixdnkAs4foDnh2tx-- From mrsmkl@saunalahti.fi Mon Jul 19 17:29:09 1999 From: mrsmkl@saunalahti.fi (Sami Mäkelä) Date: Mon, 19 Jul 1999 19:29:09 +0300 Subject: NULLs References: Message-ID: <37935255.CC03C913@saunalahti.fi> Hans-Dieter.Dreier@materna.de wrote: > sami@pefletti.saunalahti.fi wrote: > > >Hans-Dieter.Dreier@materna.de wrote: > > > >>But why do you write "Just a" in the last line instead of "Just Int"? > >>And why the parentheses around "(Just a)"? > > > >in "Just a", 'a' is variable that is bound if the function argument matches > >this case. > > Is it also allowed to write "Just Int"? Would there be any difference (other than being independent of the name "a" in the definition of "Maybe a") ? > it could be any variable name ... for example 'int'. it can't be 'Int' because compiler assumes thatcapitalized identifiers are for constructors > > parentheses are needed so that the compiler knows that "Just a" is > >a single > >pattern > > What would happen it they were omitted? > compiler would output an error like: Equations give different arities for "maybe_square" From Hans-Dieter.Dreier@materna.de Tue Jul 20 13:40:39 1999 From: Hans-Dieter.Dreier@materna.de (Hans-Dieter.Dreier@materna.de) Date: Tue, 20 Jul 1999 14:40:39 +0200 Subject: NULLs Message-ID: --QoupyQv4giyNj28glLufUfLzgJiE1ZNo Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable >How about: > >class pointer :? ref | null > That's OK with me. As you said above, it's no definitive syntax. >It looks funny but you definitely won't be confusing it with inheritance. It just occurred to me that this is simply an enumeration of classes (if we= assume that classes (definitions) are objects and types (usages) are value= s), so it seems natural to use whatever syntax we choose for enumerations. -- Regards, Hans-Dieter Dreier (Hans-Dieter.Dreier@materna.de)= --QoupyQv4giyNj28glLufUfLzgJiE1ZNo Content-type: text/plain; charset="ISO-8859-1" Content-transfer-encoding: quoted-printable IDENTIFIKATIONSANGABEN: a19236a.txt IA5 DX-MAIL X.400 User Agent= --QoupyQv4giyNj28glLufUfLzgJiE1ZNo--