[berkana@mac.com: TUNES glossary]

Francois-Rene Rideau Francois-Rene Rideau <fare@tunes.org>
Fri Nov 9 10:44:02 2001


> Who should I contact if I'd like to get some terms defined
> in the Tunes glossary?
>
I suppose it's tunes@tunes.org, although I'm the main or sole author
of all the current articles.

> I find the explanations in the Tunes glossary very helpful, since they tell
> about the implications and other useful knowledge that a straight dictionary
> definition would not.
>
Our glossary is a double-edged sword: since it is laden with intent,
rather than attempting some kind of "neutrality" (a word that only really
means "adherence to the opinion of a large majority", in such a context),
it will repel some people as much as attract others. One the one hand, it's
good, because it makes for some kind of homogeneous community were people
share some common views. On the other hand, we, the community it has
gathered, haven't been successful at achieving much. I suppose that's what
you get, when you gather people around dreams instead of work.

> I'd like to see some explanations for the difference between "dynamically
> typed languages" and "strongly typed/strictly typed" etc. What's it all
> about? what are the implications, benefits, drawbacks, examples, etc. ?
> What's it good for?
>
Ok. Here is what I intend to put in the Glossary about typing.


There are three orthogonal caracteristics:
dynamic vs static typing;
strong vs weak typing;
declared vs inferred typing.

= Types =
Typing in general means that the choice of possible data structures
manipulated by a program can be conceptually divided into groups
of objects with some homogeneity (e.g. integers, floating-point numbers,
lists, arrays, hash-tables, etc.), which groups are called "types".
Depending on the type of an object, different interfaces will be available
to interact with it (numbers are subject to arithmetic operations; lists
can be deconstructed into head and tail; functions can be applied; etc.),
and different algorithms will be used to implement common interfaces
(adding two integers is not same as adding two floating-point numbers).

= Static vs Dynamic Typing =
When the types are known in advance and decisions about them are made at
compile-time, the typing is said to be static; when the types can be known
only when code is executed at run-time, the typing is said to be dynamic.
[Some people from the Scheme community talk about "latent" vs "manifest"
types rather than static vs dynamic types, but this vocabulary is not
widely recognized.]

= Strong vs Weak Typing =
When the types detected or declared are strictly enforced by the language's
semantics, the language is strongly typed; when the semantics of the
language allows for inconsistencies between the compile-time type and
the run-time type, the language is weakly typed. Sometimes, languages are
weakly typed at one level, but strongly typed at another one.
Weak typing means that the invariants (program properties)
expressible in the type system are weaker.

= Type Declaration vs Type Inference =
Some typed language such as Pascal or Java require that types be explicitly
declared for all functions and variables in a program. Other typed languages,
such as ML or Haskell, will infer the type of an expression depending
on its structure and context, without the need for programmer declaration,
except maybe in difficult or ambiguous cases.

= Discussion =
Strong type systems allow to catch a lot of common programming mistakes
(typos, thinkos, lack of propagation of program modifications)
that usually require a lot of debugging time to catch.
Static type systems particularly allow to catch them quickly,
as early as compile-time, without having to even worry about them.
Type inference allows for such bug-catching to have no associated
development costs, as long as the typesystem is powerful enough
to express the considered program without the need for abuse or contorsions.

Some argue that you can always determine static or dynamic information
about a program, and call it is type. For them, the distinction is between
trivial static typing and rich static typing, between trivial dynamic typing
and rich dynamic typing. Trivial here means that the compiler (resp. the
runtime system) cannot use static (resp. dynamic) information about an
expressions' type so as to avoid checks and make early decisions about
program behaviour, whereas rich means that it can.

Often, otherwise strongly typed languages have escapes allowing for
weak typing (pointer/array dereference in C or C++, typecasting in
C, C++ or Java; magic in OCAML).
Such languages can usually still be neatly divided into weakly typed
and strongly typed. Indeed, in the case of C and C++, the weak typing
features are essential to any reasonably sized project,
and their implicit rather than explicit character prevents any easy syntactic
distinction between programs that abuse or don't abuse the type-system;
hence C and C++ are definitely weakly-typed languages.
In the case of OCAML, such abuse is not an essential feature to normal
user programs, and made explicit through the use of the keyword "magic",
that can be easily detected or avoided
- so OCAML is definitely a strongly-typed language.
In the case of Java, the type-system can be abused only by explicit typecasts,
but the feature is often considered essential to develop container classes
and other generic programs; Java is weakly statically typed,
although the underlying JVM is strongly dynamically typed.

On a different note, otherwise statically typed languages
sometimes have escapes that allow for dynamic typing.
For instance the MOP in CLOS, or RTTI in C++, Java or Mercury, allow programs
to inspect and modify objects whose type is not statically known.
This enables the development of some generic software infrastructure,
but at the same time forces the implementation to keep at runtime
information that might otherwise have been thrown away at compile-time
(as it is in OCAML for instance),
with all the associated space and time overhead.
Dynamically typed languages have this overhead, anyway.

Inferred Static Strong Types, such as exist in ML, Haskell, Clean or Mercury,
are those that bring the most information on programs out of the least
programmer efforts, allowing to express the stronger and richer invariants.
Because they do not usually require any type declaration,
they remove the burden that people used to dynamic programming languages
usually feel about rich typing.
However, because of language design and implementation issues
(notably the simplicity of the typesystem and
the guarantee that the process of typing will terminate promptly),
most languages with strong static inferred types also restrict the
expressiveness of their typesystem in such ways that advanced users
of dynamically typed language will find them impossible to use
(notably those who like dynamic interactive systems and/or LISP macros).
A notable exception, Cayenne, is an extension Haskell with dependent types,
that allows to express arbitrary computations within the type system at
compile-time; unhappily, it is more designed as a proof of concept
than as a really usable programming language.

Compilers for languages without a rich static type system can nevertheless
infer types from programs that follow common programming styles,
and use these types to produce more efficient code,
or emit warnings that help catch bugs.
A case in point is CMUCL (a Common LISP compiler).
Common LISP, a strongly dynamically typed language, allows the programmer
to declare static types for expressions, so as to help the compiler.
CMUCL takes advantage of these declarations as well as of information it
can infer from the language's semantics so as to construct precise types
for expressions and produce efficient code that avoids runtime type dispatch.
The situation of CMU Lisp is rather peculiar, since, depending on its
the programmer-accessible safety settings, the declarations can be used
either to tighten the type-safety of the language, or on the contrary to
make it weakly-typed and provide an escape for low-level optimizations.
Because this escape requires explicit and easily toggled safety settings,
and is not essential to normal programs, CMU Lisp, like Common LISP
in general, is still a strongly dynamically typed language.

[... there is a lot more to discuss, but that will do for now ...]


Yours freely,

[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ]
[  TUNES project for a Free Reflective Computing System  | http://tunes.org  ]
Invoking relativism is only a hypocritical way to dismiss reason, with only
sheer force being left. It's abdication of reason, denial of everything that
makes the dignity of man.