Equivalence - View of gbeta - a Language with Virtual Attributes, Block Structure, and Propagat

the interaction specication is a pattern of any kind yields greater exibility.

Note that the development of mixins has almost closed a circle. The starting point was a special usage of technically ordinary classes in CLOS and similar languages; mixins were then separated out as a new concept, dierent from a class; later, the interactions between a mixin and its actual superclass were made explicit by means of inheritance interfaces; and nally, in gbeta, inheritance interfaces were generalized to ordinary patterns.

The situation is then again similar to the starting point, because mixins in gbetamay only be specied together with a pattern on which they depend (or, if that pattern is missing then the mixin must not depend on the actual superclass at all), and this specication of a mixin together with its dependency looks just like an ordinary declaration of a pattern, enhancing a given superpattern with a new increment using inheritance. Actually, the same declaration may be viewed as a declaration of a new pattern, or as a declaration of a mixin together with a specication of its dependenciesthe dierence lies in the usage.

This is consistent with the rest of the language where, e.g., patterns unify methods and classes, leaving it open for the programmer to decide whether to view a given pattern as a method or as a class. There is a subtle richness in having an entire spectrum between method and class, or between mixin and pattern, instead of just having the end points of the spectrum.

However, in order to explain the structure and construction of patterns, and in order to dene the specialization (isa) relation, the only reasonable approach is to consider the mixins individually. That is the way it was designed, and that is the reason why these sections on mixins precede the sections about inheritance and merging.

3.4 Equivalence

This section is about qualities, not in the sense of good quality or bad quality, but in the sense of immanent dierences, such as the dierence between tomatoes and oranges. An immanent dierence is an unexplained dierence; two things are considered just dierent without having a common deconstruction in terms of which the dierence is accounted for. Conversely, e.g., a paper bag containing two tomatoes and a paper bag containing three tomatoes are dierent in terms of an analysis which describes them as composite entities built out of the same kinds of building blocks. That is an explained dierence, a completely modeled dierence, not a dierence in qualities.

The notion of qualities is a mental device which is needed because models are not faithful. As usual, instead of copying the humongous complexity of the real world and hence recreating any dierence in behavior or properties by means of copying the mechanisms, we use qualities as a mental device to obtain use-ful models with much less complex structure. Tomatoes and oranges are just dierent, axiomatically dierent, and then we may enrich our knowledge about tomatoes and about oranges independently, by adding typical properties to the

concepts. At no point are those concepts brought into a common, commensu-rable state, where all the dierences are completely explained within the model.

To emphasize the point: qualities are concerned with dierences which are taken for granted, as opposed to dierences which are modeled; it is a property of a model and not a property of the real world.

Our claim in this section is that support for qualities is obviously needed in a programming language, and that we may choose between dierent degrees of support for immanent dierence. Moreover, these dierent degrees of support correspond to a technical topic which is usually called type equivalence.

In the programming language universe, the absence of qualities corresponds

to machine code, where the bit is the only kind of matter and everything else is just collections of bits and manipulation of bits. All dierences are structural.

Similarly, untyped lamba calculus is also a world of pure structure. Such a smooth and homogeneous world may be considered clean and ideal, but for modeling purposes it does not suce, since the recreation of dierences by mech-anism is too heavy-handed; some things are just dierent and we should be able to postulate that. However, support for immanent (postulated) dierences must generally be provided on top of the native, smooth, quality-less universe of raw machine code execution. This is done by imposing a certain discipline on the usage of the raw bits.

The rst step in the direction of support for qualities could be the separation of dierently sized chunks of memory, e.g. distinguishing between an 8-bit^byte, a 16-bit^wordand a 32-bit^dword. The idea is that there should be a discipline on the usage of memory which ensures that a given bit is treated consistently as being a member of just one of those types of chunks of memory. Assembler language typically helps enforcing such a policy.

Memory chunks of the same size may be considered dierent, as when a language denes the types ^int and ^floatboth 32-bit entities, but intended to be manipulated by means of dierent procedures (integer vs. oating point operations). This is an example of support for built-in qualities. Let us call built-in qualities primitive types.

At this level we might also introduce a concept of record typescomposite

units of storage which are dened inductively as containing entities of primitive types, or of other record types. Using C syntax we can illustrate a couple of record types:

struct point { int x,y; };

struct position { int longitude, latitude; }; ^Ex.

3-1

The question about type equivalence arises as soon as there is any mechanism available through which programmers may dene similar types. Respecting the built-in qualities and the user-dened composite structure, it is only natural to consider^pointand^positionequivalent. Whether a usage of a given entity occurs based on the^pointor on the ^positiondeclaration, the two contained

wordentities will be used according to the discipline we decided to enforce at this level. This kind of type equivalence might be designated pure structural

3.4. EQUIVALENCE 49 equivalence. Note that it is not too practical, because the equivalence may be

ambiguous. In the above example we might identify^x and^longitude, but we might just as well identify ^x and ^latitude. Both choices would ensure the required discipline on the underlying usage of memory.

To describe the next level of support for qualities, we must introduce the concept of a record path. With a given record type, an entity of that type is known to contain entities which may be accessed by name, according to the list of declared names in the record type denition. If such a name is used to access a part of the entity which is again of record type, a similar step may be taken from there. Thus, a list of names corresponds to a process of repeated subentity selection. Let us call such a list of names a record path.

Now, the next level of support for qualities takes record paths into consid-eration, by considering those types equivalent which support the same record paths leading to entities of the same primitive type. Note that recursion, along with pointers with a special NULL value or disjoint sum types (tagged unions), would introduce innite sets of record paths, making the type equivalence check more complicated. With the requirement that each record must dene unique names (i.e. no name can be declared twice in the same record), this may ensure unambiguous mappings between equivalent types. In such a type system, the following two record types would be equivalent:

struct mypoint { int x,y; };

struct yourpoint { int y,x; }; ^Ex.

3-2

An example of a mapping which would ensure unambiguous access to the un-derlying memory would be to sort the declared names alphabetically and store the entities in that order (^x before^y in both ^mypoint and ^yourpoint). Note that even though we have referred to stored values only, and referred to them as if the representation should be trivial (like mapping high-level language names directly to osets into contiguous areas of memory), the notion of type equiva-lence is independent of the representation. The important point is whether or not two given type denitions specify the same type.

This type equivalence criterion would normally be designated structural equivalence, even though it depends on both structure and naming. This kind of equivalence automatically equips composite entities with dierent qualities, when their internal structures are dened using dierent sets of names; but when the same set of names is chosen for both, then the types may be consid-ered equivalent, depending on the types of the part entities.

It seems that the indirect derivation of qualities from the set of names inside a denition is of an accidental nature. In particular, when systems grow very large and complex, it is inconvenient to have to make a global search in order to ensure that some new type which is being dened will not by accident be confused with an existing type. The need to inspect the entire system to avoid accidental clashes (which will silently allow unforeseen actions) is not normally considered desirable in software engineering.

The other side of the coin is that types may be constructed in dierent

places and still be the same, by using similar denitions. This is a typical correctness/convenience trade-o.

An example of a language where the underlying record type system is of exactly this kind is Objective CAML [94], which is an object-oriented extension of CAML, which is again a functional language in the ML family.

Note that the specialization (isa) relation in Objective CAML further inten-sies the problem of accidentally confusing entities which should be immanently dierent. For example, consider a^doorclass for which the^openmethod would send signals to some hardware in order to actually open a physical door; and a

windowclass which is used to control a rectangular area of a bitmapped com-puter screen, and where^openmeans `initialize the internal data structures and show the window on the screen'. A type which just lists an^openmethod might very well be a supertype of both^doorand ^window,² and an invocation of one

openwhere the other was expected could be a disasterimagine that somebody in the sta of a nuclear power plant wants to open the window showing the cur-rent state of a particular nuclear reactor, and that action in fact opens the door to that reactor and lets radioactive material ow out:::

The moral of this is that even a theoretically very well-founded and strict type system, like the one in Objective CAML, may actually exhibit correctness faults because it does not have a suciently thorough support for distinguish-ing dierent qualities. It is our opinion that it is naïve to assume that models will be complete and faithful, and structural equivalence seems to build on the assumption that there cannot be signicant real-world dierences between phe-nomena whose descriptions are formally identicalthe structure of the model is everything there is to know.

Structural type equivalence takes the record paths into consideration, but ignores the rst name, the name of the record type itself. A more strict version of structural equivalence could be dened by simply requiring that the declared names of types should also be the same, in addition to the record paths etc. as before. However, this would still build on a derived notion of qualities which would require global checks in order to avoid accidental confusion of similarly declared types. A dierence in degree, only.

Nevertheless, this kind of type equivalence, known as name equivalence, is

quite widespread. That is because it often coincides with the next kind of type equivalence, as explained below. In fact, the presentation of name equivalence in [4] assumes this coincidence; it is unsound unless all names used in any two type expressions are dened in the same (at, global) environment. It says, on p. 356:

Name equivalence views each type name as a distinct type, so two type expressions are name equivalent if and only if they are identical.

The combination of this denition and the associated examples (in Pascal) clearly indicate that, e.g., two variables declared to have type `^"^cell' are con-sidered to have the same type under name equivalence, with no mention of the

2Assuming that all the involved^openmethods take no arguments and deliver no results, it^isa supertype in Objective CAML

3.4. EQUIVALENCE 51 lookup process that determines the meaning of the identier `^cell'. If `^cell' is looked up to mean `^integer' for one of the declarations and `^string[30]' for the other then the type equivalence obviously confuses two dierent types, and hence the existence of more than one scope for type names must be implicitly excluded from consideration. This is natural for Pascal but less natural for a, supposedly, generally applicable denition of name equivalence.

Another presentation is given in [44] on the pages 333334. Type equiv-alence, which is here called type compatibility, is presented by comparing a notion of structural type equivalence with a so-called `strict denition of type equivalence' associated with Ada, Pascal, and Modula-2. The strict denition considers two types equivalent iff they are dened in (syntactically) the same declaration, as tested in an implementation by comparing pointers to describing structures generated during a traversal of the program syntax tree.

These presentations, along with the actual semantics of Pascal, support the view that name equivalence is generally considered the same as the next kind of equivalence, based on program positions. However, since name equivalence does not seem to be very well-dened, and since the denition in [4] is unsound in any language which does not have one global name-space for types, the denition given here was created as a generalization which would make sense in a non-at type name space, and which actually depends on names.

The next approach to type equivalence, locational equivalence, takes a rad- ical step away from the smooth, homogeneous world of bits and lambdas. This kind of type equivalence is incompatible with the basic lambda calculus exe-cution model, because it presupposes that the expression of the program, the source code, remains unchanged during the entire program execution. This de-viation from the lambda calculus makes it harder to formalize the semantics using a direct translation of programs to lambda expressions, and this seems to have alienated many mathematically oriented researchers to the concept. It is as if they consider locational equivalence unacceptable, just because they cannot readily use the traditional approaches to formalization of the semantics. For example, it is stated by Abadi and Cardelli in [1, p. 27] that:

Structural subtyping [::: ] has desirable properties, such as support-ing type matchsupport-ing [ ::: ] A disadvantage [of structural subtyping]

is the possibility of accidental matching of unrelated types. In con-trast, subtyping based on type names is hard to dene precisely, and does not support structural subtyping.

After this, they exclusively use structural notions of types, with or without involving subtypes, in the rest of the book.

In lambda calculus, the ability to transform the program itself (-reduction) is the only computational tool. The simplicity of the semantic modelonly the program text itself is needed for a small-step operational semanticsis tradi-tionally viewed as desirable, especially when the emphasis is on proving formal properties of executions. However, another basic model has proved useful in practical programming, possibly because it matches the way people think bet-ter than -reduction does. In this model, programs are constants, and the

execution happens in terms of a run-time system whose actions are directed by the program. Object-oriented languages generally fall in this category. Pre-sumably, the human mind is capable of building an understanding of a static program in a similar way as it may get to know a physical landscape; there are locations, each location has its own, characteristic properties, and the dis-tance between dierent parts of the program may have been designed to reect degrees of relatednesswhen being in a particular location, the most relevant parts of the program are nearby.

Taking it seriously that the program is immutable,³ it becomes meaningful to refer to locations in a program. This is the basic mechanism behind locational equivalence. With this kind of equivalence, two types are considered the same

iff they are constructed at the same location in the program, e.g., by the same declaration. This criterion has the nice property that any system, however large, will let a programmer dene a new type and rest assured that it will be destinct from all other types in the program. In other words, this equivalence criterion takes the clean approach of providing syntactically distinct type introductions with distinct qualities.

The other side of the coin is that two similar types will be distinct, even if they should be considered the same; a separate mechanism may then be provided to allow explicit identication of two given types. Note that this does not introduce the need for global checks in order to secure against unexpected semantic eects. The failure to view two types as being equivalent may give compile-time errors, but it will not cause run-time confusion. The former is a much less serious problem since it is obvious, whereas the latter is silent at compile-time and probably subtle at run-time.

Revisiting the relation between name equivalence and locational equivalence, the two happen to coincide in the case where the type names are all dened at top-level. Since the top-level names are required to be unique, two types have the same name iff they are dened in the same location (otherwise they would have to have dierent names). With a partitioned global name space, as with Java packages and sub-packages, and with C++ name spaces, name equivalence is again eectively changed into location equivalence by considering the path of names of (sub)packages or name spaces as a part of the name of the type.

Turning to^Beta, the notion of name equivalence or locational equivalence does not suce for a characterization of the equivalence of patterns; location equivalence is the correct starting point, but the run-time environment must be taken into consideration, too. Note that location equivalence is dened on basis of the construction of an entity, in this case a mixin, not on the occurrence of a declaration. This makes a dierence because ^Beta and gbeta support the specication of patterns in many places outside of pattern declarations;

for example, a descriptor (see Fig. 2.1 on page 23) may be used in a pattern declaration, or directly as a statement. When used as a statement it constructs an anonymous pattern which is then by coercion instantiated and the resulting

3It may actually be extended, but not changed in ways which invalidate existing location specications

3.4. EQUIVALENCE 53 object executed (see Sect. 2.3.4).

Even though it is not the tradition, patterns in^Bet^amay be characterized in exactly the same way as patterns ingbeta, namely as lists of mixins, where each

In document View of gbeta - a Language with Virtual Attributes, Block Structure, and Propagating, Dynamic Inheritance (Sider 55-62)