Pattern Merging - View of gbeta - a Language with Virtual Attributes, Block Structure, and Prop

As clearly stated in [13, 4], subtyping is unsound when we allow objects to be extended.

The demonstration of this problem by means of an example right after the above quote relies on changing the meaning of a name application by introducing a new declaration of the same name but with another type. This is an example of unifying two attribute declarations in such a way that the type checker will never know about both of them at the same time. Now, with static name bind-ing and explicit unication that problem is trivially solved, since an attribute declaration can only override another when the two are recognized by the static analysis as being one, unied attribute, and that means that the necessary type conformance checks between the two declarations can be made during static analysis. However, even though such a result is nice, it should be viewed as one of many examples of improved analyzability and understandability associated with a language where programmers can reason about programs in terms of properties of declarations, without having to worry about what declarations to look at. In other words, it is a benet of static name binding.

3.7 Pattern Merging

This chapter shares material with our paper Propagating Class and Method Combination, which was accepted for publication and presen-tation at the ECOOP'99 conference.

The merging operation, `^&', is based on a linearization, as mentioned already in Sect. 3.2. This section species precisely how the merging works, giving a formal denition of the linearization based on operations on total order relations which are again just a formal view on lists.

3.7.1 Linearization

Until now the class combination mechanism `^&' used ingbetahas only been pre-sented by example. This section motivates the mechanism, species precisely what the mechanism is, and proves some properties about it. The concrete algo-rithm is shown in Fig. 3.6. This algoalgo-rithm uses the standard^memberfunction which determines whether or not a value (rst argument) is a member of the given list (second argument). As is evident, merging may fail. This corresponds to the situation where the compiler rejects agbetaprogram because it contains a bad merge. It occurs when the contributing patterns give conicting directions as to the order of two or more elements, e.g., where one contributor requires the order [:::a:::b:::] and another requires [:::b:::a:::] for two mixinsaandb. Where behavior is combined, it certainly makes sense for the programmer to decide exactly in what order contributing behaviors are composed, but for the case where the combination deals exclusively with state it would be nice to have

fun merge ([]: int list) (ys: int list) = ys

| merge (xxs as x::xs) [] = xxs

| merge (xxs as x::xs) (yys as y::ys) = if x=y then x::(merge xs ys)

else if not (member x ys) then x::(merge xs yys) else if not (member y xs) then y::(merge xxs ys) else raise Inconsistent;

fun member x [] = false

| member x (y::ys) =

if x=y then true else member x ys;

Figure 3.6: The algorithm used in merging

a symmetric, unordered mechanism. That is future work. The details about behavior combination is given in Sect. 3.8.

A small oddity: In object-oriented languages it is a tradition to write the most specic part last, like inaSuper(# .. #), not(# .. #)aSuper, and hence the class combination operator `^&' was designed to make the last argument the most specic. In this section it is more convenient to reverse the order, so the programming language syntax ^A&B corresponds to the mathematical notation BA below. Among other things, it matches better with the standard list notation which also puts the most accessible (head) element at the front (left) end of the notation.

The class combination mechanism is a graph linearization, i.e., a procedure which from an oriented graph constructs a list which determines a topologi-cal sorting of the nodes. Obviously a cyclic graph does not allow this, hence the potential for rejection of a merge. Since there are many possibilities for typical graphs, some systematic choices must be made in order to arrive at a well-dened result. Existing linearizations [29, 30, 7] are described in terms of such systematic choices of what node to take next, and this makes it hard to understand their outcome and to reason about their properties.

Luckily, the `C3' linearization [7], which is the one used in gbeta, can be characterized in a much more declarative way, and it can even be generalized in a way that makes it a proper operation on a suitable setM:

8x;y²M: xy²M

The name C3 reects three consistencies exhibited by this linearization, namely consistency with the local precedence order,⁵ consistency with the extended precedence graph,⁶and monotonicity.⁷ The other known linearizations

(includ-5The programmer-chosen ordering of direct superclasses, or, in^gbeta, the ordering of the operands of the merging operator

6The extended precedence graph additionally orders classes according to the local prece-dence order from the most general common subclass

7Avoidance of the phenomenon that an inherited feature is looked up in a class that none of the direct superclasses would have chosen.

3.7. PATTERN MERGING 65

The C3 merging principle:

The merge of two orders

and

is the union of

and

together with all non-contradictory edges from

to

Figure 3.7: The intuitive principle behind merging

ing the ones used in LOOPS, CLOS, and Dylan) do not have all three consisten-cies. Moreover, the C3 looks even better with the simple characterization given below. The remaining problem with linearization is that no linearization can handle all inheritance hierarchies, some will be rejected as inconsistent. There is simply no way the lists [1;2] and [2;1] could be merged into a new list of distinct values which would preserve the order of 1 and 2 for both of those lists.

To reach a declarative characterization we must make a shift in mindset and terminology from the `list' and `graph' based thinking. If we regard the edges in a given acyclic oriented graph as a relation and take the reexive and transitive closure of that, we get a partial order. Similarly, a list determines a total order.

Hence, a linearization is a construction of a total order by adding elements to a partial order. C3 actually constructs a total order from a number of given total orders, namely the linearizations of the superclass hierarchies. The C3 principle for two orders is as shown in Fig. 3.7. To elaborate on this principle, it says that the merge is everythingAsays and everythingB says and then, by default, elements fromAare smaller than elements fromB. This is formalized straightforwardly in section 3.7.1. First we have to establish a few facts.

Total preorders

We need to consider total preorders:

Denition 2

A total preorder is a relation which is reexive, transitive, and total. A total order is a total preorder which is also anti-symmetric.

It is easy to prove that:

Lemma 1

Assume is a total preorder. The relation dened by ab ^, ab^{^}bais an equivalence relation, and the relationon equivalence classes dened by [a][b] iab is well-dened and a total order.⁸

Given an equivalence relationand a total order on the equivalence classes

, then the relation dened by ab^,[a][b] is a total preorder.

In other words, a total preorder corresponds to a list of equivalence classes of elements, rather than a list of individual elements.

This is the desired generalization: to construct a list of groups of mixins, each group consisting of mixins considered equally specic.

[a]denotes the equivalence class containing^a.

In such a setting, clashing names are not always disambiguated. This might at rst seem to be a step backwards; it is in fact an improvement. When the ordinary C3 linearization (Fig. 3.6) would succeed, the generalization delivers the same result (all groups have size one). When the hierarchy would be rejected, the resulting non-trivial groups would in many cases work well enough. For example, as long as a name is only declared by one of the mixins in a given group, there will be no clashes on that name. In fact, a number of inheritance hierarchies would be better described by making certain mixins equally specic, since the commitment to one order causes unnecessary restrictions on future usage. However, the lack of ordering does not blend well with combination of behavior.

Merging

We need a couple of tools before C3 merging can be formalized:

Denition 3

^When ^R is a relation, its domain dom(R), its inversion R, its one-step transitive closureR⁺, and its transitive closureR, are dened by:

dom(R)=⁴^fy^j(⁹z:(y;z)²R) ^_ (⁹x:(x;y)²R)^g R=⁴^f(y;x)^j(x;y)²R^g

R⁺=⁴R^[^f(x;z)^j9y:(x;y);(y;z)²R^g R=⁴ ^[

i²!Ri

whereR⁰=⁴R,⁸i²!: Ri⁺¹=⁴ R⁺_i .

The following lemma is immediate from the denitions:

Lemma 2

Let R and S be relations. Then R is transitive. The domain is additive: dom(R^[S) = dom(R)^[dom(S). The domain is preserved by transi-tive closure and inversion: dom(R) = dom(R) = dom(R). Reexivity is pre-served by transitive closure, inversion, and union: if⁸x²dom(R): xRx then

8x ² dom(S): xSx, S ² ^fR;R^g, and if ⁸x ²dom(T): xTx, T ² ^fR;S^g then⁸x²dom(R^[S): xR^[Sx.

The formalization of the C3 merging principle is:

Denition 4 (C3 Merging)

Let R¹ and R² be relations. The C3 merge of R¹ andR² is

R¹R² =⁴ R^[(dom(R¹)dom(R²)ⁿR) whereR= (⁴ R¹^[R²).

Intuitively, the merge combines the two given relationsR¹andR²into (R¹^[R²) which is then repaired to be a transitive relationR by taking the transitive

3.7. PATTERN MERGING 67 closure. Ris complemented with everything from dom(R¹)dom(R²) which does not contradict R. In other words, R¹ elements are smaller than R² elements, unless something is known to the contrary.

As an example of a general merging, ^fa b;a c;b c^g^fc b^g is

fab;ac;bc;cb^g, or as lists: [a;b;c][c;b] = [a;^fb;c^g]. The elements

fb;c^gfor which there are conicting ordering requirementsa cycle, as dened beloware collected into a group and thus made equally specic.

Now we can state the closure property that makes total preorders interesting:

Proposition 1

AssumeR¹andR²are total preorders. ThenR¹R² is a total preorder.

Proof:

(See appendix B) ²

The ordinary C3 merge (Fig. 3.6) fails precisely when the generalized C3 merge (Def. 4) produces a total preorder which is not a total order. A total preorder is a total order if and only if there are no cycles, so we need to consider them:

Denition 5

^Let^R be a relation. A sequence of distinct elements d¹:::dn ²

dom(R), n2, is a cycle in Riff

(⁸i²1:::n 1:(di;di⁺¹)²R)^{^}(dn;d¹)²R R is acyclic iff there are no cycles in R.

Lemma 3

LetR be a acyclic relation. ThenR is acyclic.

Proof:

(See appendix B) ²

We can now state and prove the non-pre equivalent of proposition 1:

Proposition 2

AssumeR¹ andR²are total orders andR¹^[R²does not have cycles. Then R¹R² is a total order.

Proof:

(See appendix B) ²

We have seen that the C3 merging principle can be formalized in a rather ob-vious manner and proved that the formalization has the nice stability property of proposition 1 and the incomplete stability property of proposition 2. It seems to be worthwhile to try to develop the strict linearization of various languages into the more wholesome total preorder model, going from class precedence lists to class group precedence lists. This has not yet happened for gbeta, and as mentioned the main problem is the fact that behaviour can hardly be combined in a symmetric waya non-deterministic choice of ordering of the behavior com-position or generally having concurrent execution of equally specic behaviors would surely be a nightmare to debug.

It is not hard to see that the algorithmic version of C3 actually implements the formalization presented herethe algorithm each time selects the least re-maining element according to our formalization of C3. Of course, the algorithm just terminates with an error message if the result would not be a total order.

As an aside it is interesting to note that the gbetaimplementation actually used a quite dierent algorithm for merging during a period of more than a year.

Only after the above formalization was created did it become clear that the C3 algoritm (which was simpler and therefore attractive) solved the exact same problem, because both algorithms clearly implement the formal characterization of the linearization. Algorithms are generally harder to compare and reason about than declarative formalizations like Def. 4.

In document View of gbeta - a Language with Virtual Attributes, Block Structure, and Propagating, Dynamic Inheritance (Sider 71-76)