Modeling data - An outline of the theory - View of Sharing of Computations

8.1 An outline of the theory

8.1.3 Modeling data

We have to settle on the form of the conﬁgurations.It seems clear that a conﬁguration should be a sequence of predicate symbols, together with some information about which values the arguments to those predicates can assume.One usually represents this information as a substitution, as e.g. in [Llo84], but below we shall argue for choosing another representa-tion.

A key point in the reasoning in section 8.1.1 is that a folding followed by an unfolding (of the same predicate symbol) should cancel each other, i.e. be equivalent to the identity. Let us see if this can be achieved by means of the substitution model.Consider a clause p(X) ←q(X) and let the initial conﬁguration be (q(X),ε) where ε is the identity substitution.

We want to do a folding step; to this end we have to rename the clause yielding say p(X1) ←q(X1).As a result of the folding we get the conﬁgu-ration (p(X1),{X1 → X}).We want to do an unfolding step; to this end we have to rename the clause yielding say p(X2) ←q(X2).As a result of the unfolding we get the conﬁguration (q(X2),{X1 → X, X2 → X}).The substitution on the right hand side is “equivalent” to the identity substi-tution, in the sense that X2 can assume any value, but some superﬂuous items have crept in.Of course it will be possible to repair on that, but this almost inevitably gets rather messy (as witnessed by various papers in the literature!), since substitutions are hard to reason about from an algebraic point of view (even though e.g. [Søn89] and [Pal89] show that certain sets of substitutions carry some structure), in particular one has to be very careful about renaming.

We therefore (as a ﬁrst attempt!) shall prefer to represent data as sets of ground values.It is useful to suppose the existence of a

univer-sal data domain D.We will impose no requirements on the structure of this set; in our examples, however, we shall assume the elements of D to be PROLOG ground terms, i.e. terms built inductively from some set of functors (constants just being zero-arity functors).For instance, the conﬁguration (p(X),ε) will be represented by the predicate symbol p together with the set of ground values {d|d ∈ D} (that is, all values are allowed).The conﬁguration (p(X),{ X → a }) will be represented by p together with the singleton set of ground values {a}.And the conﬁg-uration (p(X,Y), { X → Y }) is represented by p together with the set {(d, d)|d ∈ D} (that is, only pairs where the components are equal are allowed).It turns out that this way of representation is not quite enough, but let us stay with it for a moment.

Next consider how to represent clauses.As a ﬁrst example, we shall look at the clause

p(X,X) ←q(X)

The idea will be to represent this clause as a mapping φ from ground values into sets of ground values: as the call to pfails if the ﬁrst argument does not equal the second argument, φ(d₁, d₂) = ∅ if d₁ = d₂.Otherwise, we have φ(d, d) = {d}.

From what is said it should be clear that we want to ﬁnd a functionφ⁻¹ such that φ⁻¹+φ is the identity.And this is quite easy: choose φ⁻¹(d) = {(d, d)}.

As a second example, consider a clause where a new variable occurs on the right hand side:

p(X) ←q(X,Y)

This clause is represented as a mapping ψ, where ψ(d) = {(d, d)|d ∈ D}. In this case, however, it is impossible to ﬁnd a mapping ψ⁻¹ such that ψ⁻¹+ψ yields the identity.On the other hand, one can deﬁne ψ⁻¹ on certain subsets Z such that ψ(ψ⁻¹(Z)) = Z – these subsets are those with the property that if they contain an element (d₁, d₂) then they con-tain any other element of form (d₁, d₂).It is worth noticing that this phenomenon explains why one has to be careful when folding against a clause containing variables not occurring in the head; these variables must be “uninstantiated”.This is a problem to which some incorrect solutions have been proposed in the literature (and yet proved correct!), for a survey see [GS91].

We want to ensure that all mappings to be deﬁned by program clauses are reversible.Of course, there are many technically equivalent ways to proceed, but the one we shall adopt is to represent data as a family of sets of ground values, to be called an information family.As an example of this, consider again the rule p(X) ←q(X,Y).Here the left hand side will be represented as a D-indexed family Q where the d’th element is the singleton set {d}, whereas the right hand side will be represented as a D-indexed family Q where the d’th element is the set {(d, d)|d ∈ D}. The mapping ψ deﬁned by the program clause now simply for each d ∈ D maps thed’th element in Qinto the d’th element inQ, and is thus clearly reversible.

One will notice that the representation of a substitution is no longer unambiguous.For instance, p(X) equipped with the identity substitution may be represented by the D-indexed family Q where for each d ∈ D it holds that Q(d) is the singleton set {d}; but may also be represented by the singleton family Q consisting of the set {d|d ∈ D}.The diﬀerence is that the latter representation (as we saw above) is needed to cope with variables occurring on the right hand side only; but on the other hand cannot be “instantiated”.

This leads to our next point: when is the a conﬁguration B an in-stance of another conﬁguration B? The “predicate part” of B and B must be equal, so we only focus upon information families – assume B contains K-indexed information family Q and assume B contains K -indexed information family Q.The answer turns out to be that there must exists a mapping s from K into P(K) such that for all k ∈ K, Q(k) = _k∈s(k)Q(k).As an example of this take K = K = D, Q(a) = {a}, Q(d) = ∅ for d = a, Q(d) = {d} for all d ∈ D.As Q corresponds to the substitution {X →a} and Q corresponds to the iden-tity substitution, it seems clear that Q is to be considered an instance of Q.And so it is according to our deﬁnition, as we can chooses as follows:

s(a) = {a}, s(d) = ∅ otherwise.We say that B = Is(B).

Conﬁgurations are not unfolded arbitrarily far, but only until either the sequence of predicate symbols is empty or the data part is failure.

That a K-indexed information family Q is failure naturally means that Q(k) = ∅ for all k ∈ K.

We are now ﬁnished with our sketch of how conﬁgurations look.For a full account of conﬁgurations and operations on these, see section 8.3.1.

Of course, one might ask about the precise relationship between our model

and the “standard” model.Such an “equivalence result” would undoubt-ful be rather cumbersome to state and to prove, so we hope the reader is convinced that our model is a faithful representation of what is going on when a logic program is executed.Below we shall try to support this claim by working on a relatively (!) large example – this material might be skipped.

Our model in action

We shall use example 8.0.1 as our starting point. The level 0 rule f([ ],[ ]) ←✷ is represented as a transition from B1 = ([f ], Q1) to B₁ = ([], Q₁), where Q₁ and Q₁ are D × D-indexed families with Q₁(d₁, d₂) = {(d₁, d₂)}, Q₁([ ],[ ]) = {()}, Q₁(d₁, d₂) = ∅ if d₁ = [ ] or d₂ = [ ].

The level 0 rule f([N|U],[s(N)|V]) ←f(U,V) is represented as a tran-sition from B₂ = ([f ], Q₂) to B₂ = ([f ], Q₂) where the D × D-indexed information families Q₂ and Q₂ are given by Q₂(d₁, d₂) = {(d₁, d₂)}, Q₂([d_n|d_u],[s(d_n)|d_v]) = {(d_u, d_v)} and Q₂(d₁, d₂) = ∅ if (d₁, d₂) is not of the form above.

In the standard framework, the query f([0],Z) is solved yielding an answer substitution where Z is bound to [s(0) ].Now consider how this works in our framework.There the query f([0],Z) is represented as the conﬁguration B = ([f ], Q) where the D-indexed information family Q is given by Q(d) = {([ 0 ], d)}.Now consider the mapping s from D to P(D × D) given by s(d) = Q(d).Then for all d it will trivially hold that

Q(d) =

(d₁,d₂)∈s(d)

Q₂(d₁, d₂)

The existence of this s shows that B is an instance of B₂, in our notation written B = Is(B₂).Then there will be a level 1 unfolding step from B to Is(B₂), to be denoted B. B = ([f ], Q) where Q is a D-indexed family given by

Q(d) =

(d₁,d₂)∈s(d)

Q₂(d₁, d₂)

That is, Q([s(0)|d_v]) = {([ ], d_v)} and Q(d) = ∅ otherwise.

Next consider the mapping s from D to P(D × D) given by s(d) = Q(d).Then for all d it will trivially hold that

Q(d) =

(d₁,d₂)∈s(d)

Q₁(d₁, d₂)

This means that B = Is(B₁).Then there will be a level 1 unfolding step from B to Is(B₁), to be denoted B. B = ([], Q) where Q is a D-indexed family given by

Q(d) =

(d₁,d₂)∈s(d)

Q₁(d₁, d₂)

That is,Q([s(0)|d_v]) = Q₁([ ], d_v) andQ(d) = ∅otherwise, i.e.Q([s(0) ]) = {()} and Q(d) = ∅ otherwise.As Q(d) = ∅ iﬀ d = [s(0) ], this corre-sponds to Z being bound to [s(0) ] in the standard model.

Of course, also B =Is(B₂).So there also is a level 1 unfolding from B to Is(B₂) = B, where B = ([f ], Q).However, it is easily seen that Q(d) = ∅ for all d – hence B is a failure conﬁguration.Thus the transition from B to B represents a failure branch.

In our examples we will, for ease of exposition, often switch back and forth between the standard model and our model when it is the control aspect which has our primary interest.

In document View of Sharing of Computations (Sider 191-195)