• Ingen resultater fundet

RefocusinginReductionSemantics BRICS

N/A
N/A
Info
Hent
Protected

Academic year: 2022

Del "RefocusinginReductionSemantics BRICS"

Copied!
50
0
0

Indlæser.... (se fuldtekst nu)

Hele teksten

(1)

BRICS

Basic Research in Computer Science

Refocusing in Reduction Semantics

Olivier Danvy Lasse R. Nielsen

BRICS Report Series RS-04-26

ISSN 0909-0878 November 2004

B R ICS R S -04-26 Dan v y & Nielsen : Ref o cu sin g in Red u ction S eman tics

(2)

Copyright c 2004, Olivier Danvy & Lasse R. Nielsen.

BRICS, Department of Computer Science University of Aarhus. All rights reserved.

Reproduction of all or part of this work is permitted for educational or research use on condition that this copyright notice is included in any copy.

See back inner page for a list of recent BRICS Report Series publications.

Copies may be obtained by contacting:

BRICS

Department of Computer Science University of Aarhus

Ny Munkegade, building 540 DK–8000 Aarhus C

Denmark

Telephone: +45 8942 3360 Telefax: +45 8942 3255 Internet: BRICS@brics.dk

BRICS publications are in general accessible through the World Wide Web and anonymous FTP through these URLs:

http://www.brics.dk ftp://ftp.brics.dk

This document in subdirectory RS/04/26/

(3)

Refocusing in Reduction Semantics

Olivier Danvy and Lasse R. Nielsen

BRICS

Department of Computer Science University of Aarhus

§

November 2004

Abstract

The evaluation function of a reduction semantics (i.e., a small-step operational semantics with an explicit representation of the reduction context) is canonically defined as the transitive closure of (1) decomposing a term into a reduction con- text and a redex, (2) contracting this redex, and (3) plugging the contractum in the context. Directly implementing this evaluation function therefore yields an interpreter with a worst-case overhead, for each step, that is linear in the size of the input term.

We present sufficient conditions over the constituents of a reduction semantics to circumvent this overhead, by replacing the composition of (3) plugging and (1) decomposing by a single “refocus” function mapping a contractum and a context into a new context and a new redex, if any. We also show how to construct such a refocus function, we prove the correctness of this construction, and we analyze the complexity of the resulting refocus function.

The refocused evaluation function of a reduction semantics implements the transitive closure of the refocus function, i.e., a “pre-abstract machine.” Fusing the refocus function with the trampoline function computing the transitive closure gives a state-transition function, i.e., an abstract machine. This abstract machine clearly separates between the transitions implementing the congruence rules of the reduction semantics and the transitions implementing its reduction rules. The construction of a refocus function therefore shows how to mechanically obtain an abstract machine out of a reduction semantics, which was done previously on a case-by-case basis.

We illustrate refocusing by mechanically constructing Felleisen et al.’s CK ma- chine from a call-by-value reduction semantics of the lambda-calculus, and by con- structing a substitution-based version of Krivine’s machine from a call-by-name reduction semantics of the lambda-calculus. We also mechanically construct three one-pass CPS transformers from three quadratic context-based CPS transformers for the lambda-calculus.

Keywords

Reduction semantics, refocusing, pre-abstract machine, abstract machine, defunc- tionalization, transformation into continuation-passing style (CPS).

A preliminary version of this article was presented at RULE 2001 [8].

Current affiliation: ACURE A/S, Bytoften 1, DK-8240 Risskov, Denmark

Basic Research in Computer Science (www.brics.dk), funded by the Danish National Research Foundation.

§IT-parken, Aabogade 34, DK-8200 Aarhus N, Denmark E-mail:{danvy,lrn}@brics.dk

(4)

Contents

1 Introduction 1

1.1 Related work . . . 1

1.2 Interpreters for reduction semantics . . . 1

1.3 A canonical example: theλ-calculus . . . . 1

1.4 This work . . . 2

2 Refocusing in a reduction semantics 4 2.1 The language . . . 4

2.2 Reduction semantics . . . 4

2.2.1 Values . . . 4

2.2.2 Reduction contexts . . . 4

2.2.3 Decompositions and potential redexes . . . 5

2.2.4 Unique decomposition . . . 5

2.2.5 Reduction rules . . . 6

2.3 Requirements . . . 6

2.3.1 No redundant constructs . . . 6

2.3.2 Distinct value constructors and potential-redex constructors . . . 7

2.3.3 Left-to-right evaluation of sub-terms . . . 7

2.4 Consequences of the requirements . . . 7

2.5 Constructing a refocus function . . . 9

2.6 Correctness . . . 11

2.7 Complexity . . . 12

2.8 Summary . . . 13

3 The λ-calculus under call by value 14 3.1 A call-by-value interpreter . . . 14

3.1.1 The original specification . . . 14

3.1.2 The refocused specification . . . 15

3.2 Sabry and Felleisen’s CPS transformer . . . 15

3.2.1 The original specification . . . 15

3.2.2 The refocused specification . . . 16

3.3 A simpler CPS transformer . . . 17

3.3.1 The original specification . . . 17

3.3.2 The refocused specification . . . 17

3.4 Refocusing efficiently . . . 18

3.4.1 The call-by-value interpreter . . . 18

3.4.2 Sabry and Felleisen’s CPS transformer . . . 19

3.4.3 The simpler CPS transformer . . . 19

3.5 Analysis of the refocused call-by-value interpreter . . . 19

3.5.1 A pre-abstract machine . . . 19

3.5.2 From the pre-abstract machine to an abstract machine . . . 19

3.5.3 An eval/apply abstract machine . . . 20

3.5.4 A push/enter abstract machine . . . 21

3.5.5 From the eval/apply abstract machine to a big-step semantics . . 21

3.6 Analysis of the refocused CPS transformers . . . 21

3.7 Summary . . . 22

(5)

4 The λ-calculus under call by name 23

4.1 A call-by-name interpreter . . . 23

4.1.1 The original specification . . . 23

4.1.2 The refocused specification . . . 24

4.2 A simple CPS transformer . . . 24

4.2.1 The original specification . . . 24

4.2.2 The refocused specification . . . 25

4.3 Refocusing efficiently . . . 26

4.3.1 The call-by-name interpreter . . . 26

4.3.2 The simple CPS transformer . . . 26

4.4 Analysis of the refocused call-by-name interpreter . . . 26

4.4.1 A pre-abstract machine . . . 26

4.4.2 From the pre-abstract machine to an abstract machine . . . 27

4.4.3 An eval/apply abstract machine . . . 27

4.4.4 A push/enter abstract machine . . . 28

4.4.5 From the eval/apply abstract machine to a big-step semantics . . 28

4.5 Analysis of the refocused CPS transformer . . . 28

4.6 Summary . . . 29

5 Conclusion 29 A Arithmetic expressions with precedence 30 A.1 A reduction semantics . . . 30

A.2 An alternative representation of contexts . . . 31

A.3 Implementation . . . 32

A.4 Refocusing . . . 36

A.5 Summary . . . 41

List of Figures

1 Canonical vs. refocused evaluation . . . 3

2 Elementary contexts, values, and potential redexes . . . 8

3 Abstract syntax of arithmetic expressions . . . 33

4 Reduction contexts for arithmetic expressions . . . 33

5 Iterative plug functions for arithmetic expressions . . . 34

6 Redexes and the result of decomposition for arithmetic expressions . . . 34

7 Iterative decomposition of an arithmetic expression . . . 35

8 Canonical evaluation of an arithmetic expression . . . 36

9 Refocusing over an arithmetic expression (signature) . . . 39

10 Refocusing over an arithmetic expression (definition) . . . 40

11 Refocused evaluation of an arithmetic expression . . . 41

(6)

1 Introduction

A reduction semantics (a.k.a. syntactic theory) is a small-step operational semantics with an explicit representation of the reduction context [10–12]. We consider the prob- lem of implementing the evaluation function of a reduction semantics in the form of an interpreter. Our emphasis, however, is not on automating this process, as in Xiao and Ariola’s SL project [28, 29]. Instead, we observe that the direct implementation of an evaluation function entails an overhead and we show how to circumvent this overhead.

1.1 Related work

Probably because every interesting programming language corresponds to an inter- esting model of computation, the meta-language of formal semantics has consistently inspired a range of programming languages—witness denotational semantics and func- tional programming [25, 27], Moggi’s computational monads and monad-based func- tional programming [26], operational semantics and logic programming with definite clause grammars [17], algebraic semantics and OBJ [15], etc. These similarities between language and meta-language make it possible to write ‘executable specifications.’ For reduction semantics, however, executable specifications appear not to scale up in prac- tice [24, 28].

We identify a source of inefficiency in the direct implementation of a reduction semantics as an interpreter, and we show how to bypass it. The result can be used to mechanically construct an abstract machine out of a reduction semantics, an issue that is usually dealt with on a case-by-case basis [11, Chapter 7].

1.2 Interpreters for reduction semantics

A reduction semantics is a small-step operational semantics where evaluation is defined as the transitive closure of single reductions, each performed by (1) decomposing a term into a reduction context and a potential redex, (2) contracting the redex, if possible,1 and (3) plugging the contractum into the context. Most reduction semantics are de- veloped for deterministic programming languages and satisfy a unique-decomposition property.

The interpreter for a reduction semantics implements its evaluation function. It naturally consists of a decompose-contract-plug loop. Decomposition is usually im- plemented with a depth-first search in the abstract-syntax tree. The decompose step therefore introduces an overhead that is proportional to the size of the input term.

Likewise, plugging takes time linear in the size of the context.

1.3 A canonical example: the λ -calculus

Here is a call-by-value reduction semantics of theλ-calculus:

Terms t ::= x | λx.t | t t t Λ

Values v ::= x | λx.t v ∈Value Λ

Variables x∈Var

Reduction contexts C ::= [ ] | C[[ ]t] | C[v[ ]] C∈RedCont

Potential redexes r ::= v v r∈PotRedex Λ

1Not all potential redexes are actual reducible expressions—some are stuck terms [20].

(7)

Among potential redexes, only β-redexes are actual reducible terms. Therefore, the only reduction rule is the following one:

C[(λx.t)v]→C[t{v/x}]

Plugging the hole of a reduction context with a term is defined by induction on the context, and thus it operates in time proportional to the size of the context:

([ ])[t] = t (C[[ ]t0])[t] = C[t t0]

(C[v[ ]])[t] = C[vt]

Likewise, decomposition operates in time proportional, in the worst case, to the size of the term to decompose.

Let us illustrate this overhead with Church numerals: the Church numeral for the numbernisλs.λz. s(s(s(. . .| {z (s}

n

z). . .))) and is noteddne.

Example 1 (Church numeral) We consider the term dne(λx.x)v where v is any value. This term reduces in two steps to(λx.x)((λx.x)((λx.x)(. . .| {z ((λx.x)}

n

v). . .))).From then on, each decomposition into a context and a redex (where the redex is always (λx.x)v) takes time proportional to the number of remaining applications. The total time taken by decomposition during evaluation is thus proportional ton2. The evaluation function of a reduction semantics is inefficient because it constructs intermediate terms (see the left side of Figure 1). We observe, though, that in the course of evaluation, each decompose step always follows a plug step (even for the first step, since ([ ])[t] =t). One could therefore deforest intermediate terms by fusing the plug step and the decompose step into one ‘refocus’ step (see the right side of Figure 1).

Rather than deforesting plug and decompose on a case-by-case basis, we seek general conditions over the constituents of a reduction semantics to construct such an efficient refocus step.

1.4 This work

We state conditions under which consecutive plug-and-decompose operations can be re- placed by a more efficient ‘refocus’ operation where the intermediate term is deforested away. These conditions pertain to the order in which a redex occurs in the depth-first traversal of the decompose operation. We show that these conditions hold if the re- duction semantics is given in the “standard” way, i.e., by a context-free grammar of values and reduction contexts, and if it satisfies a unique-decomposition property. We also show how to mechanically construct such a refocus function.

Overview: The rest of this article is structured as follows. We first list sufficient conditions to obtain the refocus function of a reduction semantics (Section 2). The proof that these conditions are sufficient is constructive. It indicates how to mechanically rephrase the evaluation function of a reduction semantics to circumvent the plug-and- decompose overhead. These conditions are general enough to make refocusing useful

(8)

t0

decompose

%%K

KK KK KK KK K

reduce

t0

refocus

C0[r0] contract

C0[r0] contract

C0[t00]

yyssssssplugssss C0[t00]

refocus

t1

decompose

%%K

KK KK KK KK K

reduce

C1[r1] contract

C1[r1] contract

C1[t10]

yyssssssplugssss C1[t10]

refocus

t2

decompose

%%K

KK KK KK KK K

reduce

C2[r2] contract

C2[r2] contract

C2[t20]

yyssssssplugssss C2[t20]

refocus

t3

decompose

$$I

II II II II II II

... ... ...

Figure 1: Canonical vs. refocused evaluation

in practice—for example, they are fulfilled in all the deterministic reduction semantics described in Felleisen and Flatt’s lecture notes [11] and all the examples documented in the SL project [28, 29].

We then illustrate refocusing with two reduction semantics for the λ-calculus, one for call by value and one for call by name (Sections 3 and 4). In the call-by-value case, the refocused evaluation function implements Felleisen et al.’s CK machine, and in the call-by-name case, the refocused evaluation function implements a substitution- based version of Krivine’s machine. We also consider three context-based, quadratic CPS transformers (two for call by value and one for call by name) and we make them operate in one pass.

(9)

2 Refocusing in a reduction semantics

In this section we show how to construct from scratch arefocusfunction for a reduction semantics, as an alternative to plugging and decomposing for defining an evaluation function. We prove that this function computes the correct result and we give an exact measure of its computational complexity. Finally we show that refocusing is always at least as efficient as plugging and then decomposing.

2.1 The language

A programming language is defined by its syntax and its semantics. Here, the semantics is specified by a reduction semantics. Without loss of generality we assume that the abstract syntax is specified by a context-free grammar (or BNF) with only productions of the form

t::=c(t1, . . . ,tn)

where c ranges over a set of constructor names and t,t1, . . . ,tn are non-terminals corresponding to the syntactic categories of the language. Each constructor may occur only once in a grammar.

2.2 Reduction semantics

A reduction semantics is a specification of a small-step operational semantics, i.e., it specifies a reduction relation between terms. Reduction semantics usually come with context-free grammars for values and reduction contexts, a notion of decomposition and potential redex, often a unique-decomposition lemma, and finally reduction rules of the form C[r]→ t, where the term r denotes a redex. Let us review each of these concepts in turn.

2.2.1 Values

The grammar of values extends the BNF of the language with one new non-terminal, v, per non-terminalt. The set of values ranged over by a v is included in the set of terms ranged over by the matchingt. The productions for values are all of the form

v::=c(x1, . . . , xn)

wheret::=c(t1, . . . ,tn) is a production of the language grammar and eachxi is either ti orvi.

If a reduction semantics of the language contains only values, then it is irrelevant whether we usevi or ti in place of xi. From here on we will consistently use a ti for such a syntactic category.

2.2.2 Reduction contexts

Reduction contexts (often called ‘evaluation contexts’) are also specified by context-free grammars, and their meaning is given by an associatedplug function. For example, the reduction contexts of theλ-calculus under call by value and the associated plug function were shown in Section 1.3.

All the reduction contexts of a reduction semantics can be written as the composi- tion ofelementary reduction contexts. Composition is a binary function on reduction

(10)

contexts,◦, that together with the plug function satisfies (C1◦C2)[t] =C1[C2[t]], and is thus associative.2 Elementary contexts are those that cannot themselves be written as compositions of other non-empty contexts.

For example, for the λ-calculus under call by value (see Section 3), the elemen- tary contexts are app([ ],t) and app(v,[ ]). The construction of reduction contexts corresponds to composing an elementary context to the right of another context, e.g., C[app([ ],t)] = C◦app([ ],t). In many articles about reduction semantics, though, compound contexts are constructed by composing an elementary context on the left rather than on the right [11, 13]. For theλ-calculus under call by value, the grammar of reduction contexts then reads as follows.

C ::= [ ]|app(C,t)|app(v,C) This notation induces the same elementary contexts.

2.2.3 Decompositions and potential redexes

If a term t0 is the result of plugging another termt into a reduction context C, i.e., ifC[t] =t0, then we say thatCand t form adecomposition (into a reduction context and a term) oft0, or alternatively thatt0 can be decomposed intoC andt. In general, a term can be decomposed in many different ways.

If t0 = C[t] then the decomposition is trivial if either C is the empty context or t is a value. We assume that values are normal forms: they cannot have a non-value sub-term in a position where it can be evaluated, so all their decompositions must be trivial.3 Non-value terms that can only be decomposed trivially are calledpotential redexes. For the λ-calculus under call by value, the potential redexes are exactly the terms of the formapp(v, v).

If t0 = C[t] then the decomposition is complete if t is a potential redex. The reduction rules of a reduction semantics contain only complete decompositions. If a term has a decomposition,C1[t1], that is not complete, then eithert1is a value or it can be further decomposed in a non-trivial way asC2[t2]. Then (C1◦C2)[t2] is again a non- trivial decomposition of the original term. The context of a complete decomposition is maximal in the sense that further decompositions would be trivial.

2.2.4 Unique decomposition

Unique decomposition is a property of reduction semantics that guarantees the exis- tence and uniqueness of complete decompositions for arbitrary terms.

Definition 1 (Unique decomposition) A reduction semantics is said to satisfy the unique-decomposition propertyif any non-value term t can be uniquely decomposed as t=C[r]where r is a potential redex.

Unique decomposition is so fundamental to reduction semantics for deterministic languages that it is almost always the first property to be established. Its proof is

2We note that reduction contexts, together with composition and the empty context, form a monoid, which explains why they can indifferently be represented “outside in” or “inside out” [7, 11, 13]. Rep- resenting contexts inside out leads one to a stack implementation: extending a context is implemented by pushing, and decomposing the innermost context is implemented by popping and dispatching.

3This does not preclude, e.g., weak head normal forms or lazy pairs.

(11)

often technically simple, but because of its many small cases, it tends to be tedious and error-prone. This state of affairs motivated Xiao, Sabry, and Ariola to develop automated support for proving unique-decomposition properties [29].4

2.2.5 Reduction rules

The reduction rules of a reduction semantics are of the form C[r] t where r is a potential redex andt depends only onCand on the sub-terms ofr. Oftent is written asC[t0] for somet0 depending onr.

The potential redexes that occur in a reduction rule are the actual reducible terms of the reduction semantics. The remaining potential redexes are stuck, to use Plotkin’s terminology [20]. Stuck terms encompass type-incorrect terms (e.g., the application of a literal) and undefined terms (e.g., the application of an identifier).

The only reduction rule of theλ-calculus under call by value is C[app(lam(x,t),v)]→C[t{v/x}]

and thusapp(lam(x,t),v) is a reducible term. The remaining potential redexes (of the formapp(x,v)) are stuck terms.

2.3 Requirements

We state three requirements that are sufficient for constructing arefocus function. We consider reduction semantics whose structure is as described in Section 2.2 and that satisfy unique decomposition.

2.3.1 No redundant constructs

A reduction semantics specifies a reduction relation. Any part of a specification that does not affect the specified relation is redundant and can be ignored. Specifically:

If the language has a grammar productiont::=c(t1, . . . ,ti, . . . ,tn) and the syn- tactic category ranged over byti contains only values, then an elementary reduc- tion context of the form c(x1, . . . , xi−1,[ ], xi+1, . . . , xn) is redundant. Since only values can be plugged in the hole, and values can only be trivially de- composed, then no reduction context constructed by composing this elementary context can ever be part of a complete decomposition.

If a syntactic category ranged over bytcontains no values, i.e., the corresponding values ranged over byvis the empty set, then any occurrence ofvin a rule makes that rule redundant. (Syntactic categories with no values are not useless; they can occur meaningfully in, e.g., languages with control effects.)

If the syntactic category ranged over by t is empty, then that entire syntactic category and all other productions containingt are redundant and can safely be ignored.

4Xiao, Sabry, and Ariola also point out that proving unique decomposition reduces to proving equivalence and unambiguity of context-free grammars—two undecidable properties. In their system SL, they reduce the problem to regular tree languages, for which equivalence and unambiguity are decidable. Similarly, the grammars we consider here are regular tree grammars.

(12)

The above properties are all decidable. Deciding whether the set of terms ranged over by a non-terminal is empty is decidable for context-free grammars. The specific structure of the productions of values guarantees that values must be a subset of terms, and that it is decidable whethert and v generate the same set of terms, i.e., whether tonly ranges over values.

2.3.2 Distinct value constructors and potential-redex constructors

By definition, potential redexes are distinct from values, and values are defined by a context-free grammar of the form shown in Section 2.2.1. If an instance of a syntactic construct can become a value by evaluating its sub-terms, then another instance must not also be able to become a redex, and vice-versa.

The only way a syntactic construct can become both a value and a potential redex is if the value is given by a production of the formv::=c(. . . ,vi, . . .) and there is no elementary context of the form c(. . . ,[ ], . . .) with a hole in the ith position. Then c(. . . ,ti, . . .) is a potential redex whenti isnot a value.

There are several ways to verify the property that no syntactic construct can become both a value and a potential redex. We use the equivalent requirement that the poten- tial redexes can be specified by a context-free grammar with productions of the same type as for values. In other words, the productions are of the formr::=c(x1, . . . , xn) wherexi is either vi orti.

2.3.3 Left-to-right evaluation of sub-terms

Without loss of generality, we assume that sub-terms of a syntactic construct are or- dered so that evaluating them proceeds from left to right. In other words, the unique decomposition into reduction context and potential redex will always have the hole of the context occurring in the left-most non-value sub-term, in the ordering of sub-terms chosen for the abstract syntax.

Finally, we assume that in an elementary context, if the syntactic categoryvioccurs to the left of the hole, then the corresponding categoryti actually contains non-value terms.

2.4 Consequences of the requirements

The requirements of Section 2.3 guarantee that a reduction semantics contains elemen- tary reduction contexts, values, and potential redexes of a specific form. This form is described in Figure 2.

An example where the m in Figure 2 is strictly smaller than n is a conditional expression such asif(t,t,t). The only elementary reduction context isif([ ],t,t), since the other two sub-terms should not be evaluated until the conditional expression has itself been reduced. Another example is theλ-calculus under call by name where only the function part of an application is evaluated.

Lemma 1 A reduction semantics satisfying the requirements stated in Section 2.3 con- tains productions corresponding to Figure 2.

Proof (sketch): Take the productiont::=c(t1, . . . ,tn).

If there is a value production v::=c(t1, . . . , tn) or a potential-redex production r::=c(t1, . . . ,tn), thenm= 0 and cbecomes a value or a potential redex.

(13)

For a syntactic construct with nsub-terms, t::=c(t1, . . . ,tn), there is a number 0≤m≤nsuch that the elementary reduction contexts for care exactly:

c([ ], t2, . . . ,tn) c(v1,[ ],t3, . . . ,tn) ...

c(v1, . . . ,vm−1,[ ],tm+1, . . . , tn)

If terms of the formc(v1, . . . ,vm,tm+1, . . . , tn) exist (i.e., the syntactic categories ranged over byt1, . . .tmall contain values) then they are either all values or all potential redexes. In other words, either

v::=c(v1, . . . , vm,tm+1, . . . ,tn) or

r::=c(v1, . . . ,vm, tm+1, . . . ,tn) is a production, but not both.

We then say that c needs to evaluate m sub-terms. If c(v1, . . . ,vm, tm+1, . . . ,tn) is a value we say thatcbecomes a value. If it is a potential redex we say thatcbecomes a potential redex.

Figure 2: Elementary contexts, values, and potential redexes

If there is no such production then, since t is not redundant, there must exist a termc(t1, . . . ,tn) that has a non-trivial decomposition and thus there must exist an elementary reduction contextc(t1, . . . ,ti−1,[ ],ti+1, . . . ,tn). We assume that evalu- ation proceeds from left to right, so the hole must be in the left-most term, i.e., the elementary context isc([ ],t2, . . . ,tn).

Assume that there exists an elementary reduction context of the form c(v1, . . . ,vi−1,[ ], ti+1, . . . ,tn).

Three cases occur:

1. Iftionly ranges over non-values thenm=iandcbecomes neither a value nor a potential redex.

2. Ifti ranges over values and there exists a production v::=c(v1, . . . ,vi,ti+1, . . . , tn) or

r::=c(v1, . . . ,vi,ti+1, . . . ,tn)

thenm=iandcbecomes either a value or a potential redex (since it cannot be both).

3. Ifti ranges over values, but terms of the formc(v1, . . . ,vi,ti+1, . . . ,tn) are not values or potential redexes, then they must have non-trivial decompositions, so

(14)

there is a corresponding elementary reduction context. Under left-to-right evalu- ation, the context must bec(v1, . . . ,vi,[ ],ti+2, . . . , tn).

In other words, either (1)c(v1, . . . ,vi,ti+1, . . . , tn) does not exist, or it is a value or a potential redex, and thenm=i, or (2) there exists an elementary reduction context c(v1, . . . ,vi,[ ],ti+2, . . . , tn), and thenm > i.

An induction argument shows thatm is the smallest isuch that (1) holds. There exists such a smallesti: a term of the formc(v1, . . . , vn) either does not exist or it has only trivial decompositions, i.e., is either a value or a potential redex, and thusm≤n

as required.

We have shown that a reduction semantics satisfying the requirements contains the productions of Figure 2. We do not show that there are no further productions.

However, we show in the next section that one can compute the unique decomposition of any term using only the productions of Figure 2, and thus any further productions would either be redundant or violate the unique-decomposition property.

2.5 Constructing a refocus function

We construct a function, refocus, that is extensionally equivalent to the composition of theplug function and the decompose function. We choose to make it use a stack of elementary contexts to represent reduction contexts. Such a stack allows an efficient implementation of composing an elementary reduction context and plugging a term in a context. We use [ ] for the empty context/stack andC◦c(v1, . . . ,vi−1,[ ],ti+1, . . . ,tn) for composing/pushing an elementary context.

The refocusing function is defined with two mutually recursive functions. The first, refocus, takes a term and a stack of elementary reduction contexts, and is defined by cases on the term; it has the same type asdecompose. The other function,refocusaux, takes a stack of elementary contexts and a value as arguments, and is defined by cases on the top-most elementary reduction context on the stack.

For eacht::=c(t1, . . . , tn) in the language grammar, there exists one corresponding rule inrefocus. Ifcneeds to evaluatemsub-terms then there aremrules inrefocusaux corresponding to the elementary contextsc([ ], t2, . . . ,tn) throughc(v1, . . . , vm−1,[ ], tm+1, . . . ,tn). Ifrefocusandrefocusaux are implemented in a typed setting (e.g., ML), there is one refocus and one refocusaux for each syntactic category of the language.

The rules ofrefocus mentioned above go in the refocus functions corresponding to the syntactic category ranged over byt. The rules ofrefocusaux go in the auxiliary refocus functions corresponding to the syntactic category of the hole, which is the same as the syntactic category of the value argument.

1. Ifcneeds to evaluate zero sub-terms, then we know thatc(t1, . . . ,tn) is a value or a potential redex.

(a) Ifcbecomes a value, then

refocus(c(t1, . . . ,tn),C) =refocusaux(C,c(t1, . . . , tn)) (b) If insteadcbecomes a potential redex, then

refocus(c(t1, . . . ,tn),C) = (C,c(t1, . . . , en)) since we have found the decomposition.

(15)

2. If c needs to evaluate more than zero sub-terms then the first term must be reduced first, and we simply refocus on it:

refocus(c(t1, . . . ,tn),C) =refocus(t1,C◦c([ ],t2, . . . , tn))

Likewise, therefocusaux function is defined by cases on the reduction context on top of the stack. Ifc(v1, . . . , vi−1,[ ],ti+1, . . . ,tn) is the top-most elementary context on the stack, then the corresponding case is one of the following.

1. Ifcneeds to evaluate exactlyisub-terms, thenc(v1, . . . ,vi,ti+1, . . . ,tn), if such a term exists, is either a value or a potential redex.

(a) Ifcbecomes a value then the case is

refocusaux(Cc(v1, . . . ,vi−1,[ ],ti+1, . . . ,tn),vi)

=refocusaux(C,c(v1, . . . ,vi−1, vi, ti+1, . . . ,tn))

(b) If c becomes a potential redex then we have found a decomposition into reduction context and potential redex, and the case is

refocusaux(Cc(v1, . . . ,vi−1,[ ],ti+1, . . . ,tn),vi)

= (C, c(v1, . . . ,vi−1,vi,ti+1, . . . , tn))

(c) If ti ranges over only non-values then there is no rule. In a typed setting there is no refocusaux corresponding to the constructors of that syntactic category.

2. If c needs to evaluate more thani sub-terms, then we continue searching for a potential redex in the next sub-term, and the rule is:

refocusaux(Cc(v1, . . . , vi−1,[ ],ti+1, . . . , tn),vi)

=refocus(ti+1,C◦c(v1, . . . ,vi,[ ],tn)) 3. Finally there is one rule for the empty context.

refocusaux([ ],v) =v

This base case accounts for the situation where the entire term is a value, so the decomposition has to return a value rather than a context and a potential redex.

In Appendix A, we illustrate the construction of arefocus function for a reduction semantics of arithmetic expressions with precedence.

In the following section we show that therefocus function generated by the above rules computes a correct decomposition of a term into a reduction context and a po- tential redex, and that it does so at least as efficiently as the combination ofplug and decompose.

(16)

2.6 Correctness

By construction,refocus can only return either a value or a pair of a reduction context and a potential redex, since the remaining cases all perform a tail call. Bothrefocus and refocusaux are passed a decomposition of a term, i.e., a reduction context and another term, and if they make a recursive call, it is on another decomposition of the same term. Therefore, if refocus(t,C) returns a decomposition, then it is a complete decomposition ofC[t].

It thus suffices to prove that refocus terminates on any argument. Let us show that ifC[t] =C0[r], wherer is a potential redex, thenrefocus(t,C) performsT(C0) + Taux(r)− T(C) tail-recursive calls, whereT andTaux are defined as follows:

T :RedCont →N T([ ]) = 0

T(Cc(v1, . . . , vk−1,[ ],tk+1, . . . ,ten)) = 1 +T(C) +Pk+1

i=1(1 +Taux(vi)) Taux :Val+PotRedex →N

Taux(c(v1, . . . ,vm,tm+1, . . . ,tn)) = 1 +Pm

i=1(1 +Taux(vi))

Given a reduction context, T computes the number of edges traversed to go from the root to the hole in a depth-first left-to-right traversal, counting the edges to values both on the way down and on the way up. This traversal is illustrated by the following picture:

c

~~~~~~~~

@@

@@

@@

@@

~~

v

>>

v

OO

c



==

==

==

==



v

??

c

<<

<<

<<

<< t

c

==

==

==

==

t t

v

@@

[ ] t

As forTaux, it computes the number of edges followed by a full traversal of a composite value.

All fully traversed sub-terms must be values. With this definition, the result of T(C) is at most twice the number of nodes visited on such a traversal.

We use theT function to measure how far therefocus function has progressed in its traversal of its argument.

Lemma 2 (Termination) If C[t] =C0[r]then refocus(t,C)terminates afterT(C0)+

Taux(r)− T(C) tail-recursive calls.

Proof:Let us useT to define aprogress measure on all the left-hand sides and right- hand sides of the definition ofrefocus andrefocusaux:

(17)

Progress(refocus(t,C))) =T(C)

Progress(refocusaux(C,v)) =T(C) +Taux(v) Progress(C,r) =T(C) +Taux(r)

Progress(v) =Taux(v)

With this definition, all choices of arguments to refocus and refocusaux give a right- hand side with a progress value one greater than the left-hand side. We omit the details and give only one case as example.

Ifcneeds to evaluatemsub-terms to become a value and 0< k < m, thenrefocusaux contains the following case:

refocusaux(Cc(v1, . . . ,vk−1,[ ],tk+1,tk+2, . . . ,tn),vk)

=refocus(tk+1,C◦c(v1, . . . ,vk−1,vk,[ ],tk+2, . . . ,tn))

Taking the progress measure on both the left-hand side and the right-hand side gives the expected one-greater value on the right-hand side:

Progress(refocusaux(Cc(v1, . . . ,vk−1,[ ], tk+1, . . . ,tn),vi))

=T(Cc(v1, . . . ,vk−1,[ ],tk+1, . . . , tn)) +Taux(vk)

= 1 +T(C) +Pk−1

i=1(1 +Taux(vi)) +Taux(vk) and

Progress(refocus(tk+1,C◦c(v1, . . . ,vk,[ ],tk+2, . . . , tn)))

=T(Cc(v1, . . . ,vk,[ ],tk+2, . . . ,tn))

= 1 +T(C) +Pk

i=1(1 +Taux(vi))

= 1 +T(C) +Pk−1

i=1(1 +Taux(vi)) +Taux(vk) + 1

Since each recursive call increases the progress measure by one, and there is only a finite possible number of different such calls (each call is to one of two functions on a decomposition of the same term, and a term has only finitely many decompositions), the recursion must eventually end. When this happens, the cumulative increase in the progress measure is also the number of recursive calls performed to reach it. This number is exactly

Progress((C0,r))−Progress(refocus(t,C)) =T(C0) +Taux(r)− T(C).

Lemma 2 tells us both thatrefocus is a total function and how many calls it takes to find the result. In Section 2.7, we use it to show thatrefocus is always at least as efficient as usingplug anddecompose.

2.7 Complexity

In order to show that refocus is more efficient than the composition of plug and decompose, we must first decide on a relevant complexity measure, and we must find the complexity ofplug anddecompose.

For measuring the time complexity ofrefocus we use the number of recursive calls, as in Section 2.6. Each call performs a bounded number of basic operations on the syntax (either constructing a term or deconstructing a term while naming the sub- terms—the latter is performed implicitly since the function is defined by cases). If each

(18)

such operation takes constant time, the time taken to compute the result ofrefocus is proportional to the number of recursive calls (i.e., bounded by a constant factor times the number of calls).

Let us informally argue that any implementation of a functiondecompose :Term Val+ (RedCont×PotRedex) finding the unique complete decomposition of terms (in a syntactic category satisfying our requirements) must perform at least (T(C) + Taux(r))/2 basic operations on the syntax to compute decompose(C[r]) = (C,r). If it uses fewer operations, it cannot inspect all the values of the reduction context and of the potential redex. If we changed such a value to a non-value, then the hole of the reduction context should be in the new non-value term, but the function cannot see this without inspecting the value and thus it must generate an erroneous result, contradicting the assumption that it computes the unique complete decomposition.

Theplug function is implemented in time proportional to the depth of the hole in the reduction context. Since the time complexity of decompose is always greater, we ignoreplug in our comparison.

In summary, the complexity of computingdecompose(C[t]) =C0[r] is at least pro- portional toT(C0)+Taux(r), whereas the complexity of computingrefocus(t,C) =C0[r]

is proportional toT(C0) +Taux(r)− T(C). For reduction semantics where rewrites are often local, the difference betweenT(C) and T(C0) is often significantly smaller than the value ofT(C0) itself. For example, in the Church-numeral example of Section 1.3 (Example 1), the difference betweenT(C) andT(C0) is constant whereasT(C0) alone is proportional to the size of the input term.

In some cases,refocus does no better thandecompose, e.g., when the context argu- ment torefocus is always empty (e.g., for tail-recursive or continuation-passing style λ-terms, and for trampoline-like programs with control operators where the context is discarded in each step). Likewise, if the size of the context is bounded, the saving is at most a constant factor.

2.8 Summary

We have presented a few mild requirements to construct a refocus function for a re- duction semantics. We have also proven the correctness of this construction and char- acterized the complexity of the constructed function. Refocusing is at least as efficient as first plugging and then decomposing, and it is often significantly more efficient.

(19)

3 The λ -calculus under call by value

Let us go back to the λ-calculus as initially considered in Section 1.3. We first re- state the grammars of the language and of the reduction semantics in the format used in Section 2. These grammars can be expressed directly as ML data types and the corresponding plug and decompose functions as ML functions (see Appendix A).

Terms t ::= var(x) | lam(x,t) | app(t,t) t∈Term

Values v ::= var(x) | lam(x,t) v∈Val

Variables x∈Var

Reduction contexts C ::= [ ] | C[app([ ],t)] | C[app(v,[ ])] C∈RedCont

Potential redexes r ::= app(v, v) r∈PotRedex

We make plugging and decomposition explicit with two functions:

plug :Term×RedContTerm

decompose :Term Val+ (RedCont×PotRedex)

The refocusing function refocus is defined as the composition of decompose and plug. Its type is thus:

refocus :Term×RedCont Val+ (RedCont×PotRedex)

We now consider a call-by-value interpreter and then two call-by-value CPS trans- formers for theλ-calculus (Sections 3.1, 3.2, and 3.3). Each usesdecompose andplug, which we fuse into a more efficientrefocus function (Section 3.4).

3.1 A call-by-value interpreter

We start from a traditional call-by-value reduction semantics for theλ-calculus [10–12].

The corresponding interpreter is defined as the transitive closure of (1) decomposing a term into a reduction context and a potential redex, (2) contracting this redex if possible, and (3) plugging the contractum into the context. We restate this interpreter so that the decomposition function is always applied to the result of the plug function.

3.1.1 The original specification

The following interpreter implements the evaluation function of the reduction seman- tics. It takes a term as argument and repeatedly decomposes, contracts, and plugs until either a value or a stuck term is reached, if any:

evaluate : Term Val+ (RedCont×PotRedex) evaluate(t) = iterate(decompose(t))

iterate : Val+ (RedCont×PotRedex)Val+ (RedCont×PotRedex) iterate(v) = v

iterate(C,app(lam(x,t), v)) = evaluate(plug(t{v/x},C)) iterate(C,app(var(x), v)) = (C,app(var(x), v))

The evaluation of a term t is evaluate(t). (NB: The last clause of iterate is used for stuck terms.)

(20)

The interpreter contains two calls to evaluate: one in the second clause ofiterate and one in the initial call. In the second clause ofiterate, the argument ofevaluate is the result ofplug, and the same holds in the initial call, sincet=plug(t,[ ]). In that sense,decompose is always called with the result ofplug.

Let us re-express the interpreter with a refocusing function that combines the effects ofdecompose andplug.

3.1.2 The refocused specification

We change the interpreter to use refocus instead of decompose and plug. Instead of decompose(t) above, we write decompose(plug(t, [ ])), i.e., refocus(t,[ ]); and in- stead ofevaluate(plug(t{v/x},C)), we inline the (old) definition ofevaluateand write iterate(decompose(plug(t{v/x},C))), i.e.,iterate(refocus(t{v/x},C)):

evaluate : Term Val+ (RedCont×PotRedex) evaluate(t) = iterate(refocus(t,[ ]))

iterate : Val+ (RedCont×PotRedex)Val+ (RedCont×PotRedex) iterate(v) = v

iterate(C,app(lam(x,t), v)) = iterate(refocus(t{v/x},C)) iterate(C,app(var(x), v)) = (C,app(var(x), v))

The evaluation of a termt isevaluate(t).

Proposition 1 The original interpreter and the refocused interpreter implement the same evaluation function.

Proof: Immediate.

We are now free to use any implementation ofrefocusthat is extensionally equivalent todecompose◦plug.

3.2 Sabry and Felleisen’s CPS transformer

In their work on reasoning about programs in continuation-passing style (CPS), Sabry and Felleisen designed a new CPS transformation [23, Definition 5]. This CPS transfor- mation integrates a notion of generalized reduction and thus yields very compact CPS programs [6]. It is also unusual in the sense that it builds on the notion of a reduction semantics. Therefore, it is defined as the transitive closure of decomposing, perform- ing an elementary CPS transformation, and plugging, which makes it a candidate for refocusing.

3.2.1 The original specification

Definition 2 (Sabry and Felleisen, 1993)The following CPS transformer uses three mutually recursive functions: Ck (and the auxiliary function Ck0) to transform terms, Φto transform values, andKk to transform reduction contexts. The functions Ck,Ck0, andKk are parameterized over a variable kthat represents the current continuation.

Ck : Term Term

Ck(t) = Ck0(decompose(t))

(21)

Ck0 : Val+ (RedCont×PotRedex) Term

Ck0(v) = app(var(k),Φ(v))

Ck0(C,app(var(x), v)) = app(app(var(x),Kk(C)), Φ(v)) Ck0(C, app(lam(x,t), v)) = app(lam(x,Ck(plug(t, C))),Φ(v))

Φ : Val Val Φ(var(x)) = var(x)

Φ(lam(x,t)) = lam(k,lam(x,Ck(t))) for a fresh k

Kk : RedCont Term Kk([ ]) = var(k)

Kk(C[app(var(x),[ ])]) = app(var(x),Kk(C)) Kk(C[app(lam(x,t),[ ])]) = lam(x,Ck(plug(t,C)))

Kk(C[app([ ], t)]) = lam(ui,Ck(plug(app(var(ui),t), C))) for a fresh ui

The CPS transformation of a term t islam(k, Ck(t)), for a freshk.

The CPS transformer contains five calls toCk (counting the initial one). In three cases, the argument ofCk is the result ofplug, and in two cases, the argument ist. As in Section 3.1.1, and sincet=plug(t,[ ]), we can say thatdecompose, in the definition ofCk, is always called with the result ofplug.

If they are implemented literally, decomposition and plugging entail a time factor for each transformation step that is linear in the size of the input term, in the worst case. Overall, the worst-case time complexity of this CPS transformation is quadratic in the size of the input term.

Let us re-express the CPS transformer with a refocusing function that combines the effects ofdecompose andplug.

3.2.2 The refocused specification

We change the CPS transformer to userefocus instead ofdecompose andplug. Ck : Term Term

Ck(t) = Ck0(refocus(t,[ ])) Ck0 : Val+ (RedCont×PotRedex)Term

Ck0(v) = app(var(k), Φ(v))

Ck0(C,app(var(x), v)) = app(app(var(x),Kk(C)),Φ(v)) Ck0(C,app(lam(x,t), v)) = app(lam(x,Ck0(refocus(t,C))),Φ(v))

Φ : Val Val Φ(var(x)) = var(x)

Φ(lam(x,t)) = lam(k, lam(x,Ck(t))) for a freshk

Kk : RedContTerm Kk([ ]) = var(k)

Kk(C[app(var(x),[ ])]) = app(var(x),Kk(C)) Kk(C[app(lam(x,t),[ ])]) = lam(x,Ck0(refocus(t,C)))

Kk(C[app([ ],t)]) = lam(ui,Ck0(refocus(app(var(ui),t),C))) for a freshui

(22)

The CPS transformation of a termtislam(k,Ck(t)), for a freshk.

Proposition 2 The original CPS transformer and the refocused CPS transformer im- plement the same transformation function.

Proof: Immediate.

We are now free to use any implementation ofrefocusthat is extensionally equivalent todecompose◦plug.

3.3 A simpler CPS transformer

We present a simplified, less compacting, version of Sabry and Felleisen’s context-based CPS transformer, one that produces CPS terms `a la Plotkin (λx.λk. ...) rather than

`a la Fischer (λk.λx. ...) [6]. We then restate it so that the decomposition function is always applied to the result of the plug function.

3.3.1 The original specification

The following CPS transformation repeatedly decomposes a source term into a context and the application of one value to another value, CPS transforms the application, and plugs a fresh variable in the context. This process continues until the source term is a value.

Definition 3 (Context-based CPS transformation) Ck :Term Term

Ck(t) = Ck0(decompose(t)) Ck0 :Val+ (RedCont×PotRedex) Term

Ck0(v) = app(var(k),Φ(v))

Ck0(C,app(v, v0)) = app(app(Φ(v),Φ(v0)),Kk(C)) Φ:Val Val

Φ(var(x)) = var(x)

Φ(lam(x,t)) = lam(x,lam(k, Ck(t))) for a freshk

Kk:RedCont Val

Kk(C) = lam(ui,Ck(plug(var(ui),C))) for a freshui

The CPS transformation of a term t islam(k, Ck(t)), for a freshk. Again, and for the same reasons as in Section 3.2.1, the worst-case time complexity of this CPS transformation is quadratic in the size of the input term.

3.3.2 The refocused specification

We change the CPS transformer to use refocus instead of decompose and plug. In- stead ofdecompose(t) above, we writedecompose(plug(t, [ ])), i.e., refocus(t,[ ]); and instead ofCk(plug(var(ui),C)), we writeCk0(decompose(plug(var(ui),C))), i.e., we write Ck0(refocus(var(ui),C)):

(23)

Ck:Term Term

Ck(t) = C0k(refocus(t,[ ])) Ck0 :Val+ (RedCont×PotRedex)Term

Ck0(v) = app(var(k),Φ(v))

Ck0(C,app(v, v0)) = app(app(Φ(v),Φ(v0)),Kk(C)) Φ:Val Val

Φ(var(x)) = var(x)

Φ(lam(x,t)) = lam(x,lam(k,Ck(t))) for a freshk

Kk :RedCont Val

Kk(C) = lam(ui,Ck0(refocus(var(ui),C))) for a freshui

The CPS transformation of a termtislam(k,Ck(t)), for a freshk.

Proposition 3 The original CPS transformer and the refocused CPS transformer im- plement the same transformation function.

Proof: Immediate.

We are now free to use any implementation ofrefocusthat is extensionally equivalent todecompose◦plug.

3.4 Refocusing efficiently

Applying the construction of Section 2.5 yields the following functions:

refocus : Term×RedCont Val+ (RedCont×PotRedex) refocus(v,C) = refocusaux(C, v)

refocus(app(t,t0),C) = refocus(t,C[app([ ],t0)])

refocusaux : RedCont×Val Val+ (RedCont×PotRedex) refocusaux([ ], v) = v

refocusaux(C[app([ ],t0)], v) = refocus(t0,C[app(v,[ ])]) refocusaux(C[app(v,[ ])], v0) = (C,app(v, v0))

By construction, refocus implements a depth-first search for the first application of one value to another, by recursively descending into a term and accumulating the corresponding context. As forrefocusaux, it dispatches on the accumulated context.

The following definition ofdecompose holds as a corollary:

decompose:Term Val+ (RedCont×PotRedex) decompose(t) = refocus(t,[ ])

3.4.1 The call-by-value interpreter

Proposition 4 The new refocused interpreter and the previous one implement the same evaluation function.

Proof: A corollary of the correctness of the refocus construction.

Referencer

RELATEREDE DOKUMENTER

The labour supply function we want to estimate in equation (6) differs from our initial labour supply function in equation (1) because we account for endogeneity – of marginal

The resulting function is the probability of travel choice behavior as a function of observable explanatory variables (i.e., traveler characteristics S n , alternative

To investigate the optimal number of data points for training, we will test the accuracy and training time as a function of the number of training data points.. The parameter

Figure 6: Evolution of the total GWP as a function of the powertrain efficiency a) GWP as a function of the powertrain efficiency, b) linear fit between the GWP and the

• A checksum function is used to create a MESSAGE DIGEST (a ONE-WAY HASH FUNCTION) that is sent with the message and encrypted along with it..

Inlining the components of the combined stack-inspection and exception monad in the monadic evaluator of Section 8 and uncurrying the eval function and the function space in the

When writing a thesis, it is important for the author and researcher of the thesis to state the specific purpose; this will function as a guide to find the appropriate research

(To make the interpreter compositional, one can follow the tradition of denotational semantics and use an environment mapping an identifier to a function that either evaluates the