Context handling - Possible optimizations

3.7 Possible optimizations

4.1.3 Context handling

JavaScript has several context conditions, that should be respected for a JavaScript program to be correct.

4.1.3.1 Return statements

In JavaScript, all return statements must be inside functions [ecm11, p. 93].

Therefore the following JavaScript program is not correct:

function double(x){

return x*2;

} return;

Checking this can be done as a tree-search for the return statement starting from the root-node. If a node is reached that represents a function, the search should not go in to the subtree starting at that node, because all return statements in that subtree are correct. The JavaScript program is correct with respect to the return statements, if the return-statement is not found by the search. We have implemented the tree search as a visitor.

4.1.3.2 Continues and breaks

The context conditions for continue and break statements are as follows [ecm11, pp. 92-93].

Unlabeled A continue or break statement should be inside a loop or switch, and there must not be a function scope between the statement and the loop or switch.

Figure 4.1

This means that the AST in figure 4.1 will be illegal because, while the break statement is in the subtree of a for node, there is a FunctionNode between the break and the loop. However, the AST in figure 4.2 will be legal.

Figure 4.2

Labeled For labeled continue or break statements, the rule is that they should be inside a loop or a block (code in a set of curly brackets), that has the same label as the continue or break statement, or the statement itself should have the label. It is still the case that there must not be a function scope between and the loop or block and the break/continue.

Algorithm Informally, the idea behind the algorithm for checking compliance with these conditions, is that each node asks all its subtrees if they comply. The node itself will comply if all the subtrees comply. For the subtrees to check for compliance, the node will tell them what labels are declared by and above the node, and if the subtrees are inside a loop or switch.

A switch or loop structure will therefore, in addition to the labels it know from its parent, tell ts subtrees that they are inside a switch or loop. If the structure has a label, it will also report that label to the subtrees. Functions cover all switches and loops, and will therefore tell their subtrees that they are not in a switch or loop and that there are no labels.

If an unlabeled continue or break is asked, it will answer yes, if it is told that it is inside a loop or switch, and no otherwise. If a labeled continue or break is asked, it will answer yes, if it is told the label was declared and no otherwise.

The algorithm is implemented as a visitor.

Alternative strategy Alternatively to this approach we could have ignored the condition, and just translated the code to C. In C, you will get a compile error when you try to make a jump to a label that is not declared, and therefore the condition would still be placed. The advantage of our approach is however that we do not rely on the error handling of the C compiler, and therefore make it more explicit how the conditions are derived from the ECMAScript Language specification. It also means that we can provide more meaningfull error messages than the C compiler, which, of course, does not know about JavaScript.

4.1.3.3 Getters and setters

Even though getters and setters are not supported by the compiler, we also check the context condition, that object literals, may only define each getter and setter once. This is implemented as a visitor.

4.1.3.4 Annotations

In the project compiler, the context handling is also used to annotate the Ab-stract Syntax Tree with information that will be practical to have in the later phases of the compilation.

4.1.3.5 Declarations and assignments

In JavaScript, there is a difference between function declarations and function expression.

function f(){

console.log(g(5)); //g is bound to the function x*5, so 10 is written console.log(x); //x is bound to undefined, so undefined is written function g(x){

return x*5;

}

var x = 5;

console.log(x);//x is bound to 5, so 5 is written }

f();

The code shows that function declarations bind the function value to the variable at the beginning of the outer function, while variable declarations bind them to undefined in the beginning, and then when the definition is reached, to the specified value.

To make the translation to C easy, we annotate the function nodes, with the sets of variables that are declared, and the sets of functions that are declared.

We also annotate the root node with all the variables that were assigned to, but not declared, since they belong to the global scope as seen in the following example:

function f(){

a = 5;

}

function g(){

console.log(a);

} if(x){

f();

} g();

If x is true, f will be run. That means f is run which has the consequence that 5 is assigned to a, in the global scope. When g is run it can then print a. On the other hand if x is false, a will not have any declaration or assignment, and g will produce a runtime error. This example shows that we also need to know what variables can belong to the global scope.

The annotating of this information is implemented as a visitor.

4.1.3.6 Alternative strategy

An alternative strategy to solving this problem with annotations, would be to make a translation from AST, to an intermediate language that was very similar, but had the restrictions that all variable and function declaration should be in the beginning of each body of a function.

We could then, for instance, take the AST representation equivalent of function foo(){

console.log(f(a));

function f(x){

return x*2;

}

var a = 5;

}

and turn it in to an intermediate representation that would be equivalent to function foo(){

var a = undefined;

var f = function f(x){

return x*2;

}

console.log(f(a));

a = 5;

}

This solution is completely equivalent to what we do, but one could argue that it shows the meaning of what is done more explicitly, than adding annotations.

Our solution, however, has the advantage that we do not need to build a new tree which would be more difficult than simply adding annotations, and we do not put a new tree in the memory of the computer.

4.1.3.7 Function IDs

Because there are not nested functions in C, all the function definitions have to be in the same outer scope. Therefore we have a problem if several functions in different scopes of JavaScript share the same name. To solve this problem, we give every function a unique ID, by visiting the tree and annotating all function nodes with a number.

In document Compiling Dynamic Languages (Sider 39-44)