Consequences on the metrics’ comparison - Phase-ordering in optimizing compilers

As it has been shown in the previous section, optimizing for size is not as easy as optimizing for speed with the transformations considered in this thesis. Indeed, while all these transformations are improving the speed of the program, it is not the same when the goal set is to optimize the size of the program. In order to consider an optimization process that aims at reducing the size of the program, the insights gained in the previous sections can be used to design two mechanisms: a mechanism to detect non-improving transformation calls, and the use of weights when comparing the metrics to know which transformation to choose next.

7.3.1 Detection of non-improving transformation

As some transformations are not always improving the size of the program, a mechanism allowing the detection of non-improving transformation calls has been designed in the phase-ordering algorithm.

The basic idea is to calculate the size of an instance of the program before a transformation call, and compare it with the size after the transformation call. Then if the size increased, the transformation is blacklisted and the old version of the program is taken for the remaining optimization process.

However, some specific cases must be considered. Indeed, some transforma-tions may increase the size of the program at first glance, but enables other transformations to be applied such as the size globally decreases. Consequently, using the informations gathered in Section6.2.2about the interactions between the transformations, some transformations are treated specifically:

1. Constant Folding: in the case Constant Folding makes the original size of the program decrease, the Phase Manager takes a look at the

met-ric DCE. Indeed, Dead Code Elimination is often enabled by Constant Folding, so there is a chance that applying Dead Code Elimination after Constant Folding is going to be beneficial. So if the metric DCE increases, that means that Dead Code Elimination has been enabled by Constant Folding, and in that case Dead Code Elimination is applied on the pro-gram. If the size after Dead Code Elimination is smaller than the original size (before Constant Folding), then the optimization process continues normally, otherwise Constant Folding is blacklisted and the original pro-gram (before Constant Folding) is used when resuming the process.

2. Code Motion: Code Motion has been shown in the previous section to degrade the size of the program. However, there are some cases where it may be beneficial, combined with Constant Folding and Dead Code Elimination: indeed, the pre-header condition can often be simplified and the pre-header itself can be removed. So in the case Code Motion is applied (and makes the size increase), Constant Folding is considered in a similar way as Dead Code Elimination above. If this still does not improve the size, another call of Constant Folding (because Precalculation, called after the first Constant Folding may have removed the pre-header and enabled Constant Folding again) is made combined with Dead Code Elimination. If this still degrades the size of the program, then Code Motion is blacklisted and the original program is used in the rest of the optimization process.

Calling other transformations without being sure it will be beneficial can in-crease the compilation time, so it could be interesting to affine the choice of the order in which the transformations will be considered. The next part intro-duces some coefficients used to weight the different metrics in order to reduce the compilation time.

7.3.2 Use of weights in phase-ordering mechanism

This second part deals with the mechanism that allows to weight the metrics’

values according to the probability they have to improve the program. This can be used to help the phase-ordering mechanism to make a better choice of transformation depending on the goal specified by the user.

The previous section considered transformations that are not improving the size of the program. A good way to reduce the optimization time is to applied first the transformations that are unlikely to degrade the size of the program, and let the problematic transformations be called at the end. In order to classify the transformations by their probability to improve the size but still consider the

metric associated with them, a system of weighting has been designed. When-ever a metric is calculated, it is multiplied by a coefficient that represents the weight of the transformation.

These weights have been defined according to the results of Section 7.2.2: the most often a transformation is improving the size in the previous experiments, the higher will be the weight of the metric associated to this transformation.

A rough estimation of potentially good weights designed from Figure7.4is:

• Copy Propagation: 1.3

• Elimination with Signs: 1

• Common Subexpressions Elimination: 0.3

• Dead Code Elimination: 2

• Code Motion: 0.1

• Constant Folding: 0.7

These values have been chosen as an example, but of course a more in-depth choice should be realized for optimal performance.

7.3.3 Evaluation

This last part deals with the evaluation of the phase-ordering mechanism aiming at reducing the size of a program. Figure7.6 shows the size of the optimized benchmark programs in three cases:

• using the benchmark suite and choosing the best regular expression ac-cording to the size of the program and then the optimization time.

• the classic metric-based phase-ordering mechanism evaluated in Section 6.1, with reuse of the results from the metrics’ calculation.

• the metric-based phase-ordering using the two mechanisms from the pre-vious sections aiming at improving the size of the optimized program.

The size of the optimized benchmark programs are almost always the same, ex-cept for two benchmark programs: in one of them (BenchPrg1), the best regular

Best regular Metric-based Metric-based expression without weights with weights

BenchPrg1 331 365 331

BenchPrg2 1781 1781 1781

BenchPrg3 723 698 698

BenchPrg4 431 431 431

BenchPrg5 536 536 536

BenchPrg6 438 438 438

BenchPrg7 582 582 582

BenchPrg8 411 411 411

BenchPrg9 87 87 87

BenchPrg10 3731 3731 3731

BenchPrg11 192 192 192

BenchPrg12 89 89 89

BenchPrg13 111 111 111

BenchPrg14 62 62 62

BenchPrg15 568 568 568

BenchPrg16 185 185 185

BenchPrg17 402 402 402

Figure 7.6: Size of the optimized benchmark programs

Figure 7.7: Optimization time for metric-based phase-ordering

expression compute a program with a smaller size than the classic metric-based algorithm, as in the other (BenchPrg3), the classic metric-based algorithm is performing better than the best regular expression in terms of size. In both cases, the new mechanism presented in this section is producing a program with the smallest size.

Finally, introducing the detection of non-improving transformation calls in-creased slightly the optimization time, compared to the metrics’ choice without any weight, as illustrated in Figure7.7.

The corresponding table of data can be found in AppendixA.5. This overhead is compensated by the fact that the mechanism using this detection is likely to produce better results for the size of the program, for example for BenchPrg 1.

Thus, this new version of the phase-ordering algorithm using weights and a non-improving transformation calls detection mechanism is an interesting im-provement and proved to be efficient on these benchmark programs. However it requires some knowledge about the interactions between the transformations in order to spot the good sequences that can be efficient while single transformation calls are not.

Design and implementation

This chapter deals with the design and the implementation of the Phase Manager in Java and all the different auxiliary modules that are used for the manager, the benchmarks and the metrics. After considering the implementation language chosen, the second section contains references to the main classes and methods involved in the optimization part of the compiler framework, while the last section addresses the different implementation issues that occurred during this thesis.

In document Phase-ordering in optimizing compilers (Sider 130-135)