• Ingen resultater fundet

Interior Point Method Performance

One of the key differences between this implementation and the Matlab imple-mentation [8] is that the interior point method has been included as an

inte-2A wrapper is essentially a function which calls another function orwrapsit. The purpose of wrapping is to either provide a more general function call or to call the particular function in a specific way.

3Overloading is when you have multiple functions with the same name, and the function with the correct combination of formal parameters will be executed.

6.5 Interior Point Method Performance 95

grated part of the module, so no other programs are needed to provide the initial solution for warm starting the simplex method.

Another benefit of this integration is that the interior point method can be used instead of simplex, when this is more convenient. Furthermore, it opens up for a direct comparison of the adaptive quantile regression performance using simplex and interior point method. A similar comparison has been made by Møller [10], but the algorithms were implemented in different languages, so the timings were not directly comparable.

In this section, the performance of interior point versus simplex of the C im-plementation will be shown but first, for a reference, the interior point method implementation will be compared to quantile regression in R.

6.5.1 Quantile Regression in R

Large efforts have been made to implement efficient methods for quantile re-gression in R [5], so it is natural to compare the current implementation with the methods offered by R. Unfortunately, due to time constraints, it was not possible to test the Frisch-Newton (also known as ”fn”) method in R, since it caused an error with the test set4.

The standard Barrodale and Roberts ”br” simplex based algorithm did however work perfectly on the test set. The test set used was very simple, 20000 random numbers asy and natural splinesxwith six columns - one was intercept or 1.0.

Below is the R session where the quantile regression was tested.

> x<-read.table("xdata_spl.dat")

> x<-as.matrix(x)

> y<-read.table("yrand.dat")

> y<-as.vector(y)

> system.time(rq(y[1:20000,] ~ x[1:20000,],0.5))

[1] 2.06666667 0.08333333 2.13333333 0.00000000 0.00000000

The last line is a vector with the timing results, where the elements represent:

user cpu, system cpu, elapsed, subproc1 and subproc2. The elapsed time (2.133s) is the best measure for performance. The test was run four times and the fastest have been kept.

4The error might be related to the type of data objects used, but due to lack of time, this could not be investigated further.

A driver program was written to do the same test with the C implementation of the interior point method. The best result of four can be seen below.

$ time ./perf_quantreg xdata_spl.dat yrand.dat 20000 6 Seconds IPM 0.586183

real 0m0.697s user 0m0.676s sys 0m0.020s

The UNIX tooltime has been used as a reference. The output of the program states how many seconds have been used on the actual interior point method, where reading data from the files have not been included.

These tests show that on this particular problem size, the Barrodale and Roberts implementation in R takes 2.1330.586 ≈3.6 times longer to finish than the C imple-mentation of interior point method. It would have been very interesting to see if the same would be true for the R implementation of interior point method. It would most likely perform better on this problem size, since the interior point method is generally better at large problems than simplex based algorithms.

6.5.2 Simplex Versus Interior Point Method

During development, there has been little doubt that the simplex method is much faster than the interior point method, when the methods are used on an adaptive data set where the simplex method can use its previous solution, while interior point method must start from scratch every time. A simple test has been run to see exactly how much difference can be observed.

It can be seen in Table6.2that the simplex implementation clearly outperforms the interior point method with a factor of six. This lead in performance can however easily be overcome by the fact that the interior point method does not need continual updates, so less frequent updates are possible. As soon as the predictor is only updated once a day, the interior point method implementation takes the lead by being four times faster than the simplex method. The tests except for ”Simplex 2” were carried out with four bins with 500 points in each and nine spline knots.

The test ”Simplex 2” in Table6.2 has been been adjusted to the settings used by Møller [10] where the performance of the Matlab implementation is used.

6.5 Interior Point Method Performance 97

Method Time between updates Time regression Time total

(hours) ms/point ms/point

IPM

1 119.2 119.3

24 (daily) 4.956 5.048

168 (weekly) 0.740 0.830

672 (every 4 weeks) 0.193 0.283

Simplex 1 19.28 19.40

Simplex 2 1 14.14 14.37

Table 6.2: Performance timings of interior point method and simplex on a data set with one explanatory variable, nine spline knots and a total of 2000 points in the bins. The four quantiles 20%,40%,60% and 80% are being calculated for each update.The timings in the last column include the time spent on forming splines and putting data into the data structure. The test ”Simplex 2” was carried out with other settings.

The adaptive algorithm with 5 bins, 1200 points in each bin, spline knots at bin boundaries and two quantiles (25% and 75%) are calculated in 0.14sper point.

The C implementation of the same algorithm does the same in 0.01437s. The data sets used are not the same, but this should not have a very big influence on the computing time. More importantly however is the fact that the tests have been carried out on two different machines. Since the article is very new, it is unlikely that the performance difference is greater than the factor 10 seen in the timings.

To validate this result, the test should be run again with the same data set and settings in both implementations and on the same processor.

Chapter 7

Tests and Validation

7.1 Introduction

The purpose of this chapter is to show the results of the validation tests described in Section 4.4. Since the module is going to be implemented in a commercial application, it is important to show that the functionality is working properly.

In the first section of this chapter the simple functional tests will be described.

After this, the quantile reliability will be assessed, first in terms of how well it describes the data stored within the program. Afterwards the prediction capabilities of the program will be studied.

All the tests except for some of the simple functional tests will be carried out on the data sets obtained from ENFOR A/S with power predictions, measured production and weather forecasts. The main data set will be described in Section 7.3.2.