Hypothesis Tests - Jan Kloppenborg Møller

When having described the model above one of course want to perform some kind of test to be able to compare models, to do this we make partition of the model

Qyi(τ;x) =x^′_iβ(τ) =x^T_1iβ1(τ) +x^T_2iβ2(τ) (2.85) where β1 ∈ R^K and β2 ∈R^p in this we want to test H0 :β2 = 0 against the alternative. Now set ˆS= minβS and ˜S = minβ1S

The most simple case is ifri is iid from the asymmetric Laplacean density, this is described by

f(u) =τ(1−τ)e^−ρ^τ^(u) (2.86)

The maximum likelihood estimate under the assumption thatri come from this distribution yields the estimates described in (2.10). Ifrifollow this distribution we can make the log likelihood ratio as Ln = 2( ˜S(τ)−S(τ)), this will beˆ χ² distributed under the assumption thatri follow the asymmetric Laplacean described above. Now it seems to be quite implausible that the residuals should follow this distribution so to make a test useful it should be more general.

If ri is iid but follow some other distribution (with f(F⁻¹(τ)) > 0) then the likelihood ratio is

LN(τ) = 2( ˜S(τ)−S(τ))ˆ

τ(1−τ)s(τ) (2.87)

withs(τ) = 1/f(F⁻¹(τ). LN will converge weaklyχ²_η(τ)(q), where the non cen-trality parameter η(τ) depend on the inverse ofD0, the estimates and ω²(τ),

2.5 Hypothesis Tests 29

further under the null hypothesis sup_τ∈TLn(τ) converges to the centralχ²_q dis-tribution, where T = [ǫ,1−ǫ] for some ǫ ∈ (0,¹₂). Already here we see the difficulties in the test procedure, since the test statistic involve the p.d.f. of the residuals, so to calculate this we have to estimate f(F⁻¹(τ)) and we have to take the supremum over an interval ofτ’s.

[11] give tests in more general situations, the most general setting is that

yi=x^T_iβ+σiui (2.88)

withσi=x^T_i γanduiiid from a distributionF. Even though this might seem a general condition we still have to assume (or prove) independence of some sort.

The test statistics also becomes very complicated and besides the p.d.f of ri

we have to estimate γ or at least matrices depending on γ. The estimates of these parameters are themselves quite complicated. So we see that as soon as we leave the assumption thatriis iid from the asymmetric Laplacean, then this gets quite complicated. [11] go through these kind of tests.

As we will discuss in Chapter4 we can not assume that our residuals are inde-pendent. Even though the structure in equation (2.88) is quite general it still require independence of theui’s and therefore thatσiui is uncorrelated. In the context of wind power forecast we will a priori assume correlation between the errors, further it is not clear how to determine if we can assume (2.88).

Hypothesis will be considered briefly in Chapter4, for the data set used in the presentation. This chapter will also discuss other ways to measure performance of quantile regression models.

Chapter 3

About splines

The purpose of this chapter is to describeB-splines basis functions, which will be used in later chapters. First a formal definition of splines and motivations for using the special class of splines called B-splines is given. Then some fun-damental properties ofB-splines basis functions are given.

Section 3.4concentrate on cubic B-splines basis functions and how to impose special boundary conditions for the cubicB-splines basis functions. This section and the Practical Summary’s therein give a constructive guide to spline with special boundary conditions.

At the end of the chapter the hat-matrix, which takes the space of observations to the space of predictions, and thereby give a basis for comparison between different smoothers or kernels, is considered.

3.1 Introduction

The following definition of splines is taken from [8]. It is given to motivate the construction of the B-splines basis functions and the discussion of splines in more general.

Definition 3.1 A spline functionsof degreem(s(x)∈ Sm(t1, ..., tn))is a poly-nomial of degree at mostmon each of the intervals defined by the knot sequence {tj}ⁿi=1 and the intervals (−∞, t1), (tn,∞), with the first m−1 derivatives varying continuously over the knots.

Before going on with the construction ofB-splines, we take a brief discussion of splines in general and what is so appealing aboutB-splines. In this context we are going to use the basis functions ofS^m(t1, ..., tn)), to estimate or approximate

w.r.t. some loss function, e.g. the loss function as discussed in Chapter2. Hence we have ˆf ∈ Sm(t1, ..., tn)), so we can put some prior assumption on how many times f is differentiable into ˆf, by choosing the right m. By controlling the location of the knot sequence we can to some extend control how much local variation ˆf can handle.

These arguments may not be very convincing especial not since we would prop-erly normally assume that f ∈ C^∞, and we can of course not choosem =∞ and we can not search for functions inC^∞.

Traditionallym= 3 is chosen, one explanation for this is probably, as noted in [9] p. 22, that we are not able to see (from a graph), whether a function isC² orC^∞. So by choosingm= 3, the spline appears to be aC^∞function.

The number of basis functions is n+m+ 1, to see this simply count degrees of freedom, this is done explicitly below. This shows that there is a trade off between the number of basis functions and differentiability. Further when choos-ingB-splines the number of intervals where the basis functions have support is proportional tom. Other properties of the B-splines will be stated in the next sections of this chapter.

The arguments above is somewhat esthetic, a more formal argument for choosing m= 3 is to look at the minimization problem (3.2) below (see [9] p. 27). The use of N instead ofn is to emphasize that we now look at every observation, andn is used for a knot sequence which does not necessarily have anything to do with the location of observations. It should also be emphasized that the minimization is to be done w.r.t. functions.

arg min

In document Jan Kloppenborg Møller (Sider 48-53)