## Modelling and forecasting of offshore wind power fluctuations with Markov-Switching models

### 02433 - Hidden Markov Models

### Pierre-Julien Trombe, Martin Wæver Pedersen, Henrik Madsen

### Course week 10

MWP, compiled June 7, 2011

## Offshore vs Onshore Wind power

There is a large difference in the magnitude of the fluctuations in on and offshore wind power production. Since there is no smoothing effect in offshore environments fluctuations can become larger.

Fluctuations at large offshore wind farms have a significant impact on the control and management strategies of their power output. Critical issue:

keep a balance between power demand and power generation.

The presence of different dynamics in the power fluctutaions indicates the need for regime switching models which can be formulated as HMMs.

## Classical regime-switching approach

In the classical regime-switching approach switches occur between three predefined intervals of the power range: low, medium and high power. This approach can be refined by combining HMMs and dynamical models.

020406080100

Power [% of Pn]

Aug 05 Aug 07 Aug 09 Aug 11 Aug 13 Low power?

Medium power?

High power?

## Offshore wind power data

Case study: Horns Rev I offshore windfarm, installed capacity 160 MW.

Power [% of Pn]

Frequency

0 20 40 60 80 100

02000400060008000 - Time-series of wind power averaged over 10 minutes intervals.

- 49536 data points (16 Feb 2005 to 26 Jan 2006).

- 7885 (16%) missing data.

- Bounded on [0,100] % of Nominal power.

I 3023 (6.1%) data points lie at max production.

I 870 (1.8%) data points lie a zero production.

The power production fluctuates differently when at max (100% of Pn) or min (0% of Pn) capacity compared to when at mid capacity. This is indicated by the histogram above, where Pn is the nominal production capacity (installed capacity).

## Heteroscedasticity of data

The variance of the data is not homogeneous. This can be accounted for in Auto Regressive Conditional Heteroscedastic (ARCH) models or Generalized ARCH (GARCH) models.

0.0 0.1 0.2 0.3 0.4

Squared residuals

Sept 05 Oct 05 Nov 05 Dec 05

## Model specification

To model the stochastic behavior of a given wind power time seriesxt, a MS(s)-AR(r)-GARCH(p,q) model is proposed as follows:

xt =θ^{(C}_{0}^{t}^{)}+

r

X

i=1

θ_{i}^{(C}^{t}^{)}xt−i+t

t ∼N(0, σ^{2}t)
σt^{2}=a^{(C}_{0}^{t}^{)}+

q

X

j=1

a_{j}^{(C}^{t}^{)}^{2}t−j+

p

X

i=1

b^{(C}_{i} ^{t}^{)}σt^{2}−i

withCt a first order Markov chain, which governs the switches between different parameter values in the dynamical models (AR or GARCH) in each regime. As alwaysΓis the probability transition matrix forCt.

## Estimation scheme

Numerical maximizaton of the likelihood is computationally infeasible in general.

A Bayesian estimation approach with Markov Chain Monte Carlo (MCMC) is simpler.

MCMC algorithm: loop until convergence over I Sample the state sequence (Data Augmentation).

I Sample the transition probabilities (Dirichlet distribution).

I Sample the state dependent AR and GARCH parameters (Giddy-Gibbs sampler).

I Compute the Bayes estimates (posterior distributions).

## Data Augmentation (1/2)

For a giventin{1, . . . ,T}and for all statesi in the finite space{1, . . . ,m}:

Pr(Ct =i|C6=t,X^{(T)}=x^{(T)},Θ) = Pr(C^{(T)},X^{(T)}=x^{(T)},Θ)
Pr(C6=t,X^{(T)}=x^{(T)},Θ)

= Pr(X^{(T)}=x^{(T)}|C^{(T)},Θ)Pr(C^{(T)},Θ)
Pr(X^{(T)}=x^{(T)}|C6=t,Θ)Pr(C6=t,Θ)

= Pr(X^{(T)}=x^{(T)}|C6=t,Θ)Pr(Ct =i|C6=t,Θ)
Pr(X^{(T)}=x^{(T)}|C6=t,Θ) ,
whereC6=t is the states at all times except timet. We therefore get

Pr(Ct =i|C6=t,X^{(T)}=x^{(T)},Θ)∝Pr(X^{(T)}=x^{(T)}|C^{(T)},Θ)Pr(Ct =i|C6=t,Θ).

So,Ct can be sampled using the Gibbs sampler.

## Data augmentation (2/2)

Pr(X^{(T)}=x^{(T)}|C^{(T)},Θ) is, for a given sample of the state sequenceC^{(T)}, the
likelihood function

Pr(X^{(T)}=x^{(T)}|C^{(T)},Θ) =

T

Y

t=1

Pr(Xt =xt|X^{(t−1)}=x^{(t−1)},C^{(T)}Θ)

=

T

Y

t=1

1
p2πσ^{2}t

exp −(xt−θ_{0}^{(C}^{t}^{)}+Pr

i=1θ_{i}^{(C}^{t}^{)}xt−i)^{2}
2σ^{2}_{t}

! ,

where theMarkov Propertyleads to

Pr(Ct =i|C6=t,Θ) = Pr(Ct =i|Ct−1=j,Ct+1=k)

= γjiγik

Pm i=1γjiγik

.

## Simulation of synthetic data

Simulating data from a MS(2)-AR(2)-GARCH(1,1) process: Top is the log-variance and bottom is the simulatedCt, i.e. the switching sequence.

0 500 1000 1500 2000

−5051015

0 500 1000 1500 2000

1.01.41.8

state

## Results, synthetic data

True values Prior bounds Means Std. Dev.

θ^{(1)}_{0} 0.1 [-0.3 0.5] 0.080 0.050

θ^{(1)}_{1} 0.3 [-0.2 0.7] 0.227 0.057

θ^{(1)}_{2} -0.1 [-0.5 0.3] -0.074 0.057

a^{(1)}_{0} 0.05 [0 0.2] 0.067 0.012

b^{(1)}_{1} 0.65 [0.2 1] 0.547 0.044

a^{(1)}_{1} 0.2 [0 0.5] 0.307 0.047

θ^{(2)}_{0} 0.5 [0.1 0.9] 0.478 0.052

θ^{(2)}_{1} 0.7 [0.2 1.3] 0.731 0.060

θ^{(2)}_{2} 0.25 [-0.3 0.7] 0.183 0.056

a^{(2)}_{0} 0.25 [0 0.35] 0.319 0.026

b^{(2)}_{1} 0.8 [0 0.7] 0.670 0.020

a^{(2)}_{1} 0.1 [0 0.5] 0.233 0.033

γ_{11} 0.98 [0 1] 0.977 0.005

γ 0.95 [0 1] 0.958 0.009

## Posterior distributions of the AR coefficients

**intercept**

state 1

−0.05 0.05 0.15 0.25

0100200300400500

**lag 1**

0.1 0.2 0.3 0.4 0.5

02004006008001000

**lag 2**

−0.3 −0.1 0.1 0.2

02004006008001000

state 2

0.3 0.4 0.5 0.6 0.7

020040060080010001200

0.5 0.6 0.7 0.8 0.9 1.0

02004006008001000

0.0 0.1 0.2 0.3 0.4

02004006008001000

## Results: Regime characterization on wind power data

Regime 1 has little process noise, regime 2 is the most common with moderate process noise, regime 3 accounts for spikes in the data.

200 400 600 800 1000 1200 1400

0 0.5 1

Wind power

200 400 600 800 1000 1200 1400

0 0.5 1

Pr. regime 1

200 400 600 800 1000 1200 1400

0 0.5 1

Pr. regime 2

0 0.5 1

Pr. regime 3

Inference on the regimes

## Results: Point Forecast

The performance of the point forecast is evaluated as the root mean square
error (R) of the conditional expectation of the one-step-ahead observation
µt+1= E(Xt+1|X^{(t)}=x^{(t)}), i.e.

R= v u u t

1 T−1

T−1

X

t=1

(Xt+1−µt+1)^{2},

whereRhas the unit “percentage of nominal capacity” i.e. [% of Pn].

No. Model R [% of Pn] R[% of Pn]

in sample out-of-sample

1 Persistence (naive) 4.62 4.49

2 AR(3)-GARCH(1,1) 4.65 4.48

3 MS(3)-AR(3)-GARCH(1,1) 4.49 4.29

Clearly the model allowing switches between different dynamical models (no. 3) has a lower out-of-sampleR when compared to the two other modelling approaches.

## Exercise

a) Download and plot the wind data set (winddata.txt) from the course website.

b) Fit AR models with varying number of lags to the data using R’s built in arimafunction.

c) Fit AR models with varying number of lags to the data by writing your own code. Ensure that your optimal likelihood value is similar to the one obtained with thearimafunction.

d) Fit a two state MSAR model to the data by extending your code for fitting AR models. Assumeγ11=γ22= 0.95. Compare the AIC for this model with the AIC for the simple AR model.