Modelling and forecasting of offshore wind power fluctuations with Markov-Switching models
02433 - Hidden Markov Models
Pierre-Julien Trombe, Martin Wæver Pedersen, Henrik Madsen
Course week 10
MWP, compiled June 7, 2011
Offshore vs Onshore Wind power
There is a large difference in the magnitude of the fluctuations in on and offshore wind power production. Since there is no smoothing effect in offshore environments fluctuations can become larger.
Fluctuations at large offshore wind farms have a significant impact on the control and management strategies of their power output. Critical issue:
keep a balance between power demand and power generation.
The presence of different dynamics in the power fluctutaions indicates the need for regime switching models which can be formulated as HMMs.
Classical regime-switching approach
In the classical regime-switching approach switches occur between three predefined intervals of the power range: low, medium and high power. This approach can be refined by combining HMMs and dynamical models.
020406080100
Power [% of Pn]
Aug 05 Aug 07 Aug 09 Aug 11 Aug 13 Low power?
Medium power?
High power?
Offshore wind power data
Case study: Horns Rev I offshore windfarm, installed capacity 160 MW.
Power [% of Pn]
Frequency
0 20 40 60 80 100
02000400060008000 - Time-series of wind power averaged over 10 minutes intervals.
- 49536 data points (16 Feb 2005 to 26 Jan 2006).
- 7885 (16%) missing data.
- Bounded on [0,100] % of Nominal power.
I 3023 (6.1%) data points lie at max production.
I 870 (1.8%) data points lie a zero production.
The power production fluctuates differently when at max (100% of Pn) or min (0% of Pn) capacity compared to when at mid capacity. This is indicated by the histogram above, where Pn is the nominal production capacity (installed capacity).
Heteroscedasticity of data
The variance of the data is not homogeneous. This can be accounted for in Auto Regressive Conditional Heteroscedastic (ARCH) models or Generalized ARCH (GARCH) models.
0.0 0.1 0.2 0.3 0.4
Squared residuals
Sept 05 Oct 05 Nov 05 Dec 05
Model specification
To model the stochastic behavior of a given wind power time seriesxt, a MS(s)-AR(r)-GARCH(p,q) model is proposed as follows:
xt =θ(C0t)+
r
X
i=1
θi(Ct)xt−i+t
t ∼N(0, σ2t) σt2=a(C0t)+
q
X
j=1
aj(Ct)2t−j+
p
X
i=1
b(Ci t)σt2−i
withCt a first order Markov chain, which governs the switches between different parameter values in the dynamical models (AR or GARCH) in each regime. As alwaysΓis the probability transition matrix forCt.
Estimation scheme
Numerical maximizaton of the likelihood is computationally infeasible in general.
A Bayesian estimation approach with Markov Chain Monte Carlo (MCMC) is simpler.
MCMC algorithm: loop until convergence over I Sample the state sequence (Data Augmentation).
I Sample the transition probabilities (Dirichlet distribution).
I Sample the state dependent AR and GARCH parameters (Giddy-Gibbs sampler).
I Compute the Bayes estimates (posterior distributions).
Data Augmentation (1/2)
For a giventin{1, . . . ,T}and for all statesi in the finite space{1, . . . ,m}:
Pr(Ct =i|C6=t,X(T)=x(T),Θ) = Pr(C(T),X(T)=x(T),Θ) Pr(C6=t,X(T)=x(T),Θ)
= Pr(X(T)=x(T)|C(T),Θ)Pr(C(T),Θ) Pr(X(T)=x(T)|C6=t,Θ)Pr(C6=t,Θ)
= Pr(X(T)=x(T)|C6=t,Θ)Pr(Ct =i|C6=t,Θ) Pr(X(T)=x(T)|C6=t,Θ) , whereC6=t is the states at all times except timet. We therefore get
Pr(Ct =i|C6=t,X(T)=x(T),Θ)∝Pr(X(T)=x(T)|C(T),Θ)Pr(Ct =i|C6=t,Θ).
So,Ct can be sampled using the Gibbs sampler.
Data augmentation (2/2)
Pr(X(T)=x(T)|C(T),Θ) is, for a given sample of the state sequenceC(T), the likelihood function
Pr(X(T)=x(T)|C(T),Θ) =
T
Y
t=1
Pr(Xt =xt|X(t−1)=x(t−1),C(T)Θ)
=
T
Y
t=1
1 p2πσ2t
exp −(xt−θ0(Ct)+Pr
i=1θi(Ct)xt−i)2 2σ2t
! ,
where theMarkov Propertyleads to
Pr(Ct =i|C6=t,Θ) = Pr(Ct =i|Ct−1=j,Ct+1=k)
= γjiγik
Pm i=1γjiγik
.
Simulation of synthetic data
Simulating data from a MS(2)-AR(2)-GARCH(1,1) process: Top is the log-variance and bottom is the simulatedCt, i.e. the switching sequence.
0 500 1000 1500 2000
−5051015
0 500 1000 1500 2000
1.01.41.8
state
Results, synthetic data
True values Prior bounds Means Std. Dev.
θ(1)0 0.1 [-0.3 0.5] 0.080 0.050
θ(1)1 0.3 [-0.2 0.7] 0.227 0.057
θ(1)2 -0.1 [-0.5 0.3] -0.074 0.057
a(1)0 0.05 [0 0.2] 0.067 0.012
b(1)1 0.65 [0.2 1] 0.547 0.044
a(1)1 0.2 [0 0.5] 0.307 0.047
θ(2)0 0.5 [0.1 0.9] 0.478 0.052
θ(2)1 0.7 [0.2 1.3] 0.731 0.060
θ(2)2 0.25 [-0.3 0.7] 0.183 0.056
a(2)0 0.25 [0 0.35] 0.319 0.026
b(2)1 0.8 [0 0.7] 0.670 0.020
a(2)1 0.1 [0 0.5] 0.233 0.033
γ11 0.98 [0 1] 0.977 0.005
γ 0.95 [0 1] 0.958 0.009
Posterior distributions of the AR coefficients
intercept
state 1
−0.05 0.05 0.15 0.25
0100200300400500
lag 1
0.1 0.2 0.3 0.4 0.5
02004006008001000
lag 2
−0.3 −0.1 0.1 0.2
02004006008001000
state 2
0.3 0.4 0.5 0.6 0.7
020040060080010001200
0.5 0.6 0.7 0.8 0.9 1.0
02004006008001000
0.0 0.1 0.2 0.3 0.4
02004006008001000
Results: Regime characterization on wind power data
Regime 1 has little process noise, regime 2 is the most common with moderate process noise, regime 3 accounts for spikes in the data.
200 400 600 800 1000 1200 1400
0 0.5 1
Wind power
200 400 600 800 1000 1200 1400
0 0.5 1
Pr. regime 1
200 400 600 800 1000 1200 1400
0 0.5 1
Pr. regime 2
0 0.5 1
Pr. regime 3
Inference on the regimes
Results: Point Forecast
The performance of the point forecast is evaluated as the root mean square error (R) of the conditional expectation of the one-step-ahead observation µt+1= E(Xt+1|X(t)=x(t)), i.e.
R= v u u t
1 T−1
T−1
X
t=1
(Xt+1−µt+1)2,
whereRhas the unit “percentage of nominal capacity” i.e. [% of Pn].
No. Model R [% of Pn] R[% of Pn]
in sample out-of-sample
1 Persistence (naive) 4.62 4.49
2 AR(3)-GARCH(1,1) 4.65 4.48
3 MS(3)-AR(3)-GARCH(1,1) 4.49 4.29
Clearly the model allowing switches between different dynamical models (no. 3) has a lower out-of-sampleR when compared to the two other modelling approaches.
Exercise
a) Download and plot the wind data set (winddata.txt) from the course website.
b) Fit AR models with varying number of lags to the data using R’s built in arimafunction.
c) Fit AR models with varying number of lags to the data by writing your own code. Ensure that your optimal likelihood value is similar to the one obtained with thearimafunction.
d) Fit a two state MSAR model to the data by extending your code for fitting AR models. Assumeγ11=γ22= 0.95. Compare the AIC for this model with the AIC for the simple AR model.