For many financial and economic time series, valuable information about variance is
contained in the past term implying that (Ψ_{t} | Ψ_{t1} ). For many purposes, conditional forecasts
therefore are preferable to unconditional forecasts (Enders 2010). In the following, the
influential ARCH and its generalized equivalent will be presented.

**3.3.1 The ARCH model**

Engle (1982) introduced his autoregressive conditional heteroskedastic (ARCH) model by
which the dynamic dependency in variance can be exploited^{19}.

*“For real processes one might expect better forecast intervals if additional *
* information from the past were allowed to affect the forecast variance…” *

Engle (1982)

In a forecasting perspective, the mean and variance of a volatility model at time t is given as
(3.14)
µ* _{t}*=

*E r*(

*Ψ*

_{t}

_{t}_{−}

_{1})

and

(3.15)
σ_{t}^{2}=var(*r** _{t}*Ψ

_{t}_{−}

_{1})=

*E r*((

*−µ*

_{t}*)*

_{t}^{2}Ψ

_{t}_{−}

_{1})

meaning that the conditional expected return as well as the expected squared deviation from
the mean are functions of the information set Ψ_{t-1}.

18 The issue of integration relates to a number of interesting topics in econometrics outside the scope of this thesis.

19 In Engle (1982) the class of models is introduced and applied to a time series of inflation demonstrating vast superiority over the unconditional measures.

By letting the conditional variance be parameterized by information in Ψ, the model turns heteroskedastic. Engle defined the properties of the ARCH model using the following notation,

(3.16)
*Y** _{t}* Ψ

_{t}_{−1}

*N x h*(

*β, )*

_{t}

_{t}(3.17)
*h** _{t}*= +α α ε

_{0}

_{1}

_{t}^{2}

_{−}

_{1}...+α ε

_{p t p}^{2}

_{−}

(3.18)
ε* _{t}*= −

*Y x*

_{t}*β*

_{t}where Ψ denotes the information set available at time t and β is a vector of unknown
para-meters constituting the mean of Y_{t} so that ε_{t} expresses the shock or innovation at time t
(Engle 1982). In its simplest multiplicative form, Engle proposed a specification for ε_{t} so that

(3.19)
ε* _{t}* =υ

_{t}*h*

_{t}where υ is a white noise process at time t with β_{t}^{2 }= 1. Due to the effect of υ it follows that
E(ε_{t})=0 and that ε_{t} to ε_{tp} are serially uncorrelated while dependent in their second moment.

For the ARCH(1) process this implies that

(3.20)
*E r*((* _{t}*−µ

*)*

_{t}^{2}Ψ

_{t}_{−}

_{1})=

*E*(ε ε ε

_{t}^{2}

_{t}_{−}

_{1},

_{t}_{−}

_{2}...)=α

_{0}+α ε

_{1}

_{t}^{2}

_{−}

_{1}

whereby a large realized shock in t-1 will be reflected in the conditional variance at t (Enders
2010). Whereas returns can be negative as well as positive, only positive values make
sense for variances. This restricts the sum of the parameters α_{0}, α_{1},… α_{p} to be positive. The
stationarity of the processes is assured by restricting the parameters so that 0 ≤ ∑^{q}_{i=1} α_{i} ≤ 1.

In effect, since α_{1},… α_{p} cannot be negative, the minimum value of α_{0}+α_{1}ε^{2}_{t1},…+α_{p}ε^{2}_{tp} is zero.

If α_{i }= 0, the term is non-existing, which for the ARCH(1) model, means that no ARCH effects
are present in the data.

The ARCH models can be estimated using varying laglengths denoted p so that the predicted
volatility h_{t} at time t depends on the parameterized squared shocks in α_{1} through α_{p} in equa

tion 3.17. Inserting the ARCH(p) into the specification for ε_{t} in equation 3.19 it is found that

(3.21)
ε* _{t}*= α

^{0}+

### ∑

_{i}

^{q}_{=}

^{1}α ε

_{i t i}^{2}

_{−}

**3.3.1.1 Weaknesses of the ARCH model**

Despite the Nobel Prize awarded to Engle^{20}, the ARCH model has some weaknesses with
importance for practical application on financial data.

First, as the ARCH model is purely descriptive it provides no guidance as to the causes of the
behavior of the data. Nwogugu (2006) note that as such, the ARCH class models are naïve as
they assume that volatility can be explained solely through mechanical descriptive analysis,
ignoring other sources of volatility such as liquidity, psychology or legal issues^{21}.

Secondly, the ARCH models assume symmetry in reactions to positive and negative shocks.

This follows from the structure of the model by which it reacts to the square of the previous period’s realizations thereby analyzing the residuals as absolute figures. As described in Section 2.1.3 and 2.1.5, this may be of importance as a difference is expected ex ante.

Third, as the ARCH models is a short memory specification, a large number of estimators may be needed in practice, which gives rise to high data requirements, and fourth, there is a risk that the deviations may be over predicted due to the inertia in the models reaction to large isolated shocks (Tsay 2005).

**3.3.2 The Generalized ARCH model**

By letting the conditional variance process mimic an ARMA process, Bollerslev (1986) intro

duced the generalized ARCH (GARCH) model. This model has been shown to accommodate financial time series well especially volatility clustering and excess kurtosis.

The error process follows the definition from equation 3.21 where still υ_{t} represents a white
noise process with σ^{2}_{t }= 1. Yet, the heteroskedastic variance process h_{t} in the generalized
ARCH is revised to encompass a moving average term. The conditional variance thus
depends on lagged squared residuals as well as lagged estimates of variance so that

(3.22)

*h*_{t}_{i t i}_{i t i}*h*

*i*
*q*
*i*

=^{α}^{0}+

### ∑

*p*

_{=}

^{1}

^{α ε}

^{2}

_{−}+

### ∑

_{=}

^{1}

^{β}

_{−}

20 Robert F. Engle received the Nobel prize in Economics sciences 2003 “for methods of analyzing economic time series with time-varying volatility (ARCH)” (Web: nobelprize.org 2010)

21 While this is true, the incorporation of independent variables in the model specifications may provide a valid remedy to such objections.

Responding to the criticism of Engle´s model, the advantages of the generalized version are
clear. While the generalized volatility model contain q + p + 1 parameters rather than p+1
in the ARCH model, the moving average terms p, allow for a longer memory in a more
parsi-monious representation. Similarly, a simple version such as the well-known GARCH(1,1)
can be shown to mimics the behavior of an ARCH(∞) model whereby all previous residuals
contribute to the parameterization of the volatility at time t (Anderson et al 2009). This makes
it easier to identify and estimate and often leads to a less restricted specification^{22} (Enders
2010).

To verify the existence and stability of the variance, the restrictions 0 ≤ α_{i }+ β_{i} < 1 must be
fulfilled. If α_{i }+ β_{i} > 1 the variance is covariance nonstationary and our models may fail to
correctly assess future period´s data^{23} (Bollerslev et al 1992). In general, α_{i }+ β_{i} expresses the
persistence of the model, that is, how long a shock to the conditional variance remains in the
data. In the GARCH(1,1) model, it is clear that larger values of α_{i} leads to greater volatility
in the forecasted errors, while high values of β_{i} indicate higher persistence.

That a low number of parameters are usually sufficient in explaining the second moment of
financial time series has been empirically backed by a vast academic literature. Bollerslev et
al. (1992) review this and find that in most research the simple GARCH(1,1), GARCH(1,2)
and GARCH(2,1) were the most adopted specifications. This is also the case with large
samples over large time scales. French et al (1987) successfully applied a GARCH(2,1) model
using daily S&P returns to calculate monthly standard deviations in a time period from 1928
to 1984. For many purposes, the simple GARCH(1,1) have proven sufficient for modeling
volatility in stock returns, interest rates and foreign exchanges (Bollerslev et al 1992,
Anderson et al 2009). Hansen and Lunde (2001) tested 330 different volatility model
specifications^{24} on daily returns in currencies and concluded that no specifications could be
shown to significantly outperform the simple GARCH(1,1)^{25}.

In accommodating the stylized facts described in Section 2, the GARCHtype models are ef

fective by allowing for the volatility clusters frequently observed. This is the case because
a high value of ε^{2}_{ti} or h_{ti} will result in a high h_{t}, whereby the pattern of high volatility following

22 Because it can be specified using less parameters.

23 That is, the conditional variance forecasts will not converge to their unconditional variance.

24 Formally 55 different specifications but tested using different error distributions and mean equations.

25 When the ARCH(1,1) model was used as benchmark it was clearly dominated by a wide range of models.

high volatility and low following low is generated. Of importance is also that the GARCH models have been shown to capture the leptokurtic characteristics as previously described.

This is the case for most GARCHtype models given that β_{i} > 0 (Tsay 2005)^{26}.

**3.3.2.1 Weaknesses of the GARCH models**

The GARCH models are criticized for imposing restrictions to the parameters that sub

sequently are violated during estimations. Especially the restrictions that α_{i} ≥ 0 and β_{i} ≥ 0 are
often violated in practice leading to disqualification of the specification (Nelson 1991).

Like the ARCH model, the GARCH model is criticized for not giving any inside as to the sources of variance. The modeling process is still mechanical and purely descriptive or in other words, the models are statistical rather than economic. Also, similarly to the ARCH model, the pure GARCH model is unable to capture asymmetric effects.