• Ingen resultater fundet

Copenhagen Business School Master Thesis 15 Sept 2021

matching method has been essential in seeking a statistically significant effect between the issuance of a corporate green bond and the post-issue performances. Without such a method, it would not have been possible to compare the results between the treated (green) and the control (vanilla) bond. Furthermore, the differences-in-differences statis-tical framework offers the chance to possibly capture the causal effect of the independent on the dependent variables, excluding the change that the dependent variables would have had anyway. Moreover, the multivariate linear regression served as a reliable instrument to calculate such casual effects, once its assumptions were ascertained true. Once again, using a combination of literature review, and quantitative analysis, the final paragraphs answer whether green bond issuers achieved better results than vanilla bond issuers in the post-issuance period in terms of environmental performance and ownership structure.

in Y of the pair. As the estimator is the difference between the two groups in the change over time within the same group, such an estimator is named differences-in-differences.

For instance, this study examines the effect of issuing a green bond on the environmental performances of bond issuers, using a differences-in-differences estimator to compare the change in environmental performance in the green issuers with the change in environmental performance in the vanilla issuers.

Looking closely at the differences-in-differences estimator, it is calculated by means of a simple algebraic equation.

βdif f s−in−dif f s

1 = ( ¯Ytreatment,af ter−Y¯treatment,bef ore)−( ¯Ycontrol,af ter−Y¯control,bef ore)

= ∆ ¯Ytreatment−∆ ¯Ycontrol

(5.1)

Let ¯Ytreatment,bef orebe the sample average of Y for those in the treatment group before the experiment, and let ¯Ytreatment,af ter be the sample average for the treatment group after the experiment. Then, let ¯Ycontrol,af ter and ¯Ycontrol,bef ore be the post- and pre-treatment sample average for the control groups, respectively. The differences-in-differences estima-tor is the average change in Y for those in the treatment group, minus the average change in Y for those in the control group.

The differences-in-differences estimator can be written in regression notation. Let ∆ ¯Yi be the post experimental value of Y for theith individual minus the pre-experimental value.

The differences-in-differences estimator is the OLS estimator ofbeta1 in the regression,

∆ ¯Yi01∗Xi+ui (5.2)

Multiple linear regression

To accurately estimate the relationship between the dependent and independent variables, an ordinary least squared (OLS) multivariate model is applied. The OLS method can be used in determining the differences-in-differences estimator, as the lack of data renders time-series or panel data regression impossible (Stock & Watson, 2012).

The dataset used to estimate the causal effect is an imbalanced panel dataset, as the availability of data over time is poor for a large fraction of the energy firms. Retrieving

Copenhagen Business School Master Thesis 15 Sept 2021

data on financial fundamentals in a quantity that would allow for a panel data regression was not possible through any available databases if the data even exists.

The multiple regression model is

Yi01X1i2X2i+...+βkXki+ui, i= 1, ..., n (5.3) where Yi is ith observation on the dependent variable; X1i, X2i, ..., Xki are the ith obser-vation on each of the k regressors; and ui is the error term. β1 is the expected change in Yi resulting from changing X1i by one unit, holding constant X2i, ..., Xki. The intercept β0 is the expected value of Y when all the X’s equal 0.

OLS and Differences-in-differences assumptions

A multiple regression model must respect the OLS assumptions. The three assumptions are the same as the one for the univariate regressor model, whereas the fourth is specific to the multivariate regressor model.

1. The conditional distribution ofui, given X1i, X2i, ..., Xki has a mean of zero.

2. (X1i, X2i, ..., Xki, Yi, i= 1, ..., n,) are independently and identically distributed.

3. There is a linear relationship between the dependent and the independent variables.

4. There is not perfect multicollinearity

Validation of OLS assumption

Tests to validate the assumptions of the multiple linear regression are conducted prior to the analysis.

The assumption number one is also known as homoskedasticity. The assumption implies that the variation of the error termsui observed around the regression line is constant.

The second assumption of the multiple linear regression states that for a given value of a Xki, the error termsεiof theYiare independently and identically distributed, i.e., normally distributed. It is possible to examine such a distribution by means of a ”Normal Q-Q” plot and Figure 5.1 shows that the normality assumption is invalid. Also, it is possible to test for normality by means of the hypothesis test Shapiro-Wilk test, see Table 5.2. The logarithm

Page 54 of 198

Studentized Breusch-Pagan test

BP 39.288

p-value 0.7805

Table (5.1) The table above describes a white t’s test. Such a test uses the following null and alternative hypotheses: the null hypothesis equals homoskedasticity, whereas the alternative hypothesis equals heteroskedasticity. Since the p-value is not less than 0.05, we fail to reject the null hypothesis. We do not have sufficient evidence to say that heteroskedasticity is present in the regression model (S. S. Shapiro & Wilk, 1965).

Shapiro-Wilk normality test

Pre Post

W 0.763 0.848

p-value 2.671e-08 7.132e-06

Table (5.2) The table above describes a Shapiro-Wilk’s hypothesis test. The test uses the following null and alternative hypotheses: the null hypothesis means that the data distribution follows a normal distribution, whereas the alternative hypothesis implies the opposite. Since the p-value is less than 0.05 both in the pre- and post-data manipulation, we cannot reject the null hypothesis.

transformation of the CO2 emission dependent variable is performed. Moreover, we seek for outliers among the sample, as you can see in Figure 5.2. In Table 5.2, the result of those manipulations is clear, the p-value is higher after both the logarithm transformation and the outliers’ omission.

Furthermore, the third assumption requires a linearity between the dependent and the independent variables. The plot in Figure 5.3 presents the OLS residuals against the fitted values. The linearity assumption holds in that case.

Finally, the level of a multicollinearity, namely, the assumption number four, is evaluated through a Variance Inflation Test (VIF). The tolerance threshold is calculated as such:

tolerance = 1−R2 and the V IF = 1/tolerance. A VIF value equal to 10 or higher indicates it multicollinearity problem. In the instance of CO2 regression model, theR2 = 0.989, hence thetolerance= 0.011, and consequently theV IF = 90.

Differences-in-differences assumptions

Copenhagen Business School Master Thesis 15 Sept 2021

(a) (b)

Figure (5.1) In order for normality to hold, the standardized residuals should fall on the diagonal line. The solution for a violation of such an assumption are outliers’ removal and logarithm transformation of the data. Figure (b) shows the results of a logarithm transformation of the dependent variable. As you can see, the Normal Q-Q plot is a less severe tailed.

(a) (b)

Figure (5.2) As you can see, there are data points outside of the quarterlies’ borders.

After trimming the outliers, the boxplot shows almost not data points outside the quartiles.

Page 56 of 198

Figure (5.3) This case relates to the CO2 emissions model. As you can see, this instance witnesses a valid linearity assumption.

Since we are using the differences in differences modelling, two further assumptions are needed to be satisfied to perform the multivariate linear regression (Strumpf et al., 2017).

1. The control group must be and adequate proxy for the counterfactual outcome.

2. The treatment must be exogenous, i.e., the treatment should not be driven by pre-treatment cause, nor by any unmeasured time-varying common cause of the treat-ment or the outcome.

To ensure the validity of the statistical framework, a plausible argument is present. The treatment and the control group are in fact exchangeable, or highly comparable, as they present substantive statistical similitudes. In fact, the control group is created by means of a matching method, which is a valid practice when using a differences-in-differences estimator. Indeed, matching is a way of controlling the pre-treatment differences between the two groups. In other words, the matching minimizes the pre-experiment difference between the control and treated group, a fundamental practice especially in a quasi-experiment setup (Stuart & Rubin, 2011). Section 5.3 is going to explain the details of the matching method of this study. However, Stuart and Robin (2007) define five key steps to do, when performing matching methods for causal inference:

1. Choose the matching covariates

2. Define what constitutes similar measures, i.e., a distance measure from the mean.

3. Perform the matching through an algorithm.

4. Repeat step two and three until the optimal match data sample is obtained.

Copenhagen Business School Master Thesis 15 Sept 2021

5. Analyse and model the data.