• Ingen resultater fundet

Regression Analysis

In document Finance and Strategic Management (Sider 102-110)

5 S HIPPING I NDUSTRY A NALYSIS

7 C ONGLOMERATE ANALYSIS

7.2.4 Regression Analysis

Troels Hedegaard & Morten N. Nielsen

7.2.3.1.7 Oil price vs. world growth

As expected, although somewhat weak, a negative correlation between world growth and oil price is identified. Historically, a low level of oil prices has ‘fueled’ the economy, increasing world economic growth (Bowler, 2015 January 19).

Troels Hedegaard & Morten N. Nielsen Where:

Y is the dependent variable, share price 𝛽0 is the intersect point

𝛽1 𝑡𝑜 𝛽𝑘 are estimated coefficients of the independent variables 𝜖 is the stochastic error term

𝐸(𝜖) = 0

(Bowerman et al., 2012; Koed & Jørgensen, 2008; Bøye, 2009).

7.2.4.2 Coefficient of determination

As opposed to correlation, the regression analysis examines the relationship between the dependent variable and the independent variable, within the context of other selected measures. Thus, the regression is better able to isolate the effects of the included variables from each other, which in turn provides a better foundation for interpretation. While it is not expected that the final model will be able to predict the future stock price, the criterion of success lies within the accuracy of the model. Hence, the focus is the so-called

‘least squares method’ in which variations between actual and predicted values (Investorpedia, Least Squares Method; Boomer, Least Squares Criteria):

𝑚𝑖𝑛 ∑(𝑦𝑖− 𝑦̂𝑖)2

𝑛

𝑖=1

Where:

𝑦𝑖 is the intersect point 𝑦̂𝑖 is the predicted value of 𝑦𝑖

In order to conclude the fit of the model, several measures of variation must be calculated, resulting in the coefficient of determination (Weiss, 2011):

𝑆𝑆𝑇 = ∑𝑛𝑖=1(𝑦𝑖− 𝑦̅)2 𝑆𝑆𝑅 = ∑𝑛𝑖=1(𝑦̂𝑖− 𝑦̅)2 𝑆𝑆𝐸 = ∑𝑛𝑖=1(𝑦𝑖− 𝑦̂)2 Where:

𝑦̅ mean of the observed values

Since the total sum of squares can be divided (SST) by n - 1 to get the sample variance, it is indeed a measure of total variation. The total variation can be separated into two types: the variation explained by the regression model and the remained unexplained variation. SSR denotes the regression sum of squares (i.e.

Troels Hedegaard & Morten N. Nielsen explained variation), while SSE reflects the rest. From these measures, the overall fit of the model can be established, known as the coefficient of determination, or 𝑅2 (Weiss, 2011) and (Spiegel & Stephens, 2008):

𝑅2=𝑆𝑆𝑅

𝑆𝑆𝑇 = 1 −𝑆𝑆𝐸 𝑆𝑆𝑇

Thus, r-squared quantifies the part of the total variation, in percentage terms, that is explained by the model.

However, the measure becomes artificially inflated when additional variables are added to the model, and therefore provide a misleading foundation from which to assess the quality of the model. To mitigate this, we instead calculate adjusted r-squared, which corrects for this unintended effect (Bowerman et al., 2012):

𝑅𝑎2= 1 − (1 − 𝑅2) ( 𝑛 − 1 𝑛 − 𝑘 − 1)

Where:

𝑅𝑎2 is adjusted r-squared n is the number of occurrences

k is the number of independent variables 7.2.4.3 P-values and significance level

The regression analysis serves to estimate the size of the coefficients of the variables. To do this, the null-hypothesis is tested, which in this case is that the coefficients are zero (i.e. no relationship with the independent variable exists). The thesis applies p-values of the regression output to test the null-hypothesis.

These values reflect the probability that the relationship found randomly occurred or the probability that the null-hypothesis can be confirmed (𝛽𝑖 = 0). In other words, the lower a p-value of the predictor, the more likely it is to improve the quality of the model. We assume a significance level of five percent in this analysis, which indicates the level of uncertainty that can be accepted, and therefore is the determining factor of whether or not variables are included in the final model (Frost, 2013):

𝐻0: 𝛽𝑖 = 0

𝐻𝑎: 𝛽𝑖 ≠ 0

𝑆𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 = 5%

7.2.4.4 Assumptions

Statistical tests require certain underlying assumptions to be met for the results to be considered reliable.

Failure to perform these tests could potentially result in over- or underestimation of significance and

Troels Hedegaard & Morten N. Nielsen consequently type one or type two errors. In fact, with each added independent variable, the risk of independable results becomes more significant, as each succeeding variable could claim some of the error variance left over by the unreliable variable. The following key assumptions will be tested upon reaching the final regression model (Osborne & Waters, 2002; Duke University, 2016):

7.2.4.4.1 Homoscedasticity

This assumption entails constant variance of the errors. If the assumption is not met, the standard deviation of errors becomes difficult to estimate. Consequently, the concluded confidence intervals could both become too wide or too narrow, thereby distorting the perceived quality of conclusions. It might also cause a skewed weight allocation to certain data with largest error variance.

The assumption can be tested by first plotting the residuals and the predicted values, and secondly plotting residuals versus time. The assumption is met if no systematic increases in residuals are found in either plot.

7.2.4.4.2 Linearity and additivity

The quality of the final model also depends upon the assumption of linearity and additivity of the relationship between the dependent and the independent variables. This involves that the predicted values are a linear function of an independent variable, when the other variables are fixed. Furthermore, the effects of the independent variables on the dependent variables must be additive.

In order to test for non-linearity, a plot of observed values versus predicted values is constructed. If the points are symmetrically distributed around a diagonal line, with a constant variance, the assumption is considered upheld.

7.2.4.4.3 Statistical Independence

This is particularly important when working with time series data, which is the case in this analysis.

Autocorrelation among the residuals is very serious, as it indicates a model specification of very poor quality.

Testing for autocorrelation is done by plotting residuals versus observation number and looking for patterns.

7.2.4.4.4 Normality

If the assumption of normality of the error distribution is not met, it could pose a substantial problem for determining if the coefficients a significantly different from zero and for estimating confidence intervals.

Troels Hedegaard & Morten N. Nielsen Testing for normality requires one to construct a plot of the fractiles of error distribution versus the fractiles of a normal distribution. The condition is considered met if the points fall close to the diagonal line. While a bow-shaped pattern indicates excessive skewness, an S-shaped pattern would indicate excessive kurtosis.

7.2.4.5 Regression

The initial model upon which the regression analysis will be conducted contains all variables:

𝑦𝑠𝑡𝑜𝑐𝑘= 𝛽0+ 𝛽1∗ 𝑋𝑔𝑟𝑜𝑤𝑡ℎ+ 𝛽2∗ 𝑋𝐵𝐷𝐼+ 𝛽3∗ 𝑋𝑟𝑎𝑡𝑒+ 𝛽4∗ 𝑋𝑐𝑜𝑠𝑡𝑠+ 𝛽5∗ 𝑋𝑣𝑜𝑙𝑢𝑚𝑒+ 𝛽6∗ 𝑋𝑜𝑖𝑙𝑃𝑟𝑖𝑐𝑒+ 𝛽7

∗ 𝑋𝑜𝑖𝑙𝑃𝑟𝑜𝑑+ 𝛽8∗ 𝑋𝑟𝑜𝑖𝑐𝑀𝐿+ 𝛽9∗ 𝑋𝑟𝑜𝑖𝑐𝑀𝑂𝐺+ 𝜖

Where:

𝑋𝑔𝑟𝑜𝑤𝑡ℎ World Growth measured by GDP 𝑋𝐵𝐷𝐼 Baltic Dry Index

𝑋𝑟𝑎𝑡𝑒 Rate per Fourty-Foot-Equivalent 𝑋𝑐𝑜𝑠𝑡𝑠 Unit Costs

𝑋𝑣𝑜𝑙𝑢𝑚𝑒 Transportation Volume of ML 𝑋𝑜𝑖𝑙𝑃𝑟𝑖𝑐𝑒 Average Oil Price for the Period 𝑋𝑜𝑖𝑙𝑃𝑟𝑜𝑑 Oil Production for the Period by MOG

The regression process progressed as follows. The full outputs of each regression can be found in appendix 14 to 16.

1. World Growth had the highest p-value with 0.86 and was therefore excluded

2. Of the remaining eight variables ‘Transported Volumes’ contributed least to the quality of the model and was therefore excluded

3. In the third model, ‘Rate per FFE’ had a p-value of 0.70 and was therefore excluded Final model:

𝑦𝑠𝑡𝑜𝑐𝑘= 24,683 − 2.5 ∗ 𝑋𝐵𝐷𝐼− 15.4 ∗ 𝑋𝑐𝑜𝑠𝑡𝑠+ 24 ∗ 𝑋𝑜𝑖𝑙𝑃𝑟𝑖𝑐𝑒+ 4.54 ∗ 𝑋𝑜𝑖𝑙𝑃𝑟𝑜𝑑+ 1,833 ∗ 𝑋𝑟𝑜𝑖𝑐𝑀𝐿+ 74

∗ 𝑋𝑟𝑜𝑖𝑐𝑀𝑂𝐺+ 𝜖

The final model has an adjusted r-squared of 0.68, which by any standards would be considered high for a model attempting to map the variations in something as complex as the stock price of a company. Although the adjusted coefficient of determination is considerably higher than expected and all remaining variables have p-values below the chosen significance level, it is important to gauge the accuracy of the coefficients.

One way to do this is to view each estimated coefficient within the context of the confidence interval. In this case, the confidence interval chosen is 95%:

Troels Hedegaard & Morten N. Nielsen Figure 37 - Regression Output

Source: Own creation based on applied variables

The ‘Accuracy’ column indicates the size of the coefficient compared to the width of the confidence interval.

Hence, a higher accuracy measure indicates that the width of the interval is smaller in comparison to the size of the coefficient:

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = √𝑋2 𝑈𝑝𝑝𝑒𝑟 − 𝐿𝑜𝑤𝑒𝑟

The results suggest that multiple variables are somewhat indicative of stock price movements of Maersk.

This is unsurprising, considering the complexity of the Group as a whole, and the wide exposure of each BU.

In fact, there are probably thousands of omitted variables, that each could account for a small part of the variation. Looking closely at each variable, however, there are several unexpected results

7.2.4.5.1 Testing Assumptions

The output from the testing methods can be found in appendix 17 to 21.

The assumption of homoscedasticity is considered met. Plotting residuals versus predicted values and observation number versus residuals showed no disconcerting patterns. The second assumption of linearity and additivity produced equally satisfying results. The plot of predicted values versus observed values formed a somewhat symmetrical diagonal line with a satisfactory consistent variance. The third assumption of statistical independence is likewise considered met, as no patterns were found in the plot of residuals versus observation number. However, when testing for normality the results were inconclusive. No indication of normality was found, which unfortunately reduces the quality of the model. The lack of normality in the data is likely a by-product of the small sample size available.

Troels Hedegaard & Morten N. Nielsen

7.2.4.6 Regression output

In the following sections, the results of the regression analysis found most relevant to the problem statement of the thesis, will be analyzed.

7.2.4.6.1 Baltic Dry Index

The BDI, which is used as an indicator of global freight rate level, seems to have a negative relationship with the price of the stock. The intuition behind this variable says that as freight rates go up, ML would benefit.

However, as the correlation analysis also indicated, rising freight rates in the current market conditions often arise from increasing costs, rather than an opportunity to capture more value.

7.2.4.6.2 Unit Costs

Unit Costs exhibit a negative relationship with stock price in the final model. This is consistent with the findings in the correlation analysis; a conclusion which is further strengthened by the degree of accuracy and low p-value of the variable. The contrast between unit costs and the freight rate indicator BDI perfectly illustrates the tendency in the industry to capture value by decreasing costs, as the current market limits the option to increase freight rates. Thus, the success of the cost leader strategy for ML is crucial in order to sustain profitability and deliver value to shareholders.

7.2.4.6.3 Oil Price

The positive operational sign of the oil price variable in the model is not in line with the results from the correlation analysis. The changed outcome could be due to the fact that the negative effects of increasing oil prices is captured by the unit cost and freight rate variables, leaving mainly the positive effect it has for MOG on the stock price. Combined with the above average accuracy of the variable, it seems that the correlation results has been misleading.

Troels Hedegaard & Morten N. Nielsen Figure 38 - Share Price vs. Oil Price Development

Source: Own creation based on data from Yahoo Finance

The Graph illustrates a somewhat clear correlation between the oil price and share price, which further strengthens the findings in the regression analysis. However, lag can be observed from early 2009 where the share price development is similar to the one of oil price, but with a few months’ delay.

7.2.4.6.4 Oil Production

Similarly, the variable oil production might capture the effects of other similar variables in the correlation analysis, in which almost zero correlation was found with the stock price. In the regression model, however, a positive operational sign indicates that Maersk has indeed been somewhat able to adjust production volume in line with oil prices.

7.2.4.6.5 Return of Capital Invested (ROIC)

The profitability measures of each BU are positive, which is to be expected. Normally, it is impossible to compare the value of two independent variables in the regression analysis; however, since the ROIC is measuring the same, it can be argued that the value derived from ML has a larger impact on the share price.

7.3 CONGLOMERATE PERFORMANCE

This section serves to apply and combine the theoretical knowledge and quantitative findings, established above, to enable an understanding of the underlying conglomerate performance. The findings will be discussed in the context of Maersk’s ability to enjoy the advantages of being a conglomerate. The attention is directed on Maersk’s process towards becoming a premium conglomerate, its diversification efforts, economies of scale and scope and internal capital market.

Troels Hedegaard & Morten N. Nielsen

In document Finance and Strategic Management (Sider 102-110)