• Ingen resultater fundet

8. Discussion of Sample, Data Collection and Variables

8.3 Data sampling

The following sub-sections will explain the stepwise process and considerations behind constructing our sample. Firstly, we gather a list of bankrupt firms. Then the group is matched with non-bankrupt firms that share similar characteristics in terms of location, industry and size. Particular emphasis is placed on the selection process, as the underlying data will be used for estimating the coefficients of the discriminant model.

Page 58 of 124

Bankrupt sample

The bankrupt sample was initially envisaged to be constructed using the Orbis BvD database and the corresponding Boolean search tool, which facilitates a search adhering to multiple criteria. However, after review of the extracted data, the bankruptcy identification criterion was deemed to be of poor quality and inconsistent across years due to two critical sampling limitations. Firstly, the search tool was unable to recognise that companies with a bankruptcy status in the database, which are naturally not publicly listed anymore, may have been listed before filing for bankruptcy. Introducing this criterion limited the sample to companies recently filing for bankruptcy and therefore not the entire study period (2012-2018). Secondly, Orbis did not provide a complete picture of all bankruptcies, as they were removed from the database on a rolling basis. Hence, potentially relevant bankruptcies would not have been retrieved.

To avoid these limitations, we use Standard and Poor’s Compustat to compile a list of bankrupt companies. Compustat does not have the same detailed Boolean Search tool as Orbis, but its bankruptcy identification of listed companies is more accurate and has been used in previous empirical corporate finance papers (Corbae & D'Erasmo, 2017).

Search criteria

Our sample is extracted by applying the following search criteria.

Criteria 1: US publicly listed companies

The criteria of including only publicly listed companies was chosen to ensure established corporate governance structures. This is due to companies listed on security exchanges, such as the New York Stock Exchange and NASDAQ, being subject to particular listing guidelines and standards, including disclosing corporate governance practices.

Criteria 2: Chapter 11 bankruptcy

Like other databases, Compustat deletes companies that have gone bankrupt but retains a historical log that describes the date and reason for deletion. For a deleted firm to be counted in the sample it must have the deletion code 02 (i.e. Chapter 11 bankruptcy). We only include companies that have filed for bankruptcy under Chapter 11 of the US Bankruptcy Code, in line with common practise in previous bankruptcy prediction literature (Betker, 1995; Brockmann et al., 2004; Balcaen & Ooghe,

Page 59 of 124

2006). As discussed in Section 2, by selecting only firms which have filed for bankruptcy under Chapter 11, homogeneity in the bankruptcy status is achieved, and the predictive power of the variables is unaffected by sample heterogeneity. We do not consider other reasons for deletion such as 01 (Mergers and Acquisitions), 03 (Liquidation associated with Chapter 7) or 04 (reverse acquisition).

Criteria 3: Selected industry exclusion

Finally, we exclude Finance and Insurance as well as Real Estate Rental and Leasing companies from the sample due to their unique characteristics. For instance, for financial services firms, high leverage is a ‘normal’ part of their business operations, which does not necessarily indicate financial distress as is often the case for non-financial companies. Industry groups are defined according to the North American Industry Classification System or NAICS classification (See Appendix 3). The NAICS is used extensively in the United States, Mexico and Canada and has superseded the Standard Industry Classification (SIC) and is therefore deemed relevant for classification purposes.

Resulting gross sample of bankrupt firms

After applying these criteria, we generate a gross sample of 864 firms that went bankrupt in the period 2012 to 2018. Following our gross data sample screening criteria, we amend the data based on a number of considerations. Firstly, duplicate firms (three firms) were discarded from the sample.

Secondly, due to the nature of our research question, we introduce a size criterion as the financial ratios of small firms have been found to be less robust and therefore not as appropriate for statistical modelling (Balcaen & Ooghe, 2006). Hence, firms with assets below USD 50m in the last available reporting year are discarded. This reduces the sample to 159 firms. Next, we randomly select 60 firms.

The number of observations is in line with previous studies and follows Altman’s (1968) methodology. We cross-reference and verify the accuracy of each of the 60 observations with Factiva to ensure that bankruptcy has occurred in said year and find no discrepancies.

Datapoint collection

Financial data for our sample of 60 firms are collected from Bloomberg through their company tickers. Unlike Compustat, Bloomberg does not delete historical data on bankrupt firms and is therefore deemed a superior financial database for this research purpose (Almamy et al., 2016).

Page 60 of 124

Corporate governance variables are sampled from the EDGAR (SEC) database by going through individual company 10-K or DEF 14A fillings. In order to ensure that our data set is complete and comprehensive, we exclude companies where all data points for the estimation year (one prior to bankruptcy) have been unable to retrieve. Our final bankrupt sample contains 51 companies that have filed for Chapter 11 bankruptcy in the period 2012 to 2018.

Non-bankrupt sample

The non-bankrupt sample is constructed using the ‘matching technique’ described in Section 8. This method has been widely used in previous research such as Altman (1968), and more recently Almamy et al. (2016) and Chan et al. (2016). As the name suggests, the sample is composed by selecting companies that match the initial bankrupt sample on three parameters namely: (i) geographic location, (ii) industry; and (iii) asset size. By controlling for these factors, we limit any exogenous impacts to our model. Additionally, the sample is drawn from the same time period as the bankrupt sample, from 2012 to 2018. We set the maximum asset size to USD 5,000m, as this is the largest observation in the non-bankrupt data set. In order to construct the non-bankrupt sample group, we use Orbis’ Boolean search function to apply the criteria described above. The sample selection process is shown in Appendix 4.

Resulting gross sample of non-bankrupt firms

The search yields a sample of 2,563 firms. We follow the approach set out for the bankrupt sample and collect the financials of these firms from Bloomberg using individual company tickers. In order to pick financially sound firms, we screen for operating income (EBIT) and remove companies with three years of consecutive negative EBIT, as they do not display the characteristics of a financially healthy firm. Additionally, we remove companies with fewer total asset than ten million USD to match the bankrupt sample thus ultimately reducing the sample to 731 firms. Next, we use the same techniques as employed in previous studies and select the 51 non-bankrupt firms to match the bankrupt sample by industry. Finally, the financial data from the non-bankrupt sample is aligned to the bankrupt sample based on the year of bankruptcy. For example, if six companies from the bankrupt sample go bankrupt in 2014 then we match the financial data of six of the non-bankrupt sample to this time frame. In this way we ensure that time effects are considered.

Page 61 of 124

Like the bankrupt sample corporate governance data is sampled from EDGAR (SEC) by looking at individual company 10-K and DEF 14A filings. This is done to ensure consistency of the data across the two groups.

Estimation sample and secondary (hold-out) sample

The total sample of 102 companies is divided into two groups: (i) the estimation sample and (ii) the secondary sample. The estimation sample comprises 30 bankrupt companies and 30 non-bankrupt companies and is used to construct the prediction model. The sample size fulfils the empirical modelling of Altman (1968).

The secondary (or hold-out) sample contains the remaining 21 bankrupt companies and 21 non-bankrupt companies and will be used to validate the prediction model. As they have not been part of the estimation sample, they are not prone to any upwards prediction bias.

The estimation sample of 60 US firms (30 bankrupt and 30 non-bankrupt) and our secondary sample of 42 US firms (21 bankrupt and 21 non-bankrupt) form the basis for the empirical analysis.