Building a Shape Model - Statistical Shape Analysis of the Human Ear Canal with Application to

Gaussian distribution this normally gives a good approximation of the shape point cloud. The region of the space that the point cloud occupies is sometimes called theAllowable Shape Domain [61].

In some cases, the hypothesis of the ellipsoid model breaks down. An example with artificial worm shapes can be found in [61]. In that case, alternative ap-proaches to model the point cloud in shape space must be sought. Examples from the literature are the non-linear polynomial point distribution model [228], the non-linear kernel PCA [212], maximal autocorrelation, and maximal noise fractions decomposition [90, 174, 175, 176, 178, 180, 238]. In addition, non-linear Point Distribution Models are treated extensively in [36].

The ellipsoid approximation was found to be efficient in the current project.

The evaluation was done by examining the distribution of the PCA parameters.

Further details can be found in Appendix A.

4.2.2 Selecting the Number of Parameters

When the shape space has been parameterised, using for example the ellipsoid model from the PCA, the number of important parameters necessary to navigate the shape space needs to be determined.

Obviously, the more parameters, the better fit of the model; the less parame-ters, the more simple the model will be. Somewhere in between is the optimal number of parameters. To determine this number, there must be a criterion for optimality. A large number of criteria exist, ranging from significance tests to graphical procedures. A thorough discussion and testing of the different criteria can be found in [149].

A popular criteria used very often in shape analysis is the proportion of the trace of the covariance matrix that is explained by the principal components in the model. In many applications, the number of components to retain is chosen so they explain 95% of the trace of the covariance matrix. Hence, the corresponding eigenvectors explain 95% of the variation seen in the training data. Jackson strongly advices not to use this method, except for initial explo-rative data analysis [149]. Suppose that for a model with 20 parameters, the last 15 parameters each explain nearly the same percentage of the trace, and further suppose that the five most important principal components only explain 50% of the trace. Should one keep adding components until the magic number is reached? If so, why should for example component number 17 be excluded while component number 7 is retained, when they explain nearly the same amount of the trace?

4.2 Building a Shape Model 31 An alternative test is the graphical test called thescree test, where the eigen-values of the covariance matrix is plotted against the number of components. A typical scree plot is show in Figure 4.2. The name scree-plot is due to Cattel [51].

The scree is the rubble at the bottom of a cliff. The idea is that the point where the scree starts is located and the number of components is chosen to be at that point. This point is sometimes called an elbow [149]. On Figure 4.2 it is not obvious where that point is, probably around mode (component) number 10.

0 5 10 15 20 25

0510152025

Mode

Percent of Total Variation

Real data Randomised

Figure 4.2: A typical scree plot. The scree plot for the same, but randomised data is also shown. The plot is taken from Appendix A.

To avoid the graphical inspection and the inherent operator influence of this approach a group of procedures calledParallel Analysis (PA) emerged. In the method of Horn, the eigenvalues are calculated from the same, but randomly scrambled data set and the two scree plots are compared [135]. The number of components is chosen to be where the two lines cross as seen in Figure 4.2.

This method has successfully been applied to the ear canal data as explained in Appendix A. Parallel analysis has also been used to truncate the model parameters of an AAM, where the number of components selected where far less than with the proportion of the trace method [233]. If the first few roots are so widely separated that plotting can be difficult without losing information about the scree point, the log of the eigenvalues can be plotted instead. This is called a LEV (log-eigenvalue) plot [149] and has been used in parallel analysis.

A simpler method is to retain only the components whose eigenvalues exceed the average of all the eigenvalues. When the PCA is made on correlation matrices, the average root is equal to one, which makes this test very simple.

Moreover, Larsen and Hilger have demonstrated the use of the Bayesian infor-mation criterion (BIC) and Akaike’s “An inforinfor-mation criterion” (AIC) in the selection of model complexity [177, 178].

For a given model the log-likelihood of the data is estimated and penalised using either BIC and AIC. BIC arises from a Bayesian approach to model selection, whereas AIC provides an estimate of a test error curve with a minima at the optimal trade-off between model complexity and performance.

The log-likelihood increases with increasing model complexity, i.e. larger models reconstruct the training data better. In general BIC punishes the log-likelihood harder with increasing model complexity, thus giving preference to simpler mod-els in selection. The optimal balancing of the model complexity and performance depends on whether or not the family of models applied includes the true un-derlying model.

Furthermore, BIC is regarded as an approximation to the Minimum Description Length despite being derived in an independent manner [123].

As demonstrated in [149] the results of the different methods vary enormously.

The choice of method should be based on the application and followed by some kind ofsanity check.

4.2.3 Multivariate Statistical Analysis

Morphometrics, the multivariate statistics of object shape has advanced greatly over the last decade as described by Bookstein in [34]. Bookstein demonstrates how it is possible to examine group differences of shapes by their outlines [33]

and an overview of, and a complete framework for, testing landmark based shape group differences can be found in [34]. In addition, the use of thin plate splines to decompose shape variation is described in [32].

The methods from morphometry can be used to analyse the information con-tained in the statistical shape models. An example is that it has proven possible to discriminate gender using logistic regression on 2D shape models of human face silhouettes [244] and by regression analysis of the shape space parameters from a full 3D face model [143].

Another example is the analysis of growth. Growth analysis has been performed on human mandibles using a shape model built from the 3D surfaces extracted from CT scans [6, 126] and on human faces captured with a 3D surface scan-ner [144, 146].

4.2 Building a Shape Model 33 Discriminating between normal or abnormal subjects using shape models is an area that has received much attention in the later years. Examples are the analysis and discrimination of 3D face models of individuals with Noonans [121]

and Smith-Magenis syndrome [122]. The characterisation of the shape of neu-rological structures has also proven to be significant in the analysis of some illnesses [108, 109, 236]. An example is the analysis of the shape of the Corpus Callosum [84].

In this project, it proved possible to perform gender discrimination based on the shape and the size of the ear canals. See Appendix A for details.

4.2.4 Shape Fitting and Recognition

One of the primary abilities of the ASM is the possibility of using it to find and recognise previously unseen shape examples. Using a shape model in the search of 2D structures in images has been widely used. See for example [43, 44, 61]. For 2D image search, ASM is often substituted with the more powerful AAM [58, 60, 63, 64, 65, 68, 86, 231].

Many improvements to the search scheme has been suggested, including multi-scale approaches [70, 71] and ASM with optimal features [110], where in each ASM iteration the optimal landmark displacements are found by locating the optimal features using a nonlinearkNN-classifier. Furthermore, the ASM search can be made more robust against outliers using M-estimators [211].

Fitting a 3D surface shape model to a new example has been done using a combination of the Iterative Closest Point (ICP) algorithm and active shape model searching in [143, 145].

When an ASM has been fitted to a new example shape, it can be used to map fea-tures from an atlas to the new example. In this project, a combined ICP/ASM approach resembling the method by Hutton [143, 145] has been used to place faceplates on ear canals as seen in Appendix D. Furthermore, the ASM is used to propagate landmarks to the new ear. These landmarks are, among others, used to calculate paths through the ear canal as demonstrated in Section 6.3.

The shape parameters that describe the new shape can be calculated from the fitted ASM. These parameters can then be used in multivariate classification as explained in Section 4.2.3.

In document Statistical Shape Analysis of the Human Ear Canal with Application to In-the-Ear Hearing Aid Design (Sider 49-54)