Point to point Point to associated border

(1)

Models Operational

Rune Fisker

Lyngby 2000

IMM-PHD-2000-77

IMM

ISSN0909-3192

c

PrintedbyIMM,TechnicalUniversityof Denmark

(2)

Preface

ThisthesishasbeenpreparedattheSectionforImageAnalysis,De-

partment of Mathematical Modelling, Technical University of Den-

mark. It isa partialfulllmentof the requirement forthedegree of

Ph.D.inengineering.

The thesis is within theeld of image processing and computer vi-

sion. The subject of the thesis is deformabletemplate models with

special focus on the common diÆculties,which arerelated to these

type ofmodels.

Readingthisthesis requires a basicknowledge of image processing,

statisticsand optimization.

Acknowledgement

I'mgratefultoalargenumberofpersonsforhelp,fruitfuldiscussions

andgreatcompany. Firstofall Iwouldliketothankmysupervisors

JensMichaelCarstensen and Knut Conradsenfortheirsupervision.

Iwould also liketo thank previousand present membersof thesec-

tion for image analysis especially Lars Pedersen, Klaus B. Hilger,

RasmusLarsen, Per R. Andresen,Claus Gramkow, Bjarne Ersbll,

Nette Schultz, Allan A. Nielsen and Johan Dore. I also had excel-

lent discussionwith Mikkel B. Stegmannand Henrik Aans, who I

I'm also in debt to Anil K. Jain and the rest of the people at the

PRIPlabforlettingmestaywiththematMichiganStateUniversity,

where I had an inspiringvisit. I would also like to thank Ingenir

Valdemar Selmer Trane og hustru Elisa Tranes Foundation for the

nancial support, which madethisvisitpossible.

Iwouldalsoliketothanktheexternopponentsatthedefenseofthis

thesis, Dr. Tim Cootes and Professor Milan Sonka, for attending

and forinterestingdiscussionson thepresentedtopics.

Many thanks to the secretaries Helle Welling and Mette Eltzholtz

for their help and to the librarianFinn K. Christensen, who never

complained aboutobtainingjust anotherarticleorbookforme.

FinallyI'mverygratefultomylovelywifeHelleanddaughterCecilie

who allways supported me and never complained when deadlines

made mea lesscommon sight.

Lyngby,November 1,2000

RuneFisker

(3)

Abstract

Deformable template models are a very popular and powerful tool

withintheeldofimageprocessingandcomputervision. Thisthesis

treatsthistype ofmodelsextensivelywithspecialfocuson handling

theircommon diÆculties,i.e. model parameter selection,initializa-

tionandoptimization. AproperhandlingofthecommondiÆculties

isessentialformakingthe modelsoperationalbyanon-expertuser,

which isa requirement forintensifyingand commercializingthe use

ofdeformable templatemodels.

Thethesisisorganizedasacollectionofthemostimportantarticles,

which has been published during the Ph.D. project. To put these

articlesintothe generalcontext of deformabletemplatemodelsand

to passonan overviewofthedeformabletemplatemodelliterature,

thethesis starts with a compact survey of thedeformable template

modelliteraturewithspecialfocuson representation,modelparam-

eter estimation, initialization, optimization and performance mea-

sures. The originalarticles- aligned abit innotationand corrected

fromdiscovered spellingerrors andother typos-are enclosedinthe

appendices.

Comparedto theliteratureone contributionis ageneral scheme for

estimation of the model parameters, which applies a combination

of a maximumlikelihoodand minimumdistance criterion. Another

contributionisaveryfastsearchbasedinitializationalgorithmusing

a lter interpretation of the likelihood model. These two methods

can be applied to mostdeformable templatemodelsmaking a non-

expertuser ableto usethemodel.

A comparative studyof anumberofoptimization algorithmsis also

reported. In addition a general polygon-based model, an ellipse

model and a textile model are proposed and a number of applica-

tionshave beensolved. FinallytheGrenandermodelandtheActive

Appearance Modelhave beenexploredandsome extensionsarepre-

sented.

(4)

Publications

ThissectionlistthepublicationsproducedduringthePh.D.period.

For the enclosed articles the corresponding appendix and reference

are given. If an enclosed article iscited in thesurvey thereference

isshown initalic,e.g. [68 ].

Publications related to this thesis

R. Fisker, N. Schultz, N. Duta and J. M. Carstensen, The

Grenander Deformable Template Model: A General Scheme, Sub-

mitted (AppendixGand [70 ]).

M.B.Stegmann,R.Fisker,andB.Ersbll,Extendingandap-

plying Active Appearance Models for automated, high precision seg-

mentationindierentimagemodalities,submitted(AppendixDand

[165 ]).

R. Fisker, J. M. Carstensen, M. F. Hansen, F. Bodker and

S.Mrup,Estimationof NanoparticleSize Distributions byImage

Analysis,ToappearinJournalofNanoparticle Research (Appendix

Aand [67 ]).

M. B. Stegmann, R. Fisker, B. Ersbll, H. H. Thodberg

and L. Hyldstrup, Active Appearance Models: Theory and Cases,

Proc. 9thDanishConf. onPatternRecognitionandImageAnalysis,

R. Fisker, N. Schultz, N. Duta and J. M. Carstensen, A

General Scheme for Training and Optimization of the Grenander

Deformable Template Model. Proc. Conf. on Computer Vision and

PatternRecognition, 698-705, Hilton HeadIsland,USA, 2000

R. Fisker and J.M. Carstensen, A General Polygon-based De-

formable Model for Object Recognition, Proc. 8th Danish Conf. on

Pattern Recognitionand Image Analysis, 28-35, Copenhagen, Den-

mark, 1999 (AppendixCand [66 ]).

R. Fisker, J.M.Carstensen and K. Madsen,Initialization and

Optimization of Deformable Models, Proc. 11thScandinavianConf.

onImageAnalysis,295-302, Greenland,1999(AppendixFand[68 ]).

J. M. Carstensen, R. Fisker, N. Schultz and T. C. Dorge,

Structural InferenceUsing Deformable Models, Machine Vision and

Advanced Image Processing in Remote Sensing, 61-71, Springer,

1999

R. Fisker and J.M. Carstensen, On Parameter Estimation in

Deformable Models, Proc. 14th International Conf. on Pattern

Recognition, 763-766, Brisbane, Australia, 1998 (Appendix E and

[65 ]).

R. Fisker andJ.M. Carstensen, Automated VisualInspection of

Textile,Proc. 10thScandinavianConf. on ImageAnalysis,173-179,

Lappeenranta, Finland,1997 (AppendixBand[64 ] ).

R. Fisker and J.M.Carstensen,Automatiskvisuelkvalitetsanal-

yseaftekstil (inDanish),Proc. 5thDanishConf. onPatternRecog-

nition andImage Analysis,28-35, Copenhagen,Denmark, 1996.

Publications not related to this thesis

H. Aans, R. Fisker, K.

Astrom, and J. M. Carstensen,

Robust Factorization, submitted.

(5)

H. Aans, R. Fisker, and J. M. Carstensen,Robust Structure

and Motion, Proc. 9th Danish Conf. on Pattern Recognition and

ImageAnalysis,

Alborg, Denmark,1-9, 2000.

R. Fisker, H.F. Poulsen, J. Schou, J.M. Carstensen and S.

Garbe, Use of ImageProcessing Toolsfor TextureAnalysis ofHigh

EnergyX-raySynchrotron Data,JournalofAppliedCrystallography,

31,647-653, 1998.

Technical reports

M. B. Stegmann, R. Fisker and B. Ersboell,On properties of

Active Shape Models, Technical Report IMM-Rep-2000-12, Depart-

ment of Mathematical Modelling,TechnicalUniversityofDenmark,

2000.

(6)

Notation

Thenotationshouldbequiteconsistentthroughoutthethesis. Bold

lowercaselettersdenotesvectors, x=(x

0

;:::;x

n 1

),andboldupper-

caseletters indicatesmatrices,X.

regularization constant.

x

parameterspace ofx.

P(v) priormodel/distribution.

P(vjy) posterior model/distribution.

P(yjv) likelihoodorobservationmodel/distribution.

covariance matrix.

modelparameters.

v templateparameters.

v t

trainingsampleof templateparameters.

^

v MAP estimateof thetemplateparameters.

U(v;y) energytermof theposteriordistribution.

y an image.

z normalizingconstant.

X 0

transposedof a matrixX

(7)

Contents

1 Introduction 1

1.1 BayesianFramework . . . 2

1.2 The Priorand LikelihoodModel . . . 3

1.3 CommonDiÆculties . . . 4

1.4 PopularDeformableTemplateModels . . . 5

1.5 RelationshipTo OtherMethods . . . 6

1.5.1 RigidTemplateMatching . . . 6

1.5.2 Hough Transform. . . 8

1.5.3 Registration . . . 8

1.5.4 Markov Random Fields . . . 9

2 Representation 11 2.1 Curves . . . 11

2.1.1 Labeled Pointsand Vertices . . . 11

2.1.2 B-splines . . . 12

2.1.3 LevelSets . . . 12

2.2 Orthogonal basis . . . 13

2.2.1 Fourier Descriptors . . . 13

2.2.2 PrincipalComponentAnalysis (PCA) . . . 13

2.2.3 Wavelets Descriptors . . . 13

2.3 Image Templates . . . 14

2.4 Conclusionand Discussion . . . 14

3 Model Parameter Estimation 17 3.1 Minimax . . . 18

3.2 MaximumLikelihood . . . 18

3.3 MinimumDistance . . . 20

3.3.1 DistanceFunctions and Performance measures 20 3.3.2 The"Best" DistanceFunction . . . 24

3.3.3 The"best" PerformanceMeasures . . . 25

3.4 Other Techniques . . . 26

3.5 MinimalManual Interaction . . . 26

3.6 Conclusionand Discussion . . . 27

4 Initialization 29 4.1 Feature Based Initialization . . . 30

4.1.1 InitializationFrom Moments . . . 30

4.1.2 Texture and Color. . . 30

4.1.3 GeneralizedHoughTransform. . . 31

4.2 ModelBased Initialization. . . 31

4.2.1 Search Based Initialization. . . 31

(8)

4.2.2 Very Fast StaticSearch. . . 32

4.3 Multi-hypothesis Initialization. . . 33

4.4 ConclusionandDiscussion. . . 33

5 Optimization 35 5.1 Deterministic OptimizationMethods . . . 35

5.1.1 Gradient Based Methods . . . 35

5.1.2 Dynamic Programming . . . 36

5.1.3 Dierence Decomposition . . . 36

5.1.4 Heuristics . . . 37

5.2 StochasticOptimizationMethods . . . 37

5.2.1 GeneticAlgorithms. . . 37

5.2.2 SimulatedAnnealing . . . 38

5.2.3 Stochastic Diusion . . . 38

5.3 ComparativeStudy . . . 39

5.4 ConclusionandDiscussion . . . 39

6 Conclusion and Discussion 41 A Estimation of Nanoparticle Size Distributions 43 A.1 Introduction. . . 44

A.2 The EllipticDeformable Template Model. . . 45

A.3 Initializationand Optimization. . . 49

A.4 Validation. . . 51

A.5 Experimental Results. . . 52

A.6 Discussion. . . 65

B Automated visual inspection of textile 71 B.1 Introduction. . . 72

B.2 Detection of defectsinthevertical threads . . . 74

B.3 Detection of defectsinthehorizontal threads . . . 75

B.3.1 DefectsAnalysis . . . 79

B.4 Experiments. . . 79

B.5 Conclusion . . . 80

C A General Polygon-based Deformable Model 85 C.1 Introduction. . . 86

C.2 Deformable models . . . 87

C.3 The polygon-based model . . . 88

C.3.1 Priormodel . . . 89

C.3.2 TheLikelihoodfunction . . . 91

C.3.3 Initializationand optimization . . . 94

C.3.4 Validation . . . 96

C.4 ExperimentalResults. . . 97

C.5 Discussions and futurework . . . .102

D Extending and applying Active Appearance Models103 D.1 Introduction. . . .104

D.2 Active Appearance Models . . . .105

D.2.1 Shape& Landmarks . . . .106

D.2.2 ShapeFormulation . . . .106

(9)

D.2.3 TextureFormulation . . . 107

D.2.4 Combined ModelFormulation . . . 108

D.2.5 Optimization . . . 109

D.3 Extensions. . . 111

D.3.1 Enhanced Shape Representation . . . 111

D.3.2 NeighborhoodAAMs. . . 112

D.3.3 Border AAMs. . . 113

D.3.4 Fine-tuningthemodelt . . . 115

D.3.5 ApplyingRobust Statistics . . . 117

D.3.6 Initialization . . . 120

D.4 Experimental Results. . . 122

D.5 Future work . . . 126

D.6 Implementation . . . 126

D.7 Conclusion . . . 127

D.8 Acknowledgements . . . 128

E On parameter estimation in deformable models 129 E.1 Introduction. . . 130

E.2 Deformable models . . . 131

E.2.1 The priormodel . . . 131

E.2.2 The observation model . . . 132

E.3 Supervisedmodel parameterestimation . . . 133

E.3.1 The priormodel . . . 133

E.3.2 The posteriormodel . . . 134

E.5 Experimentalresults . . . .136

E.5.1 Estimationofmodelparameters . . . .137

E.6 Conclusion . . . .139

F Initialization and Optimization of Deformable Mod- els 143 F.1 Introduction. . . .144

F.2 Deformable models . . . .145

F.3 Initialization . . . .147

F.4 Optimization . . . .149

F.5 Experimentalresults . . . .152

F.5.1 Initialization . . . .153

F.5.2 Optimization . . . .156

F.6 Conclusionand discussion . . . .161

G The Grenander Model: A General Scheme. 165 G.1 Introduction. . . .166

G.2 The Grenandermodel. . . .167

G.2.1 Priormodel. . . .168

G.2.2 Likelihoodmodel withglobalmean. . . .171

G.3 Modelparameter estimationbasedon a trainingset. .172

G.3.1 Priormean shape. . . .172

G.3.2 Priorweight parameters . . . .173

G.3.3 Likelihoodparameters . . . .174

G.4 Initialization. . . .175

(10)

G.6 Model parameterestimationbased on one example.. . 179

G.7 Adaptivelocalmean model. . . 180

G.8 Experimental results.. . . 181

G.9 ConclusionandDiscussion. . . 184

Bibliography 208

Ph.D. theses from IMM 216

Index 218

(11)

Chapter 1

Introduction

Thefamilyofdeformabletemplatemodelshasbeenpresentedunder

many dierent names, where the best known probablyare Snakes,

Active Contour models, Active Shape models, deformable models

and deformable templates. A general denitionof deformable tem-

plate modelscouldbe:

A deformable template model can be characterized as a

model, which under an implicit or explicit optimization

criterion deforms the shape to match a known type of

object in an image.

In mostcases the core task of thedeformable template model is to

performsegmentationofaknowntypeofobject,butthemodelscan

also be appliedto more general tasks, e.g. imageretrieval[16 , 102 ,

104 ]and tracking[18 , 99 ,145,146,171].

Even though deformable template models have been a very active

research eld for more thana decade, a recent review of statistical

pattern recognition [104] pinpoints deformable models as a one of

thefrontiers ofpattern recognition.

1.1 Bayesian Framework

The design and use of deformable template modelsts neatly into

theBayesianframeworkforimageanalysis. WerefertoBesag[15 ]or

Mumford[137 ] forageneralintroductiontotheBayesianframework

for image analysis. In a Bayesian setting, thedeformable template

modelframeworkcan be summarizedas:

1. Choosea representationof theobject, wherethetemplatepa-

rameters, v 2

v

, direct or under some mapping dene the

object.

2. ConstructapriorprobabilitydistributionP(vj),wherecor-

respondsto themodelparameters,whichdetermine theprop-

erties of thedistribution.

3. Formulate a likelihood orobservation model,P(yjv;), which

denes the probabilityof an image, y 2

y

,given any partic-

ular realizationof v.

4. Combine the prior distribution and the observation model to

theposteriordistribution,P(vjy;), byBayes theorem.

P(vjy;) =

P(vj)P(yjv;)

P(yj)

(1.1)

/ P(vj)P(yjv;) (1.2)

5. Select orestimatethemodelparameters,.

6. Makeinferencee.g. byestimationofthemaximumaposteriori

(MAP):

^

v=max

v

P(vjy;) (1.3)

Note,that,forsimplicity,isomittedinthedistributionfunctionsin

mostofthisthesisaswellasintheliterature,i.e. P(vjy)=P(vjy;)

(12)

The posterior distribution can be interpreted as an (explicit) opti-

mizationcriterionandundertheweakassumptionthattheposterior

distribution,P(vjy),isGibbsdistributed,theposteriordistribution

can be expandedto:

P(vjy) = 1

z

expf U(v;y)g (1.4)

where U(v;y) :

v

! R is the energy function and z is a normal-

izing constant, which ensures a proper statistical distribution, i.e.

R

v

P(vjy)dv = 1. Due to the dimensionality of

v

it is usually

infeasible to determine the value of z. In practice it is of little im-

portance,sincezisaconstant,whichmaketheoptimizationin(1.3)

invariantto the actualvalueof z.

Itisverycommontoformulatetheoptimizationcriteriaasanenergy

function instead of a posterior distribution. Under the assumption

thatP(vjy) is Gibbs distributedand z is constant, maximizingthe

probabilityisequivalent to minimizingtheenergy,since:

^

v = max

v

P(vjy;)

= max

v

U(v;y;) log(z)

= min

v

U(v;y;) (1.5)

Thisshowsthatthe probabilistic formulationisequivalenttothe en-

ergy formulationwhen appliedto MAP estimation.

IngeneralwepreferaBayesianformulation,becauseitgivesanatu-

ralseparationofmodelandimagecontributionsinP(v)andP(yjv),

and it provides the opportunity to simulate and thus visualize the

appropriatenessofthepriormodel. Furtheritprovidesseveralways

to makeinference inP(vjy).

1.2 The Prior and Likelihood Model

The basic principle of the prior model, P(v), is to introduce prior

is simply that it is so hard to make progress without it [18 ]. One

interpretationofthepriormodelistoseeitasaregularizer[147,148 ],

whichregularizestheshapeofthetemplate. Infact,thepriormodel

in theSnake [108 ] (equal to the internal energy) is a generalization

of a Tikhonov[173 ] regularizer[108].

Froma statistical point ofview the priormodel imposesa distribu-

tion on the template parameters. This distribution introduces cor-

relation between parameters and governs the overall deformation.

Fisker et al. [65 ] demonstrate that in the case of the prior energy

being a quadratic term, the prior model usually corresponds to a

multi-dimensional Gaussian. As demonstrated later this result has

importantapplicationswithinmodelparameterestimationaswellas

model simulation and validation. Usingthis result the prior model

of the snake correspond to a multi-dimensional Gaussian with the

mean equalto zeroandthe inverse covarianceequalto a pentadiag-

onal bandedmatrix.

Thelikelihoodorobservationmodel,P(yjv),isthedatadriventerm.

The modeldenes theinteractionbetweena realizationof thetem-

platevandanobservedimagey. Basicallyitdeneshowwelltheac-

tualtemplatematchtheobjectintheimage. Mostlikelihoodmodels

arebasedonimageintensity[23 ,41,35 ,85 ,86 ,167,155 ,189 ]and/or

gradient(edge)information[33 ,34 ,72 ,101 ,102 ,108,118,163 ,189 ],

but in principle all kinds of informationcan be combined e.g. tex-

ture,colour ormotion.

1.3 Common DiÆculties

In general, the current approaches to deformable template models

suers from anumberof commondiÆculties. Jain et al. [101]sum-

marize thediÆcultiesas:

Theuserneedstoassignweightstodierentcomponentsofthe

(13)

determinethesuccess ofa deformabletemplatemodel (Model

Parameter Selection).

The algorithms need a good initialization to give meaningful

results,otherwisetheyget stuckat spuriouslocalminimaand

thusleadto incorrectresults (Initialization).

Thelargenumberofparametersmakesthenumericalsolutions

diÆcult(Optimization).

In short, the common diÆculties can be summarizedas Model Pa-

rameterSelection,InitializationandOptimization. ThesediÆculties

arethemainreasonwhymostmodelsonlycan beappliedbyanex-

pert. Thisisnotsatisfactorysinceagoodmodelshouldbeapplicable

byanon-expertuser. Thehandlingandsolutiontothese diÆculties

arecovered intensivelyinthefollowingchapters.

1.4 Popular Deformable Template Models

Jain et al. [101, 102 ] divide deformable models into two groups:

Freeform and Parametric. Freeform deformabletemplateshave no

explicitglobalstructurebecausetheirpriormodelcontainsonlylocal

continuityand smoothnessconstraints. The bestknownexampleof

freeformmodelsareSnakes[108 ]andBallons[33 ,34 ]. Inparametric

deformablemodelspriorknowledgeoftheglobalstructureisincluded

usingaparameterizedtemplate. Examplesofparametricdeformable

modelsareGrenander'smodel[85 ],ActiveShapemodel[41 ],Active

Appearance model [35]and Blake's Active Contour [18]. The cited

modelsallhave theimportantpropertyofbeinggeneralinthesense

that they can be applied to an object with an arbitrary shape. A

moredetailedoverviewofthepropertiesofthemostpopulargeneral

modelsisgivenintable1.1. Therstreferenceinthetableisthe"key

reference" and Eucl. is an abbreviation for Euclidean parameters

The largest group of deformable modelsare formulated and tuned

for a specic object. Properly thebest-known specic model is the

eye model and a mouth model proposed by Yuille et al. [189 ]. For

someproblemsitisnecessarytoincorporatespecicassumptioninto

themodeltobeableto solvetheproblem,butmanyproblemscould

have beensolved directlybyone of thegeneralmodels.

It is outside the scope of this thesisto give a general descriptionof

all themodelsintheliterature. Forageneral surveywe referto the

rst part of the book on Active Contours by Blake and Isard [18 ]

and thesurvey papersby McInerney and Terzopoulos [132 ] and by

Jain et al. [101].

1.5 Relationship To Other Methods

Deformable template models have close relations to a number of

other computer vision and image processing methods, where the

mostimportare covered inthefollowingsections.

1.5.1 Rigid Template Matching

Earlyresearchintemplatematchingconcentratedmainlyonthesim-

ple rigid shape matching, where a prototype template was trans-

formedbysimpletransformationsliketranslation,rotationandscal-

ing before it was matched to the image. The matching is typically

performedbyasimplecorrelation-basedmethodusingalteringap-

proach,see e.g. [2 , 162 ]. Thistype of matchingis stillvery popular

inmachine visionapplications.

The rigidness of thisapproach makes itinsuÆcient forobjects with

larger shape variations, which basically lead to the development of

deformabletemplatemodels. Formanymodelsthelikelihoodmodel,

P(yjv),canbeinterpretedascorrelation-basedmatchofadeformed

template.

(14)

ModelObjectTemplateModelWeightInitia- RepresentationParametersvParametersSelectionlization ActiveMeanshape/bitmap,ScalarsonEigenvectors&NoSearch Appearancelinearcombinationeigenvectorsmeanweights Model[35,40,61]ofeigenvectors&eucl.&eucl.shape/bitmap ActiveShapeMeanshape,linearScalarsonEigenvectors&NoNo Modelcombinationofeigenvectorsmeanshapeweightscomments [38,40,41]eigenvectors&eucl.&eucl. BalloonPoints/FEMPoints/WeightsManualManual [33,34]FEM Blake'sB-Spline&CurveMeanshape,NoMoments ActiveContourdeformationspaceshape-shapematrix&commentsbinary [18]vectorweightsimage Grenander'sPointsPointsMeanshape,ManualHeuristic modelgraylevelstat.& [85,86,114,155]weights Jain'sDeformationDeformationBitmap&ManualSearch modelofbitmapparameterweight [101,102,190]&eucl.&eucl. SnakePointsPointsWeightsManualManual [108] Staib&Duncan'sEllipticFourierEllipticFourierFouriermeans&ManualModeof Model[163]descriptorsdescriptorsweightprior

1.5.2 Hough Transform

The HoughtransformwasrstproposedbyHough[95]. Since then,

numerous papers have improved the Hough transform, where the

(;)-representationof lines[58 ] and thegeneralization to arbitrary

shapes for any scale and any rotation [11 ] are some of the most

inuential. As demonstrated byStockman and Agrawala [166], the

Hough transform is simply an eÆcient implementation of template

matching. Formore aextensivereviewwerefer tothesurveypapers

[98 , 122 ].

In a deformable model setting the Hough transform can be inter-

preted as a deformable model with a uniform prior, where the ini-

tialization is performed by an intensive buteÆcient search and the

optimization isomitted.

1.5.3 Registration

Imageregistrationrequiresndinganoptimaltransformbetweenan

imagepair,thesourceandtarget image[123]. Forareviewofimage

registration, we refer to one of the survey papers [7 , 21 , 123, 129 ,

177 , 183 ].

Matching a parametric deformablemodel to an object in an image

can in general be interpretedas a registration of a prototype tem-

plate to theobject. Stillthere is dierence in the actualobjective.

In registration the requirement is an exact correspondence between

homologous points, where deformable models are focused on seg-

mentationratherthannecessarilyndingtheexact correspondence.

Anotherdierenceistheactualmatching. Traditionallyregistration

isperformedbetweensimilarfeaturetypes,e.g. pairsofimageswith

the same or dierent modality or pairs of landmark sets. Whereas

mostdeformablemodelsmatch anobjectmodel{withanon-image

representation{ to an image.

Stillthereexistanumberofmethods,whichcanbeclassiedasboth

(15)

aregistrationmethodand adeformabletemplatemodel,e.g. elastic

modelslike [9, 10 ,49 ]and theviscousuid model[20 , 29 , 30 ].

1.5.4 Markov Random Fields

MarkovRandom Fields(MRF) providea convenient and consistent

way of modeling spatial context and stochastic interaction among

entitiessuch asimagepixelsand other spatiallycorrelated features.

Foran elaboratetreatment of MRFswe referto [125].

Froma generalpointMRF'sanddeformabletemplatemodelsapply

a very similar optimization based framework and both t neatly

into theBayesiansetting. Li[125 ] also arguethatmanydeformable

templates models can be interpreted as a high level MRF having

irregularsiteswithdiscretelabels,whereeach siteindexesan image

featuresuch asapointora linesegment.

Despitethesesimilaritiestheliteratureisquiteclearonwhatisclas-

siedasaMRFandadeformabletemplatemodel,respectively. The

main dierence is what is modeled. Deformable templates models

isbased on anobject model whereasMRFs model thefullimageor

theinteractionbetween objects.

(16)

Chapter 2

Representation

Choosingobjectrepresentationisoneof themostimportantchoices

in the design of a deformable template model. The most popular

representations are summarizedbellow. We have chosen to catego-

rize the representations in three main groups: Curves, Orthogonal

basisand ImageTemplates.

2.1 Curves

2.1.1 Labeled Points and Vertices

One of therst and most popular representation is a labeled set of

points with connectivity information. This representation is equiv-

alent with a vertices and edges representation. In the simple case

ofone single openorclosedcontour neighborpointsareassumed to

be connected. In this case the point representation corresponds to

apolygonora linearspline. Numerousauthorshave usedthepoint

representation,e.g. [22 , 33 ,57 , 85 ,108,118].

2.1.2 B-splines

Another popularrepresentationis B-splines, which enableacontin-

ues curve description, see e.g. [18 ] for an introduction. Compared

to apointrepresentationB-splinesalsohavetheadvantageofa low-

dimensional parameter space, since a similar shape represented is

obtained by a few control points compared to a large number of

points. Anotheradvantageisthe"built-in"smoothness. Realization

of deformablemodelsusingB-splineswasdeveloped byCipollaand

Blake[31 ],Menetetal. [133]andHintonetal. [92 ]. Otherexamples

of theuseof B-splinesrepresentationaregiven in[18 , 99,113 ].

2.1.3 Level Sets

Another powerful representation is based on the elegant level sets

[142 ]. See [159 ] for an introduction. Compared to the other repre-

sentationslevelsets havetheadvantageofallowingautomaticmerg-

ingand splittingoftheinitialcontour. Inthecontext ofdeformable

modelslevelsetsweresimultaneouslyproposedbyCasellesetal. [25 ]

and Malladi et al. [130]. Since then level sets have received much

attentionandanumberofmodelshavebeenbasedonthisapproach,

see e.g. [26 ,111,124,144 ,143 , 186 ,187 ].

2.1.4 Other Analytic Curves

Besidethementionedcurves, numerous othersetsof analyticcurves

have been applied to represent the object, e.g. a set of parabolic

curves and circles. Examples of representation based on other sets

of curvesare[67 , 119,189].

(17)

2.2 Orthogonal basis

2.2.1 Fourier Descriptors

Fourierdescriptorsrepresenttheobjectonanorthogonalbasis,where

theusualbasisfunctionisthesinusoids,i.e. trigonometricfunctions.

On a sinusoidal basis the representation correspond to the elliptic

Fourierdescriptors[78 ,116,127 ,163],wheretheFourierdescriptors

are localized in frequency. In the context of deformable template

models Descriptors were introduced by Scott [158] and Staib and

Duncan[163].

2.2.2 Principal Component Analysis (PCA)

AnotherrepresentationwhichappliesanorthogonalbasisisthePoint

DistributionModel proposedby Cootes et al. [36,41 ]. In practice,

theobject is representedbythemean shape ofa trainingset and a

linear combination of the mostimportant eigenmodes of the shape

variationfrom thismean. Thisrepresentationhas closeconnections

to thegeneral Shape Statistic, see e.g [56 ]. The Point Distribution

Model plays an important role in the popularActive Shape Model

[41 ]andhasbeenextendedwithtexture inthenovelActiveAppear-

ance Model [35 ]. Numerous other models like [60 , 126 , 139] have

beenbased on thisrepresentation.

2.2.3 Wavelets Descriptors

A third orthogonal basis is the wavelet transform [47 , 48 ], see e.g.

[131 , 168 ]fora generalintroductionto wavelets. Waveletsarelocal-

izedbothininspaceand infrequency(scale),sincethey aredened

as dilated and translated version of the basis or mother wavelet.

Wavelets descriptors are less popular than the Fourier descriptors

and the Point DistributionModel, butthere are still examples like

[139 ,188 ,190 ],where[139,188 ]applytheDaubechieswavelet[47 ,48]

and [190]appliestheB-spline wavelet [175, 176 ], respectively. Note

the comparison of shape models based on the point distribution

model, Fourier descriptors and wavelets in [139]. The previous ex-

amplesallapplieswaveletsfortheshaperepresentation,butwavelets

have also beenusedto make acompact texture representation[184 ]

inActive Appearance Models[35 ].

Notethattheorthogonalbasisusuallyapplyareducedortruncated

parameterspacewhereonlythemostimportantmodesanddescrip-

tors areused.

2.3 Image Templates

Anotherpopularrepresentationisa prototypeimagetemplate. The

prototype image is then deformed under a similarity measure to

match a new object in an image. Most of these models can also

beclassiedasregistrationmethods(seediscussioninsection1.5.3).

Typically the template is of the same type as the inputimage but

edges templates have also been used,see [102 ]. Thereis a rich col-

lection of examples of this representation, where some of the best

knownare[5 , 9,10 , 29 ,30 ,100 , 156 ].

2.4 Conclusion and Discussion

Thecentralquestionisnowwhichrepresentationisthebest? Unfor-

tunatelythereisnoeasyanswertothisquestion,sinceitdependents

ontherestofthemodelandtheactualproblem. Stillthereisanum-

ber of properties, which from a general point of view are desirable

forthe idealmodel:

General.

(18)

Lowredundancyanddimensionality.

A low dimensional representation with little redundancy im-

provesthecomputationaleÆciencyandmaketheoptimization

easierand more robust.

Linearparameterization.

Restriction to linear parameterization has certain advantages

in simplifying tting algorithms and avoiding problems with

localminima[18 ].

Orthogonalbasis

Ingeneral,anorthogonalbasisisdesirablebecauseitmakesthe

parametersdistinct. ThismakesthecoeÆcientsdetermination

easierand avoids redundancy[163 ].

Shaperegularization.

Commonlyregularizationoftheshapeisobtainedbytheprior

modelandtherepresentation. Inmanycasestheregularization

of the shape is what makes the deformable model successful,

and it is an advantage if the representation can be used as a

regularizer.

Automaticsplittingand merging.

Primarilyin thecontext of free form deformable model auto-

maticsplittingand mergingis adesirable property.

These properties favors some models, but still there is no superior

representation.

(19)

Chapter 3

Model Parameter

Estimation

Modelparametersplayanimportantroleindeterminingtheproper-

tiesof the posteriordistribution. Commonly themodelparameters

are selected before the model is applied and these are kept con-

stant. Examples of model parameters are the mean and variance

ina Gaussian distribution and the weight parameters inthe snake,

which determine the relative inuence of the dierent terms in the

energyfunction.

Mostdeformable modelscontain model parameters,e.g. weight pa-

rameters. Withafewexceptionstheseparametersareselectedman-

uallybyanexpert(c.f. Table1.1). Fromageneralpointofviewthis

isnotacceptablesincethegoalisaframework,whichcanbeapplied

by a non-expert user. Methodsapplied to automatic model param-

eter selectionindeformabletemplatemodelsare describedbelow.

3.1 Minimax

GennertandYuille[77 ]proposeageneralunsupervisedtechniquefor

determination of model parameters for multiple objective function

optimizationbasedontheminimaxprinciple. Recall^v=max

v

P(vjy)

=min

v

U(v;y). Theprincipleisto determinethethatmaximizes

U(v;^ y;)whenU(v;y;)hasbeenminimizedoverv. Moreformally

theminimaxcriterion isdened as[77 ]:

^

=max

min

v

U(v;y;) (3.1)

The minimax principle is a very conservative estimate, since the

optimalmodelparameterscorrespondtotheworstcaseenergywhen

thedeformabletemplatemodelhasbeenoptimized. Thismayseem

at rst far from optimal, butif one recalls that U(v;y;) has been

minimizedoverv,it willbe seenthat themaximizationover does

makesense,whileavoidingtheproblemsofbeingexcessivelylowor

high [77 ]. The minimaxprinciple has been applied in[117; 118 ] to

estimatetheregularizationparameterintheirg-snake. Notethatthe

minimax principle is based on the assumption that min

v

U(v;y;)

is convexwith respectto ,which isoften notthecase[65 ].

3.2 Maximum Likelihood

Assume a ground truth of n samples, v t

0

;:::;v t

n 1

, and their corre-

spondingimages,y t

0

;:::;y t

n 1

,hasbeensupplied. Giventhistraining

setthemodelparameterscan{intheory{beestimatedbytheMax-

imumLikelihood (ML)estimate:

^

=max

L(;v t

0

;:::;v t

n 1

;y t

0

;:::;y t

n 1

) (3.2)

where L(;v t

0

;:::;v t

n 1

;y t

0

;:::;y t

n 1

) is the likelihood function. If

v t

0

;:::;v t

n 1

and y t

0

;:::;y t

n 1

are assumed to be stochastically inde-

(20)

pendentthe MaximumLikelihoodestimatebecomes:

^

=max

n 1

Y

i=0 P(v

t

i jy

t

i

;) (3.3)

In[65 ]and[96 ,109][70 ]theMLestimatorhasbeenusedtoestimate

a subsetof themodel parameters corresponding to the priormodel

and allthemodelparameters, respectively.

Unfortunately Q

n 1

i=0 P(v

i jy

i

;) is not guaranteed to be a convex

functionwithrespecttoallelementsin. Infactitisveryrare that

thelikelihoodmodel,P(yjv;

l

),isconvexwithrespectto themodel

parameters,

l

, which relates to the likelihood model. Whereas the

prior model, P(vj

p

), usually can be rewritten to become convex

with respect to the prior model parameters,

p

. In fact most prior

models are born non-convex, but if the corresponding energy func-

tiononlycontainsaquadraticterm,FiskerandCarstensen[65 ]show

thatthepriormodelusuallycorrespondstoamultivariateGaussian,

whichisconvexwithrespectto

p

. Theimportanceofthisresultcan

bedemonstrated by a univariate Gaussian, wherethe energy corre-

spond to U(x)= (x )

2

2 2

, which is non-convexw.r.t.

2

. Since U(x)

correspondstoaGaussianthenormalizingconstant,z,isknownand

theenergy can berewritten to U(x) = (x )

2

2 2

log ( p

2 2

), which

isconvex w.r.t.

2

.

In general the convexity of Q

n 1

i=0 P(v

i jy

i

;) w.r.t.

j

is closely re-

lated to whether the parameter,

i

, corresponds to a part of the

posteriordistribution,whichisolatedcorrespondsto aproperdistri-

bution. If

j

correspond to a proper distribution, Q

n 1

i=0 P(v

i jy

i

;)

is usually convex. Whereas Q

n 1

i=0 P(v

i jy

i

;) usually is non-convex

if

j

correspond to an improperdistribution.

NotethattheMLcriterionismostsensibleforparametricdeformable

template models, since this type of models favor a known global

structure, whichcan be interpretedasthemean ofthe prior.

3.3 Minimum Distance

Itisinfeasibleto performaMLestimationofthemodelparameters,

i

, which correspond to a non-convex likelihood function. In this

case we proposeto usea minimumdistance criterion(same concept

butan improvedcriterion compared to [65 ]):

^

=min

D(V

t

;

^

V()) (3.4)

where D(V t

;

^

V()) is the distance between V t

= v t

0

;:::;v t

n 1 and

^

V() =

^

v

0 ();:::;

^

v

n 1

() for v^

i

() = max

v P(vjy

t

;) being the

MAP estimate obtained by running the full initialization and op-

timization scheme with the actualvalue of . In fact thisis a very

reasonable criterion, sinceit corresponds to the overall goal of per-

formingthebestsegmentation,i.e. thesegmentationwiththesmall-

est possibledistance to theground truth.

3.3.1 Distance Functions and Performance measures

To apply the minimum distance criterion the distance has to be

dened. Unfortunatelythereisnocommonstandardforperformance

evaluationofdeformabletemplatemodels,hencethereisnocommon

functionto evaluate thedistance betweentheground truthand the

segmentation. In fact thisis very annoyingsince it isimpossible to

compare theperformanceofdierent models.

The most popular distance measures used with deformable mod-

els are summarized bellow ( is omitted for simplicity). Note that

D(V t

;

^

V)andD(v t

;v)^ correspondtothedistancebetweentwotrain-

ingsetsandthedistancebetweentwosamples,respectively. Mostof

thedistancemeasuresassumetwonitepointsets,U t

=(u t

0

;:::;u t

p 1 )

and

^

U =(u^

0

;:::;u^

q 1

)canbeobtainedgivenv t

and ^v,respectively.

Usually thesepoints layon theboundary.

(21)

Template parametererrors

Asimpledistance measureusedin [65]isthesumof weighted

squareerrors betweenthetemplateparameters:

D

sq (V

t

;

^

V)= 1

nl n 1

X

j=0 (v

t

j

^ v

j )

0

C(v t

j

^ v

j

) (3.5)

wherelisthenumberofparametersinvandC isasymmetric

weight matrix. C isusuallythe identitymatrix.

Point to point errors

Given two nite point sets, U t

and

^

U, with unknown corre-

spondenceandthesamenumberofelementstheaveragepoint

to pointerrorwith linearreparameterizationreported in[163 ]

isdened as:

D

ptpt (v

t

;

^

v)= min

0s0<k 1

k k 1

X

i=0 ku

t

i

^ u

i+s

0

k (3.6)

wherek =p=q and s

0

is theoset.

Anotherpointto point errorwithknowncorrespondence used

in [43 ] is the root mean square (RMS) point to point error

denedas:

D

ptpt (V

t

;

^

V)= v

u

t 1

nk n 1

X

j=0 k 1

X

i=0 ku

t

j;i

^ u

j;i k

2

(3.7)

A thirdpointto point errormeasure isdened by:

D

ptpt (v

t

;v)^ = 1

k k 1

X

i=0 min

u t

j 2U

t ku

t

j

^ u

i

k (3.8)

In [60 ] is reported the mean and standard deviation for this

measure forthe fulltest set, V. Note this criterion makes no

assumption about point correspondence and identical size of

Undirected Hausdor distance

Given two nite point sets, u t

and u,^ the directed Hausdor

distance isdened as[97 ]:

h(v t

;v)^ = max

u t

i 2U

t min

^ u

j 2

^

U ku

t

i

^ u

j

k (3.9)

The undirectedHausdor distanceis thendenedas[97 ]:

D

H (v

t

;v)^ =max(h(u t

;u);^ h(^u ;u t

)) (3.10)

The average and standard deviation of the undirected Haus-

dor distance forthefullset, V t

,is reported in[60 ].

Undirected partial Hausdor distance

Giventwonitepointsets,u t

andu,^ thepartialdirectedHaus-

dor distance isdened as[124]:

h

K (u

t

;u)^ = K th

u t

i 2U

t min

^ u

j 2

^

U ku

t

i

^ u

j

k (3.11)

where K is a quantile of the maximum distance. The undi-

rected distance is then dened using (3.10). This measure is

reported in [124 ] for two dierent samples and two dierent

quantiles,K.

Normal displacement.

Giventwonitepointsets,u t

andu,^ withcorrespondenceand

connectivity information the squared normal displacement is

dened as[18 ]:

D

n (v

t

;v)^ = 1

k k 1

X

i=0 ((u

t

i

^ u

i )n(^u

i ))

2

(3.12)

where n(u

i

) is the unit normal at u

i

. This distance measure

is extensivelyused in[18 ].

(22)

Point to associated boundary.

Givenanitepointset,u,^ andapiecewisecontinuoustemplate

curve,r(s;v t

),ofthegroundtruththeRMSpointtoassociated

boundaryerrorreported in[43 ] isdenedby:

D

pt b (V

t

;

^

V)= v

u

t 1

nk n 1

X

j=0 k 1

X

i=0 kmin

s

i 2

i (^

u

j;i r(s

i

;v t

j ))k

2

(3.13)

where

i

isthe part of theboundaryof r(s;v), whichis asso-

ciatedwithu

i .

Area error

Therelativearea erroris denedas:

D

A (v

t

;v)^ = A(^v)

A(v t

)

(3.14)

whereA(v) is thearea. In[60 ] the mean and standarddevia-

tionof therelativearea isreported.

Labeling error

Therelativelabelingerroris denedas:

D

L (v

t

;

^

v)= N

incor (v

t

;^v)

N

Pixel s (v

t

)

(3.15)

where N

incor

is the number of incorrectly labeled pixels and

N

Pixel s

is thetotal numberof pixels. The mean and standard

deviationof thismeasureis also given in [60 ].

Texture error

Given a texture model, y(r;c;v),^ the RMS texture error re-

ported in[43 ] isdened by:

D

t (V

t

;

^

V)= v

u

t 1

nj

y j

n 1

X

j=0 X

(r;c)2y (y

j (r;c)

t

y(r;c;v^

j ))

2

ground truth

estimated ground truth

estimated

Point to point Point to associated border

ground truth

estimated Normal displacement

Figure 3.1: Examplesof distance measures.

where j

y

jis the number of membersin

y

and

y

is the set

of image coordinates, (r,c), which correspond to the model.

Alternative texture metricscan be foundin[152].

Some of the distance measures are illustrated in Figure 3.1. Note,

that special cases exists wherea numberof the measures above are

identical.

3.3.2 The "Best" Distance Function

It is impossibleto select the overall "best" distance function, since

it dependson theactualapplicationand model. Stillsome distance

functions seemmore attractive than others.

From a general point of view the template parameter error is less

attractive, since it is very sensitive to the parameterization and to

dierent typesof templatesparameters.

Points onthe boundaryof a deformablemodelrarelycorrespond to

(23)

to move along the boundary in most applications. If the model is

not having a point based representation points need to be created,

which is very sensitive to the parameterization. Overall this makes

thepoint to pointbased measuresseemless attractive.

Toavoidproblemswithpointsmovingalongtheboundary,informa-

tionoftheboundaryneedto beincluded. Theonlymeasures,which

satisesthisisthenormal displacement andthepointto associated

boundary. These criteriabasicallyapproximate:

D(^v;v t

)= 1

L Z

1

s=0

kr(s;v^) r(g(s);v t

)k 2

ds (3.17)

whereg(s)isanreparameterizationfunctiondenedsuchthatg(s)=

s gives the original inherited parameterization, mapping points on

thecurver(s;v)^ topointsonr(g(s);v t

). Unfortunatelythetrueg(s)

is impossible to construct. Note that simple numerical approxima-

tionsof(3.17) onlyispossibleiftheerrorisnotsquared.

The normal displacement relies on two point sets with established

correspondences and thecalculation of the normal. Thismakes the

normaldisplacementseemlessattractivethanthepointtoassociated

boundary. The relativelabelingerrorandrelativeareaerrorarenot

very informative dueto theto low specicityand thetexture error

only applies to texture based models, which makes these criteria

seemlessinteresting inthegeneralcontext.

With respect to the minimum distance estimation of the sum of

squaredpointtoassociatedboundaryerrorsseemslikethebestover-

all distance function. The squarederror is selected since thisis the

common tting criterion. In the case of large deviations between

theshapesasymmetric orundirectedpointto associatedboundary

errorscould beconsidered, e.g. bysumming thedirected errors.

3.3.3 The "best" Performance Measures

The best distance function for minimum distance estimation is not

be dierent. Stillthe arguments forthe point to associated bound-

ary error are valid, but the sum of squared errors seems harder to

interpretedfor the user than the error. In fact it seems more rele-

vant to showthefullhistogramofthepointto associatedboundary

errors or illustrate the average errors on the dierent positions of

the object than obtaininga single number. If a single user friendly

number is needed the mean or a proper quantile of the histogram

could be obtained. Howeverthisis afundamentalproblemofmodel

ttingand residualanalysis,where there isno uniquesolution.

3.4 Other Techniques

Larsen et al. [121 ] propose a two step procedure to select the elas-

ticityparameters ofasnakebasedon upperandlowerboundsofthe

parameters. Howeverthismethod seemslimited to snakes.

Amini et al. [4 ] discuss the cross validation method proposed by

Wahba etal. [46,181 ]. This is a popular method forestimation of

theregularizationparameterinthecontextofregressionandsplines,

see e.g. [160]. The optimal parameter is found by minimizingthe

average discrepancy between the data samples and the predicted

values using cross validation. For an elaborate treatment we refer

to [180 ]. However the methods are mainly focused on equations of

theform 1

n P

n

i=1 (y

i f(x

i ))

2

+ R

b

a (f

(m)

(x)) 2

dx, whichmakesitless

interesting in thecontext of deformable template models. Another

problemisthatonlyalimitedgroupof deformabletemplatemodels

can make areasonablepredictionof aremoved templateparameter,

v

i .

3.5 Minimal Manual Interaction

Themanualtaskofcreatingtrainingsetsisverycumbersomeandan

(24)

problemof few trainingsamples bygenerating articial samplesby

aFEMmodel. Howeverthisisbasicallynotdierentfromimposing

anarticialcovariancestructureasdoneine.g. thesnake. Theonly

dierenceishowthe covariance structureis created.

A large number of semi-automatic methodshave beenproposed to

reduce the needed manual interaction. Most of these methods as-

sumeasetof densesampledobjectoutlinesaresupplied. Note that

itis much easier to create objectoutlinesthan actuallyplacing e.g.

landmarks. Theoutputfromthealgorithmsareusuallyasetofsub-

sampledand registered object outlines. Examples of these typesof

methodsare[6 , 59,62 ,157,182].

AnalternativestrategyisproposedbyFiskeretal. [65 ,70 ]wherethe

problemis solved by applyingtheExpectation-Maximization (EM)

algorithm [51 ]. In [65 ] no user interaction is needed whereas one

exampleshape isrequiredin [70 ].

Inthischaptertwoquitegeneralmethodsforestimationofthemodel

parameters indeformable modelsare presented. Both methodscan

beappliedbya non-expert user.

Therst methodis basedon theminimaxcriterionand hastheim-

portantfeatureofnotrequiringanyuserinteraction. Howeveritstill

seemsverycounterintuitiveto choosethe parametercorresponding

totheworstcaseenergy. Anotherproblemisthattheminimaxcrite-

rionrelyonanassumptionofconvexity,whichisnotalwaysfullled.

The second method for parameter estimation is based on a combi-

nation of a ML estimate and a minimum distance criterion. The

MLcriterion shouldbe appliedifpossible,sinceitis mucheasier to

optimize than theminimumdistance criterion. As a ruleof thumb

the ML criterion can be applied to estimate parameters, when the

isolatedcorrespondsto aproperdistribution. Thiscriterionhasthe

drawback of requiring a training set, which can be a cumbersome

tasktocreate manually. Howeveranumberofmethodsexists,which

reduce orremove theneed formanualinteraction.

We also review a large number of distance and performance mea-

sures. Based on the previous discussion we suggest that the most

reasonableoveralldistancemeasureisthepointtoassociatedborder.

Butitshouldbestressedthatthechoiceofdistanceandperformance

measures ishighly dependent onthe applicationand themodel.

Notethatexperimentalresultsforparameterestimationcanbefound

in[65 , 70 ].

(25)

Chapter 4

Initialization

To makeinference aboutanobjectinan imagey,estimationof the

maximuma posteriori(MAP) isperformed:

^

v=max

v

P(vjy) (4.1)

The MAP estimation is with few exceptions carried out using an

optimization procedure, which presume an initial conguration of

thetemplateparameters. ThisbasicallyseparatestheMAP estima-

tion into two steps: initializationand optimization. This chapter is

dedicatedto initializationand thefollowingto optimization.

The request for the initial conguration is, that it should be close

enoughtotheobjectintheimageto achievesuccessfullyrenement

duringtheoptimization. Obviously the initializationis very crucial

for the success of the segmentation, since the optimization fails if

theinitialcongurationisnotcloseenoughto theobject. A general

initialization scheme is also essential for a fully automated frame-

work, which can be appliedto a new problembya naive user. Still

initializationhas received very little attention in the literatureand

manyauthorsdo noteven comment on howthey initialize.

Inthefollowingweclassifydierentinitializationstrategiesintothree

Eventhough manualinitializationisvery popular,we willnottreat

thistopic further.

4.1 Feature Based Initialization

Feature based initializationextracts some features from the image.

Thesefeaturesarethenusedforinitializationofthetemplate. Most

featurebasedmethodsareadhoc methodstunedto aspecic prob-

lem and are of no general interest. Unfortunately this is how most

automatic schemes work. Stillthere exists a few interestingfeature

based strategies,whicharemore orless general.

4.1.1 Initialization From Moments

Blakeet al. [18 ] proposeto performtheinitializationfrom invariant

moments of a binary blob. The binary blob is obtained by thresh-

olding the image or a preprocessed version. A prototype shape is

thenalignedfromthesemomentswithrespecttoEuclideanoraÆne

similarities. Unfortunately binary blobs are not obtainable in the

general case, butinitializationfrom moments still seems like a fast

and eÆcient methodfora rangeof problems.

4.1.2 Texture and Color.

Zhong and Jain [190, 191 ] propose to use local texture and color

features for initialization. From a prototype image of the object

localtexture and colorisextracted. To locatea new object regions

with similar color and texture patterns are identied and used for

initialization. To speeduptheprocessingtextureand colorfeatures

are directly extracted from the compressed domain of the discrete

cosine transform.

(26)

4.1.3 Generalized Hough Transform.

Laiand Chin[118]and Garridoand Blanca[72 , 73 ]proposeto per-

form the initialization using the generalized Hough transform [11 ]

and a modied version of the generalized Hough transform, respec-

tively. A prototype shape is suppliedas the arbitrary shape to de-

tect. Recallthat the generalized Houghtransform is invariant with

respectto Euclideantransformations. Even though thegeneralized

Houghtransform isa powerfulmethodit is nota generalinitializa-

tionmethod sinceitrelieson gradient information.

4.2 Model Based Initialization.

Inmodelbasedinitializationthefullorapartoftheposteriormodel,

P(vjy),is applied. Staib etal. [163 ] use the mode of theprior dis-

tribution,P(v), for initialization,i.e. theconguration of the tem-

plate parameters which have the highest priorprobability- usually

the mean shape. Unfortunately this method is only applicable for

a few models, since most prior models are invariant to Euclidean

transformations.

4.2.1 Search Based Initialization.

Theonlytrulygeneralinitializationapproach,whichwehaveknowl-

edgeabout,isthesearchstrategy[61 ,102,164][66 ,67 , 68 ,70 ,165],

wherea sparsesearch is performed inthe parameterspace

v . The

static search strategy used by Jain et al. [102 ] and Fisker et al.

[66 ,67 , 68 , 70 ]can be summarizedas:

1. Createrelevantsearchcongurationsv

0

;:::;v

k 1

andseti=0.

2. CalculateP(yjv

i

) forthecenterofthetemplatecorresponding

to agrid of positionswithintherelevantregion ofinterest.

3. Calculate P(v

i

jy)=P(v

i

)P(yjv

i )

4. i=i+1. Go to 2ifi<k.

5. Extract theinitial congurationsfrom thecalculatedP(vjy).

Note that the search congurations includes rotation, scaling and

shapechanges butnottranslation. We refer to [70 ]fora discussion

on how to create the relevant search congurations, select the rel-

evant region of interest and extract the initial congurations from

P(vjy).

Cootes et al. [61 ] and Stegmann et al. [165 ] use an optimization

based variantof thesearch strategy,whichcan besummarizedas:

1. Createrelevantsearchcongurationsv

0

;:::;v

k 1

andseti=0.

2. PerformlimitedMAP estimation,v^

i

=max

v

i P(v

i

jy), forthe

center of the template corresponding to a grid of positions

within therelevant region of interest. The limited MAP esti-

mationisdenedasusinglessthanq iterationsintheiterative

optimization scheme.

3. i=i+1. Go to 2ifi<k.

4. Extract theinitial congurationsfrom thecalculatedP(^vjy).

Note thatthestatic searchis aspecialcase(q=0) of theoptimiza-

tion based search.

4.2.2 Very Fast Static Search.

The disadvantages of thesearch strategy is thecomputationalcost,

whichmainlyoriginatesfrom theevaluationsof P(vjy). Inthecase

of a static search Fiskeret al. [68 ] propose a very fast search algo-

(27)

exploits thefact that manylikelihoodmodelscan beinterpretedas

a simple correlation of a lter, f

l

(v;), and the image, y. Recall

thatin thestatic search a static template, v

i

, isshifted around the

image and P(yjv) is calculated for each positionin the grid - usu-

ally corresponding to each pixel. Using the lter interpretation all

these calculations can be performed by one convolution of the l-

terf

l (v

i

;) and y. Performing theconvolutioninthe Fourierspace

reduces the computation time further and makes the computation

time invariant to the size of the lter/template [70 ]. Fisker et al.

[68 , 70 ]reportsignicant reductionincomputation time usingthis

approach. Werefer to[68, 70 ]fordetailson thefastsearch strategy

and experimentalresults.

4.3 Multi-hypothesis Initialization.

For hard initialization problems false congurations might be ex-

tracted inthe initialization(or a numberof true congurations are

missed). Examples ofhard initializationproblemsare manyobjects

inthesame imagewithlargechanges inscaleand orientation,over-

lappingobjects and varyingbackground.

Thesimplesolutionisto performadensersearch. Howeverthiscan

beverycomputationalexpensive. An alternativestrategyistowork

with multiple object hypotheses, i.e. extract all reasonable initial

congurations accepting some of them are false hypothesis. These

initialcongurationsarethenfeedtotheoptimizationalgorithm. Fi-

nallyavalidationisperformedtoremovethefalseobjects. Examples

ofmulti-hypothesisinitializationcan be foundin[61 ][66 , 70 , 165 ].

4.4 Conclusion and Discussion.

In this chapter a number of initialization strategies have been re-

anydeformabletemplate(with anexplicitoptimizationcriterion)is

the search strategy. Two types of search strategies are presented:

static search and optimization based search. The choice of search

dependsontheactualproblem,model,neededsearchcongurations

etc. But as a ruleof thumb the static search should be appliedfor

quite simple initializationproblems, since it is faster. Whereas the

optimization basedsearch shouldbeappliedin thehardercases.

An alternative tothesearch basedinitializationisthefeaturebased

initialization. Featurebasedsearchstrategiesbecomeattractivewhen

thecomputationalcostbecomestohighusingasearchstrategy,e.g.

forobjectswithlargevariationinscaleandorientationwhereahuge

number of search congurations are needed for a successful initial-

ization.

Given one object in an image the global optimum of P(yjv) cor-

responds to this object, if the model is properly designed. In this

case the search based initialization guarantees successful initializa-

tion since it is only a question of how dense the search need to be

to obtainaninitialconguration, whichachievessuccessfullyrene-

ment during the optimization. In general a search based approach

is theoreticallymore appealingsincea uniedmodelis usedforini-

tialization and optimization. Whereas the feature based approach

appliesone model forinitializationand anothermodelfor theopti-

mization/tting. Itishardto seetheoretical argumentsforapplying

two dierent models. It couldbe argued that ifsome feature isrel-

evant for initializationit shouldbe relevant for theoptimization as

welland vice versa. Basically there is onlyone argument for using

features basedinitializationand thatis speed.

(28)

Chapter 5

Optimization

Thenaloptimizationistypicallyamedium-to-high-dimensionalop-

timizationproblem. Inmostcases,theproblemcanbecategorizedas

continuous, unconstrained optimization of a nonlinear, non-convex

function. The problem is continuous because the template param-

eters are real, i.e.

v

R

n

where n is the number of template

parameters, and unconstrained because most authors do not work

withhardconstrains.

Thefollowingsectionscontainacompactsurveyofthemostpopular

optimizationmethodsusedwithdeformabletemplatemodels. Recall

^

v =max

v

P(vjy)=min

v

U(v;y). Inpractice theMAP estimation

is always performed as min

v

U(v;y), hence we will only deal with

thisinthefollowing.

5.1 Deterministic Optimization Methods

5.1.1 Gradient Based Methods

In the optimization literature the gradient based methods receive

based methodsrest on thecomputation ofthepartialderivativesof

the posterior energy,

@U(v;y)

@v

. These derivatives are then more or

less cleverly applied in the search for the optimum. In the simple

case ofsteepestdescent theoptimizationisperformedbyiteratively

updating the parameters, dv, in the direction of the gradient, i.e.

dv

t

/ rU(v

t

;y). Thegradientbasedapproachesareverypopular

with deformable template models, see e.g. [33 , 72 , 101 , 102, 108 ,

163 , 189 ].

5.1.2 Dynamic Programming

Assumetheposteriorcanbeseparatedintodependentsubproblems:

U(v;y) = U

0 (v

0

;:::;v

l

;y)+U

1 (v

1

;:::;v

l +1

;y)+:::+

U

k l 1 (v

k 1 l

;:::;v

k 1

;y) (5.1)

wherelisa(small)integer. Thentheoptimizationcanbeperformed

very eÆciently by dynamic programming. Dynamic programming

worksbysolvingtheseparatedenergytermsU

i (v

i

;:::;v

i+l

;y)recur-

sively given v

i+1

;:::;v

i+l

is xed. Dynamic programming provides

the machinery forenforcing hardconstrains and guarantees theop-

timalsolutionwithintheresolutionofthisinterval. Foramore elab-

orate treatment of dynamic programming see e.g. [13 , 44 ]. In the

context of deformable template model dynamic programming was

introduced by Amini et al. [4]. Applying dynamic programing for

optimization of deformabletemplate modelshave beeninvestigated

and furtherdeveloped in[28,45 , 74 ,110 ,113 , 170 ].

5.1.3 Dierence Decomposition

IndierencedecompositionproposedbyGleicher[81 ]andinasimilar

approach proposedinparallelbyCootes etal. [35 , 40 ]thetemplate

parameters, v,are updatedbasedon thedierence, D,betweenthe

(29)

is only applicable for deformable template models with a fullpixel

model. By displacing the parameters in a trainingset with ground

truththerelationshipbetweendisplacements,dv,anddierenceim-

ages, D, are learned. The relationship is approximatedby a linear

regression,dv=R D,wherethematrixRisobtainedfromthetrain-

ing. Themaindierencebetweentheapproachin[81]andin[35,40]

ishowR is determined. Dierencedecompositionhasalso beenap-

pliedin[100,156].

5.1.4 Heuristics

Besidetheabovementionedmethodsnumerousheuristicshavebeen

applied. The heuristics are usually tuned to a specic model, but

can be very eÆcient. Examplescan be foundin [41 ,60 , 118].

5.2 Stochastic Optimization Methods

5.2.1 Genetic Algorithms

Genetic algorithms [82 , 93 ] employ mechanisms similarto the evo-

lutionin nature. Geneticalgorithmsmaintaina population of tem-

plates. Thispopulationevolvesintotheoptimalsolutionbyproduc-

ingnewgenerationsoftemplates,wherethesurvivaloftheindividual

templateisbasedonthecorrespondingposteriorprobability. Anew

generationis born by mutating the existing population in a proba-

bilisticfashion, e.g. by randomcrossover. We refer to e.g. [82 , 135 ]

for a more rigorous treatment of genetic algorithms. Examples of

applying genetic algorithms to deformable template models can be

foundin[12 , 50 ,91 , 128 ].

5.2.2 Simulated Annealing

Simulated annealing [27 , 112 ] can be compared to a physical pro-

cedure { calledannealing { in which a physicalsubstance is melted

and thenslowlycooled insearchof a lowenergyconguration. The

system is controlledbya decreasing temperature, T

t

, and new con-

gurations are generated by a random search method, such as the

Metropolisalgorithm [134]ortheGibbs sampler[75 ]. ForhighT

t a

new congurationwitha higherenergy,U(v;y),might be accepted,

whereasat lowT

t

onlycongurationcorrespondingto andecreasein

U(v;y) is likelyto beaccepted. Theexpectation isthat largeearly

uctuations willallow rapidlyescape from shallowminima whereas

the latter behaviour will act asa greedy algorithm descendinginto

the globalminimum. Populartemperatureschemesare: T

t

=T

t 1

[112 ] and T

t

= c

l og(1+t)

[75 ], where and c are constants. Analytic

conditionshave beenderivedfortheconvergencetowards theglobal

optimumforanumberofMarkovRandomFieldmodels,see[75 ,88 ].

Examples of theuse of simulatedannealing in thecontext of defor-

mablemodelscan be foundin[23 , 85 ,155,167][68 , 70 ].

5.2.3 Stochastic Diusion

Stochasticdiusionforoptimization[3,76 ]isbasedonthestochastic

dierentialequation:

dv

t

= rU(v

t

;y)+ p

2T

t dw

t

(5.2)

where w

t

is thestandard Brownian motion withindependent iden-

tical distributedcomponents and the temperature, T

t

, controls the

magnitudeoftherandomuctuations. (5.2)can beinterpretedasa

steepest descentalgorithm withrandomuctuation inthegradient.

The annealing approach ensureslargeuctuation forhigh tempera-

tureswhereasthelatterbehaviourwillbea steepestdescent. Given

specialassumptionsaboutU(v;y)globalconvergencehavebeende-

rived,see[76 , 79 ]. Examplesofoptimizationofdeformabletemplate

(30)

5.3 Comparative Study

InacomparativestudybyFiskeretal. [68 ]theperformanceofthree

gradient based methods, Simulated Annealing and Pattern Search

[94 ]iscomparedforasimpleellipsemodel. Withrespecttothenal

energy the study shows, that all methods perform almost equally

well within a reasonable number of function evaluations, but if a

largenumber offunctionevaluationsis allowed simulatedannealing

obtainsvery good energies. When the speedof convergenceis com-

pared,thegradientbasedmethodsarefastestduringthelaterphase

of the optimization. When the initialconguration is further away

from the nal solution, the relative performance of Pattern Search

and SimulatedAnnealingincrease. Basically all these resultsare in

very good agreement with theexpectedbehaviour.

The task of optimization is not only a question of obtaining the

minimal energy conguration, but also a question of convergence

speed. If the objective is to obtain the minimal energy a method

withgood globalconvergence properties shouldbeselected, i.e. dy-

namicprogramming, simulatedannealing or stochastic diusion. If

concernedatthespeedofconvergenceadeterministicmethodshould

beapplied,since thestochastic methodsingeneral suerfrom slow

convergence. Dynamicprogrammingsatises bothobjectives. How-

ever, it requests the separation of the energy into subproblems. In

generalthechoiceofoptimizationisaclassicaltradeobetweenaccu-

racyandspeed,anditisalsohighlydependenton theactualmodel.

Notethatthecomputationalcostandrobustnessoftheoptimization

can be improvedbya multi-resolutionstrategy[42 , 102].

Anotherimportantaspectforthechoiceofoptimizationalgorithmis

thegeneral qualityof the initialization. If the initializationis close

general it is our experience that a good initializationoften is more

importantfortheperformancethantheactualchoiceofoptimization

algorithm. Thisconclusionisalsosupportedbytheempiricalresults

in[68 ].

Another important - butoften forgotten- aspectfor obtainingsuc-

cessful optimization is the quality of the optimization criterion. A

properlydesignedoptimizationcriterionshouldbesmoothwithfew

local maxima. The prior plays an important role in obtaining a

smooth and convex optimization criterion, since most priormodels

consist of a quadratictermand henceare bornsmoothand convex.

Jain et al. [102] utilize this by gradually reducing the smoothness

of theoptimization criterionbydown weightingthe inuenceofthe

priorduringtheoptimization process.