Models Operational
Rune Fisker
Lyngby 2000
IMM-PHD-2000-77
IMM
ISSN0909-3192
c
Copyright2000 byRuneFisker
PrintedbyIMM,TechnicalUniversityof Denmark
Preface
ThisthesishasbeenpreparedattheSectionforImageAnalysis,De-
partment of Mathematical Modelling, Technical University of Den-
mark. It isa partialfulllmentof the requirement forthedegree of
Ph.D.inengineering.
The thesis is within theeld of image processing and computer vi-
sion. The subject of the thesis is deformabletemplate models with
special focus on the common diÆculties,which arerelated to these
type ofmodels.
Readingthisthesis requires a basicknowledge of image processing,
statisticsand optimization.
Acknowledgement
I'mgratefultoalargenumberofpersonsforhelp,fruitfuldiscussions
andgreatcompany. Firstofall Iwouldliketothankmysupervisors
JensMichaelCarstensen and Knut Conradsenfortheirsupervision.
Iwould also liketo thank previousand present membersof thesec-
tion for image analysis especially Lars Pedersen, Klaus B. Hilger,
RasmusLarsen, Per R. Andresen,Claus Gramkow, Bjarne Ersbll,
Nette Schultz, Allan A. Nielsen and Johan Dore. I also had excel-
lent discussionwith Mikkel B. Stegmannand Henrik Aans, who I
I'm also in debt to Anil K. Jain and the rest of the people at the
PRIPlabforlettingmestaywiththematMichiganStateUniversity,
where I had an inspiringvisit. I would also like to thank Ingenir
Valdemar Selmer Trane og hustru Elisa Tranes Foundation for the
nancial support, which madethisvisitpossible.
Iwouldalsoliketothanktheexternopponentsatthedefenseofthis
thesis, Dr. Tim Cootes and Professor Milan Sonka, for attending
and forinterestingdiscussionson thepresentedtopics.
Many thanks to the secretaries Helle Welling and Mette Eltzholtz
for their help and to the librarianFinn K. Christensen, who never
complained aboutobtainingjust anotherarticleorbookforme.
FinallyI'mverygratefultomylovelywifeHelleanddaughterCecilie
who allways supported me and never complained when deadlines
made mea lesscommon sight.
Lyngby,November 1,2000
RuneFisker
Abstract
Deformable template models are a very popular and powerful tool
withintheeldofimageprocessingandcomputervision. Thisthesis
treatsthistype ofmodelsextensivelywithspecialfocuson handling
theircommon diÆculties,i.e. model parameter selection,initializa-
tionandoptimization. AproperhandlingofthecommondiÆculties
isessentialformakingthe modelsoperationalbyanon-expertuser,
which isa requirement forintensifyingand commercializingthe use
ofdeformable templatemodels.
Thethesisisorganizedasacollectionofthemostimportantarticles,
which has been published during the Ph.D. project. To put these
articlesintothe generalcontext of deformabletemplatemodelsand
to passonan overviewofthedeformabletemplatemodelliterature,
thethesis starts with a compact survey of thedeformable template
modelliteraturewithspecialfocuson representation,modelparam-
eter estimation, initialization, optimization and performance mea-
sures. The originalarticles- aligned abit innotationand corrected
fromdiscovered spellingerrors andother typos-are enclosedinthe
appendices.
Comparedto theliteratureone contributionis ageneral scheme for
estimation of the model parameters, which applies a combination
of a maximumlikelihoodand minimumdistance criterion. Another
contributionisaveryfastsearchbasedinitializationalgorithmusing
a lter interpretation of the likelihood model. These two methods
can be applied to mostdeformable templatemodelsmaking a non-
expertuser ableto usethemodel.
A comparative studyof anumberofoptimization algorithmsis also
reported. In addition a general polygon-based model, an ellipse
model and a textile model are proposed and a number of applica-
tionshave beensolved. FinallytheGrenandermodelandtheActive
Appearance Modelhave beenexploredandsome extensionsarepre-
sented.
Publications
ThissectionlistthepublicationsproducedduringthePh.D.period.
For the enclosed articles the corresponding appendix and reference
are given. If an enclosed article iscited in thesurvey thereference
isshown initalic,e.g. [68 ].
Publications related to this thesis
R. Fisker, N. Schultz, N. Duta and J. M. Carstensen, The
Grenander Deformable Template Model: A General Scheme, Sub-
mitted (AppendixGand [70 ]).
M.B.Stegmann,R.Fisker,andB.Ersbll,Extendingandap-
plying Active Appearance Models for automated, high precision seg-
mentationindierentimagemodalities,submitted(AppendixDand
[165 ]).
R. Fisker, J. M. Carstensen, M. F. Hansen, F. Bodker and
S.Mrup,Estimationof NanoparticleSize Distributions byImage
Analysis,ToappearinJournalofNanoparticle Research (Appendix
Aand [67 ]).
M. B. Stegmann, R. Fisker, B. Ersbll, H. H. Thodberg
and L. Hyldstrup, Active Appearance Models: Theory and Cases,
Proc. 9thDanishConf. onPatternRecognitionandImageAnalysis,
R. Fisker, N. Schultz, N. Duta and J. M. Carstensen, A
General Scheme for Training and Optimization of the Grenander
Deformable Template Model. Proc. Conf. on Computer Vision and
PatternRecognition, 698-705, Hilton HeadIsland,USA, 2000
R. Fisker and J.M. Carstensen, A General Polygon-based De-
formable Model for Object Recognition, Proc. 8th Danish Conf. on
Pattern Recognitionand Image Analysis, 28-35, Copenhagen, Den-
mark, 1999 (AppendixCand [66 ]).
R. Fisker, J.M.Carstensen and K. Madsen,Initialization and
Optimization of Deformable Models, Proc. 11thScandinavianConf.
onImageAnalysis,295-302, Greenland,1999(AppendixFand[68 ]).
J. M. Carstensen, R. Fisker, N. Schultz and T. C. Dorge,
Structural InferenceUsing Deformable Models, Machine Vision and
Advanced Image Processing in Remote Sensing, 61-71, Springer,
1999
R. Fisker and J.M. Carstensen, On Parameter Estimation in
Deformable Models, Proc. 14th International Conf. on Pattern
Recognition, 763-766, Brisbane, Australia, 1998 (Appendix E and
[65 ]).
R. Fisker andJ.M. Carstensen, Automated VisualInspection of
Textile,Proc. 10thScandinavianConf. on ImageAnalysis,173-179,
Lappeenranta, Finland,1997 (AppendixBand[64 ] ).
R. Fisker and J.M.Carstensen,Automatiskvisuelkvalitetsanal-
yseaftekstil (inDanish),Proc. 5thDanishConf. onPatternRecog-
nition andImage Analysis,28-35, Copenhagen,Denmark, 1996.
Publications not related to this thesis
H. Aans, R. Fisker, K.
Astrom, and J. M. Carstensen,
Robust Factorization, submitted.
H. Aans, R. Fisker, and J. M. Carstensen,Robust Structure
and Motion, Proc. 9th Danish Conf. on Pattern Recognition and
ImageAnalysis,
Alborg, Denmark,1-9, 2000.
R. Fisker, H.F. Poulsen, J. Schou, J.M. Carstensen and S.
Garbe, Use of ImageProcessing Toolsfor TextureAnalysis ofHigh
EnergyX-raySynchrotron Data,JournalofAppliedCrystallography,
31,647-653, 1998.
Technical reports
M. B. Stegmann, R. Fisker and B. Ersboell,On properties of
Active Shape Models, Technical Report IMM-Rep-2000-12, Depart-
ment of Mathematical Modelling,TechnicalUniversityofDenmark,
2000.
Notation
Thenotationshouldbequiteconsistentthroughoutthethesis. Bold
lowercaselettersdenotesvectors, x=(x
0
;:::;x
n 1
),andboldupper-
caseletters indicatesmatrices,X.
regularization constant.
x
parameterspace ofx.
P(v) priormodel/distribution.
P(vjy) posterior model/distribution.
P(yjv) likelihoodorobservationmodel/distribution.
covariance matrix.
modelparameters.
v templateparameters.
v t
trainingsampleof templateparameters.
^
v MAP estimateof thetemplateparameters.
U(v;y) energytermof theposteriordistribution.
y an image.
z normalizingconstant.
X 0
transposedof a matrixX
Contents
1 Introduction 1
1.1 BayesianFramework . . . 2
1.2 The Priorand LikelihoodModel . . . 3
1.3 CommonDiÆculties . . . 4
1.4 PopularDeformableTemplateModels . . . 5
1.5 RelationshipTo OtherMethods . . . 6
1.5.1 RigidTemplateMatching . . . 6
1.5.2 Hough Transform. . . 8
1.5.3 Registration . . . 8
1.5.4 Markov Random Fields . . . 9
2 Representation 11 2.1 Curves . . . 11
2.1.1 Labeled Pointsand Vertices . . . 11
2.1.2 B-splines . . . 12
2.1.3 LevelSets . . . 12
2.2 Orthogonal basis . . . 13
2.2.1 Fourier Descriptors . . . 13
2.2.2 PrincipalComponentAnalysis (PCA) . . . 13
2.2.3 Wavelets Descriptors . . . 13
2.3 Image Templates . . . 14
2.4 Conclusionand Discussion . . . 14
3 Model Parameter Estimation 17 3.1 Minimax . . . 18
3.2 MaximumLikelihood . . . 18
3.3 MinimumDistance . . . 20
3.3.1 DistanceFunctions and Performance measures 20 3.3.2 The"Best" DistanceFunction . . . 24
3.3.3 The"best" PerformanceMeasures . . . 25
3.4 Other Techniques . . . 26
3.5 MinimalManual Interaction . . . 26
3.6 Conclusionand Discussion . . . 27
4 Initialization 29 4.1 Feature Based Initialization . . . 30
4.1.1 InitializationFrom Moments . . . 30
4.1.2 Texture and Color. . . 30
4.1.3 GeneralizedHoughTransform. . . 31
4.2 ModelBased Initialization. . . 31
4.2.1 Search Based Initialization. . . 31
4.2.2 Very Fast StaticSearch. . . 32
4.3 Multi-hypothesis Initialization. . . 33
4.4 ConclusionandDiscussion. . . 33
5 Optimization 35 5.1 Deterministic OptimizationMethods . . . 35
5.1.1 Gradient Based Methods . . . 35
5.1.2 Dynamic Programming . . . 36
5.1.3 Dierence Decomposition . . . 36
5.1.4 Heuristics . . . 37
5.2 StochasticOptimizationMethods . . . 37
5.2.1 GeneticAlgorithms. . . 37
5.2.2 SimulatedAnnealing . . . 38
5.2.3 Stochastic Diusion . . . 38
5.3 ComparativeStudy . . . 39
5.4 ConclusionandDiscussion . . . 39
6 Conclusion and Discussion 41 A Estimation of Nanoparticle Size Distributions 43 A.1 Introduction. . . 44
A.2 The EllipticDeformable Template Model. . . 45
A.3 Initializationand Optimization. . . 49
A.4 Validation. . . 51
A.5 Experimental Results. . . 52
A.6 Discussion. . . 65
B Automated visual inspection of textile 71 B.1 Introduction. . . 72
B.2 Detection of defectsinthevertical threads . . . 74
B.3 Detection of defectsinthehorizontal threads . . . 75
B.3.1 DefectsAnalysis . . . 79
B.4 Experiments. . . 79
B.5 Conclusion . . . 80
C A General Polygon-based Deformable Model 85 C.1 Introduction. . . 86
C.2 Deformable models . . . 87
C.3 The polygon-based model . . . 88
C.3.1 Priormodel . . . 89
C.3.2 TheLikelihoodfunction . . . 91
C.3.3 Initializationand optimization . . . 94
C.3.4 Validation . . . 96
C.4 ExperimentalResults. . . 97
C.5 Discussions and futurework . . . .102
D Extending and applying Active Appearance Models103 D.1 Introduction. . . .104
D.2 Active Appearance Models . . . .105
D.2.1 Shape& Landmarks . . . .106
D.2.2 ShapeFormulation . . . .106
D.2.3 TextureFormulation . . . 107
D.2.4 Combined ModelFormulation . . . 108
D.2.5 Optimization . . . 109
D.3 Extensions. . . 111
D.3.1 Enhanced Shape Representation . . . 111
D.3.2 NeighborhoodAAMs. . . 112
D.3.3 Border AAMs. . . 113
D.3.4 Fine-tuningthemodelt . . . 115
D.3.5 ApplyingRobust Statistics . . . 117
D.3.6 Initialization . . . 120
D.4 Experimental Results. . . 122
D.5 Future work . . . 126
D.6 Implementation . . . 126
D.7 Conclusion . . . 127
D.8 Acknowledgements . . . 128
E On parameter estimation in deformable models 129 E.1 Introduction. . . 130
E.2 Deformable models . . . 131
E.2.1 The priormodel . . . 131
E.2.2 The observation model . . . 132
E.3 Supervisedmodel parameterestimation . . . 133
E.3.1 The priormodel . . . 133
E.3.2 The posteriormodel . . . 134
E.5 Experimentalresults . . . .136
E.5.1 Estimationofmodelparameters . . . .137
E.6 Conclusion . . . .139
F Initialization and Optimization of Deformable Mod- els 143 F.1 Introduction. . . .144
F.2 Deformable models . . . .145
F.3 Initialization . . . .147
F.4 Optimization . . . .149
F.5 Experimentalresults . . . .152
F.5.1 Initialization . . . .153
F.5.2 Optimization . . . .156
F.6 Conclusionand discussion . . . .161
G The Grenander Model: A General Scheme. 165 G.1 Introduction. . . .166
G.2 The Grenandermodel. . . .167
G.2.1 Priormodel. . . .168
G.2.2 Likelihoodmodel withglobalmean. . . .171
G.3 Modelparameter estimationbasedon a trainingset. .172
G.3.1 Priormean shape. . . .172
G.3.2 Priorweight parameters . . . .173
G.3.3 Likelihoodparameters . . . .174
G.4 Initialization. . . .175
G.6 Model parameterestimationbased on one example.. . 179
G.7 Adaptivelocalmean model. . . 180
G.8 Experimental results.. . . 181
G.9 ConclusionandDiscussion. . . 184
Bibliography 208
Ph.D. theses from IMM 216
Index 218
Chapter 1
Introduction
Thefamilyofdeformabletemplatemodelshasbeenpresentedunder
many dierent names, where the best known probablyare Snakes,
Active Contour models, Active Shape models, deformable models
and deformable templates. A general denitionof deformable tem-
plate modelscouldbe:
A deformable template model can be characterized as a
model, which under an implicit or explicit optimization
criterion deforms the shape to match a known type of
object in an image.
In mostcases the core task of thedeformable template model is to
performsegmentationofaknowntypeofobject,butthemodelscan
also be appliedto more general tasks, e.g. imageretrieval[16 , 102 ,
104 ]and tracking[18 , 99 ,145,146,171].
Even though deformable template models have been a very active
research eld for more thana decade, a recent review of statistical
pattern recognition [104] pinpoints deformable models as a one of
thefrontiers ofpattern recognition.
1.1 Bayesian Framework
The design and use of deformable template modelsts neatly into
theBayesianframeworkforimageanalysis. WerefertoBesag[15 ]or
Mumford[137 ] forageneralintroductiontotheBayesianframework
for image analysis. In a Bayesian setting, thedeformable template
modelframeworkcan be summarizedas:
1. Choosea representationof theobject, wherethetemplatepa-
rameters, v 2
v
, direct or under some mapping dene the
object.
2. ConstructapriorprobabilitydistributionP(vj),wherecor-
respondsto themodelparameters,whichdetermine theprop-
erties of thedistribution.
3. Formulate a likelihood orobservation model,P(yjv;), which
denes the probabilityof an image, y 2
y
,given any partic-
ular realizationof v.
4. Combine the prior distribution and the observation model to
theposteriordistribution,P(vjy;), byBayes theorem.
P(vjy;) =
P(vj)P(yjv;)
P(yj)
(1.1)
/ P(vj)P(yjv;) (1.2)
5. Select orestimatethemodelparameters,.
6. Makeinferencee.g. byestimationofthemaximumaposteriori
(MAP):
^
v=max
v
P(vjy;) (1.3)
Note,that,forsimplicity,isomittedinthedistributionfunctionsin
mostofthisthesisaswellasintheliterature,i.e. P(vjy)=P(vjy;)
The posterior distribution can be interpreted as an (explicit) opti-
mizationcriterionandundertheweakassumptionthattheposterior
distribution,P(vjy),isGibbsdistributed,theposteriordistribution
can be expandedto:
P(vjy) = 1
z
expf U(v;y)g (1.4)
where U(v;y) :
v
! R is the energy function and z is a normal-
izing constant, which ensures a proper statistical distribution, i.e.
R
v
P(vjy)dv = 1. Due to the dimensionality of
v
it is usually
infeasible to determine the value of z. In practice it is of little im-
portance,sincezisaconstant,whichmaketheoptimizationin(1.3)
invariantto the actualvalueof z.
Itisverycommontoformulatetheoptimizationcriteriaasanenergy
function instead of a posterior distribution. Under the assumption
thatP(vjy) is Gibbs distributedand z is constant, maximizingthe
probabilityisequivalent to minimizingtheenergy,since:
^
v = max
v
P(vjy;)
= max
v
U(v;y;) log(z)
= min
v
U(v;y;) (1.5)
Thisshowsthatthe probabilistic formulationisequivalenttothe en-
ergy formulationwhen appliedto MAP estimation.
IngeneralwepreferaBayesianformulation,becauseitgivesanatu-
ralseparationofmodelandimagecontributionsinP(v)andP(yjv),
and it provides the opportunity to simulate and thus visualize the
appropriatenessofthepriormodel. Furtheritprovidesseveralways
to makeinference inP(vjy).
1.2 The Prior and Likelihood Model
The basic principle of the prior model, P(v), is to introduce prior
is simply that it is so hard to make progress without it [18 ]. One
interpretationofthepriormodelistoseeitasaregularizer[147,148 ],
whichregularizestheshapeofthetemplate. Infact,thepriormodel
in theSnake [108 ] (equal to the internal energy) is a generalization
of a Tikhonov[173 ] regularizer[108].
Froma statistical point ofview the priormodel imposesa distribu-
tion on the template parameters. This distribution introduces cor-
relation between parameters and governs the overall deformation.
Fisker et al. [65 ] demonstrate that in the case of the prior energy
being a quadratic term, the prior model usually corresponds to a
multi-dimensional Gaussian. As demonstrated later this result has
importantapplicationswithinmodelparameterestimationaswellas
model simulation and validation. Usingthis result the prior model
of the snake correspond to a multi-dimensional Gaussian with the
mean equalto zeroandthe inverse covarianceequalto a pentadiag-
onal bandedmatrix.
Thelikelihoodorobservationmodel,P(yjv),isthedatadriventerm.
The modeldenes theinteractionbetweena realizationof thetem-
platevandanobservedimagey. Basicallyitdeneshowwelltheac-
tualtemplatematchtheobjectintheimage. Mostlikelihoodmodels
arebasedonimageintensity[23 ,41,35 ,85 ,86 ,167,155 ,189 ]and/or
gradient(edge)information[33 ,34 ,72 ,101 ,102 ,108,118,163 ,189 ],
but in principle all kinds of informationcan be combined e.g. tex-
ture,colour ormotion.
1.3 Common DiÆculties
In general, the current approaches to deformable template models
suers from anumberof commondiÆculties. Jain et al. [101]sum-
marize thediÆcultiesas:
Theuserneedstoassignweightstodierentcomponentsofthe
determinethesuccess ofa deformabletemplatemodel (Model
Parameter Selection).
The algorithms need a good initialization to give meaningful
results,otherwisetheyget stuckat spuriouslocalminimaand
thusleadto incorrectresults (Initialization).
Thelargenumberofparametersmakesthenumericalsolutions
diÆcult(Optimization).
In short, the common diÆculties can be summarizedas Model Pa-
rameterSelection,InitializationandOptimization. ThesediÆculties
arethemainreasonwhymostmodelsonlycan beappliedbyanex-
pert. Thisisnotsatisfactorysinceagoodmodelshouldbeapplicable
byanon-expertuser. Thehandlingandsolutiontothese diÆculties
arecovered intensivelyinthefollowingchapters.
1.4 Popular Deformable Template Models
Jain et al. [101, 102 ] divide deformable models into two groups:
Freeform and Parametric. Freeform deformabletemplateshave no
explicitglobalstructurebecausetheirpriormodelcontainsonlylocal
continuityand smoothnessconstraints. The bestknownexampleof
freeformmodelsareSnakes[108 ]andBallons[33 ,34 ]. Inparametric
deformablemodelspriorknowledgeoftheglobalstructureisincluded
usingaparameterizedtemplate. Examplesofparametricdeformable
modelsareGrenander'smodel[85 ],ActiveShapemodel[41 ],Active
Appearance model [35]and Blake's Active Contour [18]. The cited
modelsallhave theimportantpropertyofbeinggeneralinthesense
that they can be applied to an object with an arbitrary shape. A
moredetailedoverviewofthepropertiesofthemostpopulargeneral
modelsisgivenintable1.1. Therstreferenceinthetableisthe"key
reference" and Eucl. is an abbreviation for Euclidean parameters
The largest group of deformable modelsare formulated and tuned
for a specic object. Properly thebest-known specic model is the
eye model and a mouth model proposed by Yuille et al. [189 ]. For
someproblemsitisnecessarytoincorporatespecicassumptioninto
themodeltobeableto solvetheproblem,butmanyproblemscould
have beensolved directlybyone of thegeneralmodels.
It is outside the scope of this thesisto give a general descriptionof
all themodelsintheliterature. Forageneral surveywe referto the
rst part of the book on Active Contours by Blake and Isard [18 ]
and thesurvey papersby McInerney and Terzopoulos [132 ] and by
Jain et al. [101].
1.5 Relationship To Other Methods
Deformable template models have close relations to a number of
other computer vision and image processing methods, where the
mostimportare covered inthefollowingsections.
1.5.1 Rigid Template Matching
Earlyresearchintemplatematchingconcentratedmainlyonthesim-
ple rigid shape matching, where a prototype template was trans-
formedbysimpletransformationsliketranslation,rotationandscal-
ing before it was matched to the image. The matching is typically
performedbyasimplecorrelation-basedmethodusingalteringap-
proach,see e.g. [2 , 162 ]. Thistype of matchingis stillvery popular
inmachine visionapplications.
The rigidness of thisapproach makes itinsuÆcient forobjects with
larger shape variations, which basically lead to the development of
deformabletemplatemodels. Formanymodelsthelikelihoodmodel,
P(yjv),canbeinterpretedascorrelation-basedmatchofadeformed
template.
ModelObjectTemplateModelWeightInitia- RepresentationParametersvParametersSelectionlization ActiveMeanshape/bitmap,ScalarsonEigenvectors&NoSearch Appearancelinearcombinationeigenvectorsmeanweights Model[35,40,61]ofeigenvectors&eucl.&eucl.shape/bitmap ActiveShapeMeanshape,linearScalarsonEigenvectors&NoNo Modelcombinationofeigenvectorsmeanshapeweightscomments [38,40,41]eigenvectors&eucl.&eucl. BalloonPoints/FEMPoints/WeightsManualManual [33,34]FEM Blake'sB-Spline&CurveMeanshape,NoMoments ActiveContourdeformationspaceshape-shapematrix&commentsbinary [18]vectorweightsimage Grenander'sPointsPointsMeanshape,ManualHeuristic modelgraylevelstat.& [85,86,114,155]weights Jain'sDeformationDeformationBitmap&ManualSearch modelofbitmapparameterweight [101,102,190]&eucl.&eucl. SnakePointsPointsWeightsManualManual [108] Staib&Duncan'sEllipticFourierEllipticFourierFouriermeans&ManualModeof Model[163]descriptorsdescriptorsweightprior
1.5.2 Hough Transform
The HoughtransformwasrstproposedbyHough[95]. Since then,
numerous papers have improved the Hough transform, where the
(;)-representationof lines[58 ] and thegeneralization to arbitrary
shapes for any scale and any rotation [11 ] are some of the most
inuential. As demonstrated byStockman and Agrawala [166], the
Hough transform is simply an eÆcient implementation of template
matching. Formore aextensivereviewwerefer tothesurveypapers
[98 , 122 ].
In a deformable model setting the Hough transform can be inter-
preted as a deformable model with a uniform prior, where the ini-
tialization is performed by an intensive buteÆcient search and the
optimization isomitted.
1.5.3 Registration
Imageregistrationrequiresndinganoptimaltransformbetweenan
imagepair,thesourceandtarget image[123]. Forareviewofimage
registration, we refer to one of the survey papers [7 , 21 , 123, 129 ,
177 , 183 ].
Matching a parametric deformablemodel to an object in an image
can in general be interpretedas a registration of a prototype tem-
plate to theobject. Stillthere is dierence in the actualobjective.
In registration the requirement is an exact correspondence between
homologous points, where deformable models are focused on seg-
mentationratherthannecessarilyndingtheexact correspondence.
Anotherdierenceistheactualmatching. Traditionallyregistration
isperformedbetweensimilarfeaturetypes,e.g. pairsofimageswith
the same or dierent modality or pairs of landmark sets. Whereas
mostdeformablemodelsmatch anobjectmodel{withanon-image
representation{ to an image.
Stillthereexistanumberofmethods,whichcanbeclassiedasboth
aregistrationmethodand adeformabletemplatemodel,e.g. elastic
modelslike [9, 10 ,49 ]and theviscousuid model[20 , 29 , 30 ].
1.5.4 Markov Random Fields
MarkovRandom Fields(MRF) providea convenient and consistent
way of modeling spatial context and stochastic interaction among
entitiessuch asimagepixelsand other spatiallycorrelated features.
Foran elaboratetreatment of MRFswe referto [125].
Froma generalpointMRF'sanddeformabletemplatemodelsapply
a very similar optimization based framework and both t neatly
into theBayesiansetting. Li[125 ] also arguethatmanydeformable
templates models can be interpreted as a high level MRF having
irregularsiteswithdiscretelabels,whereeach siteindexesan image
featuresuch asapointora linesegment.
Despitethesesimilaritiestheliteratureisquiteclearonwhatisclas-
siedasaMRFandadeformabletemplatemodel,respectively. The
main dierence is what is modeled. Deformable templates models
isbased on anobject model whereasMRFs model thefullimageor
theinteractionbetween objects.
Chapter 2
Representation
Choosingobjectrepresentationisoneof themostimportantchoices
in the design of a deformable template model. The most popular
representations are summarizedbellow. We have chosen to catego-
rize the representations in three main groups: Curves, Orthogonal
basisand ImageTemplates.
2.1 Curves
2.1.1 Labeled Points and Vertices
One of therst and most popular representation is a labeled set of
points with connectivity information. This representation is equiv-
alent with a vertices and edges representation. In the simple case
ofone single openorclosedcontour neighborpointsareassumed to
be connected. In this case the point representation corresponds to
apolygonora linearspline. Numerousauthorshave usedthepoint
representation,e.g. [22 , 33 ,57 , 85 ,108,118].
2.1.2 B-splines
Another popularrepresentationis B-splines, which enableacontin-
ues curve description, see e.g. [18 ] for an introduction. Compared
to apointrepresentationB-splinesalsohavetheadvantageofa low-
dimensional parameter space, since a similar shape represented is
obtained by a few control points compared to a large number of
points. Anotheradvantageisthe"built-in"smoothness. Realization
of deformablemodelsusingB-splineswasdeveloped byCipollaand
Blake[31 ],Menetetal. [133]andHintonetal. [92 ]. Otherexamples
of theuseof B-splinesrepresentationaregiven in[18 , 99,113 ].
2.1.3 Level Sets
Another powerful representation is based on the elegant level sets
[142 ]. See [159 ] for an introduction. Compared to the other repre-
sentationslevelsets havetheadvantageofallowingautomaticmerg-
ingand splittingoftheinitialcontour. Inthecontext ofdeformable
modelslevelsetsweresimultaneouslyproposedbyCasellesetal. [25 ]
and Malladi et al. [130]. Since then level sets have received much
attentionandanumberofmodelshavebeenbasedonthisapproach,
see e.g. [26 ,111,124,144 ,143 , 186 ,187 ].
2.1.4 Other Analytic Curves
Besidethementionedcurves, numerous othersetsof analyticcurves
have been applied to represent the object, e.g. a set of parabolic
curves and circles. Examples of representation based on other sets
of curvesare[67 , 119,189].
2.2 Orthogonal basis
2.2.1 Fourier Descriptors
Fourierdescriptorsrepresenttheobjectonanorthogonalbasis,where
theusualbasisfunctionisthesinusoids,i.e. trigonometricfunctions.
On a sinusoidal basis the representation correspond to the elliptic
Fourierdescriptors[78 ,116,127 ,163],wheretheFourierdescriptors
are localized in frequency. In the context of deformable template
models Descriptors were introduced by Scott [158] and Staib and
Duncan[163].
2.2.2 Principal Component Analysis (PCA)
AnotherrepresentationwhichappliesanorthogonalbasisisthePoint
DistributionModel proposedby Cootes et al. [36,41 ]. In practice,
theobject is representedbythemean shape ofa trainingset and a
linear combination of the mostimportant eigenmodes of the shape
variationfrom thismean. Thisrepresentationhas closeconnections
to thegeneral Shape Statistic, see e.g [56 ]. The Point Distribution
Model plays an important role in the popularActive Shape Model
[41 ]andhasbeenextendedwithtexture inthenovelActiveAppear-
ance Model [35 ]. Numerous other models like [60 , 126 , 139] have
beenbased on thisrepresentation.
2.2.3 Wavelets Descriptors
A third orthogonal basis is the wavelet transform [47 , 48 ], see e.g.
[131 , 168 ]fora generalintroductionto wavelets. Waveletsarelocal-
izedbothininspaceand infrequency(scale),sincethey aredened
as dilated and translated version of the basis or mother wavelet.
Wavelets descriptors are less popular than the Fourier descriptors
and the Point DistributionModel, butthere are still examples like
[139 ,188 ,190 ],where[139,188 ]applytheDaubechieswavelet[47 ,48]
and [190]appliestheB-spline wavelet [175, 176 ], respectively. Note
the comparison of shape models based on the point distribution
model, Fourier descriptors and wavelets in [139]. The previous ex-
amplesallapplieswaveletsfortheshaperepresentation,butwavelets
have also beenusedto make acompact texture representation[184 ]
inActive Appearance Models[35 ].
Notethattheorthogonalbasisusuallyapplyareducedortruncated
parameterspacewhereonlythemostimportantmodesanddescrip-
tors areused.
2.3 Image Templates
Anotherpopularrepresentationisa prototypeimagetemplate. The
prototype image is then deformed under a similarity measure to
match a new object in an image. Most of these models can also
beclassiedasregistrationmethods(seediscussioninsection1.5.3).
Typically the template is of the same type as the inputimage but
edges templates have also been used,see [102 ]. Thereis a rich col-
lection of examples of this representation, where some of the best
knownare[5 , 9,10 , 29 ,30 ,100 , 156 ].
2.4 Conclusion and Discussion
Thecentralquestionisnowwhichrepresentationisthebest? Unfor-
tunatelythereisnoeasyanswertothisquestion,sinceitdependents
ontherestofthemodelandtheactualproblem. Stillthereisanum-
ber of properties, which from a general point of view are desirable
forthe idealmodel:
General.
Lowredundancyanddimensionality.
A low dimensional representation with little redundancy im-
provesthecomputationaleÆciencyandmaketheoptimization
easierand more robust.
Linearparameterization.
Restriction to linear parameterization has certain advantages
in simplifying tting algorithms and avoiding problems with
localminima[18 ].
Orthogonalbasis
Ingeneral,anorthogonalbasisisdesirablebecauseitmakesthe
parametersdistinct. ThismakesthecoeÆcientsdetermination
easierand avoids redundancy[163 ].
Shaperegularization.
Commonlyregularizationoftheshapeisobtainedbytheprior
modelandtherepresentation. Inmanycasestheregularization
of the shape is what makes the deformable model successful,
and it is an advantage if the representation can be used as a
regularizer.
Automaticsplittingand merging.
Primarilyin thecontext of free form deformable model auto-
maticsplittingand mergingis adesirable property.
These properties favors some models, but still there is no superior
representation.
Chapter 3
Model Parameter
Estimation
Modelparametersplayanimportantroleindeterminingtheproper-
tiesof the posteriordistribution. Commonly themodelparameters
are selected before the model is applied and these are kept con-
stant. Examples of model parameters are the mean and variance
ina Gaussian distribution and the weight parameters inthe snake,
which determine the relative inuence of the dierent terms in the
energyfunction.
Mostdeformable modelscontain model parameters,e.g. weight pa-
rameters. Withafewexceptionstheseparametersareselectedman-
uallybyanexpert(c.f. Table1.1). Fromageneralpointofviewthis
isnotacceptablesincethegoalisaframework,whichcanbeapplied
by a non-expert user. Methodsapplied to automatic model param-
eter selectionindeformabletemplatemodelsare describedbelow.
3.1 Minimax
GennertandYuille[77 ]proposeageneralunsupervisedtechniquefor
determination of model parameters for multiple objective function
optimizationbasedontheminimaxprinciple. Recall^v=max
v
P(vjy)
=min
v
U(v;y). Theprincipleisto determinethethatmaximizes
U(v;^ y;)whenU(v;y;)hasbeenminimizedoverv. Moreformally
theminimaxcriterion isdened as[77 ]:
^
=max
min
v
U(v;y;) (3.1)
The minimax principle is a very conservative estimate, since the
optimalmodelparameterscorrespondtotheworstcaseenergywhen
thedeformabletemplatemodelhasbeenoptimized. Thismayseem
at rst far from optimal, butif one recalls that U(v;y;) has been
minimizedoverv,it willbe seenthat themaximizationover does
makesense,whileavoidingtheproblemsofbeingexcessivelylowor
high [77 ]. The minimaxprinciple has been applied in[117; 118 ] to
estimatetheregularizationparameterintheirg-snake. Notethatthe
minimax principle is based on the assumption that min
v
U(v;y;)
is convexwith respectto ,which isoften notthecase[65 ].
3.2 Maximum Likelihood
Assume a ground truth of n samples, v t
0
;:::;v t
n 1
, and their corre-
spondingimages,y t
0
;:::;y t
n 1
,hasbeensupplied. Giventhistraining
setthemodelparameterscan{intheory{beestimatedbytheMax-
imumLikelihood (ML)estimate:
^
=max
L(;v t
0
;:::;v t
n 1
;y t
0
;:::;y t
n 1
) (3.2)
where L(;v t
0
;:::;v t
n 1
;y t
0
;:::;y t
n 1
) is the likelihood function. If
v t
0
;:::;v t
n 1
and y t
0
;:::;y t
n 1
are assumed to be stochastically inde-
pendentthe MaximumLikelihoodestimatebecomes:
^
=max
n 1
Y
i=0 P(v
t
i jy
t
i
;) (3.3)
In[65 ]and[96 ,109][70 ]theMLestimatorhasbeenusedtoestimate
a subsetof themodel parameters corresponding to the priormodel
and allthemodelparameters, respectively.
Unfortunately Q
n 1
i=0 P(v
i jy
i
;) is not guaranteed to be a convex
functionwithrespecttoallelementsin. Infactitisveryrare that
thelikelihoodmodel,P(yjv;
l
),isconvexwithrespectto themodel
parameters,
l
, which relates to the likelihood model. Whereas the
prior model, P(vj
p
), usually can be rewritten to become convex
with respect to the prior model parameters,
p
. In fact most prior
models are born non-convex, but if the corresponding energy func-
tiononlycontainsaquadraticterm,FiskerandCarstensen[65 ]show
thatthepriormodelusuallycorrespondstoamultivariateGaussian,
whichisconvexwithrespectto
p
. Theimportanceofthisresultcan
bedemonstrated by a univariate Gaussian, wherethe energy corre-
spond to U(x)= (x )
2
2 2
, which is non-convexw.r.t.
2
. Since U(x)
correspondstoaGaussianthenormalizingconstant,z,isknownand
theenergy can berewritten to U(x) = (x )
2
2 2
log ( p
2 2
), which
isconvex w.r.t.
2
.
In general the convexity of Q
n 1
i=0 P(v
i jy
i
;) w.r.t.
j
is closely re-
lated to whether the parameter,
i
, corresponds to a part of the
posteriordistribution,whichisolatedcorrespondsto aproperdistri-
bution. If
j
correspond to a proper distribution, Q
n 1
i=0 P(v
i jy
i
;)
is usually convex. Whereas Q
n 1
i=0 P(v
i jy
i
;) usually is non-convex
if
j
correspond to an improperdistribution.
NotethattheMLcriterionismostsensibleforparametricdeformable
template models, since this type of models favor a known global
structure, whichcan be interpretedasthemean ofthe prior.
3.3 Minimum Distance
Itisinfeasibleto performaMLestimationofthemodelparameters,
i
, which correspond to a non-convex likelihood function. In this
case we proposeto usea minimumdistance criterion(same concept
butan improvedcriterion compared to [65 ]):
^
=min
D(V
t
;
^
V()) (3.4)
where D(V t
;
^
V()) is the distance between V t
= v t
0
;:::;v t
n 1 and
^
V() =
^
v
0 ();:::;
^
v
n 1
() for v^
i
() = max
v P(vjy
t
;) being the
MAP estimate obtained by running the full initialization and op-
timization scheme with the actualvalue of . In fact thisis a very
reasonable criterion, sinceit corresponds to the overall goal of per-
formingthebestsegmentation,i.e. thesegmentationwiththesmall-
est possibledistance to theground truth.
3.3.1 Distance Functions and Performance measures
To apply the minimum distance criterion the distance has to be
dened. Unfortunatelythereisnocommonstandardforperformance
evaluationofdeformabletemplatemodels,hencethereisnocommon
functionto evaluate thedistance betweentheground truthand the
segmentation. In fact thisis very annoyingsince it isimpossible to
compare theperformanceofdierent models.
The most popular distance measures used with deformable mod-
els are summarized bellow ( is omitted for simplicity). Note that
D(V t
;
^
V)andD(v t
;v)^ correspondtothedistancebetweentwotrain-
ingsetsandthedistancebetweentwosamples,respectively. Mostof
thedistancemeasuresassumetwonitepointsets,U t
=(u t
0
;:::;u t
p 1 )
and
^
U =(u^
0
;:::;u^
q 1
)canbeobtainedgivenv t
and ^v,respectively.
Usually thesepoints layon theboundary.
Template parametererrors
Asimpledistance measureusedin [65]isthesumof weighted
squareerrors betweenthetemplateparameters:
D
sq (V
t
;
^
V)= 1
nl n 1
X
j=0 (v
t
j
^ v
j )
0
C(v t
j
^ v
j
) (3.5)
wherelisthenumberofparametersinvandC isasymmetric
weight matrix. C isusuallythe identitymatrix.
Point to point errors
Given two nite point sets, U t
and
^
U, with unknown corre-
spondenceandthesamenumberofelementstheaveragepoint
to pointerrorwith linearreparameterizationreported in[163 ]
isdened as:
D
ptpt (v
t
;
^
v)= min
0s0<k 1
k k 1
X
i=0 ku
t
i
^ u
i+s
0
k (3.6)
wherek =p=q and s
0
is theoset.
Anotherpointto point errorwithknowncorrespondence used
in [43 ] is the root mean square (RMS) point to point error
denedas:
D
ptpt (V
t
;
^
V)= v
u
u
t 1
nk n 1
X
j=0 k 1
X
i=0 ku
t
j;i
^ u
j;i k
2
(3.7)
A thirdpointto point errormeasure isdened by:
D
ptpt (v
t
;v)^ = 1
k k 1
X
i=0 min
u t
j 2U
t ku
t
j
^ u
i
k (3.8)
In [60 ] is reported the mean and standard deviation for this
measure forthe fulltest set, V. Note this criterion makes no
assumption about point correspondence and identical size of
Undirected Hausdor distance
Given two nite point sets, u t
and u,^ the directed Hausdor
distance isdened as[97 ]:
h(v t
;v)^ = max
u t
i 2U
t min
^ u
j 2
^
U ku
t
i
^ u
j
k (3.9)
The undirectedHausdor distanceis thendenedas[97 ]:
D
H (v
t
;v)^ =max(h(u t
;u);^ h(^u ;u t
)) (3.10)
The average and standard deviation of the undirected Haus-
dor distance forthefullset, V t
,is reported in[60 ].
Undirected partial Hausdor distance
Giventwonitepointsets,u t
andu,^ thepartialdirectedHaus-
dor distance isdened as[124]:
h
K (u
t
;u)^ = K th
u t
i 2U
t min
^ u
j 2
^
U ku
t
i
^ u
j
k (3.11)
where K is a quantile of the maximum distance. The undi-
rected distance is then dened using (3.10). This measure is
reported in [124 ] for two dierent samples and two dierent
quantiles,K.
Normal displacement.
Giventwonitepointsets,u t
andu,^ withcorrespondenceand
connectivity information the squared normal displacement is
dened as[18 ]:
D
n (v
t
;v)^ = 1
k k 1
X
i=0 ((u
t
i
^ u
i )n(^u
i ))
2
(3.12)
where n(u
i
) is the unit normal at u
i
. This distance measure
is extensivelyused in[18 ].
Point to associated boundary.
Givenanitepointset,u,^ andapiecewisecontinuoustemplate
curve,r(s;v t
),ofthegroundtruththeRMSpointtoassociated
boundaryerrorreported in[43 ] isdenedby:
D
pt b (V
t
;
^
V)= v
u
u
t 1
nk n 1
X
j=0 k 1
X
i=0 kmin
s
i 2
i (^
u
j;i r(s
i
;v t
j ))k
2
(3.13)
where
i
isthe part of theboundaryof r(s;v), whichis asso-
ciatedwithu
i .
Area error
Therelativearea erroris denedas:
D
A (v
t
;v)^ = A(^v)
A(v t
)
(3.14)
whereA(v) is thearea. In[60 ] the mean and standarddevia-
tionof therelativearea isreported.
Labeling error
Therelativelabelingerroris denedas:
D
L (v
t
;
^
v)= N
incor (v
t
;^v)
N
Pixel s (v
t
)
(3.15)
where N
incor
is the number of incorrectly labeled pixels and
N
Pixel s
is thetotal numberof pixels. The mean and standard
deviationof thismeasureis also given in [60 ].
Texture error
Given a texture model, y(r;c;v),^ the RMS texture error re-
ported in[43 ] isdened by:
D
t (V
t
;
^
V)= v
u
u
t 1
nj
y j
n 1
X
j=0 X
(r;c)2y (y
j (r;c)
t
y(r;c;v^
j ))
2
ground truth
estimated ground truth
estimated
Point to point Point to associated border
ground truth
estimated Normal displacement
Figure 3.1: Examplesof distance measures.
where j
y
jis the number of membersin
y
and
y
is the set
of image coordinates, (r,c), which correspond to the model.
Alternative texture metricscan be foundin[152].
Some of the distance measures are illustrated in Figure 3.1. Note,
that special cases exists wherea numberof the measures above are
identical.
3.3.2 The "Best" Distance Function
It is impossibleto select the overall "best" distance function, since
it dependson theactualapplicationand model. Stillsome distance
functions seemmore attractive than others.
From a general point of view the template parameter error is less
attractive, since it is very sensitive to the parameterization and to
dierent typesof templatesparameters.
Points onthe boundaryof a deformablemodelrarelycorrespond to
to move along the boundary in most applications. If the model is
not having a point based representation points need to be created,
which is very sensitive to the parameterization. Overall this makes
thepoint to pointbased measuresseemless attractive.
Toavoidproblemswithpointsmovingalongtheboundary,informa-
tionoftheboundaryneedto beincluded. Theonlymeasures,which
satisesthisisthenormal displacement andthepointto associated
boundary. These criteriabasicallyapproximate:
D(^v;v t
)= 1
L Z
1
s=0
kr(s;v^) r(g(s);v t
)k 2
ds (3.17)
whereg(s)isanreparameterizationfunctiondenedsuchthatg(s)=
s gives the original inherited parameterization, mapping points on
thecurver(s;v)^ topointsonr(g(s);v t
). Unfortunatelythetrueg(s)
is impossible to construct. Note that simple numerical approxima-
tionsof(3.17) onlyispossibleiftheerrorisnotsquared.
The normal displacement relies on two point sets with established
correspondences and thecalculation of the normal. Thismakes the
normaldisplacementseemlessattractivethanthepointtoassociated
boundary. The relativelabelingerrorandrelativeareaerrorarenot
very informative dueto theto low specicityand thetexture error
only applies to texture based models, which makes these criteria
seemlessinteresting inthegeneralcontext.
With respect to the minimum distance estimation of the sum of
squaredpointtoassociatedboundaryerrorsseemslikethebestover-
all distance function. The squarederror is selected since thisis the
common tting criterion. In the case of large deviations between
theshapesasymmetric orundirectedpointto associatedboundary
errorscould beconsidered, e.g. bysumming thedirected errors.
3.3.3 The "best" Performance Measures
The best distance function for minimum distance estimation is not
be dierent. Stillthe arguments forthe point to associated bound-
ary error are valid, but the sum of squared errors seems harder to
interpretedfor the user than the error. In fact it seems more rele-
vant to showthefullhistogramofthepointto associatedboundary
errors or illustrate the average errors on the dierent positions of
the object than obtaininga single number. If a single user friendly
number is needed the mean or a proper quantile of the histogram
could be obtained. Howeverthisis afundamentalproblemofmodel
ttingand residualanalysis,where there isno uniquesolution.
3.4 Other Techniques
Larsen et al. [121 ] propose a two step procedure to select the elas-
ticityparameters ofasnakebasedon upperandlowerboundsofthe
parameters. Howeverthismethod seemslimited to snakes.
Amini et al. [4 ] discuss the cross validation method proposed by
Wahba etal. [46,181 ]. This is a popular method forestimation of
theregularizationparameterinthecontextofregressionandsplines,
see e.g. [160]. The optimal parameter is found by minimizingthe
average discrepancy between the data samples and the predicted
values using cross validation. For an elaborate treatment we refer
to [180 ]. However the methods are mainly focused on equations of
theform 1
n P
n
i=1 (y
i f(x
i ))
2
+ R
b
a (f
(m)
(x)) 2
dx, whichmakesitless
interesting in thecontext of deformable template models. Another
problemisthatonlyalimitedgroupof deformabletemplatemodels
can make areasonablepredictionof aremoved templateparameter,
v
i .
3.5 Minimal Manual Interaction
Themanualtaskofcreatingtrainingsetsisverycumbersomeandan
problemof few trainingsamples bygenerating articial samplesby
aFEMmodel. Howeverthisisbasicallynotdierentfromimposing
anarticialcovariancestructureasdoneine.g. thesnake. Theonly
dierenceishowthe covariance structureis created.
A large number of semi-automatic methodshave beenproposed to
reduce the needed manual interaction. Most of these methods as-
sumeasetof densesampledobjectoutlinesaresupplied. Note that
itis much easier to create objectoutlinesthan actuallyplacing e.g.
landmarks. Theoutputfromthealgorithmsareusuallyasetofsub-
sampledand registered object outlines. Examples of these typesof
methodsare[6 , 59,62 ,157,182].
AnalternativestrategyisproposedbyFiskeretal. [65 ,70 ]wherethe
problemis solved by applyingtheExpectation-Maximization (EM)
algorithm [51 ]. In [65 ] no user interaction is needed whereas one
exampleshape isrequiredin [70 ].
3.6 Conclusion and Discussion
Inthischaptertwoquitegeneralmethodsforestimationofthemodel
parameters indeformable modelsare presented. Both methodscan
beappliedbya non-expert user.
Therst methodis basedon theminimaxcriterionand hastheim-
portantfeatureofnotrequiringanyuserinteraction. Howeveritstill
seemsverycounterintuitiveto choosethe parametercorresponding
totheworstcaseenergy. Anotherproblemisthattheminimaxcrite-
rionrelyonanassumptionofconvexity,whichisnotalwaysfullled.
The second method for parameter estimation is based on a combi-
nation of a ML estimate and a minimum distance criterion. The
MLcriterion shouldbe appliedifpossible,sinceitis mucheasier to
optimize than theminimumdistance criterion. As a ruleof thumb
the ML criterion can be applied to estimate parameters, when the
isolatedcorrespondsto aproperdistribution. Thiscriterionhasthe
drawback of requiring a training set, which can be a cumbersome
tasktocreate manually. Howeveranumberofmethodsexists,which
reduce orremove theneed formanualinteraction.
We also review a large number of distance and performance mea-
sures. Based on the previous discussion we suggest that the most
reasonableoveralldistancemeasureisthepointtoassociatedborder.
Butitshouldbestressedthatthechoiceofdistanceandperformance
measures ishighly dependent onthe applicationand themodel.
Notethatexperimentalresultsforparameterestimationcanbefound
in[65 , 70 ].
Chapter 4
Initialization
To makeinference aboutanobjectinan imagey,estimationof the
maximuma posteriori(MAP) isperformed:
^
v=max
v
P(vjy) (4.1)
The MAP estimation is with few exceptions carried out using an
optimization procedure, which presume an initial conguration of
thetemplateparameters. ThisbasicallyseparatestheMAP estima-
tion into two steps: initializationand optimization. This chapter is
dedicatedto initializationand thefollowingto optimization.
The request for the initial conguration is, that it should be close
enoughtotheobjectintheimageto achievesuccessfullyrenement
duringtheoptimization. Obviously the initializationis very crucial
for the success of the segmentation, since the optimization fails if
theinitialcongurationisnotcloseenoughto theobject. A general
initialization scheme is also essential for a fully automated frame-
work, which can be appliedto a new problembya naive user. Still
initializationhas received very little attention in the literatureand
manyauthorsdo noteven comment on howthey initialize.
Inthefollowingweclassifydierentinitializationstrategiesintothree
Eventhough manualinitializationisvery popular,we willnottreat
thistopic further.
4.1 Feature Based Initialization
Feature based initializationextracts some features from the image.
Thesefeaturesarethenusedforinitializationofthetemplate. Most
featurebasedmethodsareadhoc methodstunedto aspecic prob-
lem and are of no general interest. Unfortunately this is how most
automatic schemes work. Stillthere exists a few interestingfeature
based strategies,whicharemore orless general.
4.1.1 Initialization From Moments
Blakeet al. [18 ] proposeto performtheinitializationfrom invariant
moments of a binary blob. The binary blob is obtained by thresh-
olding the image or a preprocessed version. A prototype shape is
thenalignedfromthesemomentswithrespecttoEuclideanoraÆne
similarities. Unfortunately binary blobs are not obtainable in the
general case, butinitializationfrom moments still seems like a fast
and eÆcient methodfora rangeof problems.
4.1.2 Texture and Color.
Zhong and Jain [190, 191 ] propose to use local texture and color
features for initialization. From a prototype image of the object
localtexture and colorisextracted. To locatea new object regions
with similar color and texture patterns are identied and used for
initialization. To speeduptheprocessingtextureand colorfeatures
are directly extracted from the compressed domain of the discrete
cosine transform.
4.1.3 Generalized Hough Transform.
Laiand Chin[118]and Garridoand Blanca[72 , 73 ]proposeto per-
form the initialization using the generalized Hough transform [11 ]
and a modied version of the generalized Hough transform, respec-
tively. A prototype shape is suppliedas the arbitrary shape to de-
tect. Recallthat the generalized Houghtransform is invariant with
respectto Euclideantransformations. Even though thegeneralized
Houghtransform isa powerfulmethodit is nota generalinitializa-
tionmethod sinceitrelieson gradient information.
4.2 Model Based Initialization.
Inmodelbasedinitializationthefullorapartoftheposteriormodel,
P(vjy),is applied. Staib etal. [163 ] use the mode of theprior dis-
tribution,P(v), for initialization,i.e. theconguration of the tem-
plate parameters which have the highest priorprobability- usually
the mean shape. Unfortunately this method is only applicable for
a few models, since most prior models are invariant to Euclidean
transformations.
4.2.1 Search Based Initialization.
Theonlytrulygeneralinitializationapproach,whichwehaveknowl-
edgeabout,isthesearchstrategy[61 ,102,164][66 ,67 , 68 ,70 ,165],
wherea sparsesearch is performed inthe parameterspace
v . The
static search strategy used by Jain et al. [102 ] and Fisker et al.
[66 ,67 , 68 , 70 ]can be summarizedas:
1. Createrelevantsearchcongurationsv
0
;:::;v
k 1
andseti=0.
2. CalculateP(yjv
i
) forthecenterofthetemplatecorresponding
to agrid of positionswithintherelevantregion ofinterest.
3. Calculate P(v
i
jy)=P(v
i
)P(yjv
i )
4. i=i+1. Go to 2ifi<k.
5. Extract theinitial congurationsfrom thecalculatedP(vjy).
Note that the search congurations includes rotation, scaling and
shapechanges butnottranslation. We refer to [70 ]fora discussion
on how to create the relevant search congurations, select the rel-
evant region of interest and extract the initial congurations from
P(vjy).
Cootes et al. [61 ] and Stegmann et al. [165 ] use an optimization
based variantof thesearch strategy,whichcan besummarizedas:
1. Createrelevantsearchcongurationsv
0
;:::;v
k 1
andseti=0.
2. PerformlimitedMAP estimation,v^
i
=max
v
i P(v
i
jy), forthe
center of the template corresponding to a grid of positions
within therelevant region of interest. The limited MAP esti-
mationisdenedasusinglessthanq iterationsintheiterative
optimization scheme.
3. i=i+1. Go to 2ifi<k.
4. Extract theinitial congurationsfrom thecalculatedP(^vjy).
Note thatthestatic searchis aspecialcase(q=0) of theoptimiza-
tion based search.
4.2.2 Very Fast Static Search.
The disadvantages of thesearch strategy is thecomputationalcost,
whichmainlyoriginatesfrom theevaluationsof P(vjy). Inthecase
of a static search Fiskeret al. [68 ] propose a very fast search algo-
exploits thefact that manylikelihoodmodelscan beinterpretedas
a simple correlation of a lter, f
l
(v;), and the image, y. Recall
thatin thestatic search a static template, v
i
, isshifted around the
image and P(yjv) is calculated for each positionin the grid - usu-
ally corresponding to each pixel. Using the lter interpretation all
these calculations can be performed by one convolution of the l-
terf
l (v
i
;) and y. Performing theconvolutioninthe Fourierspace
reduces the computation time further and makes the computation
time invariant to the size of the lter/template [70 ]. Fisker et al.
[68 , 70 ]reportsignicant reductionincomputation time usingthis
approach. Werefer to[68, 70 ]fordetailson thefastsearch strategy
and experimentalresults.
4.3 Multi-hypothesis Initialization.
For hard initialization problems false congurations might be ex-
tracted inthe initialization(or a numberof true congurations are
missed). Examples ofhard initializationproblemsare manyobjects
inthesame imagewithlargechanges inscaleand orientation,over-
lappingobjects and varyingbackground.
Thesimplesolutionisto performadensersearch. Howeverthiscan
beverycomputationalexpensive. An alternativestrategyistowork
with multiple object hypotheses, i.e. extract all reasonable initial
congurations accepting some of them are false hypothesis. These
initialcongurationsarethenfeedtotheoptimizationalgorithm. Fi-
nallyavalidationisperformedtoremovethefalseobjects. Examples
ofmulti-hypothesisinitializationcan be foundin[61 ][66 , 70 , 165 ].
4.4 Conclusion and Discussion.
In this chapter a number of initialization strategies have been re-
anydeformabletemplate(with anexplicitoptimizationcriterion)is
the search strategy. Two types of search strategies are presented:
static search and optimization based search. The choice of search
dependsontheactualproblem,model,neededsearchcongurations
etc. But as a ruleof thumb the static search should be appliedfor
quite simple initializationproblems, since it is faster. Whereas the
optimization basedsearch shouldbeappliedin thehardercases.
An alternative tothesearch basedinitializationisthefeaturebased
initialization. Featurebasedsearchstrategiesbecomeattractivewhen
thecomputationalcostbecomestohighusingasearchstrategy,e.g.
forobjectswithlargevariationinscaleandorientationwhereahuge
number of search congurations are needed for a successful initial-
ization.
Given one object in an image the global optimum of P(yjv) cor-
responds to this object, if the model is properly designed. In this
case the search based initialization guarantees successful initializa-
tion since it is only a question of how dense the search need to be
to obtainaninitialconguration, whichachievessuccessfullyrene-
ment during the optimization. In general a search based approach
is theoreticallymore appealingsincea uniedmodelis usedforini-
tialization and optimization. Whereas the feature based approach
appliesone model forinitializationand anothermodelfor theopti-
mization/tting. Itishardto seetheoretical argumentsforapplying
two dierent models. It couldbe argued that ifsome feature isrel-
evant for initializationit shouldbe relevant for theoptimization as
welland vice versa. Basically there is onlyone argument for using
features basedinitializationand thatis speed.
Chapter 5
Optimization
Thenaloptimizationistypicallyamedium-to-high-dimensionalop-
timizationproblem. Inmostcases,theproblemcanbecategorizedas
continuous, unconstrained optimization of a nonlinear, non-convex
function. The problem is continuous because the template param-
eters are real, i.e.
v
R
n
where n is the number of template
parameters, and unconstrained because most authors do not work
withhardconstrains.
Thefollowingsectionscontainacompactsurveyofthemostpopular
optimizationmethodsusedwithdeformabletemplatemodels. Recall
^
v =max
v
P(vjy)=min
v
U(v;y). Inpractice theMAP estimation
is always performed as min
v
U(v;y), hence we will only deal with
thisinthefollowing.
5.1 Deterministic Optimization Methods
5.1.1 Gradient Based Methods
In the optimization literature the gradient based methods receive
based methodsrest on thecomputation ofthepartialderivativesof
the posterior energy,
@U(v;y)
@v
. These derivatives are then more or
less cleverly applied in the search for the optimum. In the simple
case ofsteepestdescent theoptimizationisperformedbyiteratively
updating the parameters, dv, in the direction of the gradient, i.e.
dv
t
/ rU(v
t
;y). Thegradientbasedapproachesareverypopular
with deformable template models, see e.g. [33 , 72 , 101 , 102, 108 ,
163 , 189 ].
5.1.2 Dynamic Programming
Assumetheposteriorcanbeseparatedintodependentsubproblems:
U(v;y) = U
0 (v
0
;:::;v
l
;y)+U
1 (v
1
;:::;v
l +1
;y)+:::+
U
k l 1 (v
k 1 l
;:::;v
k 1
;y) (5.1)
wherelisa(small)integer. Thentheoptimizationcanbeperformed
very eÆciently by dynamic programming. Dynamic programming
worksbysolvingtheseparatedenergytermsU
i (v
i
;:::;v
i+l
;y)recur-
sively given v
i+1
;:::;v
i+l
is xed. Dynamic programming provides
the machinery forenforcing hardconstrains and guarantees theop-
timalsolutionwithintheresolutionofthisinterval. Foramore elab-
orate treatment of dynamic programming see e.g. [13 , 44 ]. In the
context of deformable template model dynamic programming was
introduced by Amini et al. [4]. Applying dynamic programing for
optimization of deformabletemplate modelshave beeninvestigated
and furtherdeveloped in[28,45 , 74 ,110 ,113 , 170 ].
5.1.3 Dierence Decomposition
IndierencedecompositionproposedbyGleicher[81 ]andinasimilar
approach proposedinparallelbyCootes etal. [35 , 40 ]thetemplate
parameters, v,are updatedbasedon thedierence, D,betweenthe
is only applicable for deformable template models with a fullpixel
model. By displacing the parameters in a trainingset with ground
truththerelationshipbetweendisplacements,dv,anddierenceim-
ages, D, are learned. The relationship is approximatedby a linear
regression,dv=R D,wherethematrixRisobtainedfromthetrain-
ing. Themaindierencebetweentheapproachin[81]andin[35,40]
ishowR is determined. Dierencedecompositionhasalso beenap-
pliedin[100,156].
5.1.4 Heuristics
Besidetheabovementionedmethodsnumerousheuristicshavebeen
applied. The heuristics are usually tuned to a specic model, but
can be very eÆcient. Examplescan be foundin [41 ,60 , 118].
5.2 Stochastic Optimization Methods
5.2.1 Genetic Algorithms
Genetic algorithms [82 , 93 ] employ mechanisms similarto the evo-
lutionin nature. Geneticalgorithmsmaintaina population of tem-
plates. Thispopulationevolvesintotheoptimalsolutionbyproduc-
ingnewgenerationsoftemplates,wherethesurvivaloftheindividual
templateisbasedonthecorrespondingposteriorprobability. Anew
generationis born by mutating the existing population in a proba-
bilisticfashion, e.g. by randomcrossover. We refer to e.g. [82 , 135 ]
for a more rigorous treatment of genetic algorithms. Examples of
applying genetic algorithms to deformable template models can be
foundin[12 , 50 ,91 , 128 ].
5.2.2 Simulated Annealing
Simulated annealing [27 , 112 ] can be compared to a physical pro-
cedure { calledannealing { in which a physicalsubstance is melted
and thenslowlycooled insearchof a lowenergyconguration. The
system is controlledbya decreasing temperature, T
t
, and new con-
gurations are generated by a random search method, such as the
Metropolisalgorithm [134]ortheGibbs sampler[75 ]. ForhighT
t a
new congurationwitha higherenergy,U(v;y),might be accepted,
whereasat lowT
t
onlycongurationcorrespondingto andecreasein
U(v;y) is likelyto beaccepted. Theexpectation isthat largeearly
uctuations willallow rapidlyescape from shallowminima whereas
the latter behaviour will act asa greedy algorithm descendinginto
the globalminimum. Populartemperatureschemesare: T
t
=T
t 1
[112 ] and T
t
= c
l og(1+t)
[75 ], where and c are constants. Analytic
conditionshave beenderivedfortheconvergencetowards theglobal
optimumforanumberofMarkovRandomFieldmodels,see[75 ,88 ].
Examples of theuse of simulatedannealing in thecontext of defor-
mablemodelscan be foundin[23 , 85 ,155,167][68 , 70 ].
5.2.3 Stochastic Diusion
Stochasticdiusionforoptimization[3,76 ]isbasedonthestochastic
dierentialequation:
dv
t
= rU(v
t
;y)+ p
2T
t dw
t
(5.2)
where w
t
is thestandard Brownian motion withindependent iden-
tical distributedcomponents and the temperature, T
t
, controls the
magnitudeoftherandomuctuations. (5.2)can beinterpretedasa
steepest descentalgorithm withrandomuctuation inthegradient.
The annealing approach ensureslargeuctuation forhigh tempera-
tureswhereasthelatterbehaviourwillbea steepestdescent. Given
specialassumptionsaboutU(v;y)globalconvergencehavebeende-
rived,see[76 , 79 ]. Examplesofoptimizationofdeformabletemplate
5.3 Comparative Study
InacomparativestudybyFiskeretal. [68 ]theperformanceofthree
gradient based methods, Simulated Annealing and Pattern Search
[94 ]iscomparedforasimpleellipsemodel. Withrespecttothenal
energy the study shows, that all methods perform almost equally
well within a reasonable number of function evaluations, but if a
largenumber offunctionevaluationsis allowed simulatedannealing
obtainsvery good energies. When the speedof convergenceis com-
pared,thegradientbasedmethodsarefastestduringthelaterphase
of the optimization. When the initialconguration is further away
from the nal solution, the relative performance of Pattern Search
and SimulatedAnnealingincrease. Basically all these resultsare in
very good agreement with theexpectedbehaviour.
5.4 Conclusion and Discussion
The task of optimization is not only a question of obtaining the
minimal energy conguration, but also a question of convergence
speed. If the objective is to obtain the minimal energy a method
withgood globalconvergence properties shouldbeselected, i.e. dy-
namicprogramming, simulatedannealing or stochastic diusion. If
concernedatthespeedofconvergenceadeterministicmethodshould
beapplied,since thestochastic methodsingeneral suerfrom slow
convergence. Dynamicprogrammingsatises bothobjectives. How-
ever, it requests the separation of the energy into subproblems. In
generalthechoiceofoptimizationisaclassicaltradeobetweenaccu-
racyandspeed,anditisalsohighlydependenton theactualmodel.
Notethatthecomputationalcostandrobustnessoftheoptimization
can be improvedbya multi-resolutionstrategy[42 , 102].
Anotherimportantaspectforthechoiceofoptimizationalgorithmis
thegeneral qualityof the initialization. If the initializationis close
general it is our experience that a good initializationoften is more
importantfortheperformancethantheactualchoiceofoptimization
algorithm. Thisconclusionisalsosupportedbytheempiricalresults
in[68 ].
Another important - butoften forgotten- aspectfor obtainingsuc-
cessful optimization is the quality of the optimization criterion. A
properlydesignedoptimizationcriterionshouldbesmoothwithfew
local maxima. The prior plays an important role in obtaining a
smooth and convex optimization criterion, since most priormodels
consist of a quadratictermand henceare bornsmoothand convex.
Jain et al. [102] utilize this by gradually reducing the smoothness
of theoptimization criterionbydown weightingthe inuenceofthe
priorduringtheoptimization process.