Using
Spae Mapping
Pernille Brok
LYNGBY 2004
EKSAMENSPROJEKT
NR. 54
IMM
Prefae
ThisMasterThesisissubmitted atIMM,DTU,withthesupervisionofKaj
Madsen, Professor, dr.tehn., andHans BruunNielsen,Ass.Professor.
Iwould like to thank Kaj Madsenand HansBruun Nielsen for their enthu-
siasm,and foralwayshaving timefor aquestion or a disussion.
AlsoIwouldliketothankPh.D.StudentFrankPedersenforbeinginvolved
intheprojet andsupplying manygood ideas.
Lyngby, August2,2004
Pernille Brok
Abstrat
Thesubjetofthis masterthesis isnon-linear optimization usingtheSpae
Mappingmethodwithan interpolating surrogatemodel.
The Spae Mapping method is useful in optimization problems, where the
ne model we wish to optimize is very omputationally expensive. The
interpolating surrogate is based on a heap oarse model and serves as a
replaement for the expensive model in order to minimize the number of
funtion evaluations.
An important part of the Spae Mapping algorithm is the Parameter Ex-
tration, whih involvesminimization of the residual between thesurrogate
andthenemodel,whihweaimtoalign. TheParameterExtration prob-
lem does not always have a unique solution, and dierent formulations are
presentedinorderto ensure this uniqueness.
Thethesisprovidesapresentationofthemathematialtheoryfollowedbythe
SpaeMappingalgorithm. Wethenmakeanumberoftheoretialand pra-
tialinvestigationsonerningdierent formulations oftheresidual dening
theParameter Extration problem.
The step length in forward dierene approximations is analyzed, and the
optimalsteplengthsuitedfortheonsideredproblemsisfoundtobeapprox-
imately
10 − 5
. Wemakeananalysisofthe solutionsto underdeterminedand overdetermined problems, hereby an analysis of the Marquardt equationsand of leastsquares problemswith and withoutweighting fators. We look
at theeetof addinga regularization termto theresidual vetor andnd,
thatthisresidualformulationorrespondstoaspeialaseoftheMarquardt
equations withthe damping parameter
1 + µ
.ThepresentedSpaeMapping algorithmis testedinthevariousversions on
threetestproblems,andtheresultsareompared. Theonvergene isfaster
than withlassial optimization algorithms. Itis not possible to make gen-
eralonlusionsontheperformaneofthedierentalgorithmversionsbased
onthe inluded testproblems.
Key words: Spae Mapping, non-linear optimization, interpolating sur-
overdetermined problems.
Resumé
Detteeksamensprojektomhandlerikke-lineæroptimeringmedbrugafSpae
Mapping-metoden medinterpolerendesurrogater.
Spae Mapping-metoden eranvendelig ioptimeringsproblemer ved optime-
ring af en n model, som er meget dyr beregningsmæssigt. Det interpo-
lerende surrogat er baseret på en billig grov model og erstatter den ne
modelioptimeringsproessen,hvorved vimindskerantalletaftidskrævende
funktionsevalueringer.
Et vigtigt delproblem i forbindelse med Spae Mapping-algoritmen er Pa-
rameter-Ekstraktion, som involverer minimering af residuet mellem surro-
gatet ogden ne model, somvi ønskerat mathe. Parameter-Ekstraktions-
problemet har ikke altid en entydig løsning, og vi præsenterer forskellige
formuleringer meddet formålat sikre en entydig løsning.
Projektet præsenterer den matematiske teori efterfulgt af Spae Mapping-
algoritmen. Herefter laves en række teoretiske og praktiske undersøgelser
vedrørendedeforskelligeformuleringerafresiduet,somdenererParameter-
Ekstraktions-problemet.
Skridtlængden i dierenstilnærmelser analyseres, og den optimale skridt-
længde, som er velegnet til de her betragtede problemer, bestemmes til
omkring
10 − 5
. Vi analyserer løsninger til underbestemte og overbestemte problemer,herunderMarquardts ligningerog mindste-kvadratersproblemermedog uden vægtfaktorer. Vibetragter eekten afat medtage etregulari-
seringsled i residuet, og nder at en sådan residue-formulering svarer til et
speialtilfælde afMarquardts ligninger meddæmpningsparameteren
1 + µ
.Deforskelligeversioner afden beskrevneSpaeMapping-algoritme afprøves
på tre testproblemer, og resultaterne sammenlignes. Konvergensen er hur-
tigere end for klassiske optimeringsmetoder benyttet direkte på den ne
model. Det er ikke muligt, at foretage generelle konklusioner om algorit-
mens præstationer påbasisaf deherinkluderede testproblemer.
Nøgleord : Spae Mapping, ikke-lineær optimering, interpolerende surro-
gater, mindste-kvadraters problemer, vægtfaktorer, underbestemte og over-
Contents
1 Introdution 1
1.1 Introdution to theSpaeMappingMethod . . . 1
1.1.1 Dierent SpaeMapping Tehniques . . . 2
1.2 ProblemFormulation . . . 2
1.3 MathematialIntrodution . . . 3
1.3.1 Overview ofThe SpaeMappingAlgorithm . . . 7
1.3.2 NewFormulationof the ResidualVetor . . . 7
1.4 Assumptions . . . 8
1.5 PreviousWorkand Implementation . . . 9
2 Implementation of The Spae Mapping Algorithm 11 2.1 TheSpaeMappingAlgorithm . . . 11
2.1.1 TheMainAlgorithm . . . 12
2.1.2 TheAlgorithmfor Surrogate Optimization. . . 16
2.1.3 TheAlgorithmfor Parameter Extration . . . 18
3 Theoretial and Pratial Investigations 21 3.1 FiniteDierene Approximation . . . 21
3.1.1 OptimalStepLength . . . 22
3.1.2 Resultsfromthe TLT2Problem . . . 26
3.2 TheMarquardtAlgorithm . . . 29
3.2.1 SingularValueDeomposition . . . 29
3.3 Regularization. . . 34
3.4 Variable NumberofMapping Parameters . . . 36
3.5 ThePenalty Fator . . . 39
3.7 The Normalization Fators . . . 43
4 Test Problems 45 4.1 Introdution . . . 45
4.1.1 TheTest Senarios . . . 46
4.1.2 VisualizationofThe Results. . . 48
4.2 The RosenbrokProblem . . . 50
4.2.1 Introdution. . . 50
4.2.2 LinearTransformation . . . 51
4.2.3 TheAugmented RosenbrokFuntion . . . 58
4.3 The TLT2 Problem . . . 63
4.3.1 Introdution. . . 63
4.3.2 TheResultsof theTest Runs . . . 64
4.4 The TLT7 Problem . . . 75
4.4.1 Introdution. . . 75
4.4.2 TheResultsof theTest Runs . . . 76
5 Future Work 81 5.1 Improvements oftheSMIS Implementation . . . 81
5.2 Suggestions for Further Investigations . . . 82
6 Conlusion 85 A Short User's Guide for The SMISFramework 89 A.1 The ProblemSetup-le. . . 89
A.2 Calling the SpaeMappingAlgorithm . . . 90
A.3 Plotting and ViewingData . . . 91
Symbols and Notation 93
Chapter 1
Introdution
1.1 Introdution to the Spae Mapping Method
TheSpaeMappingmethodisanoptimization methodusedforengineering
design problems. The tehnique is useful, when the model that we wishto
optimizeisomputationally expensive. Inthisasetheuseofalassialop-
timizationmethoddiretlyonthe nemodelwouldresultinalargenumber
offuntionevaluations,andisonsideredimpossibleinpratie. Thegoalis
to lowerthe number oftime-onsuming nemodelevaluations.
The Spae Mapping method relies on the existene of two funtions mod-
elling the same system: the ne model, whih is very time-onsuming to
evaluate, and a oarse model, whih is heap to evaluate. We wish to on-
strut a surrogate model based on the oarse model, and let the surrogate
serve as a replaement for the ne modelin the optimization proess. The
ne model is suessively evaluated in order to onstrut an interpolating
surrogate model, whihis thenused for optimization. Thesurrogate model
isat leastasaurate asthe oarse model. By aligningthesurrogatemodel
with thene model in more than one point we seek global aswell as loal
agreement of thetwomodels.
Theinterpolatingsurrogateisonstrutedasaomposedmappingonsisting
ofboth an inputand an outputmapping. Thismapping is theSpaeMap-
ping onneting the oarse model responses with the ne model responses.
Thedesign parameters aretransformed bytheinputmapping, andtheout-
put mapping orrets the surrogate to ensure exat agreement of the re-
sponses. Wealignbothfuntionvaluesandgradientsofthesurrogatemodel
with the ne model and hereby wish, that the surrogate provides a good
TheSpaeMapping-basedoptimizationalgorithmonsistoftwosub-problems:
•
The optimization ofthesurrogate model.•
Theupdateofthesurrogate,theso-alledParameterExtration,whihdetermines themapping parameters inorder to ensure the agreement
of thesurrogate andthene model.
1.1.1 Dierent Spae Mapping Tehniques
TheoriginalSpaeMappingformulation isdesribed in[4 ℄and [5℄andonly
involves aninput mapping
P : R n → R n
,where:P(x) = arg min
z ∈R n k c(z) − f(x) k 2 2
(1.1.1)We referto (1.1.1) astheoriginalSpaeMapping denition.
TheSpaeMappingmethods withinputmapping anbeapproahed indif-
ferent waysin orderto ensure theuniqueness oftheSpae Mapping. A full
overview andfurther disussionsof thevariousversionsareprovided byJa-
ob Søndergaard in[7℄. When onlyinput mappings areused, we annot be
surethattheSpaeMappingtehniqueprovidesthenemodeloptimizerasa
solution, unlessertaintheoretialonditions aremet. Theseonditions are
stated in[7℄,hapter4.1. Theexat math between thene modeland the
oarse modelresponseis thereforenot likely intheoriginalSpaeMapping,
eventhoughthemappedoarsemodelanprovide agoodapproximationto
the nemodelover alarge region ofthe parameter spae.
When introduing an additional mapping, an output mapping, to dene
the surrogatemodel, we an ensurethe mathing, and herebyoveromethe
residualmisalignment. Onthatground theSpaeMappingtehniqueswith
both inputandoutputmappingsareto bepreferred. Withthesetehniques
the uniqueness of the Parameter Extration is still not ensured, whih is a
problemthatan be solved inmanyways.
ThisreportwillonlyworkwiththeSpaeMappingmethodwithboth input
andoutputmappings, providingan interpolating surrogatethatgivesexat
alignment withthene modelintheexpansion point.
1.2 Problem Formulation
The main subjet of this thesis is the Spae Mapping method based on
an interpolating surrogate. We wishto investigate dierent theoretial and
theParameter Extrationproblems. Onthis basiswepresenta SpaeMap-
ping algorithm and test the implementation of the algorithm on dierent
test problems.
Inthereportweanalyze the followingsubjets:
•
Theapproximationerrorfromusingforwarddiereneapproximations asestimatesfor thederivatives, andhereby theoptimal steplength.•
Least squaresproblems.- TheMarquardt equations.
- Thesolution to theregularized problem.
- Thesolutionstounderdeterminedandoverdeterminedleastsquares
problemswith andwithout weights.
•
Dierent formulations ofthe residual inorder to ensure uniquenessof theParameter Extration.- Redution ofthenumber ofinputmapping parameters.
- The eet of using a regularization term in the Parameter Ex-
trationproblem.
- TheeetofusingweightingfatorsintheParameterExtration
problem.
- The eet of using normalization fators in the Parameter Ex-
trationproblem.
The mathematial theoryof the SpaeMapping method with interpolating
surrogate is introdued, and the Spae Mapping algorithm is presented in
pseudo-ode. We then onsider the theoretial and pratial investigations
of thesubjets above. Thevarious versions of the algorithm aretested nu-
merially on three problems. Finally suggestions for future investigations
areproposedbylisting some unresolved matters.
1.3 Mathematial Introdution
We areaimingat solving an optimization problemoftheform:
x ∗ = arg min
x ∈R n { H (f (x)) }
where
H : R m → R
is a suitable objetive funtion, andx ∗ ∈ R n
is theoptimal setofdesign parameters.
We assume, that two models of the same system are available: A ne but
expensivemodel,givenby
f : R n → R m
and aoarsebut heap modelgivenby
c : R n → R m
. The funtion vetors froma given parameter set are alsodenotedresponsevetors.
The surrogate model
s : R n → R m
is dened bya ompositemapping: Foreahof the
m
responses we dene theinput mappingP i : R n → R n
,whihperforms a linear transformation of the design parameters, and the output
mapping
O : R m → R m
,whih transforms theoarse modelresponse. Theaimis toalign the surrogate withthe ne modelfor all
m
responses.Theinputand outputmapping parameters for
i = 1, . . . m
are:A i ∈ R n × n , b i ∈ R n , α i ∈ R , β i ∈ R .
Thelineartransformation
P i
for thei
thresponsefuntionisnowdened as:P i (x) = A i x + b i
(1.3.1)andthe output mapping
O i
as:O i (y) = α i (y i − y ¯ i ) + β i
(1.3.2)where
y ¯
isa onstant vetor. Gathering theinputand outputmappingswehave:
P =
P
T1
.
.
.
P
Tm
, O =
O 1
.
.
.
O m
Theinterpolating surrogatemodel isnowdened bythe omposition:
s = O ◦ c ◦ P
(1.3.3)When inserting the expressions for the input and output mappings we get
the surrogatemodel forthe
i
th response given by:s i (x) = O i (c i (P i (x)))
= α i (c i (P i (x)) − c i (P i (¯ x))) + β i
= α i (c i (A i x + b i ) − c i (A i x ¯ + b i )) + β i
We wish to align the responses of the surrogate model with thene model
inall
m
sampling points. Inthek
th iterationx (k)
we musttherebyhave:s (k) (x (k) ) = f (x (k) )
(1.3.4)where
s (k)
denotes the surrogate usedinthek
th iteration. We furthermore wantthesurrogatemodeltoapproximatethenemodelatpreviousiterationpoints. An additional riterion for hoosing the mapping parameters is to
aim for agreement of the Jaobians ofthe ne model (denoted
J f
) and thesurrogate model (denoted
J s
) inthe urrent iterate. This leeds to the twoequations:
s (k) (x (j) ) = f (x (j) )
forj = 1, . . . , k − 1
(1.3.5a)J (k) s (x (k) ) = J f (x (k) )
(1.3.5b)Equations (1.3.4) and (1.3.5) ensure the alignment of the surrogate model
andthenemodelbothbothwrt.thefuntionresponsesandtheJaobians
in the urrent iterate as well as wrt. the funtion responses in all previous
iterates. The goal is to have both loal and global agreement of the mod-
els. The loal agreement isensured by (1.3.4) and (1.3.5b),and theglobal
agreement by(1.3.5a) .
The initial values of the mapping parameters an be hosen as follows:
We wish to start the iterations in the oarse model optimizer
z ∗
, so thatx (1) = z ∗
. Initeration0
we thereforewant thesurrogatemodeltobeidenti-alto theoarse model, whih isensured by hoosing theinputand output
mapping parameters as:
A (0) i = I b (0) i = 0 α (0) i = 1
β i (0) = α (0) i c i (P (0) i (x (0) ))
for
i = 1, . . . , m
(1.3.6)Inthis waythe
i
thresponseof the0
th surrogatebeomes:s (0) i (x) = α (0) i c i
P (0) i (x)
− c i
P (0) i (x (0) )
+ α (0) i c i (P (0) i (x (0) ))
= α (0) i c i
P (0) i (x)
= c i (x)
(1.3.7)Sinethe oarse modelis assumedto be heap to evaluate,theoptimizer is
foundbya standard optimization algorithm.
In the following iterations the mathing (1.3.4) is ensured by hoosing the
outputmapping parameters
α i
andβ i
andtheonstant¯ x
inan appropriate way. Byputtingx ¯ (k) = x (k)
wehave thei
thsurrogate inthek
th iteration:s (k) i (x) = α (k) i c i
P (k) i (x)
− c i
P (k) i (x (k) ) + β (k) i
Byinsertingthe iterate
x (k)
inthesurrogatefuntionandthenin(1.3.4)wendthevalue of
β i (k)
to be:α (k) i c i
P (k) i (x (k) )
− c i
P (k) i (x (k) )
+ β i (k) = f i (x (k) )
⇒ β i (k) = f i (x (k) )
The
i
th responseof the interpolatingsurrogate isnowgiven by:s (k) i (x) = α (k) i c i
P (k) i (x)
− c i
P (k) i (x (k) )
+ f i (x (k) )
(1.3.8)whih is valid for all
k > 0
.Beause of the hoie of
x ¯
the math (1.3.4) only depends on the outputparameter
β i
,andtheα i
'smust be hosen appropriately basedon(1.3.5) . Ineahiterationthe next setofdesignparametersx (k+1)
arefoundbymin-imizingthe surrogate(1.3.8) dened bythemapping parametersof thepre-
viousiteration:
x (k+1) = arg min
x ∈R n
n H
s (k) (x) o
(1.3.9)
It must be laried, that the new iterate
x (k+1)
is only aepted, if it pro-duesadereaseintheobjetive funtion ompared totheprevious iterate,
ie.if
H(f (x (k+1) )) < H (f(x (k) ))
. Ifthisisnotthease,thealignments(1.3.4)and (1.3.5b) must be made with respet to the previous (and so far best)
iterate. Theunaeptediterateisonlyusedintheglobalalignment equation
(1.3.5a),andmustnot satisfythegradient math. To handlesuhan uphill
stepregardingthenemodelobjetiveweuseatrust regionmethodforthe
surrogateoptimization.
When theiterate
x (k+1)
is now available, the(k + 1)
th set of mapping pa-rametersmustbefound. Theresponsealignment (1.3.4)isalready ensured.
In order to satisfy the additional mathing (1.3.5) we dene the residual
funtionfor the
i
th response:r (k+1) i (A i , b i , α i ) =
s (k+1) i (x (1) , A i , b i , α i ) − f i (x (1) )
.
.
.
s (k+1) i (x (k) , A i , b i , α i ) − f i (x (k) ) J (k+1) s,i (x (k+1) , A i , b i , α i ) − J f,i (x (k+1) )
(1.3.10)
where
J s,i
andJ f,i
arethegradientsoff i
ands i
wrt.x
,ie. thetransposeofthe
i
th rowsof the Jaobiansof thene resp. thesurrogate modelwrt. thex
vetor. We nd the next set of mapping parameters by minimizing theresidual:
n
A (k+1) i , b (k+1) i , α (k+1) i o
= arg min
A i ,b i ,α i k r (k+1) i (A i , b i , α i ) k
(1.3.11)insome norm for all
m
responses. Thisproess of updating theparametersisalledParameter Extration.
The iterations ontinue inthis way with optimization of theurrent surro-
gatefollowedbytheParameterExtration,untilthesolutionisfoundwithin
a satisfying auray. Appropriate stopping riteria ould be based on the
relative hange inthe solutionvetor or intheobjetive funtion.
1.3.1 Overview of The Spae Mapping Algorithm
Based on the previous setion we an now summarize the Spae Mapping
method with the interpolating surrogate. The algorithm for solving the
optimization problemisoutlined asfollows:
1. Given oarse modeland ne model.
2. Set
k = 0
,hooseinitial guess for the oarse optimizerx (0)
. Initializeinput and output mapping parameters
A (0) i = I
,b (0) i = 0
,α (0) i = 1
and
β i (0) = α (0) i c i (x (0) )
fori = 1, . . . , m
.3. Optimizethesurrogatemodel(1.3.8)tondthenextiterate
x (k+1)
bysolving (1.3.9) .
4. Compute
f (x (k+1) )
andJ f (x (k+1) )
. Chek stopping riteria and ter-minateifsatised.
5. Updatemappingparameters
A (k+1) i
,b (k+1) i
andα (k+1) i
fori = 1, . . . m
by (1.3.11)withtheresidual given by(1.3.10) .
6. Set
k = k + 1
and go to step3.1.3.2 New Formulation of the Residual Vetor
Inthepratial implementation oftheSpae Mappingalgorithm weusedif-
ferent versions of theresidual vetor for the Parameter Extration. This is
doneinordertoensurethattheproblemhasauniquesolution,andthatthis
solutionshouldsatisfytheloalandglobalagreementbetween thesurrogate
andthe nemodel, thatweaim for.
Onthatgroundweherebydene theresidual:
r (k+1) i (A i , b i , α i ) =
w 1 · a 1 ·
s (k+1) i (x (1) , A i , b i , α i ) − f i (x (1) )
.
.
.
w k · a k ·
s (k+1) i (x (k) , A i , b i , α i ) − f i (x (k) ) σ · d ·
J (k+1) s,i (x (k+1) , A i , b i , α i ) − J f,i (x (k+1) )
(1.3.12)
It isnoted, thatthedimension of
r i
is(k + n)
,when we have found(k + 1)
x
-iterates. The fatorsa 1 , . . . , a k
andd
are normalization fators used foravoiding salingproblems,inasetheresponsesarenot ofthesame orderof
magnitude.
The
w
-fatorsareweighting fatorsusedintherstk
elements oftheresid-ual.
Thefator
σ
isapenaltyfatoronlymultipliedonthelastn
elementsoftheresidualvetor. Theweightingfatorsandthepenaltyfatorhavethesame
eet, but we distinguish between them, beause the fators have dierent
purposes.
Theaim of the weighting fatorsisto give an individual priority to eahof
the iteration points intheresidual. Inthis waywean distinguishbetween
points far from the urrent iterate and points loser to the urrent iterate
andmake theglobal agreement moreor lessaurate inapartiular point.
The penalty fator is used to give the alignment of thegradients a ertain
priority ompared to the funtion value alignments in the previous points.
Ifweinrease
σ
,weanensure,thatthegradientsintheurrentpointmath.(1.3.12)isequivalent to(1.3.10) ,ifwe put allfators
a 1 , . . . , a k
,w 1 , . . . , w k
,d
andσ
equalto1
. Thetheoryofsetion1.3andthesummarizedalgorithmin 1.3.1 are hereby still valid, when we use the residual (1.3.12) instead of
(1.3.10). The new residual is equivalent to the residual (1.3.10) multiplied
bya diagonal matrix.
Throughoutthe restof thereport theresidual we useisgiven bythedeni-
tion(1.3.12) . Ifnothingelse ismentioned thefators
a 1 , . . . , a k
,w 1 , . . . , w k
,d
andσ
have the value1
. The rstk
elements of the residual are referredto asthe funtion value residual,whereas thelast
n
elements arealledthegradient residual.
1.4 Assumptions
Inoptimization problems therean oftenbe severaloptimizers, both global
andloal. TheSpaeMappingmethodisnotaglobal optimizationmethod,
and dependingon the problem, we annot be sure, that thefound solution
isthe global optimizer, or even thatthis optimizerisunique.
Anumberofonditionsmustbesatisedinordertondaminimizerbythe
SpaeMappingmethod. Theseonditionsaredisussedindetailsin[7℄,and
arenot thesubjetof this report.
We assumethefollowing:
•
The sets{ x ∗ } = arg min { H(f(x)) }
and{ z ∗ } = arg min { H(c(x)) }
are•
The oarse model optimizerz ∗
and the ne model optimizerx ∗
areunique.
•
Allvariables arereal.•
The oarse model funtion and the ne modelfuntion areboth on-tinous andat leastone timedierentiable.
•
Theevaluationtime for theoarse model isnegligible.1.5 Previous Work and Implementation
The Matlab programs made in onnetion with this thesis are working
in the existing SMIS (Spae Mapping Interpolating Surrogate) framework
implemented by Frank Pedersen. The framework is programmed in Mat-
lab,but also involves someFortransubroutines olleted intheF-pakage.
ThisinludedF-pakageontainsdierentalgorithmsforthesolutionofon-
strained and unonstrained non-linear optimization problems. A detailed
desriptionof theFortran subroutines isfound in[13℄.
The SMIS framework by Frank Pedersen inludes a number of algorithms
for solving optimization problems with the Spae Mapping Method. The
dierent versions of thealgorithms are plaed in their owndiretory orre-
spondingtothe partiularformulationofthealgorithm. Theproblemsused
to test the algorithms arealso plaed ineah of their own diretories. Fur-
thermore the framework ontains a number of diretories withbasi tools,
suh as forward dierene approximations, plot funtions et. A new tool-
box has been added, this is the immoptibox programmed by Hans Bruun
Nielsen [12℄. The framework an be augmented by addinga new algorithm
or a new test problem plaed in the proper new Matlab diretory in the
properMatlab searhpath.
All Matlab ode programmed during theworking period of this report is
available at IMM's homepage,see [15 ℄. Someof the program les aremod-
ifations or augmented versions of existing ode, and some les are made
fromsrath.
AppendixA providesashort user'sguidefor the SMISframework, yetonly
theimplementationsand test problems usedinthisreport areinluded.
Chapter 2
Implementation of The Spae
Mapping Algorithm
Inthishapterthe SpaeMappingalgorithmwill beoutlinedanddisussed.
The method an be parted into three algorithms: The main algorithm and
twosub-algorithms. Themain algorithmissummarized inthepreviousse-
tion, and is referred to as Algorithm 1. Eah iteration in this algorithm
onsistsoftwo optimization proedures:
Therstoptimization probleminvolvesndingthenextiteratebyminimiz-
ing the surrogate model dened by the urrent mapping parameters. This
sub-algorithm isalledAlgorithm 2.
Theseondoptimization proedure -the ParameterExtration - onsistsof
m
optimization problems eah giving a new set of mapping parameters for theorrespondingsurrogate modelresponse. Algorithm3 isusedineah ofthe
m
Parameter Extration problems.The three algorithms will be presented inpseudo-ode in thenext setions
followed byomments ontheinvolved parameters and proedures.
2.1 The Spae Mapping Algorithm
The algorithms follow the theoretial introdution in setion 1. Algorithm
1 builds on the Matlab implementation byFrank Pedersen, but has been
hanged to a ertain extent. The optimization problem in Algorithm 2 is
solved by alling a Fortran subroutine from the F-pakage. The algorithm
is idential to the original and has not been altered. The algorithm is not
disussedindetailsandispresentedheretogiveafulloverviewoftheSpae
anddierentformulationomparedtotheexistingframeworkbyFrankPed-
ersen.
Before presenting themainSpaeMapping algorithm we dene:
p
: Denes the norm used in the objetive funtionH : k·k
p. Possiblevaluesare
1
,2
or∞
,wherethe latter(minimaxoptimization) isused throughout this report.∆
: Trustregion radius usedinAlgorithm2.F
: The objetive funtion intheoptimization problem.Thestopping riteria aredened byanumber ofoptional parameters:
max f1
: Maximal number offuntion evaluations inAlgorithm1.max f2
: Maximal number offuntion evaluations inAlgorithm2.max f3
: Maximal number offuntion evaluations inAlgorithm3.ε F
: Usedinstopping riterion for theobjetive funtion.ε K
: Usedinstopping riterion for thegradient mathing.ε hx
: Usedinstopping riterion for thestep lengthforx
-iterates.ε hp
: Usedinstopping riterion for thestep lengthforp
-iterates.Thevaluesfor theparameters used inthe stopping riteria mustbe dened
inthe problemsetup-le,for more detailssee Appendix A.
2.1.1 The Main Algorithm
The surrogate
s
is given by (1.3.8) and thei
th residual funtionr i
by thegeneral formulation(1.3.12) .
Inthe algorithm thesupersript indexes
(k)
for iteration numbersare omit-
tedto simplify the pseudo-ode. The lower index 'new' orrespondsto the
upperindex
(k+1)
, for referenes to theformulation of thetheoryin setion
1.3. It is assumed, that the surrogate model and the residual funtion in
Algorithm 1: Main Algorithm for Spae Mapping Iterations
k = 0; stop = 0; x ∈ R n ; ∆ = 10 − 1 · k x k 2
A i = I; b i = 0; α i = 1; β i = α i c i (x)
for i=1,...,mwhile not
stop
Find
h new = arg min k h k 2 ≤ ∆ k s(x + h) k
p byAlgorithm2.Evaluate
x new = x + h new
,S new = k s(x new ) k
p anddS = S new − F
Chekstopping riteria
dS ≥ 0
andk h new k 2 < ε hx · ( k x k 2 + ε hx )
Evaluate
f new = f(x new )
andJ f,new = J f (x new )
andF new = k f new k
pdF = F new − F ; ρ = dF/dS
k = k + 1
Add
x new
andf new
to sorted internal datastrutureActive = | ∆ − k h new k ∞ | < 10 − 2 ∆
if
dF < 0
x = x new ; f = f new ; J f = J f,new ; F = F new
end
Chekstopping riteria
dF < ε F
andk ≥ max f1
if
ρ > 0.5 & Active
∆ = ∆ · 2
else if
ρ < 10 − 4
∆ = ∆/3
end
for i = 1:m
Find
{ A i,new , b i,new , α i,new } = arg min { 1/2 · r
Ti r i }
byAlgorithm3.Set
{ A i , b i , α i } = { A i,new , b i,new , α i,new }
end
end
Someremarksto Algorithm1 aregivenbelow:
Initialization
Theinitial guessfortheoarsemodeloptimizeris
x
,andtheinitializationof theparameters orrespondsto theformulation insetion 1.3. The elementsoftheorrespondingsurrogatemodelaregivenby(1.3.7)andareidentialto
theoarse model elements. Therst optimization beforeentering the main
loop herebygivesthe oarse modeloptimizer. The valueof theinitial trust
region isreommended to be
∆ = 10 − 1 · k x k 2
aording to [13 ℄, but an beOptimization of the Surrogate
Inthemainlooptheoptimizerofthe urrent surrogatefuntion(denedby
theurrentmapping parameters)isfound. Thestep
h new
andtheobjetivefuntion gain ompared to the previous iterate is alulated. The formula-
tion of the interpolating surrogate makes sure that
s (k) (x (k) ) = f (x (k) )
, sothat
S (k) = F (k)
.Stopping Criteria
Two stopping riteria areheked at thispoint:
Thehange inthe surrogatefuntion must be negative, ifnot thenew iter-
ateisatually a worse solutionthan the previousone, and we want to exit.
Thisstopping riterion is inludedasasafetyto avoidan inniteloop. The
optimization algorithm for the surrogate model uses a desent method, so
weareensuredadereaseof
dS
. IfdS = 0
weannotimprovethesurrogate,andwe exittheloop.
The seond stopping riterion is onerning the relative step length and is
denedbytheoptionalparameter
ε hx
. Theformulationmakessurethatthe riterion is also useful, whenk x k 2
is lose to zero. If the solution vetor istooloseto thepreviousone,wehavenotahievedmoreinformation toget
anew setofmapping parameters, and we arestukat theurrent iterate.
Cheking these two stopping riteria at this point of the algorithm makes
sure, thatunneessaryevaluations ofthe ne modelareavoided.
Gain Ratio
Weontinue themain loopwithevaluationsof
f
,J f
andtheobjetivefun-tion
F
inthenewiterate. Thegainratioρ
istheratiobetween thetruegainand the predited gain. It serves as a measure for how well the surrogate
modelapproximatesthenemodel,andisusedforupdatingthetrustregion
radius
∆
.Internal Data Struture
The iterate and the orresponding funtion vetor and Jaobian are added
F
: The objetive funtions of the iterates number1
tok
sorted in as-endingorder.
X
: Matrixwiththeiterates sorted aording toF
.dX
: Row vetorwiththe normsof the distanes fromthesorted iteratesin
X
to the bestiterate.By this sorting the rst element in
F
will be the best objetive funtionvalue so far, the rst olumn in
X
will be the best iterate so far, and therstelement in
dX
willbe0
. ThedataisusedintheParameterExtrationproblemandalsofordoumentationandplottingafterendedSpaeMapping
iterations.
The Ative Flag
Thisag is ative (
= 1
),if thenewsolution is lose to theboundary ofthetrust region, in the sense that the length of the step must be in the open
interval
k h new k ∞ ∈ ]0.99 · ∆ , 1.01 · ∆[
. In theory it is impossible to havek h new k ∞ > ∆
,butinpratieroundingerrorsan have aneet. Anativeagequalto
1
indiates,thatthe trustregiononstraintsareative,andtheag isusedlater for updatingthe trust region radius.
Update of the Iterate
If the objetive funtion has dereased, we wish to aept the new iterate,
anduse thisasan initial valueinthenext surrogateoptimization.
Stopping Criteria
Atthis point twoadditional stopping riteria areheked:
We use the hange in the objetive funtion to formulate thestopping ri-
terion:
dF < ε F
. The relative hange an be usedin aseF
is not lose tozerointhe optimizer.
As a nal safety towards an innite loop we exit, if the number of main
iterationshasexeededthe maximum value
max f1
.Update of the Trust Region
We usethe following updating strategy for the trust regionradius
∆
:Ifthegainratioislargerthan
0.5
,andthenewiterateislosetothebound-aryof the trust region, we inrease thetrust region radius bya fator
2
. Alargevalueof
ρ
indiates,thatthesurrogateservesasagoodapproximation to the ne model. Sine the ative ag is1
,we have taken the largest steppossible inthe latestiteration. We would thenlike to inrease
∆
,and takelongersteps.
If
ρ
is small, the surrogate is a poor approximation to the ne model, and we want to take smaller steps. A derease of the trust region is made, ifthe gainratio is smallerthan thevalue
10 − 4
. Thisondition is quitestrit,beause of the fat that the surrogate has not yet been updated. By the
update of the mapping parameters to follow, we hope that the surrogate
model isimproved. It is also important, that thetrust region doesnot get
too small, sine the optimization proedure of the surrogate model would
thengetrestrited.
Update of the Mapping Parameters
Themapping parameters areupdatedbyAlgorithm 3.
Eah iteration in the main loop involves one evaluation of the ne model
funtion vetor and one evaluation of the ne model Jaobian. The itera-
tionounter
k
isthenequal to thenumber off
- andJ f
-evaluations. Alsoa numberof oarse model evaluations (through the surrogate evaluation) areperformed. Sine these are onsidered heap ompared to the ne model
evaluations,weareonly interested inthenalamount ofnemodelevalua-
tions.
2.1.2 The Algorithm for Surrogate Optimization
Thealgorithm usedto solve the optimization ofthesurrogate isoutlinedin
pseudo-ode in the box below. Sine we are mainly onerned with Algo-
rithm1and 3 inthis report, Algorithm2is only disussedbriey. We note
that the iteration ounter is used independently of the iteration ounter of
Algorithm 2: Sub-algorithm for the Surrogate Optimization
Given global trust regionradius
∆
and initial parameter vetorx (0)
Linear inequalityonstraints:
A ˆ = [ I; − I]
b ˆ = [ − x (0) + ∆; x (0) + ∆]
Callto funtion minnonlin.
Callto Fortran subroutine (non-linear optimization inthenorm
p
).Initialloal trustregion radius
∆/4
stop if
j ≥ max f2
or if
h < ε hx · k x k 2
Initialization
The trust region radius
∆
is given byAlgorithm 1. To ensure that theop-timizerofthe surrogate funtionisinsidethetrust region,we introdue the
linearinequality onstraints given by:
x +
− x (0) + ∆
≥ 0
and− x +
x (0) + ∆
≥ 0
whihis equal to the onditions:
x ≥ x (0) − ∆
andx ≤ x (0) + ∆
See the user's guide inAppendix A for more information onhow to handle
theasewhere theoriginal minimization problemisonstrained.
Funtion Call to minnonlin and Fortran Subroutine
This funtion is a helping funtion that, depending on the problem type,
makesanotherfuntionalltotheproperFortransubroutine,see[13℄. Whih
optimization algorithm isuseddependson thenorm
p
,inwhihwe want tominimize the surrogate funtion, and of wether the objetive funtion is a
salar funtion or a vetor funtion. Beause of the trust region approah
theoptimization problemisalwaysonstrained.
Someof the Fortran subroutines use anoptimization method withtrust re-
gion, inthis asethe initial loaltrust regionis setto
∆/4
. Thisloaltrustregionhasnothingto dowiththe global trustregion
∆
,whih ensures,thatStopping Criteria
The Fortran subroutines require two optionalparameters used in thestop-
pingriteria of the algorithm: The maximum number of iterations allowed
(
max f 3
, and the parameterε hx
whih is used in regard to the step length.Thealgorithm stops,whenitsuggestsa steplength
h
,whenh < ε hx · k x k 2
.2.1.3 The Algorithm for Parameter Extration
The third algorithm, whih performs theParameter Extration for eah of
the
m
responsefuntions, isoutlinedbelow.Algorithm 3: Sub-algorithmfor Parameter Extration
Given initial parametervetor
p k = [A(:); b; α]
The objetive funtion is
r
and the lastn
rows ofr
are denotedg
.The options inmarquardtare given byopts.
j = 0; stop = 0; σ = 1 K = k g k ∞ ; K r = k r k ∞
while not stop
j = j + 1
Find
p new = arg min p { 1/2 · r(p)
Tr(p) }
bymarquardt withoptsr new = r(p new ); K new = k g new k ∞ ; K r,new = k r new k ∞
h = p new − p; Accept = K new < K
if
Accept
p = p new ; K = K new ; K r = K r,new
end
if
K new < ε K
stop = 1;
breakelse if
h < ε hp · ( k p k 2 + ε hp ) stop = 2;
breakelse if
j ≥ max f 3
stop = 3;
breakelse if
σ ≥ 10 3 stop = 4;
breakend
if
σ < 10 3 σ = σ · 10
end
end
This algorithm is onsidered independently of Algorithm 1 and 2, and any
referenes toiterations only onernthepresent Algorithm3.
Initialization
We have given initial sets of the mapping parameters
A
,b
andα
orre-sponding to an arbitrary response. The mapping parameters are arranged
inthevetor
p k
,wherek
denotes the mapping parameters deningthek
thsurrogate model. The residual funtion for the response is given by equa-
tion (1.3.12) . The options used in the marquardt-funtion are set in the
5
-element vetoropts where:opts(1) : Denes theinitial valueof the Marquardtparameter:
:
µ 0 =
opts(1)max { (J
T0 J 0 ) (i,i) }
.opts(2) : Parameter usedinstopping riteria for thegradient:
:
k F ′ (p) k ∞ ≤
opts(2).opts(3) : Parameter usedinstopping riteria for thesteplength:
:
k dx k 2 ≤
opts(3)(
opts(3)+ k p k 2 )
.opts(4) : Maximal numberof iterations
opts(5) : Lowerbound on
µ
:µ =
opts(5)max { (J
TJ) (i,i) }
.Thepenalty fator
σ
used inthe gradient residual isinitialized to1
.K
is ameasure of the violation of thegradient math, and
K r
of themathing ofthefullresidual.
Optimization of the Residual
The urrent residual is optimized by the Matlab-funtion marquardt im-
plemented by Hans Bruun Nielsen to nd the next parameter vetor. This
new solution isa resultof a number ofMarquardt steps, and theiterations
endedbeauseoneofthestopppingriteriasmentionedabovewasativated.
The Marquardt algorithm is disussed further in setion 3.2. The residual
funtion vetor is evaluated inthe new iterate, aswell asthenew values of
K
andK r
and thesteplength.The Aept Flag and Update of the Iterate
Sineweonsider the gradient math to be ofgreat importane,we usethe
fator
K
to deide,wetherthe newsetof parametersisbetter thanthepre-vious. The parameter vetor is aepted (
Accept = 1
), only ifthe gradientmath residual hasbeen dereasedompared to theprevious iteration.
Stopping Criteria
We hek four stopping riteria and exittheloop, ifeitherof them aresat-
ised. The gradient math is onsidered satised, if
K
is smaller than thevalue
ε K
. Seondlyifthestep between two onseutive iterates isrelativelysmall, the step length riterion is ativated. The third riterion stops the
loop,ifthe numberofiteration steps hasexeededthelimit
max f 3
. Finallywe will only ontinue the iterations, while
σ
is below or equal to10 3
. Thisriterionisequivalenttousing
max f3 = 4
,ifwehoosetheupdatingstrategybelow.
Update of the Penalty Fator
Forthepenaltyfator
σ
weuseasimpleupdatingstrategy:σ
isinreasedbyafator
10
for eahmain iteration. Inthis wayweforethegradient mathto beome ofgreaterimportane than the otherelements oftheresidual.
Chapter 3
Theoretial and Pratial
Investigations
3.1 Finite Dierene Approximation
To run the SMIS algorithm the user must supply two Matlab-les with
implementationsoftheoarseandnemodelandtheirJaobians. Forsome
problems (inluding theTLT2 and TLT7 problems), no exat gradients are
available, and theJaobiansareinsteadalulated byeg.dierene approx-
imations. In the test problems investigated in this report, the Jaobian is
alulated by forward dierene approximations by the Matlab-funtion
diffjaobian, see [15℄, implemented by Jaob Søndergaard. In the im-
plementation of diffjaobian the step length is saled aording to the
independent variable
x
,sothath = η(1 + | x | )
. Thisformulationisusefulin theasewhere| x |
isvery small,andwe geth ≃ η
. Thevalueofη
originallyhadthe xedvalue
η = √ ε M
,whereε M
isthe mahineauray, butη
anbe hanged bytheuser diretlyinthe m-le.
BytheinvestigationoftheTLT2problem,some interesting featuresregard-
ingthemodelfuntionswerediovered. Thene,oarse andsurrogatefun-
tions are smooth, but the gradients (partial derivatives) wrt. both
x
andp
alulatedfromdiffjaobianshowadierentpiture,whenthesteplength
issmall. Thismotivatedaninvestigationofthesteplengthindiffjaobian.
In the following setions the theory is simplied by looking at salar fun-
3.1.1 Optimal Step Length
TheJaobiansofboththeneandoarsemodelarealulatedbytheMat-
lab-funtiondiffjaobian. Inthefollowingtheinueneofthesteplength
usedinthe dierene approximation willbe analyzedbased on a salarex-
ample.
Given asalarfuntion
f : R → R
weompute aforward diereneapprox-imationto thegradient of
f
inthe pointx
by:D F (x, h) = f (x + h) − f (x)
h
(3.1.1)TheTaylorexpansion of
f
intheexpansion pointx
is given by:f(x + h) = f(x) + hf ′ (x) + h 2
2 f ′′ (x) + O(h 3 )
(3.1.2)Inserting(3.1.2) intheforward dierene approximation (3.1.1) ,we get:
D F (x, h) = f ′ (x) + h
2 f ′′ (x) + O(h 2 )
Thetrunation error
E T
isnow:E T = D F (x, h) − f ′ (x)
= f ′ (x) + h
2 f ′′ (x) + O(h 2 ) − f ′ (x)
= h
2 f ′′ (x) + O(h 2 )
(3.1.3)
and we seethat thetrunation error is
O(h)
forh → 0
. Ifwe alsotake theroundingerrorsintoaount,wegettheoatingpointnumbers
f ¯ (x + h)
andf(x) ¯
insteadoff(x + h)
andf (x)
:f ¯ (x + h) = f (x + h)(1 + δ 1 ), | δ 1 | ≤ Kε M
f ¯ (x) = f (x)(1 + δ 2 ), | δ 2 | ≤ Kε M
where theonstant
K ≥ 1
andε M
isthe mahine auray.Insertingthe above intheforward dierene approximation(3.1.1) we get:
D ¯ F (x, h) = f(x ¯ + h) − f ¯ (x) h
= f (x + h) − f (x)
h + δ 1 f (x + h) − δ 2 f (x) h
= D F (x, h) + E R
= f ′ (x) + E T + E R
(3.1.4)where the rounding eror is denoted
E R
. The worst possible rounding erroriswhen
δ 1 f (x + h)
andδ 2 f (x)
have oppositesigns, giving:| E R | ≤ Kε M ( | f (x + h) | + | f (x) | ) h
≃ 2K | f (x) | ε M
h
(3.1.5)Theabsolutetotalerrorisnowgiven bytheabsolutedierenebetween
D ¯ F
andthe realgradient:
| E | = | E T + E R | = | D ¯ F (x, h) − f ′ (x) |
≃ Ah + O(h 2 ) + B ε M
h
(3.1.6)Theonstants
A
andB
dependonthefuntionvaluesandseondderivatives intheneighbourhood ofx
,A ≃ 1 2 | f ′′ (x) |
andB ≃ 2K | f (x) |
.Usingtheabovewenowonsidertheasewhenapproximatingthegradients
wrt.theparameters in the
x
-vetor. The gradients appear intheJaobiansof the ne, oarse and surrogate models inthe Parameter Extration prob-
lem. We onsider an arbitrary response funtion with no index on
f
ands
. Eahof the rows of the gradient residual (thelastn
rows ofthe residualfrom(1.3.12) ) hastheform:
g i (x, p) = s ′ x i (x, p) − f x ′ i (x)
(3.1.7)x
is a olumn vetor holding the design parameters, and the vetorp
isholding the parameters
A
,b
andα
. To simplify the alulations we on-sider only the
i
th row of the gradient residual asa funtion of thevariablevetors
x
andp
(keeping all otherparameters thanx i
xed). Withoutlossofgeneralitywealso disregard thefators
σ
andd
.Theexatfuntion
g i (x, p)
isreplaedbytheapproximatedfuntionG i (x, p)
returnedbytheMatlab-funtion,wherethediereneapproximationwrt.
x i
andthe roundingerrorsgivesthe approximation to (3.1.7) :
G i (x, p) ≃ s(x + h x e i , p) − s(x, p) + Bε M
h x − f x ′ i (x)
=
s ′ x i (x, p) − f x ′ i (x) + h x
2 s ′′ x i x i (x, p) + O(h 2 x ) + B ε M
h x
= g i (x, p) + Ah x + O(h 2 x ) + B ε M
h x
(3.1.8)
where
e i
isa unitvetor inthei
th diretion andh x
isthestep length. Theonstant
A
depends on the seond derivative ofs
, andB
depends on thefuntion values of
s
intheintervalx ∈ [x , x + h]
.Byomparing (3.1.8) withtheexat funtion (3.1.7) ,we getthetotal error
foreah of thegradient residual rows:
E G (h x ) = E T + E R
≃ Ah x + O(h 2 x ) + B ε M
h x
(3.1.9)
wherethe dierentiation iswrt.
x i
for rownumberi
,and thelowerindexofx
hasbeen omittedfor simpliity.When
h x
is large the total error is dominated by the trunation error (therst two terms), whereas the rounding error (the last term) dominates for
small
h x
. To determine the optimal step lengthh x,opt
inorder to minimizethe total errorwe dierentiate (3.1.9)negligating the
O(h 2 x )
term andget:E G ′ (h x ) = A − B ε M
h 2 x
(3.1.10)Equalizing (3.1.10) with
0
we have the optimal step length minimizing theerrorfuntion:
E G ′ (h x ) = 0 ⇒ h x,opt = r
ε M
B
A
(3.1.11)InMatlab theunit round-o is
ε M ≃ 10 − 16
, sothe step length should bearound
10 − 8 q B
A
togivethesmallesterrors. Inpratiewedon'tdistinguish between thex
-variables andhooseone suitablestep length.Withtypial values of thederivatives of thesurrogatemodel of:
s x (x, p) ∼ 10 − 3
,s ′′ xx (x, p) ∼ 10 − 4
we get the approximate value of the optimal step lengthh x,opt ∼ 10 − 7
. Here we have used the values for the onstantsA = 1 2 10 − 4
andB = 2 · 10 − 3
. The result agrees with gure 3.1.1, wheretheerrorfuntion(3.1.9)forthesevaluesof
A
andB
isplottedasafuntionofthe step length
h x
.We now analyse the ase where the dierene approximation is made with
the
p
-parameters. We again onsider thei
th row of the gradient residual,andlookat the gradient withrespet to the
j
th parameterinthe vetorp
.Allother variables arekept xed.
We nowhave the