IMM YGBY2004ESAESRETR.54 ei

(1)

Using

Spae Mapping

Pernille Brok

LYNGBY 2004

EKSAMENSPROJEKT

NR. 54

IMM

(2)

(3)

Prefae

ThisMasterThesisissubmitted atIMM,DTU,withthesupervisionofKaj

Madsen, Professor, dr.tehn., andHans BruunNielsen,Ass.Professor.

Iwould like to thank Kaj Madsenand HansBruun Nielsen for their enthu-

siasm,and foralwayshaving timefor aquestion or a disussion.

AlsoIwouldliketothankPh.D.StudentFrankPedersenforbeinginvolved

intheprojet andsupplying manygood ideas.

Lyngby, August2,2004

Pernille Brok

(4)

(5)

Abstrat

Thesubjetofthis masterthesis isnon-linear optimization usingtheSpae

Mappingmethodwithan interpolating surrogatemodel.

The Spae Mapping method is useful in optimization problems, where the

ne model we wish to optimize is very omputationally expensive. The

interpolating surrogate is based on a heap oarse model and serves as a

replaement for the expensive model in order to minimize the number of

funtion evaluations.

An important part of the Spae Mapping algorithm is the Parameter Ex-

tration, whih involvesminimization of the residual between thesurrogate

andthenemodel,whihweaimtoalign. TheParameterExtration prob-

lem does not always have a unique solution, and dierent formulations are

presentedinorderto ensure this uniqueness.

Thethesisprovidesapresentationofthemathematialtheoryfollowedbythe

SpaeMappingalgorithm. Wethenmakeanumberoftheoretialand pra-

tialinvestigationsonerningdierent formulations oftheresidual dening

theParameter Extration problem.

The step length in forward dierene approximations is analyzed, and the

optimalsteplengthsuitedfortheonsideredproblemsisfoundtobeapprox-

imately

10 ⁻ ⁵

^. ^Wê^makeânânalysisôf^the ^solutions^to underdeterminedand overdetermined problems, hereby an analysis of the Marquardt equations

and of leastsquares problemswith and withoutweighting fators. We look

at theeetof addinga regularization termto theresidual vetor andnd,

thatthisresidualformulationorrespondstoaspeialaseoftheMarquardt

equations withthe damping parameter

1 + µ

^.

ThepresentedSpaeMapping algorithmis testedinthevariousversions on

threetestproblems,andtheresultsareompared. Theonvergene isfaster

than withlassial optimization algorithms. Itis not possible to make gen-

eralonlusionsontheperformaneofthedierentalgorithmversionsbased

onthe inluded testproblems.

Key words: Spae Mapping, non-linear optimization, interpolating sur-

(6)

overdetermined problems.

(7)

Resumé

Detteeksamensprojektomhandlerikke-lineæroptimeringmedbrugafSpae

Mapping-metoden medinterpolerendesurrogater.

Spae Mapping-metoden eranvendelig ioptimeringsproblemer ved optime-

ring af en n model, som er meget dyr beregningsmæssigt. Det interpo-

lerende surrogat er baseret på en billig grov model og erstatter den ne

modelioptimeringsproessen,hvorved vimindskerantalletaftidskrævende

funktionsevalueringer.

Et vigtigt delproblem i forbindelse med Spae Mapping-algoritmen er Pa-

rameter-Ekstraktion, som involverer minimering af residuet mellem surro-

gatet ogden ne model, somvi ønskerat mathe. Parameter-Ekstraktions-

problemet har ikke altid en entydig løsning, og vi præsenterer forskellige

formuleringer meddet formålat sikre en entydig løsning.

Projektet præsenterer den matematiske teori efterfulgt af Spae Mapping-

algoritmen. Herefter laves en række teoretiske og praktiske undersøgelser

vedrørendedeforskelligeformuleringerafresiduet,somdenererParameter-

Ekstraktions-problemet.

Skridtlængden i dierenstilnærmelser analyseres, og den optimale skridt-

længde, som er velegnet til de her betragtede problemer, bestemmes til

omkring

10 ⁻ ⁵

^. ^Vi ^analyserer ^løsninger ^til underbestemte og overbestemte problemer,herunderMarquardts ligningerog mindste-kvadratersproblemer

medog uden vægtfaktorer. Vibetragter eekten afat medtage etregulari-

seringsled i residuet, og nder at en sådan residue-formulering svarer til et

speialtilfælde afMarquardts ligninger meddæmpningsparameteren

1 + µ

^.

Deforskelligeversioner afden beskrevneSpaeMapping-algoritme afprøves

på tre testproblemer, og resultaterne sammenlignes. Konvergensen er hur-

tigere end for klassiske optimeringsmetoder benyttet direkte på den ne

model. Det er ikke muligt, at foretage generelle konklusioner om algorit-

mens præstationer påbasisaf deherinkluderede testproblemer.

Nøgleord : Spae Mapping, ikke-lineær optimering, interpolerende surro-

gater, mindste-kvadraters problemer, vægtfaktorer, underbestemte og over-

(8)

(9)

Contents

1 Introdution 1

1.1 Introdution to theSpaeMappingMethod . . . 1

1.1.1 Dierent SpaeMapping Tehniques . . . 2

1.2 ProblemFormulation . . . 2

1.3 MathematialIntrodution . . . 3

1.3.1 Overview ofThe SpaeMappingAlgorithm . . . 7

1.3.2 NewFormulationof the ResidualVetor . . . 7

1.4 Assumptions . . . 8

1.5 PreviousWorkand Implementation . . . 9

2 Implementation of The Spae Mapping Algorithm 11 2.1 TheSpaeMappingAlgorithm . . . 11

2.1.1 TheMainAlgorithm . . . 12

2.1.2 TheAlgorithmfor Surrogate Optimization. . . 16

2.1.3 TheAlgorithmfor Parameter Extration . . . 18

3 Theoretial and Pratial Investigations 21 3.1 FiniteDierene Approximation . . . 21

3.1.1 OptimalStepLength . . . 22

3.1.2 Resultsfromthe TLT2Problem . . . 26

3.2 TheMarquardtAlgorithm . . . 29

3.2.1 SingularValueDeomposition . . . 29

3.3 Regularization. . . 34

3.4 Variable NumberofMapping Parameters . . . 36

3.5 ThePenalty Fator . . . 39

(10)

3.7 The Normalization Fators . . . 43

4 Test Problems 45 4.1 Introdution . . . 45

4.1.1 TheTest Senarios . . . 46

4.1.2 VisualizationofThe Results. . . 48

4.2 The RosenbrokProblem . . . 50

4.2.1 Introdution. . . 50

4.2.2 LinearTransformation . . . 51

4.2.3 TheAugmented RosenbrokFuntion . . . 58

4.3 The TLT2 Problem . . . 63

4.3.2 TheResultsof theTest Runs . . . 64

4.4 The TLT7 Problem . . . 75

4.4.2 TheResultsof theTest Runs . . . 76

5 Future Work 81 5.1 Improvements oftheSMIS Implementation . . . 81

5.2 Suggestions for Further Investigations . . . 82

6 Conlusion 85 A Short User's Guide for The SMISFramework 89 A.1 The ProblemSetup-le. . . 89

A.2 Calling the SpaeMappingAlgorithm . . . 90

A.3 Plotting and ViewingData . . . 91

Symbols and Notation 93

(11)

Chapter 1

Introdution

1.1 Introdution to the Spae Mapping Method

TheSpaeMappingmethodisanoptimization methodusedforengineering

design problems. The tehnique is useful, when the model that we wishto

optimizeisomputationally expensive. Inthisasetheuseofalassialop-

timizationmethoddiretlyonthe nemodelwouldresultinalargenumber

offuntionevaluations,andisonsideredimpossibleinpratie. Thegoalis

to lowerthe number oftime-onsuming nemodelevaluations.

The Spae Mapping method relies on the existene of two funtions mod-

elling the same system: the ne model, whih is very time-onsuming to

evaluate, and a oarse model, whih is heap to evaluate. We wish to on-

strut a surrogate model based on the oarse model, and let the surrogate

serve as a replaement for the ne modelin the optimization proess. The

ne model is suessively evaluated in order to onstrut an interpolating

surrogate model, whihis thenused for optimization. Thesurrogate model

isat leastasaurate asthe oarse model. By aligningthesurrogatemodel

with thene model in more than one point we seek global aswell as loal

agreement of thetwomodels.

Theinterpolatingsurrogateisonstrutedasaomposedmappingonsisting

ofboth an inputand an outputmapping. Thismapping is theSpaeMap-

ping onneting the oarse model responses with the ne model responses.

Thedesign parameters aretransformed bytheinputmapping, andtheout-

put mapping orrets the surrogate to ensure exat agreement of the re-

sponses. Wealignbothfuntionvaluesandgradientsofthesurrogatemodel

with the ne model and hereby wish, that the surrogate provides a good

(12)

TheSpaeMapping-basedoptimizationalgorithmonsistoftwosub-problems:

•

^The optimization ofthesurrogate model.

•

^Theûpdateôf^the^surrogate,^the^so-alled^ParameterÊxtration,^whih

determines themapping parameters inorder to ensure the agreement

of thesurrogate andthene model.

1.1.1 Dierent Spae Mapping Tehniques

TheoriginalSpaeMappingformulation isdesribed in[4 ℄and [5℄andonly

involves aninput mapping

P : R ⁿ → R ⁿ

^,^where:

P(x) = arg min

z ∈R ⁿ k c(z) − f(x) k ² 2

^(1.1.1)

We referto (1.1.1) astheoriginalSpaeMapping denition.

TheSpaeMappingmethods withinputmapping anbeapproahed indif-

ferent waysin orderto ensure theuniqueness oftheSpae Mapping. A full

overview andfurther disussionsof thevariousversionsareprovided byJa-

ob Søndergaard in[7℄. When onlyinput mappings areused, we annot be

surethattheSpaeMappingtehniqueprovidesthenemodeloptimizerasa

solution, unlessertaintheoretialonditions aremet. Theseonditions are

stated in[7℄,hapter4.1. Theexat math between thene modeland the

oarse modelresponseis thereforenot likely intheoriginalSpaeMapping,

eventhoughthemappedoarsemodelanprovide agoodapproximationto

the nemodelover alarge region ofthe parameter spae.

When introduing an additional mapping, an output mapping, to dene

the surrogatemodel, we an ensurethe mathing, and herebyoveromethe

residualmisalignment. Onthatground theSpaeMappingtehniqueswith

both inputandoutputmappingsareto bepreferred. Withthesetehniques

the uniqueness of the Parameter Extration is still not ensured, whih is a

problemthatan be solved inmanyways.

ThisreportwillonlyworkwiththeSpaeMappingmethodwithboth input

andoutputmappings, providingan interpolating surrogatethatgivesexat

alignment withthene modelintheexpansion point.

1.2 Problem Formulation

The main subjet of this thesis is the Spae Mapping method based on

an interpolating surrogate. We wishto investigate dierent theoretial and

(13)

theParameter Extrationproblems. Onthis basiswepresenta SpaeMap-

ping algorithm and test the implementation of the algorithm on dierent

test problems.

Inthereportweanalyze the followingsubjets:

•

^Theapproximationerrorfromusingforwarddiereneapproximations asestimatesfor thederivatives, andhereby theoptimal steplength.

•

^Least ^squares^problems.

- TheMarquardt equations.

- Thesolution to theregularized problem.

- Thesolutionstounderdeterminedandoverdeterminedleastsquares

problemswith andwithout weights.

•

^Dierent formulations ofthe residual inorder to ensure uniquenessof theParameter Extration.

- Redution ofthenumber ofinputmapping parameters.

- The eet of using a regularization term in the Parameter Ex-

trationproblem.

- TheeetofusingweightingfatorsintheParameterExtration

problem.

- The eet of using normalization fators in the Parameter Ex-

trationproblem.

The mathematial theoryof the SpaeMapping method with interpolating

surrogate is introdued, and the Spae Mapping algorithm is presented in

pseudo-ode. We then onsider the theoretial and pratial investigations

of thesubjets above. Thevarious versions of the algorithm aretested nu-

merially on three problems. Finally suggestions for future investigations

areproposedbylisting some unresolved matters.

1.3 Mathematial Introdution

We areaimingat solving an optimization problemoftheform:

x ^∗ = arg min

x ∈R ⁿ { H (f (x)) }

where

H : R ^m → R

îs â ^suitable ôbjetive ^funtion, ând

x ^∗ ∈ R ⁿ

^is ^the

optimal setofdesign parameters.

We assume, that two models of the same system are available: A ne but

expensivemodel,givenby

f : R ⁿ → R ^m

ând âôarse^but ^heap ^model^given

by

c : R ⁿ → R ^m

^. ^The ^funtion ^vetors ^fromâ ^given ^parameter ^set âre âlso

denotedresponsevetors.

The surrogate model

s : R ⁿ → R ^m

îs ^dened ^byâ ômposite^mapping: ^Fôr

(14)

eahof the

m

^responses ^we ^dene ^the^input ^mapping

P i : R ⁿ → R ⁿ

^,^whih

performs a linear transformation of the design parameters, and the output

mapping

O : R ^m → R ^m

^,^whih ^transforms ^the^oarse ^model^response. ^The

aimis toalign the surrogate withthe ne modelfor all

m

^responses.

Theinputand outputmapping parameters for

i = 1, . . . m

^are:

A i ∈ R ⁿ ^× ⁿ , b i ∈ R ⁿ , α i ∈ R , β i ∈ R .

Thelineartransformation

P i

^for ^the

i

^th^response^funtion^is^now^dened ^as:

P _i (x) = A _i x + b _i

^(1.3.1)

andthe output mapping

O i

^as:

O i (y) = α i (y i − y ¯ i ) + β i

^(1.3.2)

where

y ¯

îsâ ônstant ^vetor. ^Gathering ^theînputând ôutput^mappings^we

have:

P =





 P

^T

₁

.

P

^T

_m





 , O =





 O ₁

.

O m







Theinterpolating surrogatemodel isnowdened bythe omposition:

s = O ◦ c ◦ P

^(1.3.3)

When inserting the expressions for the input and output mappings we get

the surrogatemodel forthe

i

^th ^response ^given ^by:

s _i (x) = O _i (c _i (P _i (x)))

= α _i (c _i (P _i (x)) − c _i (P _i (¯ x))) + β _i

= α i (c i (A i x + b i ) − c i (A i x ¯ + b i )) + β i

We wish to align the responses of the surrogate model with thene model

inall

m

^sampling ^points. ^In^the

k

^th ^iteration

x ^(k)

^we ^must^thereby^have:

s ^(k) (x ^(k) ) = f (x ^(k) )

^(1.3.4)

where

s ^(k)

^denotes ^the ^surrogate ^usedⁱⁿ^the

k

^th ^iteration. ^W^e furthermore wantthesurrogatemodeltoapproximatethenemodelatpreviousiteration

points. An additional riterion for hoosing the mapping parameters is to

aim for agreement of the Jaobians ofthe ne model (denoted

J f

⁾ ^and ^the

(15)

surrogate model (denoted

J s

⁾ ⁱⁿ^the ^urrent ^iterate. ^This ^leeds ^to ^the ^two

equations:

s ^(k) (x ^(j) ) = f (x ^(j) )

^for

j = 1, . . . , k − 1

^(1.3.5a)

J ^(k) _s (x ^(k) ) = J _f (x ^(k) )

^(1.3.5b)

Equations (1.3.4) and (1.3.5) ensure the alignment of the surrogate model

andthenemodelbothbothwrt.thefuntionresponsesandtheJaobians

in the urrent iterate as well as wrt. the funtion responses in all previous

iterates. The goal is to have both loal and global agreement of the mod-

els. The loal agreement isensured by (1.3.4) and (1.3.5b),and theglobal

agreement by(1.3.5a) .

The initial values of the mapping parameters an be hosen as follows:

We wish to start the iterations in the oarse model optimizer

z ^∗

^, ^so ^that

x ⁽¹⁾ = z ^∗

^. ^In^iteration

0

^we ^therefore^want ^the^surrogate^model^to^be^identi-

alto theoarse model, whih isensured by hoosing theinputand output

mapping parameters as:

A ⁽⁰⁾ _i = I b ⁽⁰⁾ _i = 0 α ⁽⁰⁾ _i = 1

β _i ⁽⁰⁾ = α ⁽⁰⁾ _i c i (P ⁽⁰⁾ _i (x ⁽⁰⁾ ))



 

 

 

 

for

i = 1, . . . , m

^(1.3.6)

Inthis waythe

i

^th^response^of ^the

0

^th ^surrogate^beomes:

s ⁽⁰⁾ _i (x) = α ⁽⁰⁾ _i c i

P ⁽⁰⁾ _i (x)

− c i

P ⁽⁰⁾ _i (x ⁽⁰⁾ )

+ α ⁽⁰⁾ _i c i (P ⁽⁰⁾ _i (x ⁽⁰⁾ ))

= α ⁽⁰⁾ _i c _i

P ⁽⁰⁾ _i (x)

= c _i (x)

^(1.3.7)

Sinethe oarse modelis assumedto be heap to evaluate,theoptimizer is

foundbya standard optimization algorithm.

In the following iterations the mathing (1.3.4) is ensured by hoosing the

outputmapping parameters

α _i

^and

β _i

^and^the^onstant

¯ x

ⁱⁿ^an appropriate way. Byputting

x ¯ ^(k) = x ^(k)

^we^have ^the

i

^th^surrogate ⁱⁿ^the

k

^th ^iteration:

s ^(k) _i (x) = α ^(k) _i c i

P ^(k) _i (x)

− c i

P ^(k) _i (x ^(k) ) + β ^(k) _i

Byinsertingthe iterate

x ^(k)

ⁱⁿ^the^surrogate^funtion^and^thenⁱⁿ^(1.3.4)^we

ndthevalue of

β _i ^(k)

^to ^be:

(16)

α ^(k) _i c i

P ^(k) _i (x ^(k) )

− c i

P ^(k) _i (x ^(k) )

+ β _i ^(k) = f i (x ^(k) )

⇒ β _i ^(k) = f i (x ^(k) )

The

i

^th ^response^of ^the interpolatingsurrogate isnowgiven by:

s ^(k) _i (x) = α ^(k) _i c i

P ^(k) _i (x)

− c i

P ^(k) _i (x ^(k) )

+ f i (x ^(k) )

^(1.3.8)

whih is valid for all

k > 0

^.

Beause of the hoie of

x ¯

^the ^math ^(1.3.4) ônly ^depends ôn ^the ôutput

parameter

β i

^,^and^the

α i

^'s^must ^be ^hosen appropriately basedon(1.3.5) . Ineahiterationthe next setofdesignparameters

x ^(k+1)

^are^found^by^min-

imizingthe surrogate(1.3.8) dened bythemapping parametersof thepre-

viousiteration:

x ^(k+1) = arg min

x ∈R ⁿ

n H

s ^(k) (x) o

(1.3.9)

It must be laried, that the new iterate

x ^(k+1)

îs ônly âepted, îf ît ^pro-

duesadereaseintheobjetive funtion ompared totheprevious iterate,

ie.if

H(f (x ^(k+1) )) < H (f(x ^(k) ))

^. Îf^thisîs^not^theâse,^theâlignments^(1.3.4)

and (1.3.5b) must be made with respet to the previous (and so far best)

iterate. Theunaeptediterateisonlyusedintheglobalalignment equation

(1.3.5a),andmustnot satisfythegradient math. To handlesuhan uphill

stepregardingthenemodelobjetiveweuseatrust regionmethodforthe

surrogateoptimization.

When theiterate

x ^(k+1)

^is ^now ^available, ^the

(k + 1)

^th ^set ^of ^mapping ^pa-

rametersmustbefound. Theresponsealignment (1.3.4)isalready ensured.

In order to satisfy the additional mathing (1.3.5) we dene the residual

funtionfor the

i

^th ^response:

r ^(k+1) _i (A i , b i , α i ) =







s ^(k+1) _i (x ⁽¹⁾ , A i , b i , α i ) − f i (x ⁽¹⁾ )

.

s ^(k+1) _i (x ^(k) , A _i , b _i , α i ) − f i (x ^(k) ) J ^(k+1) _s,i (x ^(k+1) , A _i , b _i , α _i ) − J _f,i (x ^(k+1) )







(1.3.10)

where

J _s,i

^and

J _f,i

^are^the^gradients^of

f _i

^and

s _i

^wrt.

x

^,^ie. ^the^transpose^of

the

i

^th ^rows^of ^the ^Jaobians^of ^the^ne ^resp. ^the^surrogate ^model^wrt. ^the

x

^vetor. ^W^e ^nd ^the ^next ^set ^of ^mapping ^parameters ^by ^minimizing ^the

residual:

n

A ^(k+1) _i , b ^(k+1) _i , α ^(k+1) _i o

= arg min

A i ,b i ,α i k r ^(k+1) _i (A i , b i , α i ) k

^(1.3.11)

(17)

insome norm for all

m

^responses. ^This^proess ^of ^updating ^the^parameters

isalledParameter Extration.

The iterations ontinue inthis way with optimization of theurrent surro-

gatefollowedbytheParameterExtration,untilthesolutionisfoundwithin

a satisfying auray. Appropriate stopping riteria ould be based on the

relative hange inthe solutionvetor or intheobjetive funtion.

1.3.1 Overview of The Spae Mapping Algorithm

Based on the previous setion we an now summarize the Spae Mapping

method with the interpolating surrogate. The algorithm for solving the

optimization problemisoutlined asfollows:

1. Given oarse modeland ne model.

2. Set

k = 0

^,^hooseînitial ^guess ^for ^the ôarse ôptimizer

x ⁽⁰⁾

^. ^Initialize

input and output mapping parameters

A ⁽⁰⁾ _i = I

^,

b ⁽⁰⁾ _i = 0

^,

α ⁽⁰⁾ _i = 1

and

β _i ⁽⁰⁾ = α ⁽⁰⁾ _i c i (x ⁽⁰⁾ )

^for

i = 1, . . . , m

^.

3. Optimizethesurrogatemodel(1.3.8)tondthenextiterate

x ^(k+1)

^by

solving (1.3.9) .

4. Compute

f (x ^(k+1) )

^and

J _f (x ^(k+1) )

^. ^Chek ^stopping ^riteria ^and ^ter-

minateifsatised.

5. Updatemappingparameters

A ^(k+1) _i

^,

b ^(k+1) _i

^and

α ^(k+1) _i

^for

i = 1, . . . m

by (1.3.11)withtheresidual given by(1.3.10) .

6. Set

k = k + 1

^and ^go ^to ^step^3.

1.3.2 New Formulation of the Residual Vetor

Inthepratial implementation oftheSpae Mappingalgorithm weusedif-

ferent versions of theresidual vetor for the Parameter Extration. This is

doneinordertoensurethattheproblemhasauniquesolution,andthatthis

solutionshouldsatisfytheloalandglobalagreementbetween thesurrogate

andthe nemodel, thatweaim for.

Onthatgroundweherebydene theresidual:

r ^(k+1) _i (A i , b _i , α i ) =







w ₁ · a ₁ ·

s ^(k+1) _i (x ⁽¹⁾ , A i , b i , α i ) − f i (x ⁽¹⁾ )

.

w k · a k ·

s ^(k+1) _i (x ^(k) , A i , b i , α i ) − f i (x ^(k) ) σ · d ·

J ^(k+1) _s,i (x ^(k+1) , A _i , b _i , α _i ) − J _f,i (x ^(k+1) )







(1.3.12)

It isnoted, thatthedimension of

r _i

^is

(k + n)

^,^when ^we ^have ^found

(k + 1)

x

^-iterates. ^The ^fators

a ₁ , . . . , a k

^and

d

^are normalization fators used for

(18)

avoiding salingproblems,inasetheresponsesarenot ofthesame orderof

magnitude.

The

w

^-fators^are^weighting ^fators^usedⁱⁿ^the^rst

k

^elements ^of^the^resid-

ual.

Thefator

σ

îsâ^penalty^fatorônly^multipliedôn^the^last

n

^elements^of^the

residualvetor. Theweightingfatorsandthepenaltyfatorhavethesame

eet, but we distinguish between them, beause the fators have dierent

purposes.

Theaim of the weighting fatorsisto give an individual priority to eahof

the iteration points intheresidual. Inthis waywean distinguishbetween

points far from the urrent iterate and points loser to the urrent iterate

andmake theglobal agreement moreor lessaurate inapartiular point.

The penalty fator is used to give the alignment of thegradients a ertain

priority ompared to the funtion value alignments in the previous points.

Ifweinrease

σ

^,^weânênsure,^that^the^gradientsⁱⁿ^theûrrent^point^math.

(1.3.12)isequivalent to(1.3.10) ,ifwe put allfators

a ₁ , . . . , a k

^,

w ₁ , . . . , w k

^,

d

^and

σ

^equal^to

1

^. ^The^theoryôf^setion^1.3ând^the^summarizedâlgorithm

in 1.3.1 are hereby still valid, when we use the residual (1.3.12) instead of

(1.3.10). The new residual is equivalent to the residual (1.3.10) multiplied

bya diagonal matrix.

Throughoutthe restof thereport theresidual we useisgiven bythedeni-

tion(1.3.12) . Ifnothingelse ismentioned thefators

a ₁ , . . . , a _k

^,

w ₁ , . . . , w _k

^,

d

^and

σ

^have ^the ^v^alue

1

^. ^The ^rst

k

êlements ôf ^the ^residual âre ^referred

to asthe funtion value residual,whereas thelast

n

êlements âreâlled^the

gradient residual.

1.4 Assumptions

Inoptimization problems therean oftenbe severaloptimizers, both global

andloal. TheSpaeMappingmethodisnotaglobal optimizationmethod,

and dependingon the problem, we annot be sure, that thefound solution

isthe global optimizer, or even thatthis optimizerisunique.

Anumberofonditionsmustbesatisedinordertondaminimizerbythe

SpaeMappingmethod. Theseonditionsaredisussedindetailsin[7℄,and

arenot thesubjetof this report.

We assumethefollowing:

•

^The ^sets

{ x ^∗ } = arg min { H(f(x)) }

^and

{ z ^∗ } = arg min { H(c(x)) }

^are

(19)

•

^The ^oarse ^model ^optimizer

z ^∗

^and ^the ^ne ^model ^optimizer

x ^∗

^are

unique.

•

^All^variables ^are^real.

•

^The ôarse ^model ^funtion ând ^the ^ne ^model^funtion âre^both ôn-

tinous andat leastone timedierentiable.

•

^Theêvaluation^time ^for ^theôarse ^model îsnegligible.

1.5 Previous Work and Implementation

The Matlab programs made in onnetion with this thesis are working

in the existing SMIS (Spae Mapping Interpolating Surrogate) framework

implemented by Frank Pedersen. The framework is programmed in Mat-

lab,but also involves someFortransubroutines olleted intheF-pakage.

ThisinludedF-pakageontainsdierentalgorithmsforthesolutionofon-

strained and unonstrained non-linear optimization problems. A detailed

desriptionof theFortran subroutines isfound in[13℄.

The SMIS framework by Frank Pedersen inludes a number of algorithms

for solving optimization problems with the Spae Mapping Method. The

dierent versions of thealgorithms are plaed in their owndiretory orre-

spondingtothe partiularformulationofthealgorithm. Theproblemsused

to test the algorithms arealso plaed ineah of their own diretories. Fur-

thermore the framework ontains a number of diretories withbasi tools,

suh as forward dierene approximations, plot funtions et. A new tool-

box has been added, this is the immoptibox programmed by Hans Bruun

Nielsen [12℄. The framework an be augmented by addinga new algorithm

or a new test problem plaed in the proper new Matlab diretory in the

properMatlab searhpath.

All Matlab ode programmed during theworking period of this report is

available at IMM's homepage,see [15 ℄. Someof the program les aremod-

ifations or augmented versions of existing ode, and some les are made

fromsrath.

AppendixA providesashort user'sguidefor the SMISframework, yetonly

theimplementationsand test problems usedinthisreport areinluded.

(20)

(21)

Chapter 2

Implementation of The Spae

Mapping Algorithm

Inthishapterthe SpaeMappingalgorithmwill beoutlinedanddisussed.

The method an be parted into three algorithms: The main algorithm and

twosub-algorithms. Themain algorithmissummarized inthepreviousse-

tion, and is referred to as Algorithm 1. Eah iteration in this algorithm

onsistsoftwo optimization proedures:

Therstoptimization probleminvolvesndingthenextiteratebyminimiz-

ing the surrogate model dened by the urrent mapping parameters. This

sub-algorithm isalledAlgorithm 2.

Theseondoptimization proedure -the ParameterExtration - onsistsof

m

optimization problems eah giving a new set of mapping parameters for theorrespondingsurrogate modelresponse. Algorithm3 isusedineah of

the

m

^Parameter ^Extration ^problems.

The three algorithms will be presented inpseudo-ode in thenext setions

followed byomments ontheinvolved parameters and proedures.

2.1 The Spae Mapping Algorithm

The algorithms follow the theoretial introdution in setion 1. Algorithm

1 builds on the Matlab implementation byFrank Pedersen, but has been

hanged to a ertain extent. The optimization problem in Algorithm 2 is

solved by alling a Fortran subroutine from the F-pakage. The algorithm

is idential to the original and has not been altered. The algorithm is not

disussedindetailsandispresentedheretogiveafulloverviewoftheSpae

(22)

anddierentformulationomparedtotheexistingframeworkbyFrankPed-

ersen.

Before presenting themainSpaeMapping algorithm we dene:

p

^: ^Denes ^the ^norm ^used ⁱⁿ ^the ^objetive ^funtion

H : k·k

^p^. ^Possible

valuesare

1

^,

2

^or

∞

^,^where^the ^latter^(minimaxoptimization) isused throughout this report.

∆

^: ^Trust^region ^radius ^usedⁱⁿ^Algorithm^2.

F

^: ^The ^objetive ^funtion ⁱⁿ^theoptimization problem.

Thestopping riteria aredened byanumber ofoptional parameters:

max _f1

^: ^Maximal ^number ^of^funtion evaluations inAlgorithm1.

max _f2

max _f3

ε F

^: ^Usedⁱⁿ^stopping ^riterion ^for ^the^objetive ^funtion.

ε K

^: ^Usedⁱⁿ^stopping ^riterion ^for ^the^gradient ^mathing.

ε _hx

^: ^Usedⁱⁿ^stopping ^riterion ^for ^the^step ^length^for

x

^-iterates.

ε hp

^: ^Usedⁱⁿ^stopping ^riterion ^for ^the^step ^length^for

p

^-iterates.

Thevaluesfor theparameters used inthe stopping riteria mustbe dened

inthe problemsetup-le,for more detailssee Appendix A.

2.1.1 The Main Algorithm

The surrogate

s

^is ^given ^by ^(1.3.8) ^and ^the

i

^th ^residual ^funtion

r _i

^by ^the

general formulation(1.3.12) .

Inthe algorithm thesupersript indexes

(k)

for iteration numbersare omit-

tedto simplify the pseudo-ode. The lower index 'new' orrespondsto the

upperindex

(k+1)

, for referenes to theformulation of thetheoryin setion

1.3. It is assumed, that the surrogate model and the residual funtion in

(23)

Algorithm 1: Main Algorithm for Spae Mapping Iterations

k = 0; stop = 0; x ∈ R ⁿ ; ∆ = 10 ⁻ ¹ · k x k 2

A i = I; b i = 0; α i = 1; β i = α i c i (x)

^for ^i=1,...,m

while not

stop

Find

h _new = arg min _k _h _k ₂ _≤ _∆ k s(x + h) k

^p ^by^Algorithm^2.

Evaluate

x new = x + h new

^,

S new = k s(x new ) k

^p ^and

dS = S new − F

Chekstopping riteria

dS ≥ 0

^and

k h new k 2 < ε _hx · ( k x k 2 + ε _hx )

Evaluate

f _new = f(x _new )

^and

J _f,new = J _f (x _new )

^and

F _new = k f _new k

^p

dF = F new − F ; ρ = dF/dS

k = k + 1

Add

x _new

^and

f _new

^to ^sorted ^internal datastruture

Active = | ∆ − k h _new k ∞ | < 10 ⁻ ² ∆

if

dF < 0

x = x _new ; f = f _new ; J _f = J _f,new ; F = F _new

end

Chekstopping riteria

dF < ε F

^and

k ≥ max _f1

if

ρ > 0.5 & Active

∆ = ∆ · 2

else if

ρ < 10 ⁻ ⁴

∆ = ∆/3

end

for i = 1:m

Find

{ A i,new , b i,new , α i,new } = arg min { 1/2 · r

^T

_i r i }

^by^Algorithm^3.

Set

{ A _i , b _i , α i } = { A _i,new , b _i,new , α i,new }

end

Someremarksto Algorithm1 aregivenbelow:

Initialization

Theinitial guessfortheoarsemodeloptimizeris

x

^,^and^theinitializationof theparameters orrespondsto theformulation insetion 1.3. The elements

oftheorrespondingsurrogatemodelaregivenby(1.3.7)andareidentialto

theoarse model elements. Therst optimization beforeentering the main

loop herebygivesthe oarse modeloptimizer. The valueof theinitial trust

region isreommended to be

∆ = 10 ⁻ ¹ · k x k 2

^aording ^to ^{[13 ℄,} ^but ^an ^be

(24)

Optimization of the Surrogate

Inthemainlooptheoptimizerofthe urrent surrogatefuntion(denedby

theurrentmapping parameters)isfound. Thestep

h _new

^and^the^objetive

funtion gain ompared to the previous iterate is alulated. The formula-

tion of the interpolating surrogate makes sure that

s ^(k) (x ^(k) ) = f (x ^(k) )

^, ^so

that

S ^(k) = F ^(k)

^.

Stopping Criteria

Two stopping riteria areheked at thispoint:

Thehange inthe surrogatefuntion must be negative, ifnot thenew iter-

ateisatually a worse solutionthan the previousone, and we want to exit.

Thisstopping riterion is inludedasasafetyto avoidan inniteloop. The

optimization algorithm for the surrogate model uses a desent method, so

weareensuredadereaseof

dS

^. ^If

dS = 0

^we^annot^improve^the^surrogate,

andwe exittheloop.

The seond stopping riterion is onerning the relative step length and is

denedbytheoptionalparameter

ε hx

^. ^Theformulationmakessurethatthe riterion is also useful, when

k x k 2

îs ^lose ^to ^zero. Îf ^the ^solution ^vetor îs

tooloseto thepreviousone,wehavenotahievedmoreinformation toget

anew setofmapping parameters, and we arestukat theurrent iterate.

Cheking these two stopping riteria at this point of the algorithm makes

sure, thatunneessaryevaluations ofthe ne modelareavoided.

Gain Ratio

Weontinue themain loopwithevaluationsof

f

^,

J _f

^and^the^objetive^fun-

tion

F

ⁱⁿ^the^new^iterate. ^The^gain^ratio

ρ

^is^the^ratio^between ^the^true^gain

and the predited gain. It serves as a measure for how well the surrogate

modelapproximatesthenemodel,andisusedforupdatingthetrustregion

radius

∆

^.

Internal Data Struture

The iterate and the orresponding funtion vetor and Jaobian are added

(25)

F

^: ^The ôbjetive ^funtions ôf ^the îterates ^number

1

^to

k

^sorted ⁱⁿ ^as-

endingorder.

X

^: ^Matrix^with^the^iterates ^sorted ^aording ^to

F

^.

dX

^: ^Row ^vetor^with^the ^norms^of ^the ^distanes ^from^the^sorted ^iterates

in

X

^to ^the ^best^iterate.

By this sorting the rst element in

F

^will ^be ^the ^best ^objetive ^funtion

value so far, the rst olumn in

X

^will ^be ^the ^best ^iterate ^so ^far, ^and ^the

rstelement in

dX

^will^be

0

^. ^The^dataîsûsedⁱⁿ^the^ParameterÊxtration

problemandalsofordoumentationandplottingafterendedSpaeMapping

iterations.

The Ative Flag

Thisag is ative (

= 1

^),îf ^the^new^solution îs ^lose ^to ^the^boundary ôf^the

trust region, in the sense that the length of the step must be in the open

interval

k h new k ∞ ∈ ]0.99 · ∆ , 1.01 · ∆[

^. În ^theory ît îs împossible ^to ^have

k h new k ∞ > ∆

^,^butⁱⁿ^pratie^roundingêrrorsân ^have ânêet. Ânâtive

agequalto

1

îndiates,^that^the ^trust^regionônstraintsâreâtive,ând^the

ag isusedlater for updatingthe trust region radius.

Update of the Iterate

If the objetive funtion has dereased, we wish to aept the new iterate,

anduse thisasan initial valueinthenext surrogateoptimization.

Stopping Criteria

Atthis point twoadditional stopping riteria areheked:

We use the hange in the objetive funtion to formulate thestopping ri-

terion:

dF < ε F

^. ^The ^relative ^hange ân ^be ûsedⁱⁿ âse

F

^is ^not ^lose ^to

zerointhe optimizer.

As a nal safety towards an innite loop we exit, if the number of main

iterationshasexeededthe maximum value

max _f1

^.

Update of the Trust Region

We usethe following updating strategy for the trust regionradius

∆

^:

Ifthegainratioislargerthan

0.5

^,ând^the^newîterateîs^lose^to^the^bound-

(26)

aryof the trust region, we inrease thetrust region radius bya fator

2

^. ^A

largevalueof

ρ

îndiates,^that^the^surrogate^servesâsâ^goodapproximation to the ne model. Sine the ative ag is

1

^,^we ^have ^taken ^the ^largest ^step

possible inthe latestiteration. We would thenlike to inrease

∆

^,^and ^take

longersteps.

If

ρ

îs ^small, ^the ^surrogate îs â ^poor approximation to the ne model, and we want to take smaller steps. A derease of the trust region is made, if

the gainratio is smallerthan thevalue

10 ⁻ ⁴

^. ^This^ondition ^is ^quite^strit,

beause of the fat that the surrogate has not yet been updated. By the

update of the mapping parameters to follow, we hope that the surrogate

model isimproved. It is also important, that thetrust region doesnot get

too small, sine the optimization proedure of the surrogate model would

thengetrestrited.

Update of the Mapping Parameters

Themapping parameters areupdatedbyAlgorithm 3.

Eah iteration in the main loop involves one evaluation of the ne model

funtion vetor and one evaluation of the ne model Jaobian. The itera-

tionounter

k

îs^thenêqual ^to ^the^number ôf

f

^- ^and

J _f

-evaluations. Alsoa numberof oarse model evaluations (through the surrogate evaluation) are

performed. Sine these are onsidered heap ompared to the ne model

evaluations,weareonly interested inthenalamount ofnemodelevalua-

tions.

2.1.2 The Algorithm for Surrogate Optimization

Thealgorithm usedto solve the optimization ofthesurrogate isoutlinedin

pseudo-ode in the box below. Sine we are mainly onerned with Algo-

rithm1and 3 inthis report, Algorithm2is only disussedbriey. We note

that the iteration ounter is used independently of the iteration ounter of

(27)

Algorithm 2: Sub-algorithm for the Surrogate Optimization

Given global trust regionradius

∆

^and ^initial ^parameter ^vetor

x ⁽⁰⁾

Linear inequalityonstraints:

A ˆ = [ I; − I]

b ˆ = [ − x ⁽⁰⁾ + ∆; x ⁽⁰⁾ + ∆]

Callto funtion minnonlin.

Callto Fortran subroutine (non-linear optimization inthenorm

p

^).

Initialloal trustregion radius

∆/4

stop if

j ≥ max _f2

or if

h < ε hx · k x k 2

Initialization

The trust region radius

∆

îs ^given ^byÂlgorithm ^1. ^To ênsure ^that ^theôp-

timizerofthe surrogate funtionisinsidethetrust region,we introdue the

linearinequality onstraints given by:

x +

− x ⁽⁰⁾ + ∆

≥ 0

^and

− x +

x ⁽⁰⁾ + ∆

≥ 0

whihis equal to the onditions:

x ≥ x ⁽⁰⁾ − ∆

^and

x ≤ x ⁽⁰⁾ + ∆

See the user's guide inAppendix A for more information onhow to handle

theasewhere theoriginal minimization problemisonstrained.

Funtion Call to minnonlin and Fortran Subroutine

This funtion is a helping funtion that, depending on the problem type,

makesanotherfuntionalltotheproperFortransubroutine,see[13℄. Whih

optimization algorithm isuseddependson thenorm

p

^,ⁱⁿ^whih^we ^want ^to

minimize the surrogate funtion, and of wether the objetive funtion is a

salar funtion or a vetor funtion. Beause of the trust region approah

theoptimization problemisalwaysonstrained.

Someof the Fortran subroutines use anoptimization method withtrust re-

gion, inthis asethe initial loaltrust regionis setto

∆/4

^. ^This^loal^trust

regionhasnothingto dowiththe global trustregion

∆

^,^whih ^ensures,^that

(28)

Stopping Criteria

The Fortran subroutines require two optionalparameters used in thestop-

pingriteria of the algorithm: The maximum number of iterations allowed

(

max f 3

^, ^and ^the ^parameter

ε hx

^whih ^is ^used ⁱⁿ ^regard ^to ^the ^step ^length.

Thealgorithm stops,whenitsuggestsa steplength

h

^,^when

h < ε hx · k x k 2

^.

2.1.3 The Algorithm for Parameter Extration

The third algorithm, whih performs theParameter Extration for eah of

the

m

^response^funtions, ^is^outlined^below.

Algorithm 3: Sub-algorithmfor Parameter Extration

Given initial parametervetor

p _k = [A(:); b; α]

The objetive funtion is

r

^and ^the ^last

n

^rows ^of

r

^are ^denoted

g

^.

The options inmarquardtare given byopts.

j = 0; stop = 0; σ = 1 K = k g k ∞ ; K r = k r k ∞

while not stop

j = j + 1

Find

p new = arg min p { 1/2 · r(p)

^T

r(p) }

^by^marquardt ^with^opts

r new = r(p new ); K new = k g new k ∞ ; K r,new = k r new k ∞

h = p _new − p; Accept = K _new < K

if

Accept

p = p new ; K = K new ; K r = K r,new

end

if

K new < ε K

stop = 1;

^break

else if

h < ε _hp · ( k p k 2 + ε _hp ) stop = 2;

^break

else if

j ≥ max f 3

stop = 3;

^break

else if

σ ≥ 10 ³ stop = 4;

^break

end

if

σ < 10 ³ σ = σ · 10

end

(29)

This algorithm is onsidered independently of Algorithm 1 and 2, and any

referenes toiterations only onernthepresent Algorithm3.

Initialization

We have given initial sets of the mapping parameters

A

^,

b

^and

α

^orre-

sponding to an arbitrary response. The mapping parameters are arranged

inthevetor

p k

^,^where

k

^denotes ^the ^mapping ^parameters ^dening^the

k

^th

surrogate model. The residual funtion for the response is given by equa-

tion (1.3.12) . The options used in the marquardt-funtion are set in the

5

^-element ^vetor^opts ^where:

opts(1) : Denes theinitial valueof the Marquardtparameter:

:

µ ₀ =

^opts(1)

max { (J

^T

₀ J ₀ ) _(i,i) }

^.

opts(2) : Parameter usedinstopping riteria for thegradient:

:

k F ^′ (p) k ∞ ≤

^opts(2).

opts(3) : Parameter usedinstopping riteria for thesteplength:

:

k dx k 2 ≤

^opts(3)

(

^opts(3)

+ k p k 2 )

^.

opts(4) : Maximal numberof iterations

opts(5) : Lowerbound on

µ

^:

µ =

^opts(5)

max { (J

^T

J) _(i,i) }

^.

Thepenalty fator

σ

^used ⁱⁿ^the ^gradient ^residual ^isinitialized to

1

^.

K

^is ^a

measure of the violation of thegradient math, and

K r

^of ^the^mathing ^of

thefullresidual.

Optimization of the Residual

The urrent residual is optimized by the Matlab-funtion marquardt im-

plemented by Hans Bruun Nielsen to nd the next parameter vetor. This

new solution isa resultof a number ofMarquardt steps, and theiterations

endedbeauseoneofthestopppingriteriasmentionedabovewasativated.

The Marquardt algorithm is disussed further in setion 3.2. The residual

funtion vetor is evaluated inthe new iterate, aswell asthenew values of

K

^and

K r

^and ^the^step^length.

The Aept Flag and Update of the Iterate

Sineweonsider the gradient math to be ofgreat importane,we usethe

fator

K

^to ^deide,^wether^the ^new^set^of ^parameters^is^better ^than^the^pre-

vious. The parameter vetor is aepted (

Accept = 1

^), ^only ^if^the ^gradient

(30)

math residual hasbeen dereasedompared to theprevious iteration.

Stopping Criteria

We hek four stopping riteria and exittheloop, ifeitherof them aresat-

ised. The gradient math is onsidered satised, if

K

^is ^smaller ^than ^the

value

ε K

^. ^Seondlyîf^the^step ^between ^two ônseutive îterates îs^relatively

small, the step length riterion is ativated. The third riterion stops the

loop,ifthe numberofiteration steps hasexeededthelimit

max _f ₃

^. ^Finally

we will only ontinue the iterations, while

σ

îs ^below ôr êqual ^to

10 ³

^. ^This

riterionisequivalenttousing

max _f3 = 4

^,^if^we^hoose^the^updating^strategy

below.

Update of the Penalty Fator

Forthepenaltyfator

σ

^weûseâ^simpleûpdating^strategy:

σ

^is^inreased^by

afator

10

^for êah^main îteration. În^this ^way^we^fore^the^gradient ^math

to beome ofgreaterimportane than the otherelements oftheresidual.

(31)

Chapter 3

Theoretial and Pratial

Investigations

3.1 Finite Dierene Approximation

To run the SMIS algorithm the user must supply two Matlab-les with

implementationsoftheoarseandnemodelandtheirJaobians. Forsome

problems (inluding theTLT2 and TLT7 problems), no exat gradients are

available, and theJaobiansareinsteadalulated byeg.dierene approx-

imations. In the test problems investigated in this report, the Jaobian is

alulated by forward dierene approximations by the Matlab-funtion

diffjaobian, see [15℄, implemented by Jaob Søndergaard. In the im-

plementation of diffjaobian the step length is saled aording to the

independent variable

x

^,^so^that

h = η(1 + | x | )

^. ^Thisformulationisusefulin theasewhere

| x |

^is^very ^small,^and^we ^get

h ≃ η

^. ^The^value^of

η

^originally

hadthe xedvalue

η = √ ε M

^,^where

ε M

^is^the ^mahine^auray^, ^but

η

^an

be hanged bytheuser diretlyinthe m-le.

BytheinvestigationoftheTLT2problem,some interesting featuresregard-

ingthemodelfuntionswerediovered. Thene,oarse andsurrogatefun-

tions are smooth, but the gradients (partial derivatives) wrt. both

x

^and

p

alulatedfromdiffjaobianshowadierentpiture,whenthesteplength

issmall. Thismotivatedaninvestigationofthesteplengthindiffjaobian.

In the following setions the theory is simplied by looking at salar fun-

(32)

3.1.1 Optimal Step Length

TheJaobiansofboththeneandoarsemodelarealulatedbytheMat-

lab-funtiondiffjaobian. Inthefollowingtheinueneofthesteplength

usedinthe dierene approximation willbe analyzedbased on a salarex-

ample.

Given asalarfuntion

f : R → R

^weômpute â^forward ^diereneâpprox-

imationto thegradient of

f

ⁱⁿ^the ^point

x

^by:

D F (x, h) = f (x + h) − f (x)

h

^(3.1.1)

TheTaylorexpansion of

f

ⁱⁿ^the^expansion ^point

x

^is ^given ^by:

f(x + h) = f(x) + hf ^′ (x) + h ²

2 f ^′′ (x) + O(h ³ )

^(3.1.2)

Inserting(3.1.2) intheforward dierene approximation (3.1.1) ,we get:

D F (x, h) = f ^′ (x) + h

2 f ^′′ (x) + O(h ² )

Thetrunation error

E T

^is^now:

E T = D F (x, h) − f ^′ (x)

= f ^′ (x) + h

2 f ^′′ (x) + O(h ² ) − f ^′ (x)

= h

2 f ^′′ (x) + O(h ² )

(3.1.3)

and we seethat thetrunation error is

O(h)

^for

h → 0

^. ^If^we ^also^take ^the

roundingerrorsintoaount,wegettheoatingpointnumbers

f ¯ (x + h)

^and

f(x) ¯

^instead^of

f(x + h)

^and

f (x)

^:

f ¯ (x + h) = f (x + h)(1 + δ ₁ ), | δ ₁ | ≤ Kε M

f ¯ (x) = f (x)(1 + δ ₂ ), | δ ₂ | ≤ Kε M

where theonstant

K ≥ 1

^and

ε _M

^is^the ^mahine ^auray^.

Insertingthe above intheforward dierene approximation(3.1.1) we get:

D ¯ F (x, h) = f(x ¯ + h) − f ¯ (x) h

= f (x + h) − f (x)

h + δ 1 f (x + h) − δ 2 f (x) h

= D F (x, h) + E R

= f ^′ (x) + E T + E R

^(3.1.4)

(33)

where the rounding eror is denoted

E R

^. ^The ^worst ^possible ^rounding ^error

iswhen

δ ₁ f (x + h)

^and

δ ₂ f (x)

^have ^opposite^signs, ^giving:

| E R | ≤ Kε M ( | f (x + h) | + | f (x) | ) h

≃ 2K | f (x) | ε M

h

^(3.1.5)

Theabsolutetotalerrorisnowgiven bytheabsolutedierenebetween

D ¯ F

andthe realgradient:

| E | = | E _T + E _R | = | D ¯ _F (x, h) − f ^′ (x) |

≃ Ah + O(h ² ) + B ε M

h

^(3.1.6)

Theonstants

A

^and

B

^depend^on^the^funtion^values^and^seondderivatives intheneighbourhood of

x

^,

A ≃ ¹ ₂ | f ^′′ (x) |

^and

B ≃ 2K | f (x) |

^.

Usingtheabovewenowonsidertheasewhenapproximatingthegradients

wrt.theparameters in the

x

^-vetor. ^The ^gradients ^appear ⁱⁿ^the^Jaobians

of the ne, oarse and surrogate models inthe Parameter Extration prob-

lem. We onsider an arbitrary response funtion with no index on

f

^and

s

^. Êahôf ^the ^rows ôf ^the ^gradient ^residual ^(the^last

n

^rows ^of^the ^residual

from(1.3.12) ) hastheform:

g _i (x, p) = s ^′ _x _i (x, p) − f _x ^′ _i (x)

^(3.1.7)

x

îs â ôlumn ^vetor ^holding ^the ^design parameters, and the vetor

p

^is

holding the parameters

A

^,

b

^and

α

^. ^Tô ^simplify ^the âlulations ^we ôn-

sider only the

i

^th ^row ôf ^the ^gradient ^residual âsâ ^funtion ôf ^the^variable

vetors

x

^and

p

^(keeping ^all ^other^parameters ^than

x i

^xed). ^Without^loss

ofgeneralitywealso disregard thefators

σ

^and

d

^.

Theexatfuntion

g i (x, p)

^is^replaed^by^theapproximatedfuntion

G i (x, p)

returnedbytheMatlab-funtion,wherethediereneapproximationwrt.

x _i

andthe roundingerrorsgivesthe approximation to (3.1.7) :

G i (x, p) ≃ s(x + h _x e _i , p) − s(x, p) + Bε _M

h x − f _x ^′ _i (x)

=

s ^′ _x _i (x, p) − f _x ^′ _i (x) + h x

2 s ^′′ _x _i _x _i (x, p) + O(h ² _x ) + B ε M

h x

= g i (x, p) + Ah x + O(h ² _x ) + B ε M

h x

(3.1.8)

where

e i

îsâ ûnit^vetor ⁱⁿ^the

i

^th ^diretion ^and

h x

^is^the^step ^length. ^The

onstant

A

^depends ^on ^the ^seond ^derivative ^of

s

^, ^and

B

^depends ^on ^the

funtion values of

s

ⁱⁿ^the^interval

x ∈ [x , x + h]

^.

(34)

Byomparing (3.1.8) withtheexat funtion (3.1.7) ,we getthetotal error

foreah of thegradient residual rows:

E G (h x ) = E T + E R

≃ Ah x + O(h ² _x ) + B ε M

h x

(3.1.9)

wherethe dierentiation iswrt.

x _i

^for ^row^number

i

^,ând ^the^lowerîndexôf

x

^has^been ^omitted^for ^simpliity^.

When

h x

îs ^large ^the ^total êrror îs ^dominated ^by ^the ^trunation êrror ^(the

rst two terms), whereas the rounding error (the last term) dominates for

small

h x

^. ^T^o ^determine ^the ^optimal ^step ^length

h x,opt

ⁱⁿ^order ^to ^minimize

the total errorwe dierentiate (3.1.9)negligating the

O(h ² _x )

^term ^and^get:

E _G ^′ (h x ) = A − B ε M

h ² _x

^(3.1.10)

Equalizing (3.1.10) with

0

^we ^have ^the ^optimal ^step ^length ^minimizing ^the

errorfuntion:

E _G ^′ (h x ) = 0 ⇒ h x,opt = r

ε M

B

A

^(3.1.11)

InMatlab theunit round-o is

ε _M ≃ 10 ⁻ ¹⁶

^, ^so^the ^step ^length ^should ^be

around

10 ⁻ ⁸ q B

A

^to^give^the^smallest^errors. ^In^pratie^we^don'tdistinguish between the

x

^-variables ^and^hoose^one ^suitable^step ^length.

Withtypial values of thederivatives of thesurrogatemodel of:

s _x (x, p) ∼ 10 ⁻ ³

^,

s ^′′ _xx (x, p) ∼ 10 ⁻ ⁴

^we ^get ^the approximate value of the optimal step length

h x,opt ∼ 10 ⁻ ⁷

^. ^Here ^we ^have ûsed ^the ^vâlues ^for ^the ônstants

A = ¹ ₂ 10 ⁻ ⁴

^and

B = 2 · 10 ⁻ ³

^. ^The ^result ^agrees ^with ^gure ^3.1.1, ^where

theerrorfuntion(3.1.9)forthesevaluesof

A

^and

B

îs^plottedâsâ^funtion

ofthe step length

h x

^.

We now analyse the ase where the dierene approximation is made with

the

p

-parameters. We again onsider the

i

^th ^row ^of ^the ^gradient ^residual,

andlookat the gradient withrespet to the

j

^th ^parameterⁱⁿ^the ^vetor

p

^.

Allother variables arekept xed.

We nowhave the

(i, j)

^-element ^of^the^Jaobian:

IMM YGBY2004ESAESRETR.54 ei

IMM

10 − 5

1 + µ

10 − 5

1 + µ

•

•

P : R n → R n

P(x) = arg min

z ∈R n k c(z) − f(x) k 2 2

•

•

•

x ∗ = arg min

x ∈R n { H (f (x)) }

H : R m → R

x ∗ ∈ R n

f : R n → R m

c : R n → R m

s : R n → R m

m

P i : R n → R n

O : R m → R m

m

i = 1, . . . m

A i ∈ R n × n , b i ∈ R n , α i ∈ R , β i ∈ R .

P i

i

P i (x) = A i x + b i

O i

O i (y) = α i (y i − y ¯ i ) + β i

y ¯

P =





 P

1

P

m





 , O =





 O 1

O m







s = O ◦ c ◦ P

i

s i (x) = O i (c i (P i (x)))

= α i (c i (P i (x)) − c i (P i (¯ x))) + β i

= α i (c i (A i x + b i ) − c i (A i x ¯ + b i )) + β i

m

k

x (k)

s (k) (x (k) ) = f (x (k) )

s (k)

k

J f

J s

s (k) (x (j) ) = f (x (j) )

j = 1, . . . , k − 1

J (k) s (x (k) ) = J f (x (k) )

z ∗

x (1) = z ∗

0

A (0) i = I b (0) i = 0 α (0) i = 1

β i (0) = α (0) i c i (P (0) i (x (0) ))



 

 

 

 

 

 

i = 1, . . . , m

i

IMM YGBY2004ESAESRETR.54 ei

10 ⁻ ⁵

10 ⁻ ⁵

P : R ⁿ → R ⁿ

z ∈R ⁿ k c(z) − f(x) k ² 2

x ^∗ = arg min

x ∈R ⁿ { H (f (x)) }

H : R ^m → R

x ^∗ ∈ R ⁿ

f : R ⁿ → R ^m

c : R ⁿ → R ^m

s : R ⁿ → R ^m

P i : R ⁿ → R ⁿ

O : R ^m → R ^m

A i ∈ R ⁿ ^× ⁿ , b i ∈ R ⁿ , α i ∈ R , β i ∈ R .

P _i (x) = A _i x + b _i

₁

_m

 O ₁

s _i (x) = O _i (c _i (P _i (x)))

= α _i (c _i (P _i (x)) − c _i (P _i (¯ x))) + β _i

x ^(k)

s ^(k) (x ^(k) ) = f (x ^(k) )

s ^(k)

s ^(k) (x ^(j) ) = f (x ^(j) )

J ^(k) _s (x ^(k) ) = J _f (x ^(k) )

z ^∗

x ⁽¹⁾ = z ^∗

A ⁽⁰⁾ _i = I b ⁽⁰⁾ _i = 0 α ⁽⁰⁾ _i = 1

β _i ⁽⁰⁾ = α ⁽⁰⁾ _i c i (P ⁽⁰⁾ _i (x ⁽⁰⁾ ))

s ⁽⁰⁾ _i (x) = α ⁽⁰⁾ _i c i

P ⁽⁰⁾ _i (x)

P ⁽⁰⁾ _i (x ⁽⁰⁾ )

+ α ⁽⁰⁾ _i c i (P ⁽⁰⁾ _i (x ⁽⁰⁾ ))

= α ⁽⁰⁾ _i c _i

P ⁽⁰⁾ _i (x)

= c _i (x)

α _i

β _i

x ¯ ^(k) = x ^(k)

s ^(k) _i (x) = α ^(k) _i c i

P ^(k) _i (x)

P ^(k) _i (x ^(k) ) + β ^(k) _i

x ^(k)

β _i ^(k)

α ^(k) _i c i

P ^(k) _i (x ^(k) )

P ^(k) _i (x ^(k) )

+ β _i ^(k) = f i (x ^(k) )

⇒ β _i ^(k) = f i (x ^(k) )

s ^(k) _i (x) = α ^(k) _i c i

P ^(k) _i (x)

P ^(k) _i (x ^(k) )

+ f i (x ^(k) )

x ^(k+1)

x ^(k+1) = arg min

x ∈R ⁿ

s ^(k) (x) o

x ^(k+1)

H(f (x ^(k+1) )) < H (f(x ^(k) ))

x ^(k+1)

r ^(k+1) _i (A i , b i , α i ) =

s ^(k+1) _i (x ⁽¹⁾ , A i , b i , α i ) − f i (x ⁽¹⁾ )

s ^(k+1) _i (x ^(k) , A _i , b _i , α i ) − f i (x ^(k) ) J ^(k+1) _s,i (x ^(k+1) , A _i , b _i , α _i ) − J _f,i (x ^(k+1) )