4.3 The TL T2 Problem
4.3.2 The Results of the T est Runs
Eet of the Regularization
Herewerunthetestswithafullmappingparametervetororrespondingto
afull matrix
A
,ie. thenumber of parameters isn p = 7
. The toleranes forthemainproblemandthesubproblemsaresetto
ε 1 = 10 − 14
andε 2 = 10 − 4
.Theoptions used inthe Marquardtalgorithm are opts
=
[1e-8 1e-4 1e-4200 1e-12℄. Thetest runsprodue the following iteration sequenes.
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure 4.3.2: Withregularization
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure 4.3.3: Without
regulariza-tion
Thedesiredauray
ε 2
of theParameter Extration problems hasaratherlarge value ompared to
ε 1
, and orrespondingly the options used in the stoppingriteria for marquardtareofthe same size. Thisishosenbeauseitgivesthebestresults. Ifwe altertheoptions to
ε 2 = 10 − 14
andtheopts-vetorto[1e-8 1e-14 1e-14 200 1e-12℄thealgorithmatuallyonverges
slower asshowninthe gures4.3.4 and 4.3.5.
This seems strange, sine one would expet a higher auray of the
map-pingparameters, resulting ina better surrogatemodel andtherebya faster
onvergene. But this isnot the ase. It isnot obvious why this behaviour
0 5 10 15 10 −15
10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure4.3.4: Withregularization
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure 4.3.5: Without
regulariza-tion
gradientmathto thedesiredauray,and themarquardt-algorithm keeps
iterating,atually produing aworse setof mapping parameters.
In the ase of no regularization there is a possibility of having an
overde-termined problemfromthe
6
th iteration. At thispoint of theiteration,theiterateisalreadyverylosetotheoptimizer,anditapparentlyisnoproblem
thattheregularization term is notadded.
Eet of the Normalization Fators
Thenormalizationfatorshave animportant inueneonboth theiteration
sequene and on the surrogate model in the optimizer. The next gures
show the performane of the algorithm with three ases of normalization
orrespondingto setion 4.1.1:
•
Normalization of allresidual elements•
Onlynormalization of thegradient residual•
No normalizationFirst weonsider theaseof no regularization.
Weseethatthe onvergeneisalittle slowerinase2,where
a 1 , . . . , a k = 1
,but
d 6 = 1
. Withno normalization thesolution is not as good ompared to theestimateofx ∗
.In the gures 4.3.9-4.3.11 we see the approximation errors
E s
(light grid)and
E l
(darkgrid) orresponding to the threeases of saling.Here there is a big dierene in the surrogate models: Case 1 with
om-0 5 10 15 10 −15
10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure 4.3.6: With
nor-malization
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure 4.3.7: Only
nor-malization of gradient
residual
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure 4.3.8: Without
normalization
0 50
100
150 40 60 80 100 120 140
0 1 2 3 4 5 6
x 2 Approximation error for fine model Taylor appr. and surrogate model
with expansion point x = [ 74.23 , 79.27 ]
x 1
Norm of approximation error
Figure 4.3.9: With
nor-malization
0 50
100
150 40 60 80 100 120 140
0 1 2 3 4 5 6
x 2 Approximation error for fine model Taylor appr. and surrogate model
with expansion point x = [ 74.23 , 79.27 ]
x 1
Norm of approximation error
Figure4.3.10: Only
nor-malization of gradient
residual
0 50
100
150 40 60 80 100 120 140
0 1 2 3 4 5 6
x 2 Approximation error for fine model Taylor appr. and surrogate model
with expansion point x = [ 74.26 , 79.24 ]
x 1
Norm of approximation error
Figure 4.3.11: Without
normalization
use only the normalization fator
d
for the gradient residual, the model iswell-behaved and provides a muh better approximation to the ne model.
Thethirdasewithnosalingoftheresidual vetoratall,providesabetter
resultthan the rstase, but the seond aseis stillto prefer.
Foromparisonweviewtheiterationsequenesintheaseoftheregularized
residual.
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * || 2 F(x (k) ) − F(x * )
Figure4.3.12: With
nor-malization
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * || 2 F(x (k) ) − F(x * )
Figure4.3.13: Only
nor-malization of gradient
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * || 2 F(x (k) ) − F(x * )
Figure 4.3.14: Without
normalization
0 50
100
150 40 60 80 100 120 140
0 1 2 3 4 5 6
x 2 Approximation error for fine model Taylor appr. and surrogate model
with expansion point x = [ 74.23 , 79.27 ]
x 1
Norm of approximation error
Figure4.3.15: With
nor-malization
0 50
100
150 40 60 80 100 120 140
0 1 2 3 4 5 6
x 2 Approximation error for fine model Taylor appr. and surrogate model
with expansion point x = [ 74.23 , 79.27 ]
x 1
Norm of approximation error
Figure4.3.16: Only
nor-malization of gradient
residual
0 50
100
150 40 60 80 100 120 140
0 1 2 3 4 5 6
x 2 Approximation error for fine model Taylor appr. and surrogate model
with expansion point x = [ 74.21 , 79.29 ]
x 1
Norm of approximation error
Figure 4.3.17: Without
normalization
Asseenfromgures4.3.12-4.3.14thenormalizationfatorsareofnorelevant
importane of the performane in the ase of regularization. The solution
is again less aurate in the third ase without any normalization. Also
thesurrogate modelapproximationerrorsingures4.3.15-4.3.17 arealmost
unaetedbythesaling.
We instead onsider the ase where we put
a 1 , . . . , a k
equal to1
, but stillhavethenormalization fator
d 6 = 1
,weget the resultsingure 4.3.18:0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
0 50
100
150 40 60 80 100 120 140
0 1 2 3 4 5 6
x 2
Approximation error for fine model Taylor appr. and surrogate model with expansion point x = [ 74.23 , 79.27 ]
x 1
Norm of approximation error
Figure 4.3.18: Withnormalization only ofgradient residual
We see that the partly normalization is resulting in a signiantly better
surrogate approximationerror, whih isnowbetter that theapproximation
error fromusing alinearTaylor model.
We onlude that for this partiular problem, thesaling of theresidual
el-ementsis important for the qualityof thesurrogateapproximation, butnot
Eet of the Weighting Fators
Herewetestsomeeetsoftheweightingfatorsbylookingattheasewith
noregularization oftheresidual. Theweightingfatorshavenoeetonthe
iteration sequene, asitisseenfrom thegures 4.3.19 and4.3.20.
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure 4.3.19: With normalization
andwithout weighting
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure 4.3.20: Withnormalization
and with weighting
Thenumberofunknownparametersis
n p = 7
,whihmeans,thattheweight-ingfators will possibly inuene the results fromiteration
6
and onwards.Butthesurrogatemodelapproximationerrororrespondingtotheweighted
aseissimilarto gure4.3.9,and servesasapoorapproximationto thene
model.
Atlastweonsidertheweightingstrategyombinedwithonlynormalization
of the gradient residual. The results fromthis senario areshown in gure
4.3.21.
Now the surrogate is again well-behaved, whih is aused by putting the
normalization fators
a 1 , . . . , a k = 1
.We onlude,that the weighting fatorsin thistest problem arepratially
without inuene on the results - both regarding performane of the Spae
Mappingalgorithmandregardingthequalityofthesurrogatemodel
approx-imationintheoptimizer.
Referring to setion 3.1 we an only expet the results to be within the
aurayofthemaximal errorsfrom notusingtheexat gradients. The
tol-eraneused inthese testresults is
ε 1 = 10 − 14
,whih is probablytoo strit.We therefore onlude, that the problem is not suited for investigating the
0 5 10 15 10 −15
10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
0 50
100
150 40 60 80 100 120 140
0 1 2 3 4 5 6
x 2
Approximation error for fine model Taylor appr. and surrogate model with expansion point x = [ 74.23 , 79.27 ]
x 1
Norm of approximation error
Figure4.3.21: Withnormalizationonlyofgradientresidualandwith
weight-ing
Eet of the Number of Mapping Parameters
We here bringthe resultsof the testruns withthediagonal inputmapping
parametermatrix
A
. Inthis asewehaven p = 5
elementsintheparametervetor
p
. There isa possibilityof having anoverdetermined systemin iter-ation4
. Thetest runs aremade withthe tolerane parametersε 1 = 10 − 14
,ε 1 = 10 − 4
and opts=
[1e-8 1e-4 1e-4 200 1e-12℄. Figures 4.3.22 and4.3.23 show the results with a diagonal input mapping matrix in the
regu-larizedand the unregularized ase.
0 5 10 15 20 25
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * || 2 F(x (k) ) − F(x * )
Figure 4.3.22: Withregularization
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * || 2 F(x (k) ) − F(x * )
Figure 4.3.23: Without
regulariza-tion
We seethatthe onvergene is slowerompared to gures4.3.12 resp. 4.3.6
bothwithandwithouttheregularizationtermadded,whenonlyonsidering
theredued parameter vetor. The orresponding approximation errorsfor
gures4.3.24 and 4.3.25.
0 50
100
150 40 60 80 100 120 140
0 1 2 3 4 5 6
x 2
Approximation error for fine model Taylor appr. and surrogate model with expansion point x = [ 74.23 , 79.27 ]
x 1
Norm of approximation error
Figure4.3.24: Withregularization
0 50 100 150
40 80 60 120 100 140 0
2 4 6 8 10
x 2
Approximation error for fine model Taylor appr. and surrogate model with expansion point x = [ 74.23 , 79.27 ]
x 1
Norm of approximation error
Figure 4.3.25: Without
regulariza-tion
Againthesurrogatemodelapproximationerrorislargefortheunregularized
aseompared to theregularized. In gure4.3.25 thesurrogate
approxima-tionisnotasgoodastheapproximationwithalinearTaylormodelinmost
afthe design parameter region.
Alsoin thisase the tolerane options areof great importane. We run the
algorithm with smaller toleranes:
ε 1 = ε 2 = 10 − 14
and themarquardt-options [1e-8 1e-14 1e-14 200 1e-12℄. As the results in gures 4.3.26
and 4.3.27 show, the onvergene is now as fast as withthe full parameter
vetor. Thisappliesforboththeasewithregularization andthease
with-out regularization.
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure4.3.26: Withregularization
0 5 10 15
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure 4.3.27: Without
regulariza-tion
The approximation errors
E s
andE l
orresponding to the regularized andunregularizedtests,areseeningures4.3.28and4.3.29. Theapproximation
error for the surrogate model is not better, when using a smaller tolerane
value.
0 50
100
150 40 60 80 100 120 140
0 1 2 3 4 5 6
x 2
Approximation error for fine model Taylor appr. and surrogate model with expansion point x = [ 74.23 , 79.27 ]
x 1
Norm of approximation error
Figure 4.3.28: Withregularization
0 50
100
150 40 60
80 100 120
140 0
2 4 6 8 10
x 2
Approximation error for fine model Taylor appr. and surrogate model with expansion point x = [ 74.23 , 79.27 ]
x 1
Norm of approximation error
Figure 4.3.29: Without
regulariza-tion
We onlude that the optional tolerane values are of great importane in
this problem. A smaller tolerane here results in faster onvergene, but
thesurrogateapproximationerrorsarenoteetedpositively bythesmaller
tolerane. In order to ensure good surrogate approximations over a large
regionofthe designparameterspae, wemustputthenormalizationfators
a 1 , . . . , a k
equal to1
.Optimal Mapping Parameters
Finallywe lookat thevalues ofthemappingparameters intheoptimal
sur-rogate model. We have the initial mapping parameters
A i = I
,b i = 0
andα i = 1
forallresponsefuntionsi = 1, . . . , 11
initeration0
. Itisinteresting to seehowdierent the optimal mapping parameters arefrom thesevalues.We onsidertwo test senarios:
•
Withregularization and withnormalization (gure4.3.15)•
Withoutregularization and withnormalization (gure 4.3.9)In the rst ase a representative matrix
A
in the optimal surrogate modelis:
A =
1.09 0.10 0.05 1.07
The elements of
b i
are of order of magnitude10 − 3
, and all values of theoutput mapping parameters
α
vary in the interval[0.75 , 1.45]
. But forresponsefuntion number
2
we have:A 2 =
− 0.06 − 1.27
− 1.11 0.34
b 2 =
− 0.024 0.018
whih is not lose to the identity matrix. We onlude that most of the
matries
A
arenot verydierentfrom theinitial identitymatrix, andmostofthe elementsin
b
arelose to0
. Thesurrogatemodelwiththesemappingparameters iswell-behaved aswe have seeningure4.3.15.
But in the ase of no regularization we nd a muh more varying piture.
Someofthe
A i
'sarelosetotheidentitymatrixsomearenot,thoughallA i
-elements have absolute values between
0
and3.5
. Again response funtionnumber
2
is speial:A 2 =
− 0.37 − 0.31 2.12 1.70
b 2 =
− 93.91
− 65.52
α 2 = − 12.85
Some
b
-vetorshaveelementsloseto0
,butastheresultabove showsthereare also examples of extremely dierent
b
-values. Theα i
's vary between− 12.85
and1.42
. Thesurrogate modelapproximation errorshown ingure 4.3.9 hasextreme variation, whih ouldbe aonsequene of thevariationsofthe mapping parameters.
This behaviour of the mapping parameters is probably onneted to the
missingregularizationterm. Intheregularizationaseweforethemapping
parameterstobelosetothepreviousset, andinthisway,weannotendup
withresultsveryfar from theinitial values. We onlude,thatifwe donot
regularize wrt.the previous parameter set, we ouldend up withasolution
veryfar from the previous.
Finallywe onsiderthe mapping parameters orrespondingto gure 4.3.10.
Herethe approximationerrorforthe surrogateisnotaslarge asbefore. The
mapping parameters forresponsefuntion
2
arenow:A 2 =
4.08 4.52
− 1.06 2.73
b 2 =
− 231.01
− 72.83
α 2 = − 0.38
Generallythemappingparametersinthisasestillvarymuhfromresponse
to response, but apparently thesurrogate approximation isbetter.
Diret Optimization
For omparing the Spae Mapping algorithm with a lassial optimization
modelbythetwoalgorithmsdiretanddiretdfromtheSMISframework
implementedbyFrank Pedersen.
0 10 20 30 40 50 60
10 −20 10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
0 5 10 15 20 25 30 35 40
10 −15 10 −10 10 −5 10 0 10 5
Performance
Iteration
||x (k) −x * ||
2 F(x (k) ) − F(x * )
Figure 4.3.30: Performane of diret optimization of the ne model (diret
left,diretdright)
The diret-algorithm uses a Broyden updated approximation of the rst
orderpartialderivatives, whiletheotherusestheJaobian diretlyfromthe
ne modelfuntion.
Itisobvious,thatthe SpaeMappingalgorithm ismuhmore eientthan
thelassialoptimization algorithms.
Summary of the Results
By running the dierent versions of the Spae Mapping algorithm on this
test problem,wehave seen, thattheresults arevery dierent. Itis diult
to onlude, why the results look as they do, and impossible to generalize
the behaviour to other problem types. But for this partiular problem the
results showthe following:
•
Theregularization seemsto have a positive eet onboth the onver-genespeed and the optimalsurrogate aproximation.•
The normalization fators have only little eet on the onvergene speed, but a big inuene on the quality of thesurrogateapproxima-tion, whenwe don't useregularization.
•
Theweightingfatorsdo not seemto have anotiable eet oneithertheonvergene or the surrogateapproximation.
•
The redution of the mapping parameters still provides goodonver-generesults,although the tolerane options have an eet.
•
Theoptimal mapping parameters areinuened bytheregularizationterm.