Optimization with (in)equality constraints

(1)

Optimization with (in)equality constraints

Evelien van der Hurk

DTU Management Engineering

(2)

Today’s Topics in Optimization

•One variable without constraints

•Two or more variables without constraints

•Lagrange: Equality Constraints

•Equality and Inequality Constraints

(3)

Let’s start with a puzzle

To save the world from destruction, agent 007 has

to reach an island 50 meters off-shore from a point 100 meters further

along a straight beach and then disarm a timing device. The agent can run

along the shore at 5 meters per second, swim at 2 meters per second, and

disarm a timing device in 30 seconds. Can 007 save the world if the device

is set to trigger destruction in 73 seconds?

(4)

One variable without constraints

(5)

Two important theorems: (1) Fermat

The Fermat Theorem

Ifˆxis a point of local minimum (or maximum) of a functionf, then the main linear part of the increment is equal to zero.

y

y=x²

(6)

Example: verify Fermat

x y

y=x²

Verify the Fermat theorem for min f(x) = x

²

→ x ˆ = 0 Let x ˆ be a point of local minimum of f .

Let x be a point arbitrarily close to x, ˆ and let h be the increment: x = ˆ x + h

The increment function:

f (x) − f(ˆ x) = (ˆ x + h

²

) − x ˆ

²

= 2ˆ xh + h

²

Where 2ˆ xh is called the main linear part

and h

²

is the remainder. This remainder is negligible for small enough h. E.g. h =

₁₀₀¹

= 0.01

,h

²

=

₁₀₀₀₀¹

= 0.0001Now, Fermat:

The Fermat Theorem

Ifxˆis a point of local minimum (or maximum) of a function f, then the main linear part (here: 2ˆxh) of the increment is equal to zero.

That the main linear part 2ˆ xh = 0 for each h, implies ˆ

x = 0

(7)

Example: verify Fermat

x y

y=x²

Verify the Fermat theorem for min f(x) = x

²

→ x ˆ = 0 Let x ˆ be a point of local minimum of f .

Let x be a point arbitrarily close to x, ˆ and let h be the increment: x = ˆ x + h The increment function:

f (x) − f(ˆ x) = (ˆ x + h

²

) − x ˆ

²

= 2ˆ xh + h

²

Where 2ˆ xh is called the main linear part

and h

²

is the remainder. This remainder is negligible for small enough h. E.g. h =

₁₀₀¹

= 0.01

,h

²

=

₁₀₀₀₀¹

= 0.0001

Now, Fermat:

The Fermat Theorem

Ifxˆis a point of local minimum (or maximum) of a function f, then the main linear part (here: 2ˆxh) of the increment is equal to zero.

That the main linear part 2ˆ xh = 0 for each h, implies ˆ

x = 0

(8)

Example: verify Fermat

x y

y=x²

Verify the Fermat theorem for min f(x) = x

²

→ x ˆ = 0 Let x ˆ be a point of local minimum of f .

Let x be a point arbitrarily close to x, ˆ and let h be the increment: x = ˆ x + h The increment function:

f (x) − f(ˆ x) = (ˆ x + h

²

) − x ˆ

²

= 2ˆ xh + h

²

Where 2ˆ xh is called the main linear part

and h

²

is the remainder. This remainder is negligible for small enough h. E.g. h =

₁₀₀¹

= 0.01

,h

²

=

₁₀₀₀₀¹

= 0.0001Now, Fermat:

The Fermat Theorem

Ifˆxis a point of local minimum (or maximum) of a functionf, then the main linear part (here: 2ˆxh) of the increment is equal to zero.

That the main linear part 2ˆ xh = 0 for each h, implies ˆ

x = 0

(9)

Example: verify Fermat

x y

y=x²

Verify the Fermat theorem for min f(x) = x

²

→ x ˆ = 0 Let x ˆ be a point of local minimum of f .

Let x be a point arbitrarily close to x, ˆ and let h be the increment: x = ˆ x + h The increment function:

f (x) − f(ˆ x) = (ˆ x + h

²

) − x ˆ

²

= 2ˆ xh + h

²

Where 2ˆ xh is called the main linear part

and h

²

is the remainder. This remainder is negligible for small enough h. E.g. h =

₁₀₀¹

= 0.01

,h

²

=

₁₀₀₀₀¹

= 0.0001Now, Fermat:

The Fermat Theorem

Ifˆxis a point of local minimum (or maximum) of a functionf,

(10)

Fermat Theorem:

necessary conditions for one variable without constraints Consider a problem f (x) → extr, i.e. a one-variable optimization problem without constraints.

The Fermat Theorem

Assume that the functionf : (ˆx−,xˆ+)→Ris differentiable atx. Ifˆ xˆ is a local extremum of problemf(x), then: f⁰(ˆx) = 0.

Proof for x ˆ is a minimum.

f (ˆ x + h) − f (ˆ x) ≥ 0 for all h with |h| sufficiently small:

•Ifh >0, ^f(ˆ^x+h)−f(ˆ_h ^x) ≥0. Lettingh→0 givesf⁰(ˆx)≥0.

•Ifh <0, ^f(ˆ^x+h)−f(ˆ_h ^x) ≤0. Lettingh→0 givesf⁰(ˆx)≤0.

Therefore, f

⁰

(ˆ x) = 0

Remark:

•Proof can be reformulated to link derivative to indicate increase/decrease inf(x)

•Necessary, but notsufficientcondition. Example: f(x) =x³

(11)

Fermat Theorem:

necessary conditions for one variable without constraints Consider a problem f (x) → extr, i.e. a one-variable optimization problem without constraints.

The Fermat Theorem

Assume that the functionf : (ˆx−,xˆ+)→Ris differentiable atx. Ifˆ xˆ is a local extremum of problemf(x), then: f⁰(ˆx) = 0.

Proof for x ˆ is a minimum.

f (ˆ x + h) − f (ˆ x) ≥ 0 for all h with |h| sufficiently small:

•Ifh >0, ^f(ˆ^x+h)−f(ˆ_h ^x) ≥0. Lettingh→0 givesf⁰(ˆx)≥0.

•Ifh <0, ^f(ˆ^x+h)−f(ˆ_h ^x) ≤0. Lettingh→0 givesf⁰(ˆx)≤0.

(12)

Two important theorems: (2) Weierstrass

The Weierstrass Theorem

If a functionf : [a, b]→Ron a closed interval is continuous, then the problem f(x)→min, a≤z≤b

has a point of global minimum.

Often: unbounded function; but coercive (for minimization):

|x|→+∞

lim f (x) = +∞

Corrollary

If a functionf :R→Ris continuous and coercive for minimization, then the problem

f(x)→min, x∈R

(13)

Weierstrass: example

Show that there exists a solution for the problem:

f (x) = x + 5x

⁻

1 → min, x > 0

Solution: Observe that

|x|→+∞

lim f (x) = +∞ and lim

|x|↓+0

f (x) = +∞

Now, it is possible to choose M > 0 large enough that: f (x) > f (1) if x < 1/M or x > M.

Then we can add to the given problem the constraints 1 < M ≤ x ≤ M Conclusion:

f (x) → min, 1 < M ≤ x ≤ M has a point of global minimum, by the

Weierstrass theorem.

(14)

Weierstrass: example

Show that there exists a solution for the problem:

f (x) = x + 5x

⁻

1 → min, x > 0 Solution: Observe that

|x|→+∞

lim f (x) = +∞ and lim

|x|↓+0

f (x) = +∞

Now, it is possible to choose M > 0 large enough that: f (x) > f (1) if x < 1/M or x > M.

Then we can add to the given problem the constraints 1 < M ≤ x ≤ M Conclusion:

f (x) → min, 1 < M ≤ x ≤ M has a point of global minimum, by the

Weierstrass theorem.

(15)

Weierstrass: example

Show that there exists a solution for the problem:

f (x) = x + 5x

⁻

1 → min, x > 0 Solution: Observe that

|x|→+∞

lim f (x) = +∞ and lim

|x|↓+0

f (x) = +∞

Now, it is possible to choose M > 0 large enough that:

f (x) > f (1) if x < 1/M or x > M.

Then we can add to the given problem the constraints 1 < M ≤ x ≤ M

Conclusion:

f (x) → min, 1 < M ≤ x ≤ M has a point of global minimum, by the

Weierstrass theorem.

(16)

Weierstrass: example

Show that there exists a solution for the problem:

f (x) = x + 5x

⁻

1 → min, x > 0 Solution: Observe that

|x|→+∞

lim f (x) = +∞ and lim

|x|↓+0

f (x) = +∞

Now, it is possible to choose M > 0 large enough that:

f (x) > f (1) if x < 1/M or x > M.

Then we can add to the given problem the constraints 1 < M ≤ x ≤ M Conclusion:

f (x) → min, 1 < M ≤ x ≤ M has a point of global minimum, by the

(17)

Unconstrained Optimization - summary

Problem

minimize f(x) subject to: x ∈R

The Two Theorems

•Weierstrass, existence:

•a functionf : [a, b]→Ron a closed interval is continuous

•a functionf :R→Ris coercive for minimization and continuous.

•Fermat : Assume that the functionf : (ˆx−,xˆ+)→Ris differentiable atx.ˆ Ifxˆ is a local extremum of problemf(x), then: f⁰(ˆx) = 0.

(18)

The Four-Step Method

The Four-Step Method for optimization

1 Model the problem and establish existence of global solutions (e.g. Weierstrass)

2 Write down the equation(s) of the first order necessary conditions (e.g. Fermat)

3 Investigate these equations

4 Write down the conclusion

Wait, What about the second order conditions?

•The second order conditions also form sufficient conditions to establish a global maximum or minimum for a problem with one variable – but the story becomes more complicated for functions with multiple variables.

•The four step method isuniversal. It works for all optimization problems that can be solved analytically.

(19)

The Four-Step Method

Wait, What about the second order conditions?

•The second order conditions also form sufficient conditions to establish a global maximum or minimum for a problem with one variable – but the story becomes more complicated for functions with multiple variables.

(20)

The Four-Step Method: example

Using the four-step method, solve the problem:

f (x) = x + 5x

⁻

1 → min, x > 0

Solution:

1 lim

|x|→+∞f(x) = +∞and lim

|x|↓+0f(x) = +∞

and so existence of a global solutionxˆ follows, using Weierstrass theorem

2 Fermat: f⁰(x) = 0→1−5x⁻²= 0

3 x=√

5(note thatx=−√

5 contradictsx >0)

4 xˆ=√ 5

(21)

The Four-Step Method: example

Using the four-step method, solve the problem:

f (x) = x + 5x

⁻

1 → min, x > 0 Solution:

1 lim

|x|↓+0f(x) = +∞

2 Fermat: f⁰(x) = 0→1−5x⁻²= 0

3 x=√

5(note thatx=−√

5 contradictsx >0)

4 xˆ=√ 5

(22)

The Four-Step Method: example

Using the four-step method, solve the problem:

f (x) = x + 5x

⁻

1 → min, x > 0 Solution:

1 lim

|x|↓+0f(x) = +∞

2 Fermat: f⁰(x) = 0→1−5x⁻²= 0

3 x=√

5(note thatx=−√

5 contradictsx >0)

4 xˆ=√ 5

(23)

The Four-Step Method: example

Using the four-step method, solve the problem:

f (x) = x + 5x

⁻

1 → min, x > 0 Solution:

1 lim

|x|↓+0f(x) = +∞

2 Fermat: f⁰(x) = 0→1−5x⁻²= 0

3 x=√

5(note thatx=−√

5 contradictsx >0)

4 xˆ=√ 5

(24)

The Four-Step Method: example

Using the four-step method, solve the problem:

f (x) = x + 5x

⁻

1 → min, x > 0 Solution:

1 lim

|x|↓+0f(x) = +∞

2 Fermat: f⁰(x) = 0→1−5x⁻²= 0

3 x=√

5(note thatx=−√

5 contradictsx >0)

4 xˆ=√ 5

(25)

The Four-Step Method: In-class assignment

Using the four-step method, solve the problem:

f (x) = x

x

²

+ 1 max

(26)

The Four-Step Method: In-class assignment

Using the four-step method, solve the problem:

f (x) = x

x

²

+ 1 max

(27)

The Four-Step Method: In-class assignment Using the four-step method, solve the problem:

f (x) = x

x

²

+ 1 max

Solution:

1 lim

|x|→+∞f(x) = 0

There exists a numberx¯ withf(x)>0(e.g. x¯= 2, f(¯x) =¹₅). Then we could take a large numberN, such that for x⁰ > N andx⁰<−N, f(x⁰)< f(¯x). Therefore (as we are maximizing), we can find the maximum off(x)on the closed interval[−N, N], and thus by the Weierstrass theorem there exists a global maximumx.ˆ

2 Fermat: f⁰(x) = 0→ ^x²^+1−x(2x)_x₂₊₁ = 0

3 x²+ 1−2x²= 0→x²= 1→x=±1

Compare values off at the candidate solutions 1 and -1: f(−1) =−¹₂ andf(1) = ¹₂

4 xˆ= 1

(28)

The Four-Step Method: In-class assignment Using the four-step method, solve the problem:

f (x) = x

x

²

+ 1 max Solution:

1 lim

|x|→+∞f(x) = 0

There exists a numberx¯ withf(x)>0(e.g. x¯= 2, f(¯x) =¹₅). Then we could take a large numberN, such that for x⁰ > N andx⁰<−N, f(x⁰)< f(¯x).

Therefore (as we are maximizing), we can find the maximum off(x)on the closed interval[−N, N], and thus by the Weierstrass theorem there exists a global maximumx.ˆ

2 Fermat: f⁰(x) = 0→ ^x²^+1−x(2x)_x₂₊₁ = 0

3 x²+ 1−2x²= 0→x²= 1→x=±1

4 xˆ= 1

(29)

The Four-Step Method: In-class assignment Using the four-step method, solve the problem:

f (x) = x

x

²

+ 1 max Solution:

1 lim

|x|→+∞f(x) = 0

2 Fermat: f⁰(x) = 0→ ^x²^+1−x(2x)_x₂₊₁ = 0

3 x²+ 1−2x²= 0→x²= 1→x=±1

4 xˆ= 1

(30)

The Four-Step Method: In-class assignment Using the four-step method, solve the problem:

f (x) = x

x

²

+ 1 max Solution:

1 lim

|x|→+∞f(x) = 0

2 Fermat: f⁰(x) = 0→ ^x²^+1−x(2x)_x₂₊₁ = 0

3 x²+ 1−2x²= 0→x²= 1→x=±1

4 xˆ= 1

(31)

The Four-Step Method: In-class assignment Using the four-step method, solve the problem:

f (x) = x

x

²

+ 1 max Solution:

1 lim

|x|→+∞f(x) = 0

2 Fermat: f⁰(x) = 0→ ^x²^+1−x(2x)_x₂₊₁ = 0

4 xˆ= 1

(32)

The Four-Step Method: In-class assignment Using the four-step method, solve the problem:

f (x) = x

x

²

+ 1 max Solution:

1 lim

|x|→+∞f(x) = 0

2 Fermat: f⁰(x) = 0→ ^x²^+1−x(2x)_x₂₊₁ = 0

3 x²+ 1−2x²= 0→x²= 1→x=±1

Compare values off at the candidate solutions 1 and -1:

(33)

Two or more variables without constraints

Historical remark: it took no less than two centuries before the inventions of derivatives by Newton and Leibniz for functions with one variable, was extended to functions with more variables.

The conclusion? The definition for functions with n variables can be given in exactly the same way as functions of one variable! (provided we use the approximation definition and make use of vectors)

f

⁰

(x) =

δf (ˆ x)

δx

₁

, ..., δf (ˆ x δx

_n

(34)

Two or more variables without constraints

Historical remark: it took no less than two centuries before the inventions of derivatives by Newton and Leibniz for functions with one variable, was extended to functions with more variables.

The conclusion? The definition for functions with n variables can be given in exactly the same way as functions of one variable! (provided we use the approximation definition and make use of vectors)

f

⁰

(x) =

δf (ˆ x)

δx

₁

, ..., δf (ˆ x δx

_n

(35)

Two or more variables without constraints

Historical remark: it took no less than two centuries before the inventions of derivatives by Newton and Leibniz for functions with one variable, was extended to functions with more variables.

The conclusion? The definition for functions with n variables can be given in exactly the same way as functions of one variable! (provided we use the approximation definition and make use of vectors)

f

⁰

(x) =

δf(ˆ x)

δx

₁

, ..., δf (ˆ x δx

_n

(36)

Unconstrained Optimization - two or more variables

Problem

f(x)→extr withxann-variable vector

The Two Theorems

•Weierstrass, existence:

•a continuous function f on a compact set in Rⁿ attains its global maxima and minima

•a continuous function f :Rⁿ→Ris coercive (for minimization) if lim

|x|→+∞f(x) = +∞. A continuous functionf :Rⁿ→Rthat is coercive for minimization has a global minimum.

•Fermat : Assume thatf :U_n(ˆx,¯)→Ris differentiable atx. Ifˆ xˆis a local

(37)

The Four-Step Method: In-class assignment

Using the four-step method, solve the problem:

f (x) = |x|

²

= x

²₁

+ x

²₂

→ min, x ∈ R

²

(38)

The Four-Step Method: In-class assignment

Using the four-step method, solve the problem:

f (x) = |x|

²

= x

²₁

+ x

²₂

→ min, x ∈ R

²

(39)

A puzzle: Ice cream on the beach

Two ice cream stands on the beach

There is a 200 meter stretch of beach. Customers are uniformly distributed over the beach. Because the sand is hot, they will always go to the nearest ice cream stand.

Question: Should the town council of a seaside resort tell the sellers of ice cream on its beach where to stand? Compare:

•Sellers choose location

•Prescribed location to the sellers

Using the four-step method, solve the problem:

(40)

Unconstrained Optimization - alternative approach

Second Order Necessary Conditions

Ifx^∗ is a local minimizer off(x)and∇²f(x)is continuously differentiable in an open neighbourhood ofx^∗, then

∇f(x^∗) =0

∇²f(x^∗)is positivesemi definite

Second Order Sufficient Conditions

Suppose that∇²f(x)is continuously differentiable in an open neighbourhood of x^∗. If the following two conditions are satisfied, thenx^∗ is a local minimum of f(x).

∇f(x^∗) =0

∇²f(x^∗)ispositive definite

(41)

Optimization with Equality Constraints

(42)

Equality Constrained Optimization

Problem

minimize f0(x)

subject to: f_i(x) = 0∀i= 1,2, . . . m x ∈Rⁿ

(43)

Equality Constrained Optimization

Example

minimize 2x²₁+x²₂ subject to: x₁+x₂ = 1

•Let us first consider the unconstrained case

•Differentiate with respect tox1andx2

∂f(x1, x2)

∂x1

= 4x1

∂f(x1, x2)

∂x₂ = 2x2

(44)

Equality Constrained Optimization

•Let us penalize ourselves for not satisfying the constraint

•This gives

L(x1, x2, λ1) = 2x²₁+x²₂+λ1(1−x1−x2)

•This is known as theLagrangianof the problem

•Try to adjust the valueλ1so we use just the right amount of resource

λ1= 0→get solutionx1=x2= 0,1−x1−x2= 1 λ1= 1→get solutionx1= ¹₄, x2=¹₂,1−x1−x2= ¹₄ λ1= 2→get solutionx1= ¹₂, x2= 1,1−x1−x2=−¹₂ λ₁= ⁴₃ →get solutionx₁= ¹₃, x₂=²₃,1−x₁−x₂= 0

(45)

Equality Constrained Optimization

•This gives

L(x1, x2, λ1) = 2x²₁+x²₂+λ1(1−x1−x2)

•Try to adjust the valueλ1so we use just the right amount of resource λ1= 0→get solutionx1=x2= 0,1−x1−x2= 1

λ1= 1→get solutionx1= ¹₄, x2=¹₂,1−x1−x2= ¹₄ λ1= 2→get solutionx1= ¹₂, x2= 1,1−x1−x2=−¹₂ λ₁= ⁴₃ →get solutionx₁= ¹₃, x₂=²₃,1−x₁−x₂= 0

(46)

Equality Constrained Optimization

•This gives

L(x1, x2, λ1) = 2x²₁+x²₂+λ1(1−x1−x2)

λ1= 1→get solutionx1= ¹₄, x2= ¹₂,1−x1−x2= ¹₄

λ1= 2→get solutionx1= ¹₂, x2= 1,1−x1−x2=−¹₂ λ₁= ⁴₃ →get solutionx₁= ¹₃, x₂=²₃,1−x₁−x₂= 0

(47)

Equality Constrained Optimization

•This gives

L(x1, x2, λ1) = 2x²₁+x²₂+λ1(1−x1−x2)

λ1= 1→get solutionx1= ¹₄, x2= ¹₂,1−x1−x2= ¹₄ λ1= 2→get solutionx1= ¹₂, x2= 1,1−x1−x2=−¹₂

λ₁= ⁴₃ →get solutionx₁= ¹₃, x₂=²₃,1−x₁−x₂= 0

(48)

Equality Constrained Optimization

•This gives

L(x1, x2, λ1) = 2x²₁+x²₂+λ1(1−x1−x2)

λ1= 1→get solutionx1= ¹₄, x2= ¹₂,1−x1−x2= ¹₄ λ1= 2→get solutionx1= ¹₂, x2= 1,1−x1−x2=−¹₂ λ1= ⁴₃ →get solutionx1= ¹₃, x2=²₃,1−x1−x2= 0

(49)

Equality Constrained Optimization:

The Lagrange multiplier rule

Lagrange multiplier rule

Given a problemf₀(x)→extr,f_i(x) = 0,,i≤i≤m.

Assume that this problem is smooth atxˆ in the following sense. A function f₀:U_n(ˆx, )→Ris differentiable atˆxand the functions

fi:Un(ˆx, )→R,1≤i≤m, are continuously differentiable atx.ˆ

Ifˆxis a point of local extremum of this problem, then it is a stationary point of the Lagrange function of the problem for a suitable nonzero selection of Lagrange multipliersλ∈(R^m+1)⁰, that is:

Lx(ˆx, λ) = 0^T_n ⇐⇒Pm

i=0λif_i⁰(ˆx) = 0^T_n ⇐⇒Pm i=0λiδf_i

δx_j(ˆx) = 0,1≤j≤n

To remember it, you can write:

(50)

The Lagrange multiplier rule

A few remarks:

•The clever idea behind the Lagrange method is to reverse the order used in Fermat: differentiate first and eliminate after

•The secret of its success lies in the idea of reducing the main task, anonlinear elimination problem, to an easier task, alinear elimination problem

•In many applications, we will not need to values of the lagrange multipliers themselves.

•The lagrange multiplier rule is a necessary condition – using the four-step method you do not need other sufficient conditions.

(51)

The Lagrange multiplier rule - in class exercise

Solve , for a given constant a > 0, the problem:

f

0

(x) = x

1

x

2

→ max, f

1

(x) = x

²₁

+ x

²₂

− a

²

= 0, x

i

> 0, i = 1, 2

(52)

Equality Constrained Optimization

Assume x

^∗

minimizes the following

minimize f (x)

subject to: h

i

(x) = 0 ∀i = 1, 2, . . . m x ∈ R

ⁿ

The following two cases are possible:

1 The vectors∇h1(x^∗),∇h2(x^∗), . . . ,∇hm(x^∗)are linearly dependent

2 There exists a vectorλ^∗ such that

∂L(x^∗,λ^∗)

∂x1

= ∂L(x^∗,λ^∗)

∂x2

= ∂L(x^∗,λ^∗)

∂x3

=, . . . ,=∂L(x^∗,λ^∗)

∂xn

= 0

∂L(x^∗,λ^∗) ∂L(x^∗,λ^∗) ∂L(x^∗,λ^∗) ∂L(x^∗,λ^∗)

(53)

Case 1: Example

Example

minimize x1+x2+x²₃ subject to: x1= 1 x²₁+x²₂= 1

•The minimum is achieved atx₁= 1, x₂= 0, x₃= 0

•The Lagrangian is:

L(x₁, x₂, x₃, λ₁, λ₂) =x₁+x₂+x²₃+λ₁(1−x₁) +λ₂(1−x²₁−x²₂)

•Observe that:

∂L(1,0,0, λ₁, λ₂)

(54)

Case 2: Example

Example

minimize 2x²₁+x²₂ subject to: x1+x2 = 1

L(x1, x2, λ1) = 2x²₁+x²₂+λ1(1−x1−x2)

•Solve for the following:

∂L(x^∗₁, x^∗₂, λ^∗₁

∂x1

) = 4x^∗₁−λ^∗₁= 0

∂L(x^∗₁, x^∗₂, λ^∗₁

∂x₂ ) = 2x^∗₂−λ^∗₁= 0

(55)

Case 2: Example continued

•Solving this system of equations yieldsx^∗₁= ¹₃, x^∗₂=²₃, λ^∗₁= ⁴₃

•Is this a minimum or a maximum?

(56)

Graphically

x

1

x

2

x1+x2= 1 1

1

(57)

Graphically

x

1

x

2

x1+x2= 1 1

x^∗₁=¹₃ 1

x^∗₂=²₃ ∇f(x^∗) =λ^∗∇h(x^∗)

(58)

Geometric Interpretation

•Consider the gradients off andhat the optimal point

•They must point in the same direction, though they may have different lengths

∇f(x^∗) =λ^∗∇h(x^∗)

•Along with feasibility ofx^∗, is the condition ∇L(x^∗,λ^∗) = 0

•From the example, atx^∗₁=¹₃, x^∗₂= ²₃, λ^∗₁= ⁴₃

∇f(x^∗₁, x^∗₂) =h

4x^∗₁ 2x^∗₂ i

=h

4 3

i

∇h1(x^∗₁, x^∗₂) =h

1 1 i

(59)

Geometric Interpretation

•∇f(x)points in the direction of steepest ascent

•−∇f(x)points in the direction of steepest descent

•In two dimensions:

•∇f(x^o)is perpendicular toalevel curve off

•∇hi(x^o)is perpendicular tothelevel curvehi(x^o) = 0

(60)

Equality, Inequality Constrained Optimization

(61)

Inequality Constraints

General Problem

maximize f(x)

subject to: g_i(x) ≤0 (µ_i) ∀i∈I hj(x) = 0 (λj) ∀i∈J

•Given a feasible solutionx^o, the set ofbindingconstraints is:

I ={i:gi(x^o) = 0}

(62)

The Lagrangian

L(x, λ, µ) = f (x) +

m

X

i=1

µ

i

(0 − g

i

(x)) +

k

X

j=1

λ

j

(0 − h

j

(x))

(63)

Inequality Constrained Optimization Assume x

^∗

maximizes the following

maximize f (x)

subject to: g

i

(x) ≤ 0 (µ

i

) ∀i ∈ I h

_j

(x) = 0 (λ

_j

) ∀i ∈ J

The following two cases are possible:

1 ∇h1(x^∗), . . . ,∇hk(x^∗),∇g1(x^∗), . . . ,∇gm(x^∗)are linearly dependent

2 There exist vectorsλ^∗ andµ∗such that

∇f(x^∗)−

k

Xλj∇hj(x^∗)−

m

Xµi∇gi(x^∗) = 0

(64)

Inequality Constrained Optimization

•These conditions are known as the Karush-Kuhn-Tucker Conditions

•We look for candidate solutionsx^∗ for which we can findλ^∗ andµ^∗

•Solve these equations using complementary slackness

•At optimality some constraints will be binding and some will be slack

•Slack constraints will have a correspondingµi of zero

•Binding constraints can be treated using the Lagrangian

(65)

Constraint qualification

KKT constraint qualification

∇gi(x^o)fori∈I are linearly independent

Slater constraint qualification

•gi(x)fori∈I are convex functions

•A non boundary point exists: gi(x)<0fori∈I

(66)

Case 1 Example

The Problem

maximize x

subject to: y ≤(1−x)³ y ≥0

•Consider the global max: (x, y) = (1,0)

•After reformulation, the gradients are

∇f(x, y) = (1,0)

∇g1 = (3(x−1)²,1)

∇g2 = (0,−1)

(67)

Graphically

x y

1

(68)

Case 1 Example

We get:

"

1 0

#

− µ

₁

"

0 1

#

− µ

₂

"

0 −1

#

•Noµ₁andµ₂ exist such that:

∇f(x, y)−

2

X

i=1

µi∇gi(x, y) =0

(69)

Case 2 Example

The Problem

maximize −(x−2)²−2(y−1)²

subject to: x+ 4y ≤3

x ≥y

The Problem (Rearranged)

maximize −(x−2)²−2(y−1)²

subject to: x+ 4y ≤3

−x+y ≤0

(70)

Case 2 Example

L(x₁, y, µ₁, µ₂) =−(x−2)²−2(y−1)²+µ₁(3−x−4y) +µ₂(0 +x−y)

•This gives the following KKT conditions

∂L

∂x =−2(x−2)−µ1+µ2= 0

∂L

∂y =−4(y−1)−4µ1−µ2= 0 µ1(3−x−4y) = 0

µ2(x−y) = 0 µ , µ ≥0

(71)

Case 2 Example

We have two complementarity conditions → check 4 cases

1 µ1=µ2= 0→x= 2, y= 1

2 µ1= 0, x−y= 0→x= ⁴₃, µ2=−⁴₃

3 3−x−4y= 0, µ2= 0→x=⁵₃, y=¹₃, µ1= ²₃

4 3−x−4y= 0, x−y= 0→x=³₅, y=³₅, µ1= ²²₂₅, µ2=−⁴⁸₂₅

Optimal solution is therefore x

^∗

=

⁵₃

, y

^∗

=

¹₃

, f (x

^∗

, y

^∗

) = −

⁴₉

(72)

Case 2 Example

We have two complementarity conditions → check 4 cases

1 µ1=µ2= 0→x= 2, y= 1

2 µ1= 0, x−y= 0→x= ⁴₃, µ2=−⁴₃

3 3−x−4y= 0, µ2= 0→x=⁵₃, y=¹₃, µ1= ²₃

4 3−x−4y= 0, x−y= 0→x=³₅, y=³₅, µ1= ²²₂₅, µ2=−⁴⁸₂₅

Optimal solution is therefore x

^∗

=

⁵₃

, y

^∗

=

¹₃

, f (x

^∗

, y

^∗

) = −

⁴₉

(73)

Continued

The Problem

minimize (x−3)²+ (y−2)² subject to: x²+y² ≤5

x+ 2y ≤4 x, y ≥0

The Problem (Rearranged)

maximize −(x−3)²−(y−2)² subject to: x²+y² ≤5

x+ 2y ≤4

(74)

Inequality Example

•The gradients are:

∇f(x, y) = (6−2x,4−2y)

∇g₁(x, y) = (2x,2y)

∇g2(x, y) = (1,2)

∇g₃(x, y) = (−1,0)

∇g4(x, y) = (0,−1)

(75)

Inequality Example

•Consider the point(x, y) = (2,1)

It is feasibleI ={1,2}

•This gives

"

2 2

#

−µ1

"

4 2

#

−µ2

"

1 2

#

=

"

0 0

#

•µ₁=¹₃, µ₂= ²₃ satisfy this

(76)

Sufficient condition

General Problem

maximize f(x)

subject to: gi(x) ≤0 ∀i∈I

Theorem

Iff(x)is concave andg_i(x)fori∈Iare convex functions then a feasible KKT point is optimal

•An equality constraint is equivalent to two inequality constraints:

h_j(x) = 0⇔h_j(x)≤0and −h_j(x)≤0

•The corresponding two nonnegative multipliers may be combined to one free one

(77)

Equality constraints

General Problem

maximize f(x)

subject to: gi(x) ≤0 ∀i∈I h_j(x) = 0 ∀j∈J

•Letx^obe a feasible solution

•As before,I={i:gi(x^o) = 0}

•Assume constraint qualification holds

(78)

Equality constraints

KKT Necessary Optimality Conditions

Ifx^o is a local maximum, there exist multipliersµi≥0∀i∈Iandλj ∀j∈J such that

∇f(x^o)−X

i∈I

µ_i∇gi(x^o)−X

j

λ_j∇hj(x^o) =0

KKT Sufficient Optimality Conditions

Iff(x)is concave,g_i(x)∀i∈Iare convex functions andh_j ∀j∈J are affine then a feasible KKT point is optimal

(79)

KKT Conditions - Summary

General Problem

maximize f(x)

subject to: g_i(x) ≤0 ∀i∈I hj(x) = 0 ∀j∈J

KKT conditions

∇f(x^o)−P

iµi∇gi(x^o)−P

jλj∇hj(x^o) =0

µigi(x^o) = 0 ∀i∈I µi ≥0 ∀i∈I

o

(80)

Alternative Formulation

General Problem

maximize f(x) subject to: g(x) ≤0

h(x) = 0

KKT Conditions

∇f(xô)−µ∇g(xô)−λ∇h(xô) =0 µg(xô) =0 µ ≥0 xô feasible

(81)

Class Exercise 1

The Problem

maximize ln(x+ 1) +y subject to: 2x+y ≤3

x, y ≥0

(82)

Class Exercise 2

The problem

minimize x²+y² subject to: x²+y² ≤5

x+ 2y = 4 x, y ≥0

(83)

Class Exercise 3

Write the KKT conditions for

maximize c^Tx subject to: Ax ≤b

x ≥0

(84)

Richard M. Lusby

DTU Management Engineering, Technical University of Denmark

Building 424, Room 208 rmlu@dtu.dk

2800 Kgs. Lyngby, Denmark phone +45 4525 3084 http://www.man.dtu.dk