Convexity and Optimization

(1)

Convexity and Optimization

Richard Lusby

Department of Management Engineering Technical University of Denmark

(2)

Today’s Material

(jg

Extrema

Convex Function Convex Sets

Other Convexity Concepts Unconstrained Optimization

(3)

Extrema

(jg

Problem

max f(x) s.t. x∈S max{f(x) :x∈S} Global maximum x^∗

f(x^∗)≥f(x) ∀x∈S Local maximumx^o:

f(x^o)≥f(x) ∀x in a neighborhood aroundx^o

(4)

Weierstrass theorem:

(jg

Theorem

A continuous function achieves its max/min on a closed and bounded set

(5)

Supremum and Infimum

(jg

Supremum

The supremum of a set S having a partial order is the least upper bound of S (if it exists) and is denoted sup S.

Infimum

The infimum of a set S having a partial order is the greatest lower bound of S (if it exists) and is denoted inf S.

If the extrema are not achieved:

I max→sup

I min→inf Examples

(6)

Finding Optimal Solutions

Necessary and Sufficient Conditions(jg

Every method for finding and characterizing optimal solutions is based on optimality conditions- either necessary or sufficient

Necessary Condition

A condition C1(x) isnecessary if C1(x^∗)is satisfied by every optimal solution x^∗ (and possibly some other solutions as well).

Sufficient Condition

A condition C2(x) issufficient if C2(x^∗) ensures that x^∗ is optimal (but some optimal solutions may not satisfy C2(x^∗).

Mathematically

{x|C2(x)} ⊆ {x|x optimal solution} ⊆ {x|C1(x)}

(7)

Finding Optimal Solutions

(jg

An example of a necessary condition in the case S is “well-behaved” no improving feasible direction

Feasible Direction

Considerx^o ∈S,s ∈Rⁿ is called afeasibledirection if there exists (s)>0 such that

x^o +s ∈S ∀: 0< ≤(s)

We denote the cone of feasible directions fromx^o in S as S(x^o) Improving Direction

s ∈Rⁿ is called animprovingdirection if there exists (s)>0 such that

(8)

Finding Optimal Solutions

(jg

Local Optima

Ifx^o is a local minimum, then there existno s ∈S(x^o) for which f(.) decreases alongs, i.e. for which

f(x1+₂s)<f(x+₁s) for 0≤₁< ₂ ≤(s) Stated otherwise: A necessary condition for local optimality is

F(x^o)∩S(x^o) =∅

(9)

Improving Feasible Directions

(jg

If for a given direction s it holds that

∇f(x^o)s <0

Then s is an improving direction

Well known necessary condition for local optimality of x^o for a differentiable function:

∇f(x^o) = 0

In other words,x^o is stationarywith respect to f(.)

(10)

What if Stationarity is not Enough?

(jg

Supposef is twice continuously differentiable Analyse the Hessian matrix for f atx^o

∇²f(x^o) =

∂²(f(x^o))

∂xi∂xj

, i,j = 1, . . . ,n

Sufficient Condition

If∇f(x^o) = 0 and∇²f(x^o) ispositive definite:

x^T∇²f(x^o)x>0 ∀ x∈Rⁿ\{0} then x^o is a local minimum

(11)

What if Stationarity is not Enough?

continued(jg

A necessary condition for local optimality is “Stationarity + positive semidefinitenessof ∇²f(x^o)”

Note that positive definiteness is not a necessary condition

I E.g. look atf(x) =x⁴forx^o= 0

Similar statements hold for maximization problems

I Key concept here isnegative definiteness

(12)

Definiteness of a Matrix

(jg

A number of criteria regarding the definiteness of a matrix exist A symmetric n×n matrixAis positive definite if and only if

x^TAx>0 ∀x∈Rⁿ\{0}

Positive semidefiniteis defined likewise with ⁰⁰≥⁰⁰ instead of⁰⁰>⁰⁰ Negative (semi) definiteis defined by reversing the inequality signs to

00<⁰⁰ and⁰⁰ ≤⁰⁰, respectively.

Necessary conditions for positive definiteness:

Ais regular with det(A)>0 A⁻¹ is positive definite

(13)

Definiteness of a Matrix

(jg

Necessary+Sufficient conditions for positive definiteness:

Sylvestor’s Criterion: All principal submatrices have positive determinants

(a11) a11 a12

a21 a22

!







a11 a12 a13

a21 a22 a23

a31 a32 a33







All eigenvalues of Aare positive

(14)

Necessary and Sufficient Conditions

Differentiable f(.)(jg

Theorem

Suppose that f(.) is differentiable at a local minimumx^o. Then

∇f(xô)s ≥0 fors ∈S(xô). Iff(.) is twice differentiable at xô and

∇f(xô) = 0, then s^T∇²f(xô)s ≥0∀s ∈S(xô) Theorem

Suppose that S is convex and non-empty, f(.) differentiable, and that x^o ∈S. Suppose furthermore that f(.) is convex.

xô is a local minimum if and only ifxô is a global minimum xô is a local (and hence global) minimum if and only if

∇f(x^o)(x−x^o)≥0∀x∈S

(15)

Convex Combination

(jg

Convex combination

The convex combinationof two points is the line segment between them α1x1+α2x2for α1, α2≥0 and α1+α2 = 1

(16)

Convex function

(jg

Convex Functions

A convex function lies below itschord

f(α1x1+α2x2)≤α1f(x1) +α2f(x2) Astrictly convex function has no more than one minimum Examples: y =x²,y =x⁴,y=x

The sum of convex functions is also convex

A differentiable convex function lies above its tangent A differentiable function is convex if its Hessian is positive semi-definite

I Strictly convex not analogous!

A function f isconcave iff−f is convex

(17)

Economic Order Quantity Model

(jg

The problem

The Economic Order Quantity Modelis an inventory model that helps manafacturers, retailers, and wholesalers determine how they should optimally replenish their stock levels.

Costs

K = Setup cost for odering one batch c = unit cost for producing/purchasing

h = holding cost per unit per unit of time in inventory Assumptions

d = A known constant demand rate

(18)

Convex sets

(jg

Definition

A convex setcontains all convex combinations of its elements α₁x₁+α₂x₂∈S ∀x₁,x₂∈S

Some examples of E.g. (1,2], x²+y² <4,∅ Level curve (2 dimensions):

{(x,y) :f(x,y) =β} Level set:

{x:f(x)≤β}

(19)

Lower Level Set Example

(jg

−4

−2

0 0 2 4

0 100

f(x)

Plot of2x²+ 4y²

50 100 150

(20)

Lower Level Set Example

(jg

140 140 140 140

130 130 130 130

120 120 120 120

110 110 110 110

100 100

100 100 100

90 90

90

90 90

90

80 80

80

80 80

80

70 70

70

70 70

70

60

60 60

60

50 60 50

50

50 50

50 40

40

40 40 40

30

30 30

20

20 20

10 10

10

−4 −2 0 2 4

−4

−2 0 2 4

y

Plot of 2x²+ 4y²

(21)

Upper Level Set Example

(jg

0

2 2

4 0

100 200

f(x)

Plot of54x+−9x²+ 78y−13y²

50 100 150

(22)

Upper Level Set Example

(jg

180

180 180 160 160

160

160 140 140

140

140 140

120

120 100

100 100

100

80 80

80

60 60 20 40

0 1 2 3 4 5

y

Plot of 54x+−9x²+ 78y−13y²

(23)

Example

(jg

−2 0

−1 2 0 1

Z

Z=_exp(x^7xy2+y²)

−1

−0.5 0 0.5 1

(24)

Example

(jg

1.2

1

0.8

0.6 0.6

0.4

0.4 0.4

0.4

0.2 0.2

0.2

0.2 0.2

0.2

0

00

0

−0.2

−0.4

−0.6

−0.8

−1

−1.2

−2 −1 0 1 2

−2

−1 0 1 2

y

Z=_exp(x^7xy2+y²)

(25)

Convexity, Concavity, and Optima

Constrained optimization problems(jg

Theorem

Suppose that S is convex and thatf(x) is convex on S for the problem minx∈Sf(x), then

Ifx^∗ is locally minimal, then x^∗ is globally minimal The setX^∗ of global optimal solutions is convex Iff is strictly convex, thenx^∗ is unique

(26)

Examples

(jg

Problem 1

Minimize −x2lnx1+ ^x₉¹ +x₂² Subject to: 1.0≤x1 ≤5.0 0.6≤x2 ≤3.6

(27)

What does the function look like?

(jg

1 2

3 3

4 5

0 20

f(x)

Plot of −x2ln(x1) +^x₉¹ +x₂²

5 10 15 20 25

(28)

Problem 2

Minimize P₃

i=1−ilnxi

Subject to: P3

i=1xi = 6

xi ≤3.5i = 1,2,3 xi ≥1.5i = 1,2,3

(29)

Class exercises

(jg

Show that f(x) =||x||= q

P

ix_i² is convex

Prove that any level set of a convex function is a convex set

(30)

Other Types of Convexity

(jg

The idea of pseudoconvexityof a function is to extend the class of functions for which stationarity is a sufficient condition for global optimality. If f is defined on an open set X and is differentiable we define the concept of pseudoconvexity.

A differentiable function f ispseudoconvex if

∇f(x)·(x⁰−x)≥0⇒f(x⁰)≥f(x) ∀x,x⁰ ∈X or alternatively ..

f(x⁰)<f(x)⇒ ∇f(x)(x⁰−x)<0 ∀x,x⁰∈X A function f ispseudoconcave iff −f is pseudoconvex

Note that iff is convex and differentiable, and X is open, thenf is

(31)

Other Types of Convexity

(jg

A function isquasiconvex if all lower level sets are convex That is, the following sets are convex

S⁰ ={x:f(x)≤β}

A function isquasiconcaveif all upper level sets are convex That is, the following sets are convex

S⁰ ={x:f(x)≥β}

Note that iff is convex and differentiable, and X is open, thenf is also quasiconvex

Convexity properties

(32)

Exercises

(jg

Show the following

f(x) =x+x³ is pseudoconvexbut notconvex f(x) =x³ isquasiconvex but notpseudoconvex

(33)

Graphically

(jg

−4

−2 0 2 4

f(x)

Plot of x³+x andx³

(34)

Exercises

(jg

Convexity Questions

Can a function be both convex and concave?

Is a convex function of a convex function convex?

Is a convex combination of convex functions convex?

Is the intersection of convex sets convex?

(35)

Unconstrained problem

(jg

minf(x) s.t. x∈Rⁿ

Necessary optimality condition for x^o to be a local minimum

∇f(xô) = 0 andH(xô) is positive semidefinite Sufficientoptimality condition for xô to be a local minimum

∇f(x^o) = 0 andH(x^o) is positive definite Necessary and sufficient

(36)

Unconstrained example

(jg

minf(x) = (x²−1)³ f⁰(x) = 6x(x²−1)²= 0 for x = 0,±1 H(x) = 24x²(x²−1) + 6(x²−1)² H(0) = 6 and H(±1) = 0

Therefore x= 0 is a local minimum (actually the global minimum) x =±1 are saddle points

(37)

What does the function look like?

(jg

−1 0 1 2

f(x)

f(x)=(x²−1)³

(38)

Class Exercise

(jg

Problem

SupposeA is anm∗n matrix, b is a givenm vector, find min||Ax−b||²