HólarSummerSchoolonSparseCoding,August2010 FrançoisLauze TotalVariationinImageAnalysis(TheHomoErectusStage?)

(1)

Total Variation

Total Variation in Image Analysis (The Homo Erectus Stage?)

François Lauze

1 Department of Computer Science University of Copenhagen

Hólar Summer School on Sparse Coding, August 2010

(2)

1 Motivation

Origin and uses of Total Variation Denoising

Tikhonov regularization 1-D computation on step edges 2 Total Variation I

First definition Rudin-Osher-Fatemi Inpainting/Denoising 3 Total Variation II

Relaxing the derivative constraints Definition in action

Using the new definition in denoising: Chambolle algorithm Image Simplification

4 Bibliography 5 The End

(3)

Total Variation Motivation

Origin and uses of Total Variation

Outline

1 Motivation

Tikhonov regularization 1-D computation on step edges

2 Total Variation I First definition Rudin-Osher-Fatemi Inpainting/Denoising

3 Total Variation II

(4)

In mathematics: the Plateau problem of minimal surfaces, i.e. surfaces of minimal area with a given boundary

In image analysis: denoising, image reconstruction, segmentation...

An ubiquitous prior for many image processing tasks.

(5)

Origin and uses of Total Variation

(6)

(7)

Total Variation Motivation Denoising

Outline

1 Motivation

(8)

Determine an unknown image from a noisy observation.

(9)

Methods

All methods based on some statistical inference.

Fourier/Wavelets Markov Random Fields

Variational and Partial Differential Equations methods ...

We focus on variational and PDE methods.

(10)

A digital imageuof sizeN×Mpixels, corrupted by Gaussian white noise of varianceσ²

write it as observed imageu0=u+η,ku−u0k²=P

ij(uij−u0ij)²=NMσ² (noise variance =σ²),P

ijuij=P

iju0ij(zero mean noise).

could add a blur degradationu₀=Ku+ηfor instance, so to have kKu−u0k²=NMσ².

(11)

A simple corruption model

ijuij=P

(12)

ijuij=P

(13)

A simple corruption model

ijuij=P

(14)

ijuij=P

(15)

Recovery

The problem: Findusuch that

ku−u0k²=NMσ², X

ij

uij=X

ij

u0ij (1)

is not well-posed. Many solutions possible.

In order to recoveru, extra information is needed, e.g. in the form of a prior onu.

For images, smoothness priors often used.

LetRua digital gradient ofu, Then find smoothestuthat satisfy constraints (1), the smoothest meaning with smallest

T(u) =kRuk= sX

ij

|Ru|²_ij.

(16)

ij

uij=X

ij

u0ij (1)

T(u) =kRuk= sX

ij

|Ru|²_ij.

(17)

Recovery

ij

uij=X

ij

u0ij (1)

T(u) =kRuk= sX

ij

|Ru|²_ij.

(18)

ij

uij=X

ij

u0ij (1)

T(u) =kRuk= sX

ij

|Ru|²_ij.

(19)

Recovery

ij

uij=X

ij

u0ij (1)

T(u) =kRuk= sX

ij

|Ru|²_ij.

(20)

1 Motivation

(21)

Tikhonov regularization

It can be show that this is equivalent to minimize E(u) =kKu−u0k²+λkRuk²

for aλ=λ(σ)(Wahba?).

E(u)minimizaton can be derived from a Maximum a Posteriori formulation

Arg.max

u

p(u|u₀) = p(u₀|u)p(u) p(u0)

Rewriting in a continuous setting:

E(u) = Z

Ω

(Ku−uo)²dx+λ Z

Ω

|∇u|²dx

(22)

Arg.max

u

p(u|u₀) = p(u₀|u)p(u) p(u0)

E(u) = Z

Ω

(Ku−uo)²dx+λ Z

Ω

|∇u|²dx

(23)

Tikhonov regularization

Arg.max

u

p(u|u₀) = p(u₀|u)p(u) p(u0)

E(u) = Z

Ω

(Ku−uo)²dx+λ Z

Ω

|∇u|²dx

(24)

E(u)minimizaton can be derived from a Maximum a Posteriori formulation Arg.max

u

p(u|u₀) = p(u₀|u)p(u) p(u0)

E(u) = Z

Ω

(Ku−uo)²dx+λ Z

Ω

|∇u|²dx

(25)

Tikhonov regularization

E(u)minimizaton can be derived from a Maximum a Posteriori formulation Arg.max

u

p(u|u₀) = p(u₀|u)p(u) p(u0)

E(u) = Z

Ω

(Ku−uo)²dx+λ Z

Ω

|∇u|²dx

(26)

Solution satisfies the Euler-Lagrange equation forE: K^∗(Ku−u₀)−λ∆u=0.

(K^∗is the adjoint ofK)

A linear equation, easy to implement, and many fast solvers exit, but...

(27)

How to solve?

(28)

(29)

Tikhonov example

Denoising example,K=Id.

Original λ=50 λ=500

Not good: images contain edges but Tikhonov blur them. Why?

The termR

Ω(u−u0)²dx: not guilty!

Then it must beR

Ω|∇u|²dx. Derivatives and step edges do not go too well together?

(30)

The termR

Then it must beR

(31)

Tikhonov example

The termR

Then it must beR

(32)

The termR

Then it must beR

(33)

1-D computation on step edges

Outline

1 Motivation

Tikhonov regularization 1-D computation on step edges 2 Total Variation I

First definition Rudin-Osher-Fatemi Inpainting/Denoising

(34)

SetΩ = [−1,1],aa real number anduthe step-edge function u(x) =

(

0 x≤0

a x>0 Not differentiable at 0, but forget about it and try to compute

Z 1

−1

|u⁰(x)|²dx.

Around 0 “approximate”u⁰(x)by u(h)−u(−h)

2h , h>0,small

(35)

(

0 x≤0

Z 1

−1

|u⁰(x)|²dx.

2h , h>0,small

(36)

(

0 x≤0

Z 1

−1

|u⁰(x)|²dx.

2h , h>0,small

(37)

with this finite difference approximation u⁰(x)≈ a

2h, x∈[−h,h]

then Z1

−1

|u⁰(x)|²dx = Z −h

−1

|u⁰(x)|²dx+ Z h

−h

|u⁰(x)|²dx+ Z 1

h

|u⁰(x)|²dx

= 0+2h×a 2h

2

+0

= a²

2h→ ∞, h→0

So a step-edge has “infinite energy”.It cannot minimizes Tikhonov.

What went “wrong”: the square:

(38)

2h, x∈[−h,h]

then Z1

−1

|u⁰(x)|²dx = Z −h

−1

|u⁰(x)|²dx+ Z h

−h

|u⁰(x)|²dx+ Z 1

h

|u⁰(x)|²dx

= 0+2h×a 2h

2

+0

= a²

2h→ ∞, h→0

(39)

2h, x∈[−h,h]

then Z1

−1

|u⁰(x)|²dx = Z −h

−1

|u⁰(x)|²dx+ Z h

−h

|u⁰(x)|²dx+ Z 1

h

|u⁰(x)|²dx

= 0+2h×a 2h

2

+0

= a²

2h→ ∞, h→0

(40)

2h, x∈[−h,h]

then Z1

−1

|u⁰(x)|²dx = Z −h

−1

|u⁰(x)|²dx+ Z h

−h

|u⁰(x)|²dx+ Z 1

h

|u⁰(x)|²dx

= 0+2h×a 2h

2

+0

= a²

2h→ ∞, h→0

(41)

2h, x∈[−h,h]

then Z1

−1

|u⁰(x)|²dx = Z −h

−1

|u⁰(x)|²dx+ Z h

−h

|u⁰(x)|²dx+ Z 1

h

|u⁰(x)|²dx

= 0+2h×a 2h

2

+0

= a²

2h→ ∞, h→0

(42)

Replace the square in the previous computation byp>0 and redo:

Then Z1

−1

|u⁰(x)|^pdx = Z −h

−1

|u⁰(x)|^pdx+ Z h

−h

|u⁰(x)|^pdx+ Z 1

h

|u⁰(x)|^pdx

= 0+2h×

a 2h

p

+0

= |a|^p(2h)^1−p<∞ whenp≤1

Whenp≤1 this is finite! Edges can survive here!

Quite ugly whenp<1 (but not uninteresting) Whenp=1, this is theTotal Variationofu.

(43)

Then Z1

−1

|u⁰(x)|^pdx = Z −h

−1

|u⁰(x)|^pdx+ Z h

−h

|u⁰(x)|^pdx+ Z 1

h

|u⁰(x)|^pdx

= 0+2h×

a 2h

p

+0

= |a|^p(2h)^1−p<∞ whenp≤1

(44)

Then Z1

−1

|u⁰(x)|^pdx = Z −h

−1

|u⁰(x)|^pdx+ Z h

−h

|u⁰(x)|^pdx+ Z 1

h

|u⁰(x)|^pdx

= 0+2h×

a 2h

p

+0

= |a|^p(2h)^1−p<∞ whenp≤1

(45)

Then Z1

−1

|u⁰(x)|^pdx = Z −h

−1

|u⁰(x)|^pdx+ Z h

−h

|u⁰(x)|^pdx+ Z 1

h

|u⁰(x)|^pdx

= 0+2h×

a 2h

p

+0

= |a|^p(2h)^1−p<∞ whenp≤1

(46)

Then Z1

−1

|u⁰(x)|^pdx = Z −h

−1

|u⁰(x)|^pdx+ Z h

−h

|u⁰(x)|^pdx+ Z 1

h

|u⁰(x)|^pdx

= 0+2h×

a 2h

p

+0

= |a|^p(2h)^1−p<∞ whenp≤1

(47)

Then Z1

−1

|u⁰(x)|^pdx = Z −h

−1

|u⁰(x)|^pdx+ Z h

−h

|u⁰(x)|^pdx+ Z 1

h

|u⁰(x)|^pdx

= 0+2h×

a 2h

p

+0

= |a|^p(2h)^1−p<∞ whenp≤1

(48)

1 Motivation

(49)

Total Variation Total Variation I

First definition

Letu: Ω⊂Rⁿ→R. Define total variation as

J(u) = Z

Ω

|∇u|dx, |∇u|= v u u t

n

X

i=1

u_x²_i.

WhenJ(u)is finite, one says thatuhasbounded variationsand the space of function of bounded variations onΩis denotedBV(Ω).

(50)

Letu: Ω⊂Rⁿ→R. Define total variation as

J(u) = Z

Ω

|∇u|dx, |∇u|= v u u t

n

X

i=1

u_x²_i.

WhenJ(u)is finite, one says thatuhasbounded variationsand the space of function of bounded variations onΩis denotedBV(Ω).

(51)

First definition

Expected: when minimizingJ(u)with other constraints, edges are less penalized that with Tikhonov.

Indeed edges are “naturally present” in bounded variation functions. In fact:

functions of bounded variations can be decomposed in

1 smooth parts,∇uwell defined,

2 Jump discontinuities (our edges)

3 something else (Cantor part) which can be nasty...

The functions that do not possess this nasty part form a subspace ofBV(Ω) calledSBV(Ω), TheSpecial functions of Bounded Variation, (used for instance when studying Mumford-Shah functional)

(52)

(53)

First definition

(54)

(55)

First definition

(56)

(57)

First definition

(58)

(59)

First definition

(60)

(61)

Rudin-Osher-Fatemi

Outline

1 Motivation

(62)

State the denoising problem as minimizingJ(u)under the constraints Z

Ω

u dx= Z

Ω

uodx, Z

Ω

(u−u₀)²dx=|Ω|σ² (|Ω|=area/volume ofΩ)

Solve via Lagrange multipliers.

(63)

Rudin-Osher-Fatemi

ROF Denoising

Ω

u dx= Z

Ω

uodx, Z

Ω

(64)

Ω

u dx= Z

Ω

uodx, Z

Ω

(65)

Rudin-Osher-Fatemi

TV-denoising

Chambolle-Lions: there existsλsuch the solution minimizes ETV(u) =1

2 Z

Ω

(Ku−u0)²dx+λ Z

Ω

|∇u|dx

Euler-Lagrange equation:

K^∗(Ku−u₀)−λdiv ∇u

|∇u|

=0.

The term div

∇u

|∇u|

is highly non linear. Problems especially when|∇u|=0.

In fact_|∇u|^∇u/(x)is the unit normal of the level line ofuatxand div

∇u

|∇u|

is the (mean)curvature of the level line: not defined when the level line is singular or does not exist!

(66)

2 Z

Ω

(Ku−u0)²dx+λ Z

Ω

|∇u|dx

|∇u|

=0.

The term div

∇u

|∇u|

∇u

|∇u|

(67)

Rudin-Osher-Fatemi

TV-denoising

2 Z

Ω

(Ku−u0)²dx+λ Z

Ω

|∇u|dx

|∇u|

=0.

The term div

∇u

|∇u|

∇u

|∇u|

(68)

2 Z

Ω

(Ku−u0)²dx+λ Z

Ω

|∇u|dx

|∇u|

=0.

The term div

∇u

|∇u|

∇u

|∇u|

(69)

Rudin-Osher-Fatemi

TV-denoising

2 Z

Ω

(Ku−u0)²dx+λ Z

Ω

|∇u|dx

|∇u|

=0.

The term div

∇u

|∇u|

∇u

|∇u|

(70)

Replace it by regularized version

|∇u|_β= q

|∇u|²+β, β >0 Acar - Vogel show that

β→0lim

Jβ(u) = Z

Ω

|∇u|βdx

=J(u).

Replace energy by

E⁰(u) = Z

Ω

(Ku−u0)²dx+λJβ(u)

|∇u|_β

=0 The null denominator problem disappears.

(71)

Rudin-Osher-Fatemi

Acar-Vogel

|∇u|_β= q

β→0lim

Jβ(u) = Z

Ω

|∇u|βdx

=J(u).

Replace energy by

E⁰(u) = Z

Ω

|∇u|_β

(72)

|∇u|_β= q

β→0lim

Jβ(u) = Z

Ω

|∇u|βdx

=J(u).

Replace energy by

E⁰(u) = Z

Ω

|∇u|_β

(73)

Rudin-Osher-Fatemi

Acar-Vogel

|∇u|_β= q

β→0lim

Jβ(u) = Z

Ω

|∇u|βdx

=J(u).

Replace energy by

E⁰(u) = Z

Ω

|∇u|_β

(74)

Implementation by finite differences, fixed-point strategy, linearization.

Original λ=1.5,β=10⁻⁴

(75)

Rudin-Osher-Fatemi

Example

Implementation by finite differences, fixed-point strategy, linearization.

Original λ=1.5,β=10⁻⁴

(76)

1 Motivation

2 Total Variation I First definition Rudin-Osher-Fatemi Inpainting/Denoising 3 Total Variation II

(77)

Inpainting/Denoising

Fillinguin the subsetH⊂Ωwhere data is missing, denoise known data Inpainting energy (Chan & Shen):

E_ITV(u) = 1 2 Z

Ω\H

(u−u₀)²dx+λ Z

Ω

|∇u|dx

Euler-Lagrange Equation:

(u−u0)χ−λdiv ∇u

|∇u|

=0.

(χ(x) =1 isx6∈H, 0 otherwise).

Very similar to denoising. Can use the same approximation/implementation.

(78)

E_ITV(u) = 1 2 Z

Ω\H

(u−u₀)²dx+λ Z

Ω

|∇u|dx

|∇u|

=0.

(79)

E_ITV(u) = 1 2 Z

Ω\H

(u−u₀)²dx+λ Z

Ω

|∇u|dx

|∇u|

=0.

(80)

E_ITV(u) = 1 2 Z

Ω\H

(u−u₀)²dx+λ Z

Ω

|∇u|dx

|∇u|

=0.

(81)

E_ITV(u) = 1 2 Z

Ω\H

(u−u₀)²dx+λ Z

Ω

|∇u|dx

|∇u|

=0.

(82)

Degraded Inpainted

(83)

Segmention

Inpainting - driven segmention (Lauze, Nielsen 2008, IJCV)

Aortic calcifiction Detection Segmention

(84)

1 Motivation

(85)

Total Variation Total Variation II

Relaxing the derivative constraints

With definition of total variation as J(u) =

Z

Ω

|∇u|dx umust have (weak) derivatives.

But we just saw that the computation is possible for a step-edgeu(x) =0,x<0, u(x) =a,x>0:

Z 1

−1

|u⁰(x)|dx=|a|

Can we avoid the use of derivatives ofu?

(86)

Z

Ω

Z 1

−1

|u⁰(x)|dx=|a|

(87)

Z

Ω

Z 1

−1

|u⁰(x)|dx=|a|

(88)

Z

Ω

Z 1

−1

|u⁰(x)|dx=|a|

(89)

Assume first that∇uexists.

|∇u|=∇u· ∇u

|∇u|

(except when∇u=0) and_|∇u|^∇u is the normal to the level lines ofu, it has everywhere norm 1.

LetVthe set of vector fieldsv(x)onΩwith|v(x)| ≤1. I claim J(u) =sup

v∈V

Z

Ω

∇u(x)·v(x)dx

(consequence of Cauchy-Schwarz inequality).

(90)

|∇u|=∇u· ∇u

|∇u|

v∈V

Z

Ω

∇u(x)·v(x)dx

(91)

|∇u|=∇u· ∇u

|∇u|

v∈V

Z

Ω

∇u(x)·v(x)dx

(92)

Restrict to the setW of suchv’s that are differentiable and vanishing at∂Ω, the boundary ofΩThen

J(u) =sup

v∈W

Z

Ω

∇u(x)·v(x)dx

But then I can use Divergence theorem:H⊂D⊂Rⁿ,f :D→Rdifferentiable function,g= (g¹, . . . ,gⁿ) :D→Rⁿdifferentiable vector field and

divg=Pn i=1gⁱ_x

i, Z

H

∇f·g dx=− Z

H

fdivg dx+ Z

∂H

fg·n(s)ds

withn(s)exterior normal field to∂H.

Apply it to J(u) above:

J(u) = sup

v∈W

− Z

Ω

u(x)divv(x)dx

The gradient has disappeared fromu!This is the classical definition of total variation.

Note that when∇u(x)6=0, optimalv(x) = (∇u/|∇|u)(x)and divv(x)is the mean curvature of the level set ofuatx. Geometry is there!

(93)

J(u) =sup

v∈W

Z

Ω

∇u(x)·v(x)dx

divg=Pn i=1gⁱ_x

i, Z

H

∇f·g dx=− Z

H

fdivg dx+ Z

∂H

fg·n(s)ds withn(s)exterior normal field to∂H.

J(u) = sup

v∈W

− Z

Ω

u(x)divv(x)dx

(94)

J(u) =sup

v∈W

Z

Ω

∇u(x)·v(x)dx

divg=Pn i=1gⁱ_x

i, Z

H

∇f·g dx=− Z

H

fdivg dx+ Z

∂H

J(u) = sup

v∈W

− Z

Ω

u(x)divv(x)dx

(95)

J(u) =sup

v∈W

Z

Ω

∇u(x)·v(x)dx

divg=Pn i=1gⁱ_x

i, Z

H

∇f·g dx=− Z

H

fdivg dx+ Z

∂H

J(u) = sup

v∈W

− Z

Ω

u(x)divv(x)dx

(96)

J(u) =sup

v∈W

Z

Ω

∇u(x)·v(x)dx

divg=Pn i=1gⁱ_x

i, Z

H

∇f·g dx=− Z

H

fdivg dx+ Z

∂H

J(u) = sup

v∈W

− Z

Ω

u(x)divv(x)dx

(97)

J(u) =sup

v∈W

Z

Ω

∇u(x)·v(x)dx

divg=Pn i=1gⁱ_x

i, Z

H

∇f·g dx=− Z

H

fdivg dx+ Z

∂H

J(u) = sup

v∈W

− Z

Ω

u(x)divv(x)dx

(98)

1 Motivation

(99)

Definition in action

Step-edge

uthe step-edge function defined in previous slides. We computeJ(u)with the new definition.

hereW ={φ: [−1,1]→Rdifferentiable, φ(−1) =φ(1) =0,|φ(x)| ≤1},

J(u) = sup

φ∈W

Z 1

−1

u(x)φ⁰(x)dx

we compute

Z 1

−1

u(x)φ⁰(x)dx=a Z 1

0

φ⁰(x)dx

=a(φ(1)−φ(0))

=−aφ(0)

As−1≤φ(0)≤1, the maximum is|a|.

(100)

hereW ={φ: [−1,1]→Rdifferentiable, φ(−1) =φ(1) =0,|φ(x)| ≤1}, J(u) = sup

φ∈W

Z 1

−1

u(x)φ⁰(x)dx

we compute

Z 1

−1

0

φ⁰(x)dx

=a(φ(1)−φ(0))

=−aφ(0)

(101)

Step-edge

φ∈W

Z 1

−1

u(x)φ⁰(x)dx

we compute

Z 1

−1

0

φ⁰(x)dx

=a(φ(1)−φ(0))

=−aφ(0)

(102)

φ∈W

Z 1

−1

u(x)φ⁰(x)dx

we compute

Z 1

−1

0

φ⁰(x)dx

=a(φ(1)−φ(0))

=−aφ(0)

(103)

Step-edge

φ∈W

Z 1

−1

u(x)φ⁰(x)dx

we compute

Z 1

−1

0

φ⁰(x)dx

=a(φ(1)−φ(0))

=−aφ(0)

(104)

φ∈W

Z 1

−1

u(x)φ⁰(x)dx

we compute

Z 1

−1

0

φ⁰(x)dx

=a(φ(1)−φ(0))

=−aφ(0)

(105)

2D example

Bopen set with regular boundary curvepartialB,Ωlarge enough to containBand χBthe characteristic function ofB

χ_B(x) =

(1 x∈B 0 x6∈B

Forv∈W, by the divergence theorem onBand its boundary∂B Z

Ω

χ(x)divv(x)dx= Z

B

divv(x)dx

=− Z

∂B

v(s)·n(s)ds (n(s)is the exterior normal to∂B)

This integral is maximized whenv=−n: length of∂Bperimeter ofB.

(106)

Bopen set with regular boundary curvepartialB,Ωlarge enough to containBand χBthe characteristic function ofB

χ_B(x) =

(1 x∈B 0 x6∈B

Forv∈W, by the divergence theorem onBand its boundary∂B Z

Ω

χ(x)divv(x)dx= Z

B

divv(x)dx

=− Z

∂B

v(s)·n(s)ds (n(s)is the exterior normal to∂B)

This integral is maximized whenv=−n: length of∂Bperimeter ofB.