MatrixProblems Fromnowon,thecoeﬃcientmatrixA isallowedtohavemorerowsthan columns,i.e., A 2Rm n with m n: Form >n itisnaturaltoconsidertheleastsquaresp...

1 downloads 21 Views 536KB Size

No documents

When we say “naive solution” we either mean the solution A−1 b (when m = n) or the least squares solution (when m > n). We emphasize the convenient fact that the naive solution has precisely the same SVD expansion in both cases: x naive =

n X uT b i

i=1

Intro to Inverse Problems

Chapter 4

σi

vi .

Regularization Methods

1 / 29

Naive Solutions are Useless

Exact solutions (blue smooth lines) together with the naive solutions (jagged green lines) to two test problems. Left: deriv2 with n = 64. Middle and right: gravity with n = 32 and n = 53. Due to the large condition numbers (especially for gravity) the small perturbations lead to useless naive solutions. Intro to Inverse Problems

Chapter 4

Regularization Methods

2 / 29

Need For Regularization Discrete ill-posed problems are characterized by having coefficient matrices with a very large condition number. The naive solution is very sensitive to any perturbation of the right-hand side, representing the errors in the data. Specifically, assume that the exact and perturbed solutions x exact and x satisfy A x exact = b exact , A x = b = b exact + e, where e denotes the perturbation. Then classical perturbation theory leads to the bound kek2 kx exact − xk2 ≤ cond(A) exact . kx exact k2 kb k2 Since cond(A) = σ1 /σn is large, this implies that x can be very far from x exact .

Intro to Inverse Problems

Chapter 4

Regularization Methods

3 / 29

Illustration of Ill Conditioning and Regularization

Rn = span{v1 , . . . , vn }

Rm = span{u1 , . . . , um }

Exact sol.: x exact • ∗ xλ (Tikhonov) ? xk (TSVD)

b exact = A x exact

-•

@ R @ exact + e ◦ b=b

Naive sol.: x naive + ◦

Intro to Inverse Problems

Chapter 4

Regularization Methods

4 / 29

Regularization Methods → Spectral Filtering

Almost all the regularization methods treated in this course produce solutions which can be expressed as a filtered SVD expansion of the form xreg =

n X i=1

ϕi

uiT b vi , σi

where ϕi are the filter factors associated with the method. These methods are called spectral filtering methods because the SVD basis can be considered as a spectral basis.

Intro to Inverse Problems

Chapter 4

Regularization Methods

5 / 29

Truncated SVD A simple way to reduce the influence of the noise is to discard the SVD coefficients corresponding to the smallest singular values. We can define the truncated SVD (TSVD) solution as xk =

k X uiT b vi , σi

k < n.

i=1

Regularization Tools: tsvd. Alternatively we can define xk as the solution of the problem min kxk2 x

s.t. kAk x − bk2 = min,

where we introduce the rank-k matrix Ak = U Σk V T =

k X

σi ui viT ,

Σk = diag(σi , . . . , σk , 0, . . . , 0).

i=1 Intro to Inverse Problems

Chapter 4

Regularization Methods

6 / 29

The Truncation Parameter

Note: the truncation parameter k in xk =

k X uiT b vi σi i=1

is dictated by the coefficients uiT b, not the singular values! Basically we should choose k as the index i where |uiT b| start to “level off” due to the noise.

Intro to Inverse Problems

Chapter 4

Regularization Methods

7 / 29

More About the Truncated SVD Can show that if Cov(b) = η 2 I then Cov(xk ) = η 2

k X 1 vi viT σi2 i=1

and thus we can expect that kxk k2 kx naive k2

and

kCov(xk )k2 kCov(x naive )k2 .

The prize we pay for smaller covariance is bias: E(xk ) 6= E(x naive ). Advantes of TSVD: Intuitive. Easy to compute if we have the SVD.

Drawback of TSVD: For large-scale problems it is infeasible to compute the SVD. The abrupt cut-off of SVD components may introduce artifacts. Intro to Inverse Problems

Chapter 4

Regularization Methods

8 / 29

Selective SVD Consider a problem in which, say, every second SVD component is zero (v2T x exact = v4T x exact = v6T x exact = . . . = 0). There is no need to include these SVD components. A variant of the TSVD method called selective SVD (SSVD) includes, or selects, only those SVD components which make significant contributions to the regularized solution: xτ ≡

X |uiT b|>τ

uiT b vi . σi

Thus, the filter factors for the SSVD method are ( 1, |uiT b| ≥ τ [τ ] ϕi = 0, otherwise.

Intro to Inverse Problems

Chapter 4

Regularization Methods

9 / 29

SSVD Example

Only the filled diamonds contribute to the SSVD solution. Intro to Inverse Problems

Chapter 4

Regularization Methods

10 / 29

Regularization – A General Approach Regularization = stabilization: how to deal with (and filter) solution components corresponding to the small singular values. Most approaches involve the residual norm

Z 1

K (s, t) f (t) dt − g (s) ρ(f ) =

, 0

2

and a smoothing norm ω(f ) that measure the “size” of the solution f . Examples of common choices: Z 1 Z 1 2 2 (p) 2 2 2 ω(f ) = kf k2 = |f (t)| dt or ω(f ) = kf k2 = |f (p) (t)|2 dt 0

0

The underlying principle is that if we control the norm of the solution, or its derivative, then we should be able to suppress some/most of the large noise components. Intro to Inverse Problems

Chapter 4

Regularization Methods

11 / 29

Discrete Tikhonov Regularization Replace the continuous problem with a linear algebra problem. Minimization of the residual ρ is replaced by min kA x − bk2 , x

A ∈ Rm×n ,

where A and b are obtained by discretization of the integral equation. Must also discretize the smoothing norm Ω(x) ≈ ω(f ). We focus on a common choice: Ω(x) = kxk2 . The resulting discrete version of Tikhonov regulariztion is thus minx kA x − bk22 + λ2 kxk22 . Regularization Tools: tikhonov. Intro to Inverse Problems

Chapter 4

Regularization Methods

12 / 29

More About Tikhonov Regularization

The standard-form Tikhonov problem: min kA x − bk22 + λ2 kxk22 . x

kA x − bk22 is the residual term (data-fitting term, data-fidelity term), kxk22 is the regularization term, λ is a parameter that balances these two terms. Large λ → strong regularization, over-smoothing of solution. Small λ → good fit but solution is dominated by noise.

Intro to Inverse Problems

Chapter 4

Regularization Methods

13 / 29

Tikhonov Solutions

Intro to Inverse Problems

Chapter 4

Regularization Methods

14 / 29

Other Smoothing Norms → Chapter 8 Another common choice: Ω(x) = kL xk2 , where L approximates a derivative operator. Examples of the 1. and 2. derivative operator on a regular mesh 1 −1 (n−1)×n .. .. L1 = ∈R . . 1

−1

L2

1 −2 1 (n−2)×n .. .. .. = . ∈R . . . 1 −2 1

Regularization Tools: get_l. Intro to Inverse Problems

Chapter 4

Regularization Methods

15 / 29

Efficient Implementation The original formulation min kA x − bk22 + λ2 kxk22 . x

Two alternative formulations (AT A + λ2 I ) x = AT b

A b

min x−

x λI 0 2 The first shows that we have a linear problem. The second shows how to solve it stably: treat it as a least squares problem, utilize any sparsity or structure.

Intro to Inverse Problems

Chapter 4

Regularization Methods

16 / 29

SVD and Tikhonov Regularization

We can write the discrete Tikhonov solution xλ in terms of the SVD of A as xλ =

n X i=1

n

X [λ] u T b uiT b σi2 i v = φi vi . i 2 2 σ σ σi + λ i i i=1

The filter factors are given by [λ]

φi

=

σi2 , σi2 + λ2

and their purpose is to dampen the components in the solution corresponding to small σi .

Intro to Inverse Problems

Chapter 4

Regularization Methods

17 / 29

Tikhonov Filter Factors

[λ] φi

Intro to Inverse Problems

σ2 = 2 i 2 ≈ σi + λ

(

1,

σi λ

σi2 /λ2 , σi λ.

Chapter 4

Regularization Methods

18 / 29

TSVD and Tikhonov Regularization TSVD and Tikhonov solutions are both filtered SVD expansions. The regularization parameter is either k or λ.

For each k, there exists a λ such that xλ ≈ xk . Intro to Inverse Problems

Chapter 4

Regularization Methods

19 / 29

Wiener Filtering

In certain applications, e.g., in image deblurring, the SVD basis vectors ui and vi can be replaced by the discrete Fourier vectors (that underly the discrete Fourier transform). In these applications, Tikhonov regularization is known as Wiener filtering. It is typically derived in a stochastic setting. Here, λ−2 is the signal-to-noise power, i.e., the power of the exact solution divided by the power of the noise in the right-hand side. Available in MATLAB’s Image Processing Toolbox as deconvwnr.

Intro to Inverse Problems

Chapter 4

Regularization Methods

20 / 29

Other Spectral Filtering Methods

A few spectral filtering methods not mentioned in the book. Damped SVD: [λ]

ϕi

=

σi , σi + λ

λ ≥ 0.

Exponential filtering: [β]

ϕi

= 1 − exp(−β σi2 ),

β ≥ 0.

Regularization Tools: fil_fac computers filter factors for DSVD, TSVD, Tikhonov, and TTLS (not covered here).

Intro to Inverse Problems

Chapter 4

Regularization Methods

21 / 29

TSVD Perturbation Bound Theorem. Let b = b exact + e and let xk and xkexact denote the TSVD solutions computed with the same k. Then

kxkexact − xk k2 σ1 kek2 ≤ . kxk k2 σk kA xk k2

We see that the perturbation bound for the TSVD solution is controlled by the factor σ1 κk = σk which can be much smaller than cond(A) = σ1 /σn .

Intro to Inverse Problems

Chapter 4

Regularization Methods

22 / 29

Tikhonov Perturbation Bound Theorem. Let b = b exact + e and let xλexact and xλ denote the solutions to and min kA x − bk22 + λ2 kxk22 min kA x − b exact k22 + λ2 kxk22 x

x

computed with the same λ. Then kxλexact − xλ k2 kAk2 kek2 ≤ kxλ k2 λ kA xλ k2 and hence the perturbation bound for the Tikhonov solution is controlled by the factor kAk2 σ1 κλ = = . λ λ Again it can be much smaller than cond(A) = σ1 /σn .

Intro to Inverse Problems

Chapter 4

Regularization Methods

23 / 29

Illustration of Sensitivity

Red dots: xλ for 25 random perturbations of b. Black crosses: unperturbed xλ – note the bias. Intro to Inverse Problems

Chapter 4

Regularization Methods

24 / 29

Monotonic Behavior of the Norms The TSVD solution and residual norms vary monotonically with k kxk k22

=

k T 2 X u b i

i=1

kA xk −

σi bk22

=

≤ kxk+1 k22

n X

(we assume m = n),

(uiT b)2 ≥ kA xk+1 − bk22 .

i=k+1

The Tikhonov solution and residual norms also vary monotonically with λ: kxλ k22

n T 2 X [λ] ui b , = φi σi i=1

kA xλ − bk22 =

n X

[λ]

1 − φi ) uiT b

2

.

i=1

Intro to Inverse Problems

Chapter 4

Regularization Methods

25 / 29

The L-Curve for Tikhonov Regularization Plot of kxλ k2 versus kA xλ − bk2 in log-log scale.

Intro to Inverse Problems

Chapter 4

Regularization Methods

26 / 29

Properties of the L-Curve The norm kxλ k2 is a monotonically decreasing convex function of the norm kA xλ − bk2 . Define the “inconsistency” δ02

=

m X

(uiT b)2

(= 0 when m = n.)

i=n+1

Then δ0 ≤ kA xλ − bk2 ≤ kbk2 0 ≤ kxλ k2 ≤ kx naive k2 . Any point (δ, η) on the L-curve is a solution to the following two inequality-constrained least squares problems: δ = minx kA x − bk2 η = minx kxk2 Intro to Inverse Problems

subject to

subject to Chapter 4

kxk2 ≤ η

kA x − bk2 ≤ δ . Regularization Methods

27 / 29

More Properties For small values of λ, many SVD components are included in the Tikhonov solution, and hence it is dominated by the perturbation errors coming from the inverted noise – the solution is under-smoothed, and we have kxλ k2 increases with λ−1

and

kA xλ − bk2 ≈ kek2 (a constant).

When λ gets larger (but not very large), then xλ is dominated by SVD coefficients whose main contribution is from the exact right-hand side b exact – and the solution becomes over-smoothed. A careful analysis shows that for such values of λ we have kxλ k2 ≈ kx exact k2 (a constant),

kA xλ − bk2 increases with λ.

As λ → ∞ we have kxλ k2 → 0 and kA xλ − bk2 → kbk2 . Thus the L-curve has two distinctly different parts: a part that is approximately horizontal, and a part that is approximately vertical. Intro to Inverse Problems

Chapter 4

Regularization Methods

28 / 29

Log-Log Scale Separates Over- and Under-Smoothing

The features become more pronounced (and easier to inspect) when the L-curve is plotted in double-logarithmic scale: ( log kA xλ − bk2 , log kxλ k2 ) The “corner” that separates these horizontal and vertical parts is located roughly at the point ( log kek2 , log kx exact k2 ) . Towards the right, for λ → ∞, the L-curve starts to bend down as the increasing amount of regularization forces the solution norm towards zero.

Intro to Inverse Problems

Chapter 4

Regularization Methods

29 / 29