Sparse Solutions to Linear Inverse Problems

The aim of this thesis is to investigate the framework of approximate mes-sage passing [DMM09] for finding sparse solutions to linear inverse problems.

This work is mainly motivated by problem of source localization for electroen-cephalography (EEG) [BML01, NS06], which aims to localize the sources of neural electrical activity in the human brain using non-invasive measurements of electromagnetic signals. Thus, the distribution of the neural electrical brain activity is the desired state and the electromagnetic signals are the measure-ments (see appendixDfor more details). However, the methods and algorithms discussed in this thesis apply to ill-posed linear inverse problems in general.

Set of solutions

x₂

Set of measurements

y Ax1

Ax2

(a) Ill-posed problem

Set of solutions

x1 x2

Set of measurements

Ax1 Ax2

(b) Ill-conditioned problem

Figure 1.2: (a) Illustration of an ill-posed problem, where more than one solu-tion map to the same set of measurements. (b) Illustrasolu-tion of the concept of an ill-conditioned problem, where two similar solutions map to very different measurements.

Formally, a linear inverse problem has the following form:

y=Ax+e, (1.1)

where y ∈ R^m is a vector of m measurements, A ∈ R^m^×ⁿ is the so-called forward matrix,e ∈R^m is a corruptive noise vector andx∈Rⁿ is the desired state of the system. The objective is then to reconstructxfrom the pair(A,y).

From this formulation, it is immediately seen that the (noiseless) measurements are easily computed based on knowledge of the forward matrix Aand the true solutionx.

For many problems of practical interest (including EEG source localization), the number of measurements are much smaller than the dimension of the desired state vector x, i.e. m << n, and this effectively makes the linear system of equations in eq. (1.1) under-determined. This implies that, if a solution exists, then it is not unique and therefore, the problem is indeed ill-posed according to the Hadamard definition.

Besides being ill-posed, many linear inverse problems also provide another dif-ficult challenge: they are ill-conditioned, meaning that the solution is highly sensitive to small perturbations in the measurements. This is of course an un-desirable characteristic for a noisy system. The ”degree” of illconditioned-ness can be measured by the condition number¹ of the forward matrix A, where a high condition number implies that the problem is ill-conditioned. Figure 1.2 illustrates the differences between a problem being posed and being ill-conditioned.

Because of these issues, additional assumptions or constraints have to be im-posed on the system in eq. 1.1in order for it to be solved. Put another way, the

1The condition number of a matrixAis the ratio of the largest and smallest singular value ofA.

systems needs to be regularized, where the classical approach is Tihkonov regu-larization, in which solutions with smaller norms are preferred [TA77,McD09].

Another approach is to assume that the desired solution x is sparse [Ela12], which is the approach taken in this thesis.

So what does it mean forx to be sparse and how does it help? It means that most of the entries in x∈ Rⁿ is zero, or put another way: the energy of x is concentrated in a small number of positions. The vectorxis said to bek-sparse, if it has at most k≤n non-zero entries. In the noiseless case, i.e. e= 0, and under certain assumptions on the forward matrix A, it has been shown that if x is sufficiently sparse compared to the undersamplingsratio, i.e. the ratio

n, then exact recovery ofxis possible [CRT06, DT10]. Informally, the higher degree of sparsity, i.e. the smaller k, the fewer samples are required to achieve perfect reconstruction. This is referred to as the sparsity-undersampling trade-off. Besides mathematical convenience, sparsity also has the advantage that sparse models are usually easier to interpret than fully ”dense” models. But it is important to stress that the location of the non-zero elements inx, also known as the support ofx, are usually not known in advance.

We now return to the simple problem of recovering(x1, x2)from the arithmetic averagea, for which we concluded an infinite number of solutions existed. Sup-pose now that the desired solution(x1, x2)is1-sparse, then it is easily seen that (2a,0) and (0,2a) are the only solutions to the system. Thus, the solution is still not unique, but the infinite set of solutions is effectively reduced to a set of only 2 solutions.

However, many natural signals of interest do not have an exact sparse repre-sentation, especially not under contamination of noise. But it turns out that many signals can be well approximated by a sparse signal when represented in a suitable basis. For instance, a signal dominated by a single sine-wave is well approximated using a 1-sparse signal in suitable Fourier basis, many natural images can be well approximated by a sparse signal using a Wavelet or Curvelet representation [SFM10] and so on. Signals, which are well approximated by a sparse signal in a known basis, are referred to ascompressible signals.

A natural generalization of the linear inverse problem in eq. (1.1) is the so-called multiple measurement vector problem (MMV) [CREKD05], where, as the name suggests, multiple measurement vectors {y^t} are available at consecutive time stepst= 1, .., T:

y^t=A^tx^t+e^t, (1.2)

In the context of the MMV problem, the formulation of the linear inverse prob-lem in eq. (1.1) is referred to as asingle measurement vector problem (SMV).

Clearly, the MMV problem can be approached by solving T individual (SMV) problems and nothing is gained. On the other hand, if the measurement vectors {y^t}^Tt=1 are obtained within a small period of time, it is often reasonable to as-sume that support of the solutionsx^tis constant w.r.t. time, i.e., the locations of the non-zero elements in x^t are the same for all t = 1, .., T. This is often referred to as thecommon sparsity assumption [CREKD05] orjoint sparsity as-sumption [vdBF10]. In this case, the estimates of the solution vectors{x^t}^Tt=1

can be greatly improved by using multiple measurement vectors [ZR13,WR07].

However, note that indext does not necessarily have to be a time index. That is, the evolution of the measurements{y1,y2, ...,yT} does not necessarily have to be temporal, it can also be spatial etc. as long as it fits into the formulation in eq. (1.2) and satisfies the common sparsity assumption. Here a temporal evo-lution is assumed without loss of generality. A few examples of applications of the MMV formulation are face recognition from video [MW12], through-the-wall radar imagery [BTA11] and EEG-based source localization [ZR13].

By assuming the forward matrix Adoes not change with time indext, and by concatenating the measurement vectors into a matrix, i.e. Y =h

y1 y2 .. yT

i∈ R^m^×^T, and similar for the solution vectors {x^t}^Tt=1 and error vectors{e^t}^Tt=1

the MMV problem can be reformulated using matrices:

Y =AX+E, (1.3)

whereY ∈R^m^×^T is the measurement matrix,E∈R^m^×^T is the error matrix and X ∈Rⁿ^×^T is the desired solution matrix. Using this formulation, the common sparsity assumption is then manifested asrow sparsity ofX. The non-zero rows ofX are referred to as the true signals or sources .

Consider the case, where multiple measurement vectors are available, but the underlying sparsity pattern is changing with time, i.e. the common sparsity assumption is violated. If the sparsity patterns at two subsequent time steps are independent, then the problem should be approached as independent SMV problems. On the other hand, if the sparsity pattern is changing slowly with time, it can be incorporated into the model. Such models are referred to as MMV with dynamic support as opposed toMMV with static support. The use of such models can be justified for many applications. For instance, this type of model is appropriate for EEG source localization due to the dynamic nature of the human brain [NS06,AM12].

In document Sparse inference using approximate message passing (Sider 18-22)