• Ingen resultater fundet

Downsampling - a crude approach to removing load changes

Finally as a simple alternative, it was investigated if the timing changes in the AEEsignals could be removed with downsampling. Comparing the upper and lower panel ofFigure 3.12 the timing changes in the upper panel was been re-moved. While the downsampling removed the timing changes it did not remove the amplitude changes, thus some load dependent processing would still be nec-essary. Alternatively the AEE signals in the period could just be summed to preserve energy. Still timing information for proper selection of intervals would be necessary. With this setup the landmarks would just be used like crank pulses in the crank angle conversion (section 2.4, i.e., that the observed crank samples between each landmark was summed. This could be seen as a more advanced usage ofdomain knowledge, compared to the work byChandroth and Sharkey[1999], where domain knowledge is applied by only considering the most important part of the cycle.

A simple test was conducted where the necessary downsampling factor for the whole cycle was roughly estimated from the obtained landmarks to around 50. In the example inFigure 3.12the downsampling factor is 96. Recall downsampling

3.4 Downsampling - a crude approach to removing load changes 41

Figure 3.12: The timing differences has been downsampled away (factor 96).

with factor 96 results in lowpass filtering and selection of every 96’th sample.

The factor is higher as all events needs to be in the same 96 sample long block regardless of load. An event that is moving from one block to another would result in timing changes. However, some faults were removed in this way. The faulty water brake resulted in unstable timing of the events [Pontoppidan and Larsen,2003], and downsampling with factor 8 lead to decreased fault detection of that fault [Pontoppidan and Larsen,2004].

42 Event alignment

Chapter 4

Condition modeling

In this chapter, I outline the various methods that I have used to model the engine condition in observed data. The methods share some important prop-erties, namely being generative and capable of data reduction. The two main equations in this chapter are

x=As+ν, ν∼N(0,Σ) (4.1)

X =AS+Γ, (4.2)

where x is the observation vector of size d×1, A the mixing matrix of size d×k, sthe source signal of size k×1 and ν the additive noise is also d×1.

d is the number of features and k the number of components, and k d.

The observation matrix X is generated by stacking several realizations of the observation vectors. Here the different realizations comes from different engine cycles acquired with the same sensor, i.e., they are not simultaneously recorded as in the classical blind source separation problems [Bell and Sejnowski,1995, Molgedey and Schuster, 1994]. Similarly the source matrix S and the noise matrixΓcomes from stacking the N source vectors and noise vectors.

X ={x1,x2, . . . ,xN} (4.3) S ={s1,s2, . . . ,sN} (4.4) Γ={ν12, . . . ,νN} (4.5)

44 Condition modeling

The assumption that the noise is Gaussian with zero mean does not hold completely, as the sensor noise and the signal is added and squared in the root mean square (RMS). Being uncorrelated they add as energy signals, i.e.

rms =√

s2+ 2ns+n2, having s, nas signal and noise respectively. Since the signal and noise is noise are uncorrelated the mean of the 2nsis zero, and further since the overall noise level is low compared to the signals the non-zero mean can be neglected.

Equation 4.1 describe how thek hidden signals inA are weighted by the co-efficients in sto generate the observed signalx. In other words the Amatrix contain those signal parts that the observed signals can be made up from - it acts like a basis for the normal condition. The idea is to learn this basis set from a collection of normal condition data, making the model capable of generating the different modes in the observed training data. By applying the component analysis methods the orthogonal/independent directions in the observed data should result in a basis, i.e., columns in the mixing matrix, that contains sig-natures with the descriptive quality like source 3 (the third row of S) model the amplitude of the injector event signal in column 3 of the mixing matrix. As Figure 4.1show, such clear descriptive quality is not always encountered, since the columns of the mixing matrix seem to model parts of all events in the cycle.

In the introduction of the chapter, data reduction was stated as a property of the models. Consider a group of observations as in Equation 4.2, and assume that k < N∧k < d, then the N dvalues of the observation matrix is modeled by the much smallerk(d+N) values, e.g., each examplexis modeled by thek source values multiplied on thekcore signals in the mixing matrix.

In addition, noise model assumptions can be made. In this thesis, two assump-tions have been considered. Either Σ=σ2I, i.e., independent and identically distributed (iid), which assume a constant noise level throughout the engine cy-cle. Alternatively, the more advancedΣ=σ2Iwhere the noise level is assumed to be varying through out the engine cycle (seesubsection 4.2.3).

Solely examples acquired under normal conditions have been modeled, but the same procedure could be repeated on other important faulty conditions, as this would enable the identification of these faults. This way the simple nor-mal/faulty condition monitoring system could be expanded into a more full diagnosis system.

The models describe signals originating from one sensor by already observed sig-nals from that sensor. Conceptually the model knows how that engine normally sounds with that sensor in that position. The knowledge of the normal sound pattern is used to output the negative log-likelihood (NLL) for each example, given the normal condition model and possibly the expected noise level. The

4.1 Properties: Independent, orthogonal and uncorrelated 45

Figure 4.1: Graphical explanation of matrix setup. Here an observation matrix with 2 examples is expressed by the weighted sum of the two columns of the mixing matrix.

The elements in the source matrix are the gains or activations of the core signals found in observed data using the component analysis methods.

identification of a fault is performed by monitoring the NLLagainst a thresh-old. Depending on the amount of data this threshold can be established by comparing theNLLof known normal and faulty examples (supervised) or when only normal examples are available by selecting an inherent rejection rate of the normal examples. In chapter 5 the handling of theNLLvalues and thresholds are described further.

4.1 Properties: Independent, orthogonal and un-correlated

Prior to the description of the actual algorithms, some properties of variables are considered. Columns vectors x and y are either orthogonal, uncorrelated or statistically independent if one or more of the corresponding equations are

46 Condition modeling Statistically independence is that the joint distribution can be factorized by the marginal distributions: p(x,y) = p(x)p(y), making the nominator and denominator inEquation 4.8equal.

Another definition of independent is

E{g1(p(yi))g2(p(yj))} −E{g1(p(yi))}E{g2(p(yj))}= 0, i6=j (4.9) However, it is not very convenient as it requires trying all measurable functions g1 and g2 [Hyv¨arinen, 1999]. Nevertheless, it show what statistical indepen-dence is really about - that there should be absolutely no way of linking the observations inyi with those inyj.

For joint Gaussian distributions independent and uncorrelated is equivalent [Hyv¨arinen, 1999]. Since the covariance matrix becomes diagonal and the dis-tribution factorizes in the two marginal disdis-tributions. For zero mean signals un-correlated and orthogonal is equivalent; and joined these two properties imply that zero mean orthogonal Gaussians variables are independent. However, this is not good as the independent components analysis (ICA) algorithms require that none/or at most one source is Gaussian to recover the mixing matrix direc-tions. This is due to thesummation property of the alpha-stable distributions where Cauchy and Gaussian is the most common. The Cauchy distribution, which has heavier tails than the Gaussian, was considered as source prior for monaural ICA in my Masters Thesis [Dyrholm and Pontoppidan, 2002]. The summation property is that adding two independently drawn numbers from a Gaussian (or Cauchy) results in a sum that follows a Gaussian (or Cauchy) dis-tribution with changed parameters [Conradsen,1995]. Another way of looking the summation property is that, convolving two alpha-stable distributions of same family results in a new distribution of that family, i.e., it only changes parameter values. In contrary, adding two uniform stochastic variables results in a triangular distribution and thus the family changes. For deeper insight on alpha-stable distributions see Kidmose[2001].

In order to investigate what statistical independence and orthogonallity con-strains imply for the source separation and identification problem a small ex-ample with synthetic data is conducted. Figure 4.2 show the results of little experiment with two classes of data, whereICAis capable of discriminating the