• Ingen resultater fundet

12.6, page 403)

N/A
N/A
Info
Hent
Protected

Academic year: 2023

Del "12.6, page 403)"

Copied!
12
0
0

Indlæser.... (se fuldtekst nu)

Hele teksten

(1)

02402 Indicative solutions to homework assignments and exercises, Week 12

0.1 Exercise 12.6, page 368 (7ed: 12.6, page 414 and 6ed:

12.6, page 403)

This exercise is a standard onesided variance analysis. We want to investigate if the three lubricants function equally well, or if it is possible to prove a difference between them, which then is expressed in the difference in weight loss due to friction between the .

We often plot the data as in the following figure. From this figure we notice that the variation area for the three groups A, B and C, are approximately equally wide, so that the variance for the groups can be assumed to be equal (there is a test for such a hypothesis, one of them is called Bartlett’s test, but there are several).

A B C

5 10 15 20 25

The mathematical model (cf. page 405) is:

Yij =µ+αi+ij

where α1+α2+α3 = 0, and it is assumed thatij ∈N(0, σ2) . µis a constant.

Hypothesis: H0 : α1 =α2 =α3 = 0 and H1 : All alternatives.

Mean Data Sum Sum of squares

A 12.2 11.8 13.1 11.0 3.9 4.1 10.3 8.4 74.8 789.36

B 10.9 5.7 13.5 9.4 11.4 15.7 10.8 14.0 91.4 1111.00 C 12.7 19.9 13.6 11.7 18.3 14.3 22.8 20.4 133.7 2354.53 Total 299.9 4254.89 Total sum of squares: SST = 4254.8929924.92 = 507.39 Sum of square between treatments: SStreatment = 74.82+91.842+133.72 29924.92 = 230.59 Sum of squares within treatments : SSE =SST −SStreatment = 276.80 These sum of squares are put in to the following analysis of variance table:

(2)

Source of Sum of Degrees of s2 F-value variation squares freedom

Lubricants 230.59 31 = 2 115.29 8.75 Error 276.80 3(81) = 21 13.18

Total variation 507.39 241 = 23

The calculated F-value should be compared to a F(2,21)-distribution: F(2,21)0.01 = 5.78

5.78 8.75

F(2,21)

α=0.01

and it is seem, that the F-value for the lubricants (8.75) is in the critical area at a test on a significance level of 1% .

Therefor we reject H0 and conclude that the mean values for the three lubricants are not equal, but on the contrary different.

Conclusion :

Yij = µ + αi + ij

Now we estimate all the parameters of the model.

Parameter Estimate Calculated Value µ Y.. 299.9/24 = 12.50 α1 Y1.−Y.. 74.8/812.50 = 3.15 α2 Y2.−Y.. 91.4/812.50 = 1.07 α3 Y3.−Y.. 133.7/812.50 = 4.22 σ2 s2rest 13.18 = 3.632

In most cases we are interested in the mean values of the individual groups, that is µi =µ+αi, and besides we would like to state a confidence interval for the mean of the groups individually.

The general formula for a two-sided (1−α)-confidence interval for the mean in a group is:

I[ µi ]1−α =Yi. ± srest

√ni ·t(frest)α/2

whereYi. is the mean in the group,niis the amount of samples/measrments in a group, srest is the estimate ofσ with frestdegrees of freedom, and t(frest)α/2 is the α/2-value in the t-distribution with frest degrees of freedom.

(3)

I

µ1 µ2

µ3

0.95

=

9.35 11.43 16.72

± 3.63

8 ·t(21)0.025=

9.35 ± 2.67 11.43 ± 2.67 16.72 ± 2.67

since t(21)0.025 = 2.08 and 3.63

8 ·2.08 = 2.67 are the same values for all three groups (n1 =n2 =n3 = 8) .

The three confidence intervals are shown at the following figure

A B C

5 10 15 20 25

You could write, that even though the intervals overlap each other, we have found a significant difference in the underlying mean values.

There are several methods to check if all the mean values can be looked upon as different, or if only one of the three mean values is different. One such test is the Duncans multiple range test, which also is mentioned in the book, but there are several of these tests.

Finally we could find a 95% confidence interval for σ2: I[ σ2 ]0.95= [ f·σb2

χ2(f)0.025 , f·σb2

χ2(f)0.975 ] where f is the degrees of freedom for the estimate σb2.

We get f = 21, σb2 = 3.632,χ2(21)0.025= 35.479 and χ2(21)0.975 = 10.283 , so that I[ σ2 ]0.95 = [21·3.632

35.479 , 21·3.232

10.283 ] = [ 2.792 , 5.042 ] Model control:

Like in a regression analysis we should control that the residuals can be assumed be have a normal distribution.

A simple and often used method is to sketch a plot of the normal distribution for calculated deviations between data and the expected model, i.e. for the residuals.

(4)

The residuals are in our case actually the deviations from the from the mean within the groups, since

bij = yij −ybij = yij −µb−αbi = yij −yi.

As an estimate for the standard deviation of the data we get σb = 3.63

−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5

−8

−6

−4

−2 0 2 4 6 8

Residualer opgave 12.6

Normal scores

If the mean of the residuals is called and their estimated standard distribution is called σb, then the line will go through the point ( 0, ) and have the slope σb. In this case we have = 0 andσb = 3.63 . In a usual analysis of variance (where there is constant term) the mean of the residuals would be 0.

The figure shows a little deviation from the normal distribution, but we judge it not to be so dramatically that is would question the result of the analysis. If you would like to check the question, you would have to carry out a test. You could also investigate the residuals a little more carefully, and for instance check if the largest residuals are from the same group.

I haven’t done the calculations, but I am pretty sure that a test for the normal distri- bution (among the tests we normally would use) would not show significant deviation in this case

Exercise 12.10, page 369 (7ed: 12.10, page 415 and 6ed: 12.10, page 404)

This exercise is just like exercise 12.6, except that the amount of measurements in the groups variate. The model and approach are also alike, except the calculations of the sum of squares.

The data is plotted in the following figure. From this figure we see, that the variation area for the datawithin the four groups, 1, 2, 3 and 4, is almost equally wide, so that the variance within the groups can be assumed to be equal. The mean values of the groups are indicated with a little horizontal .

(5)

Alloy 1 Alloy 2 Alloy 3 Alloy 4

0.980 1.020 1.060 1.100

Current

The mathematical model (cf. page 405) is:

Yij =µ+αi+ij

whereα1+α2+α3+α4 = 0, and it is assumed that ij ∈N(0, σ2) . µis a constant.

Hypothesis: H0 : α1 =α2 =α3 =α4 = 0 and H1 : All alternatives.

Alloy (Data1.000)×1000 n Sum Squared sum

Alloy 1 85 16 9 34 4 144 8718

Alloy 2 51 -7 22 3 66 3134

Alloy 3 -15 1 -10 -12 11 5 -25 591

Alloy 4 101 15 2 116 10426

I alt 14 301 22869

Total sum of squares : SST = 22869301142 = 16397.5 Treatment sum of squares: SStreatment= 14442 +6632 +(−25)5 2 +11622 301142 = 7017.5 Error sum of squares : SSE=SSTSStreatment = 9380.0

These sum of squares are put in to the analysis of variance table below:

Source of- Sum of Degrees of s2 F-value variation Squares freedom

Treatment 7017.5 41 = 3 2339.2 2.49 Error 9380.0 144 = 10 938.0

Total variation 16397.5 141 = 13

The calculated F-value should be compared with a F(3,10)-distribution:

3.71 2.49 F(3,10)

6.55 F(3,10)

α=0.05 α=0.01

(6)

and we see that the F-value is not in the critical area at a test on a significance level of 1% or 5%.

Therefore we accept H0 and conclude that the mean values for the four alloys aren’t significant different.

Conclusion : α1 =α2 =α3 =α4 = 0 , and the reduced model is then Yij = µ + ij

Now we estimate the parameters of the model:

Parameter Estimate Calculated Value

µ Y.. 301/14 = 21.5

σ2 s2total 16397.5/13 = 35.522 We could state a 95% confidence interval for the common mean value:

I[ µ]1−α =Y.. ± √σb

N ·t(N 1)α/2

whereY..is the common mean,N is the total amount of measurements andt(N 1)α/2 is the α/2-value in the t-distribution withN 1 degrees of freedom. Since t(13)0.025 = 2.16 we get

Ih µ i

0.95 = 21.50 ± 35.52

14 ·t(13)0.025 = 21.50 ± 20.51'[1.0, 42.0]

In the original units the interval is:

Ih µ i0.95= [ 1.001 , 1.042 ] Ampere

We could also have estimated a 95% confidence interval for σ2: I[ σ2 ]0.95= [ f·σb2

χ2(f)0.025 , f·σb2

χ2(f)0.975 ]

where f is the degrees of freedom for the estimate σb2. We have f = 10, σb2 = 35.522, χ2(10)0.025 = 20.483 andχ2(10)0.975= 3.247 , so that

I[ σ2 ]0.95 = [ 10·35.522

20.483 , 10·35.522

3.247 ]'[ 252 , 622 ] In the original units

I[σ2 ]0.95'[ 0.0252 , 0.0622 ] Ampere2

(7)

Exercise 12.47 page 388 (7ed: 12.50, page 446 and 6ed: 12.48, page 434)

This exercise is a typical example of arandomized block designwith 1 measurements per ’cell’, that is per combination of ’Agency’ and ’Site’.

Data Sum

Site A Site B Site C Site D Site E

Agency 1 23.8 7.6 15.4 30.6 4.2 81.6

Agency 2 19.2 6.8 13.2 22.5 3.9 65.6

Agency 3 20.9 5.9 14.0 27.1 3.0 70.9

Sum 63.9 20.3 42.6 80.2 11.1 218.1

Squared sum 1371.89 138.81 607.40 2177.02 41.85 4336.97 The usual mathematical model is (cf. page 418):

Yij =µ+αi+βj+ij

wherePiαi = 0,Pjβj = 0, and it is assumed that ij ∈N(0, σ2) . µis a constant.

In the modelσ2 is the variance of the uncertainty of measurements in the experiment.

This include in part the variation, of the chemical method and in part the variation in between the tests that are taken at the same ’Site’.

Data is plotted in the following figure. From this we see that the three ’Agencies’

approximately follows the level of the ’Sites’. This support the chosen model, which assume additivity between ’Agencies’ and ’Sites’ in the measurement result (it is possible that there is a tendency of larger variation for the results at a higher concen- tration; a very used method to compensate for this, is shown as an example at the end of this solution).

Site 1 Site 2 Site 3 Site 4 Site 5 5

15 25 35

Koncentration

Agency 1

Agency 2 Agency 3

This exercise does not ask you to check if the ’Sites’ are equal (it is obvious that they aren’t), but instead it is desired to check if the three ’Agencies’ can be equal, i.e., do you find consistent results, when the concentration is varied.

Hypothesis: H0 : α1 =α2 =α3 = 0 and H1 : All alternatives.

The following calculations are performed:

(8)

Total sum of squares : SST = 4336.9721815.12 = 1165.80 Sum of squares between ’Agencies’ : SSagency= 81.62+655.62+70.92 21815.12 = 26.57 Sum of squares between ’Sites’ : SSsite= 63.9220.32+42.632+80.22+11.12 21815.12 = 1117.26 Error sum of squares: SSE =SST −SSagency−SSsite = 21.96 These sum of squares are put in the following analysis of variance table:

Source of variation Sum of squares Degrees of freedom s2 F

Agencies 26.57 31 = 2 13.29 4.84

Sites (blocks) 1117.26 51 = 4 279.32 (101.57)

Error 21.96 (31)×(51) = 8 2.75

Total variation 1165.80 151 = 14

The calculated F-value for ’Agencies’ should be compared with a F(2,8)-distribution:

4.46 F(2,8)

8.65 F(2,8)

4.84

α=0.05 α=0.01

It is seen that there is a significant difference between ’Agencies’ at a test on a 5%

level, but there is no difference on a 1% level. One would conclude that there probably is a certain difference at the three ’Agencies’.

You could also perform a test to check if the ’Sites’ are different (even though this is not what is asked for). The calculated Fsites = 101.57>> F(4,8)0.01 = 7.01, so there is a (huge) difference between the ’Sites’ (and this was also shown in the figure).

Conclusion

Yij =µ+αi+βj+ij

Now we estimate all the parameters of the model.

(9)

Parameter Estimate Calculated Value µ Y.. 218.1/15 = 14.54 α1 Y1.−Y.. 81.6/514.54 = 1.78 α2 Y2.−Y.. 65.6/514.54 = 1.42 α3 Y3.−Y.. 70.9/514.54 = 0.36 β1 Y.1−Y.. 63.9/314.54 = 6.76 β2 Y.2−Y.. 20.3/314.54 = 7.77 β3 Y.3−Y.. 42.6/314.54 = 0.34 β4 Y.4−Y.. 80.2/314.54 = 12.19 β5 Y.5−Y.. 11.1/314.54 = 10.84 σ2 s2rest 2.75 = 1.662

We could be interested in the mean values of the individual ’Agencies’ , that is µi = µ+αi, and the confidence interval of the individual mean values.

The formula for a two-sided (1−α)-confidence interval for the mean values in a given group (an ’Agency’) is:

I[ µi ]1−α =Yi. ± srest

√ni ·t(frest)α/2

where Yi. is the mean in the group, ni is the amount of measurements in the group, srest is the estimate ofσ with frestdegrees of freedom, and t(frest)α/2 is the α/2-value in the t-distribution with frest degrees of freedom.

I

µ1

µ2 µ3

0.95

=

16.32 13.12 14.18

± 1.66

5 ·t(8)0.025 =

16.32 ± 1.71 13.12 ± 1.71 14.18 ± 1.71

since t(8)0.025 = 2.365 and 1.665 ·2.306 = 1.71 are the same values for all the groups (n1 =n2 =n3 = 5) .

The three confidence intervals are illustrated in the following figure, where also the data is shown, marked with ’Site’ number. Notice that the confidence intervals don’t make any sense without knowing which specific ’Site’, we are dealing with.

1

1 1

2 2

2 3

3 3

4

4

4

5 5 5

Agency 1 Agency 2 Agency 3 5

10 15 20 25 30 35

Koncentration

We could also notice that the intervals overlap each other a lot, even though we found a significant difference between the three ’Agencies’.

(10)

Again we could find a 95% confidence interval for σ2: I[ σ2 ]0.95= [ f·σb2

χ2(f)0.025 , f·σb2

χ2(f)0.975 ] where f is the degrees of freedom of the estimate σb2.

We have f = 8, σb2 = 1.662, χ2(8)0.025 = 17.535 and χ2(8)0.975 = 2.180 , so that I[ σ2 ]0.95= [ 8·1.662

17.535 , 8·1.662

2.180 ]'[ 1.122 , 3.182 ]

Dec04.4

The correct answer is number 5, since we have a ”two-sample” situation (different people in the two groups).

Dec04.7

The correct answer is number 4, since we have the same patients that are measured several times.(=Randomized Blocks)

Dec04.14

We have a two-way analysis of variance (=Randomized Blocks), so the number of degrees of freedom for the error (the denominator in the F-test) is (51)(41) = 12, and since the groups to be compared are five, the numerator is equal to 51 = 4, and the correct answer therefore number 2.

Dec04.15

This is an analysis of variance (different animals in different herds). The usual test statistic is therefore: (page 406 (396))

F = SS(T r)/(31)

SSE/(1663) = SS(T r)/2 SSE/2 From the information given, we have:

SS(T r) = 39·(4.8234.896)2+ 67·(5.2934.896)2+ 60·(4.4994.896)2 and

SSE= 38·1.6542+ 66·1.7062+ 59·1.7462 and the correct answer therefore number 5.

(11)

Dec04.25

By comparing the two p-values with 0.05, we get that the correct answer is answer number 2.

Dec04.26

F we look beside the information that it is the same taster that tastes the three groups of cheese, the variation within the groups (SSE) is equal to the total variation minus the cheese variation, or: (lowest page 407 (397))

SSEIncorrect =SST −SS(T r) For the correct analysis we have that: (page 419 (408))

SSECorrect =SST −SS(T r)−SS(Bl) And then:

SSEIncorrect=SSEKorrekt+SS(Bl) = 80.6120 + 18.0260 and the corresponding degrees of freedom:

DF EIncorrect =DF ECorrect+DF(Bl) = 18 + 9 = 27 and the correct answer is answer number 1.

Rex11.3.1

In the first analysis where the ’kommune’ is part of the analysis, we can see that there in total are 270 ’kommuner’ in the study (and besides 1555 schools). In this analysis we test the hypothesis:

H0 : µ1 =µ2 =· · ·=µ270

whereµi is the level for thei’th ’kommune’. The p-value for this hypothesis is 0.05406.

Therefore we cannot prove that there is difference in in the ’kommunerne’ (since the P-value is larger than the usual level of 5%). The variation (from school to school) within the ’kommuner’ is estimated to 0.3412, or equivalently: the school-standard deviation within the ’kommuner’ is

0.3412 = 0.584.

In the other analysis where the ’amter’ is analyzed, we can see that there is 16 ’amter’in total in the study (and as before 1555 schools in total). I this analysis we test the hypothesis:

H0 : µ1 =µ2 =· · ·=µ16

where µi is the level for the i’th amt. The p-value for this hypothesis is 0.000038.

Therefore we can prove that there is a difference between the ’amterne’ (since the P-value is small compared to the usual level of 5%). The variation (from school to school) within the ’amter’ is estimated to 0.3435, or equivalently: the school-standard deviation within the ’amter’ is

0.3435 = 0.586.

(12)

Rex12.3.1

The model can be written as (page 418):

Yij =µ+αi +βj +εij

where αi gives the effect of the i’th thread and βj the effect of the j’th instrument.

There are 5 kinds of threads and 4 kinds of instruments. The following two hypothesis are tested:

H0 : α1 =α2 =· · ·=α5

and

H0 : β1 =β2 =β3 =β4

The p-value for the first hypothesis about no difference in the threads can be found in the table from the printout to 0.0018781 and therefore there is a significant difference between the threads. The p-value for the other hypothesis about the difference in the instruments can be found in the table in the printout to 0.9835259 and therefore we cannot prove a difference between the instruments.

The standard deviations for these measurements are estimated to:

2.10958 = 1.45

Referencer

RELATEREDE DOKUMENTER

In this section it will be assumed that there is heat transfer between the reactor wall and the reaction mixture and then bifurcation analysis will be performed for the parameters

ward regulation reserve (demand that can be turned on). Of course, this can be done more precisely so that you are able to estimate the flexibility available with great precision.

Assume that the people in the survey are equally divided among the three possible rankings... Assume that the people in the survey are equally divided among the three

In East jutland three of the traditionally defined archaeological culture groups are represented and they are supplemented by a number of finds, that can be seen as a parallel

(Notice that the sum of the length of the output labels in the prefix form transducer can be larger. This is not a problem, however, since these edge output labels are represented

95 to the suggestion that type A axes belong to the Funnel Beaker Culture, and type B axes to the Pitted Ware Cul- ture; that the middle neolithic can on this basis be

The two countries will therefore represent the peer group of Asian conglomerates from different countries well and the findings are assumed to be similar to those that would be

In relation to the agro- industrial complex and the food and beverages resource area, figure 3 shows that it might well be that biotechnology is crucial for the development of