• Ingen resultater fundet

Discussion and possible implications

Given the results obtained from similar studies previously conducted using similar data, the proposed clusters could be expected. However, given that we might expect that the data include a large random error component, the existence of these clearly identifiable groups of countries is surprising4. From a measurement point of view, such fluctuations are considered to be random errors, and in the developmental phase of the test material efforts are made to reduce the influence of these item-specific patterns across countries. Even though item-by-country interactions may be perceived as a source of error in the international measurement of achievement, there is, in the case of PISA, no reason to conclude that this error source threatens the aim of international comparisons. On average, the p-value residuals in the above analyses correspond to a standard error of international measurement (Wolfe, 1999) in the range 0.5 – 1 percentage points.

In this chapter the p-value residuals have not been studied from a test perspective.

Instead they have been considered as data points that give meaningful descriptions of countries’ relative achievement profiles. Hence, the residual values have been assumed to contain substantial information describing the interaction between the countries and the items. At the outset, it is not reasonable to expect that all items measure the same overall trait consistently. If the fluctuations expressed by the p-value residuals are only random errors of measurement, the analysis of the profiles would probably not end up generating a systematic pattern. Clearly, though there are very systematic patterns in the data across countries. We can therefore be reasonably confident that the cluster structures in the data are reliable since this pattern fits with several other similar analyses of comparable datasets. On the other hand, what these measures indicate is still not entirely clear. Chapter 4 by Olsen &

Grønmo will explore this in more depth.

The overall motivation for this work was to explore the degree to which the Nordic countries have similar relative achievement profiles across the mathematics items in PISA. The analysis suggests that the Nordic countries have highly related profiles, and these profiles are also strongly related to five of the six English-speaking countries participating in PISA. It is interesting to note that in a similar analysis of the science achievement data, the Nordic countries’ relative achievement profiles were strongly related to the German-speaking countries’ profiles (Olsen, 2005b).

This was also the case in similar analyses of the mathematics and science items in

4. The way these residuals are constructed ensures that they sum to zero for each country; in other words, they are, for each country, fluctuations around the overall item-by-country expected means.

The most stable components of the p-values are cancelled out when subtracting the item and country averages. It is therefore reasonable to expect that the residuals include the major part of the random variation in the p-values.

TIMSS 1995 (Grønmo et al., 2004) where a Nordic-English similarity for mathematics items and a Nordic-German similarity for science items were

observed. This suggests that even if many of the same clusters of countries reappear in similar analysis based on other datasets, and in other domains, there are also domain-specific relationships between these clusters. This probably indicates that different school subjects may have different histories of policy exchange between countries.

The fact that the high-scoring Finnish students also have a relative achievement profile that is closely related to the other Nordic countries may appear surprising.

One simple conclusion is that the Finnish students are better than their Nordic peers in all aspects of mathematics as defined by PISA, but relatively they have many of the same kinds of strengths and weaknesses (to be identified more closely in chapter 4). Analysis of data from international comparative studies from a policy perspective is often driven by a desire to learn from others. In this particular case the analysis indicates that we cannot easily use the item-specific information to identify formulae for success in the Finnish curriculum. Another fact not included in the results presented above, the close-to-zero correlation between the Finnish profile and the other high performing East Asian countries, tells us that there must be several different curricular recipes for success. The most likely conclusion in this quest for success factors is therefore that international differences in performance levels in achievement tests such as PISA and TIMSS are most likely related to factors other than international differences in the subject matter of the curriculum.

This conclusion is also supported by the fact that across all the domains tested in PISA the same countries more or less consistently perform well. This suggests that the extent to which factors related to curricula may contribute to the

understanding of high achievement they must be related to more general and overarching elements in the way the curriculum is organised or delivered, and not specific parts of it, such as the different weights put on, for instance, pure or applied mathematics.

Large-scale international studies of students’ achievement are frequently criticised for being used mainly to rank countries. However, the analysis presented in this chapter has demonstrated that even if the process of test development used in PISA aimed to remove items with large item-by-country interactions, the remaining small p-value residuals may still be used to establish a clear cluster structure in countries’ relative achievement profiles. This shows that achievement studies like PISA provide data that may be used to report more than the average achievement of countries. There is a fine structure within the data, and this

evidently not only reflects the random errors in the measurements, but can actually be used to describe and analyse the diversity of mathematical achievement across the participating countries. It is therefore reasonable to believe that analyses such as that presented here and in chapter 4 by Olsen & Grønmo can provide valuable

information for further research in comparative education. However, in order to utilise this type of information to target specific issues we need to find ways to link these relative achievement profiles to descriptions of the policy and teaching related to the subject of mathematics across education systems.

References

Adams, R., & Wu, M. (Eds.). (2002). PISA 2000 Technical Report. Paris: OECD Publications.

Angell, C., Kjærnsli, M., & Lie, S. (in press). Curricular and cultural effects in patterns of students’ responses to TIMSS science items. In S. J. Howie & T. Plomp (Eds.), Contexts of learning mathematics and science: Lessons learned from TIMSS. Lisse: Swets &

Zeitlinger Publishers.

Everitt, B. S., Landau, S., & Leese, M. (2001). Cluster Analysis(4 ed.). London: Arnold.

Grønmo, L. S., Kjærnsli, M., & Lie, S. (2004). Looking for cultural and geographical factors in patterns of response to TIMSS items. In C. Papanastasiou (Ed.), Proceedings of the IRC-2004 TIMSS(Vol. 1, pp. 99-112). Lefkosia: Cyprus University Press.

Kjærnsli, M., & Lie, S. (2004). PISA and Scientific Literacy: similarities and differences between the Nordic countries. Scandinavian Journal of Educational Research, 48(3), 271-286.

Lie, S., Kjærnsli, M., & Brekke, G. (1997). Hva i all verden skjer i realfagene? Internasjonalt lys på trettenåringers kunnskaper, holdninger og undervisning i norsk skole,Oslo: Institutt for lærerutdanning og skoleutvikling, UiO.

Lie, S., & Roe, A. (2003). Unity and diversity of reading literacy profiles. In S. Lie, P.

Linnakylä & A. Roe (Eds.), Northern Lights on PISA(pp. 147-157). Oslo: Department of Teacher Education and School Development, University of Oslo.

OECD. (2005). PISA 2003: Technical Report. Paris: OECD Publications.

Olsen, R. V. (2005a). Achievement tests from an item perspective. An exploration of single item data from the PISA and TIMSS studies, and how such data can inform us about students’

knowledge and thinking in science.Series of dissertations submitted to the Faculty of Education, University of Oslo, No. 48. Oslo: Unipub.

Olsen, R. V. (2005b). An exploration of cluster structure in scientific literacy in PISA:

Evidence for a Nordic Dimension? Nordina, 1(1), 81-94.

Wolfe, R. G. (1999). Measurement Obstacles to International Comparisons and the Need for Regional Design and Analysis in Mathematics Surveys. In G. Kaiser, E. Luna & I.

Huntley (Eds.), International Comparisons in Mathematics Education. London: Falmer Press.

Zabulionis, A. (2001). Similarity of Mathematics and Science Achievement of Various Nations. Education Policy Analysis Archives, 9(33).

Chapter 4

What are the Characteristics of the