• Ingen resultater fundet

Composite score of implementation level

In document “It has to be fun to be healthy” (Sider 67-70)

5. Discussion

5.2 Implementation feasibility

5.3.2 Composite score of implementation level

The importance of documenting the reliability and validity of the implementation measures used has been highlighted by Durlak & Dupree,53 however to my knowledge in this field there is no consensus on the most reliable and valid way to measure level of implementation.10, 132 Further, it is acknowledged50, 53 that in implementation studies a common standard is difficult to achieve since all programs differ and therefore it is unlikely that all aspects of implementation can be measured by standard instruments. Thus,

53

developing individual study-specific measures, as it has been done for the third sub-study of this PhD, can be necessary.

While reviews of implementation studies have pointed out that many studies rely on only one, or sometimes two implementation components,53, 68, 78, 84 more or less implicitly treating these as “replacements” for overall implementation quality, the present study uses four implementation components which are summed up to a composite score indicating the overall level to which implementation has been achieved. This procedure is based on the evaluation framework by Linnan and Steckler who suggested determining program implementation as a summary score based on precisely reach, dose delivered, dose received and fidelity.50 By simultaneously taking into account different components or subdimensions such a measure has the advantage of allowing for a more thorough, comprehensive, and balanced assessment of implementation. Seen from this perspective, the subdimensions reflect different demands for implementation which to a certain extent all need to be met for a successful operation of a project. A certain level of “dose delivered” by the program operators, for instance, is a necessary precondition for changes in the target group, which, however, without an adequate level of actual “dose received” (e.g. due to potential recipients actually being physically present at delivery) will not be able to enable this change.

A summary measure of implementation can therefore be conceptualized as a formative index reflecting a joint total of partly related but also partly separate individual components (causal indicators).

For the third sub-study of this PhD, empirical investigation of the associations indeed indicated moderate to medium-sized associations between the different subdimensions, with the highest correlation found for dose-delivered and fidelity (r = .58; p < 0.001) and the lowest for reach and fidelity (r = .23; p < 0.001) (see table 5 in appendix 8). That all subdimensions are correlated to some degree shows that there is a certain overlap between the subdimensions, and that they do share some variance. However, the correlations do not exceed r = .58, thus they do in fact tap into specific, different aspects of implementation.

The extent of the associations may be dependent not only on the different “contents”

targeted but also reflect the different sources performing the ratings. The strongest

54

correlations, not surprisingly, were found between the subdimensions which came from more or less the same data source, that is student/student (reach and dose received), and teacher/teacher & observations (dose delivered and fidelity), so there is shared measure variance (see table 5 in appendix 8). Further, dose delivered and dose received are among the subdimensions which have the lowest correlation (r =.31; p < 0.001) (see appendix 8).

This could be explained by the fact that these subdimensions measure both shared but also different aspects of program implementation. Shared aspects were for instance instructing/conducting frisbee exercises and dancing to the music video, where differential aspects were e.g. the teacher registering the class score on the webpage (dose delivered) and physical activity behavior of students at home and to and from school (dose received).

Furthermore, these subdimensions utilized partly different response categories tailored towards the different types of respondents (children vs. adults). Thus, for instance students were asked to rate the degree of program usage while teachers indicated the number of times program components were used as well as the degree of usage. For these reasons, dose delivered and dose received cannot be expected to be highly correlated.

Prior studies of implementation of school-based health promotion programs have likewise used a composite score based on several components.72, 133-136 However, studies adhering to the specific implementation components suggested by Linnan and Steckler,50 primarily used only one or more of these components separately.137-149 To my knowledge, no other studies have based the composite implementation score on the four components of reach, dose delivered, dose received, and fidelity. However, a Danish study131 on the effectiveness of a school-based hand hygiene program created an implementation index using a combination of reach (defined as the proportion of children in each class who received the intervention component) and dose received (defined as the extent to which the child indicated that they participated in or experienced the program).

Using a composite score of implementation does, however not solve the problem of comparing results across studies, as inconsistencies still exist in the way implementation is defined and measured53, 69, 75, 81, 82 regardless of whether separate implementation components or a composite score are used.

It could be discussed though if a simpler implementation score could have been constructed by using only one data source. In previous implementation research, teachers’ self-report

55

on program adherence (which relates to the construct of fidelity), has been found to be negatively correlated with assessments of adherence made by independent observers.83 This could provide an argument for utilizing observations only to study level of implementation, assuming that independent observers are more “objective” than teachers who are likely to have a vested interest in the success of their inputs.150 Using such an approach was not deemed possible though, due to the given circumstances. Firstly, I assessed fidelity both by observation as well as by a teacher survey, since some of the aspects of program implementation could not be assessed by observation only (e.g. the teacher registering the class score on the webpage, distributing parent information sheet etc.). Further, assessing reach, dose delivered, and dose received of the program throughout the three program weeks by observations was not deemed feasible due to 1) the immense resources this would have required, and 2) the fact that such an approach would very likely have affected the students’ usage of the program and the teachers’

implementation of the program to an unacceptable degree. Vice versa, using solely data from student questionnaires or teacher questionnaires was not possible, since the included implementation components inherently require the perspective of the different groups, regarding e.g. dose delivered (from teachers) and dose received (from students). Thus, constructing an implementation score from a single data source was not deemed possible or at least seemed reductionist.

In document “It has to be fun to be healthy” (Sider 67-70)