• Ingen resultater fundet

State-of-the-art estimation of site-specific safety effectsMike Maher

N/A
N/A
Info
Hent
Protected

Academic year: 2022

Del "State-of-the-art estimation of site-specific safety effectsMike Maher"

Copied!
28
0
0

Indlæser.... (se fuldtekst nu)

Hele teksten

(1)

Institute for Transport Studies

FACULTY OF ENVIRONMENT

State-of-the-art estimation of site- specific safety effects

Mike Maher

(2)

Introduction

Estimation of treatment effect at a site

– just compare “after” accidents with “before” accidents, surely?

• Unfortunately, not quite so simple!

– accident frequencies are random quantities: x = m + e

– there may be a general (eg national) trend, so accidents might have gone up/down anyway

– treatment may lead to a change in flow (eg speed humps), so accidents may “migrate” to parallel routes

– sites not selected for treatment randomly, but because they have high accident frequency: “regression to mean” effect

x % ˆ x

before after



 

 −

=

θ 100 1

(3)

Regression to the Mean

• Regression to the Mean (RTM) problem

– sites selected for treatment on basis of high accident frequency x

– but x = m + e, where m is true mean frequency, and e is Poisson error – selected because x is high: so likely that m is high, and e > 0

– so observed frequency x is an overestimate of true value m

• In the after period, observed frequency is unbiassed

– “bad luck” does not persist

– comparison of “after” with “before” will exaggerate treatment effect

• OK – but is it a serious issue? Does it really matter?

(4)

An illustrative example (1)

Number of sites N

k

with k accidents 1992-94

k 0 1 2 3 4 5 6 7 8 9 11 13

Nk 7411 1645 341 117 38 26 13 7 2 1 1 1

Data from 9603 sites in North Lanarkshire, Scotland Sites with at least 4 accidents called “cluster sites”

Earmarked for remedial treatment

(5)

An illustrative example (2)

Number of sites N

k

with k accidents 1992-94

k 0 1 2 3 4 5 6 7 8 9 11 13

Nk 7411 1645 341 117 38 26 13 7 2 1 1 1

92 - 94 95 - 97 change

Whole network 3136 2799 -11%

Cluster sites 458 233 -49%

(6)

An illustrative example (2)

Number of sites N

k

with k accidents 1992-94

k 0 1 2 3 4 5 6 7 8 9 11 13

Nk 7411 1645 341 117 38 26 13 7 2 1 1 1

92 - 94 95 - 97 change

Whole network 3136 2799 -11%

Cluster sites 458 233 -49%

But – no treatment was actually applied!!

(7)

Why “regression to the mean”?

• Sir Francis Galton, (1822 – 1911), eugenicist, biometrician, statistician, observed:

Tall fathers tend to have sons who are also tall – but who are not as

tall as themselves

(8)

Galton height data

58 60 62 64 66 68 70 72 74 76 78

58 60 62 64 66 68 70 72 74 76 78

father's height

son's height

(9)

Galton height data

58 60 62 64 66 68 70 72 74 76 78

58 60 62 64 66 68 70 72 74 76 78

father's height

son's height

tall fathers

(10)

RTM appears in other places, too

• Golf tournaments:

The players who score best in the first round tend, on average, to score well

in the second round too; but not as

well as they did in the first

(11)

British Open Golf 2010

Pos Name Rounds 1-2 Rounds 3-4

1 Louis Oosthuizen 132

2 Mark Calcavecchia 137

3 Lee Westwood 138

4 Paul Casey 138

5 Jin Jeong 138

6 Alejandro Canizares 138

7 Retief Goosen 139

8 Sean O’Hair 139

9 Tom Lehman 139

10 Graeme McDowell 139

(12)

British Open Golf 2010

Pos Name Rounds 1-2 Rounds 3-4

1 Louis Oosthuizen 132 140

2 Mark Calcavecchia 137 157

3 Lee Westwood 138 141

4 Paul Casey 138 142

5 Jin Jeong 138 146

6 Alejandro Canizares 138 148

7 Retief Goosen 139 142

8 Sean O’Hair 139 143

9 Tom Lehman 139 145

10 Graeme McDowell 139 146

(13)

British Open Golf 2010

Pos Name Rounds 1-2 Rounds 3-4

1 Louis Oosthuizen 132 140

2 Mark Calcavecchia 137 157

3 Lee Westwood 138 141

4 Paul Casey 138 142

5 Jin Jeong 138 146

6 Alejandro Canizares 138 148

7 Retief Goosen 139 142

8 Sean O’Hair 139 143

9 Tom Lehman 139 145

10 Graeme McDowell 139 146

Average increase = 7.3

(14)

British Open Golf 2009

Pos Name Rounds 1-2 Rounds 3-4

1 Tom Watson 135 141

2 Steve Marino 135 144

3 Mark Calcavecchia 136 146

4 Retief Goosen 137 141

5 Angel Jimenez 137 149

6 Ross Fisher 137 138

7 Kenichi Kuboya 137 147

8 Vijay Singh 137 145

9 Stewart Cink 138 143

10 Lee Westwood (+3) 138 143 (av)

Average increase = 6.5

(15)

So – what is to be done?

• Ideally would like to avoid RTM

– select sites for treatment randomly, rather than on basis of x’s, but not generally practicable or ethical

or select and then wait before treatment (gives “lag period”)

• Use of lag period (Mountain et al, 1998):

– comparison of after with before: treatment effect = 43%

– comparison of after with lag: treatment effect = 23%

– so, RTM effect = 20%

• Alternatively, need to correct for RTM

– the Empirical Bayes (EB) method does this

– adjusts before accident frequency, by use of predictive accident model

before lag after

(16)

The Empirical Bayes method

• What is expected value of mean m

– given observed value of no. accidents x?

• Bayes’ Theorem

– prior distribution for m from predictive accident model

– models the “overdispersion”

– combine with observed x

– to give posterior estimate of m

• Depends on model and its precision C

v

x m

E( | x) = αµ +(1−α) where : α =

(

1+ µCv2

)

1

0 0.05 0.1 0.15 0.2 0.25 0.3

0 1 2 3 4 5 6 7 8 9 10

m f(m)

prior distribution predicted mean µ

Weighted average of prediction µ and observed value x

(17)

Coefficient of variation C

v

0 0.5 1 1.5 2 2.5 3 3.5 4

0 0.5 1 1.5 2 2.5

m

Cv = 0.1

Cv = standard deviation mean

Cv = 0.5

Cv = 0.3

f(m)

(18)

Types of predictive accident models

• Given what we know about the site (before x is known):

– how many accidents would we expect there to be?

• Models differ in their level of detail (and precision)

• From very simple, coarse models …

– Hauer’s “reference population” (eg North Lanarkshire data) – or a simple “rate” model for that type of road (accs/veh-km)

• … through to more complex, detailed models

– regression models: function of flows and geometry, by arm of junction – eg approaching accidents at 4-arm roundabouts: function of entry flow,

entry curvature, entry width

) e . C

exp(

Q

ˆ = k

e1.7

20

e

− 0 1

µ

(19)

Requirements for EB method

• Existing, current predictive model for that site type

– data available for all explanatory variables in the model

• Measure of precision of the predictions

– spread of the prior distribution – eg coefficient of variation Cv

– dictates the weights given to x and prediction

• But also the form of the prior distribution

– conventional EB method assumes a gamma – because it makes the maths easy!

– but could be lognormal, Weibull ….

– which fits the data best?

0 0.05 0.1 0.15 0.2 0.25 0.3

0 1 2 3 4 5 6 7 8 9 10

m f(m)

(20)

MCMC methods for fitting

• Conventional method of fitting

– Negative Binomial error structure for accident frequencies

– NB regression modelling available in most statistical packages – equivalent to gamma prior + Poisson

– but this is for convenience: no reason why it should be a gamma

• Modern methods now make it possible to fit any form of prior

• Markov Chain Monte Carlo (MCMC) methods

– available in software (eg WinBUGS)

– gamma, lognormal, Weibull … for distribution of m

• Does it make a difference which one?

– does the estimate E(m | x) depend on the form of distribution?

(21)

Effect of different prior distributions

0 5 10 15 20 25

0 5 10 15 20 25

x E(m / x)

E(m  x)

No RTM

Weibull lognormal

gamma v-s gamma 4-arm roundabouts: µ = 5.1

(22)

Estimates of RTM and treatment effect

• So, if x = 20 in before period:

– Weibull: E(m | x) = 9.26, and RTM = -53%

– gamma: E(m | x) = 12.13, and RTM = -39%

– lognormal: E(m | x) = 14.12, and RTM = -29%

• Then if, in the after period, there are 8 accidents, the estimates of the treatment effect are:

– Weibull: -14%

– gamma: -34%

– lognormal: -43%

• So it can make a big difference!

(23)

RTM in speed camera data

• Asked by UK DfT to work with UCL and PA in four year report on speed camera effectiveness (pub: Dec 2005)

– previous reports had not allowed for RTM

– EPSRC research project (Mountain and Maher) – carry out our analysis on subset of data

– 216 urban sites (30 and 40 mph) – allow for trend and RTM

– see how much apparent effect of cameras is due to RTM, and how much is real

Overall reduction = trend + RTM + camera effect

(24)

The predictive accident model

• Obtained from a previous, unconnected study

• Separate models for:

– personal injury collisions (PICs) – fatal and serious collisions (FSCs)

• Relevant factors:

– flow

– road class

– carriageway type – speed limit

– number of minor junctions per km

eg for PICs on 30 mph, single c’way, B road:

) 083

0 exp(

72

0 . q

0 626

L . n / L

ˆ =

.

µ

(25)

Results – for PICs

PICs/site/year:

before 4.65 after 3.22

Overall reduction = 1.43 = 0.37 + 0.31 + 0.75 trend RTM camera 31% = 8% + 7% + 16%

(-31%)

relative to what would have been: 19% allowing for trend + RTM

25% allowing for trend

(26)

Results – for FSCs

FSCs/site/year:

before 1.05 after 0.48

Overall reduction = 0.57 = 0.10 + 0.36 + 0.11 trend RTM camera 54% = 10% + 34% + 10%

(-54%)

relative to what would have been: 19% allowing for trend + RTM

50% allowing for trend

(27)

Conclusions

• EB method provides sound basis for allowing for RTM

• Requires a predictive accident model relevant to the site

– take weighted average of observed x and prediction µ

– should then give unbiassed estimate of before accident mean

• But the model needs to be good (current, well-fitted …)

– if out-of-date, will not correct fully for RTM

– uncertainty of estimate depends on precision of model (Cv)

– the more precise the model, the less uncertainty in the estimate … – … and hence the more precise the estimate of treatment effect

• But assumed form of prior distribution may be important

– not necessarily safe to assume gamma / negative binomial

(28)

Mange tak!

Spørgsmål?

Referencer

RELATEREDE DOKUMENTER

maripaludis Mic1c10, ToF-SIMS and EDS images indicated that in the column incubated coupon the corrosion layer does not contain carbon (Figs. 6B and 9 B) whereas the corrosion

In this study, a national culture that is at the informal end of the formal-informal continuum is presumed to also influence how staff will treat guests in the hospitality

We know that it is not possible to cover all aspects of the Great War but, by approaching it from a historical, political, psychological, literary (we consider literature the prism

autisMesPeKtruMsforstyrrelse Efter opdagelsen af spejlneuroner og deres antagne funktion i forhold til at være grund- laget for at kunne føle empati og imitere andre, opstod

In general terms, a better time resolution is obtained for higher fundamental frequencies of harmonic sound, which is in accordance both with the fact that the higher

In order to verify the production of viable larvae, small-scale facilities were built to test their viability and also to examine which conditions were optimal for larval

H2: Respondenter, der i høj grad har været udsat for følelsesmæssige krav, vold og trusler, vil i højere grad udvikle kynisme rettet mod borgerne.. De undersøgte sammenhænge

Driven by efforts to introduce worker friendly practices within the TQM framework, international organizations calling for better standards, national regulations and