• Ingen resultater fundet

Multi point calculation using F ourier T ransforms

3.4 Multi Point Analysis

3.4.3 Multi point calculation using F ourier T ransforms

between marker 1 and 2. In the gure

v 2 = w 3

is updated according to all

statesof

v 1

.

The idea behind multi point analysis is that we exploit thefact that few

crossoversaremost likelyto occurbetween two markers, sincethe

recombina-tionfractionbetweentwoadjacentmarkersisalwayslessthan

1 2

. Forinstance,

if we have observed the exact state of

v 1

as

v 3

, that is

P ( v 1 = v 3 |G 1 ) = 1

,

thentheprobabilityofall inheritancevectorsat

v 2

close24to

v 3

getincreased

probability,whereasinheritancevectorsfar 25

from

v 3

getdecreasedprobability. More precisely, the contribution from any state

v i

of

v 1

is given by the

probability oftheinheritance vector at marker1conditioned on thegenotype

informationat thatlocus,

P (v i |G 1 )

,timesthetransitionprobability

P (w 3 | v i )

.

The transition probability is given by the Hamming distance between the

two inheritance vectors which indicates the number of odd crossovers that

have occurred between two markers on a chromosome. The probability of

d

crossoversoccurringbetweentwo markerswithaninheritance vectoroflength

n

is

θ d 1 · (1 θ 1 ) n d

.

We thensum over thecontribution to

w 3

of every vector

v i

, which

corre-spondsto marginalizing out

v 1

. The sum inthen multiplied with the

condi-tionalprobabilityof

w 3

given

G 2

. Thisproductisproportionalto

P (w 3 |G 1 , G 2 )

.

We compute this product for every inheritance vector at marker

M 2

to

getthe full probability distribution,

P ( v 2 |G 1 , G 2 )

26. Theupdated probability distribution at marker 2 can then be used to calculate the left-conditioned

probability at marker 3 and so forth. The right-conditioned probability is

calculatedinat similar fashion.

Using this procedure we can compute

P( v i |G all )

for any

v i

. Thesevalues

canthenbeusedinthescoringfunctions,i.e. LODscore,todeterminelinkage

between markers and traits.

where

|v|

isthe numberof inheritance vectors. A convolution hasto be done twice for each marker: Once for calculating the left conditioned probability

andoncefor the right conditioned probability.

In[Gud00 ]Gudbjartssonshowshowthe algorithmsforFastFourier

Trans-forms can berewritten for better utilisation of CPU cache and registers with

speeds-upranging fromafactor of1.9 to5.3dependingon thedataanalysed,

[Gud00 ,page 32-34]. The speed-ups areobtained througha reordering of the

computationssothat they take advantage ofthe cache memory,and through

unrollingloopswhich reduces the amount of book-keeping.

For more on Fourier Transforms within the eld of linkage analysis, see

[KL98 ]and [Gud00 ].

3.4.4 Founder Reduction

One way of reducing the number of computations for the set of inheritance

vectorsfor a pedigree isbyapplying thesocalled founderreduction, [Gud00,

page34]. Theintuitiveideaisthatconsistentlychangingtheallelepointingto

afounder, frompaternal to maternaland viceversa for all children, yields an

newinheritance vector withthesame probability astheoriginal vector. This

isbecause we cannotdistinguish between maternaland thepaternal alleles of

founderssince the phaseis unknown.

Noticethatitisnotareductioninthenumberofinheritancevectors,rather

anexploitation of symmetriesinthe probabilitydistribution.

For the

{ p, m }

describing paternal and maternal inheritance we dene in-versionsuchthat:

p = m

and

m = p.

Thefounderreductionappliesfor anyfounder,butfor convenience we just

state the reduction for one male founder. The reduction for all founders is

equivalent. Stated formally:

Theorem 6 (Founder Reduction) Let

h F

be a male founder in some

pedigree

P = h F, N, f ather, mother i

. Let

C

denote the set of children of

h

,

thatis

C = { n N | f ather(n) = h }

. Let

v

and

w

be two inheritance vectors, suchthat for any

n N

:

w(n, p) =

( v(n, p) : n C v(n, p) :

otherwise;

w(n, m) = v(n, m).

Givensomegenotypeinformation

G = hL, astatesi

onthepedigreethefollowing

alwaysholds:

P(v |G ) = P(w |G ).

(3-21)

Intuitively,thedenitionof

v

and

w

statesthatifachildof

h

hasinherited

h

's paternal allele according to

v

, then that child has inherited

h

's maternal

allelein

w

,and viceversa.

Proof: Prior to proving for multi point linkage analysiswe prove thatit

holds for single point linkage analysis. The denition of

v

and

w

implies the

following relationship between

F v

and

F w

:

F w (n, $) =

 

 

(h, m) : F v (n, $) = (h, p) (h, p) : F v (n, $) = (h, m) F v (n, $) :

otherwise

,

where

$ ∈ { p, m }

. Theorem1 statesthat:

ϕ v G def = ^

n ∈L

(f F v ( n,p ) = a 1 f F v ( n,m ) = a 2 ) (f F v ( n,p ) = a 2 f F v ( n,m ) = a 1 ) ,

where

a 1 , a 2 astates(n)

. By denition

ϕ w G

is equal to

ϕ v G

with regard to all

sub-expressions,exceptthatanysub-expression:

(f h,$ = a 1 f n,$ 0 = a 2 ) (f h,$ = a 2 f n,$ 0 = a 1 ),

in

ϕ v G

lookslike:

(f h,$ = a 1 f n,$ 0 = a 2 ) (f h,$ = a 2 f n,$ 0 = a 1 ),

in

ϕ w G

. Thismeans that any constraints on founder allele

(h, $)

in

ϕ v G

is also

aconstraint on

(h, $)

in

ϕ w G

.

Thisimpliesthatforanyfounderalleleassignment,

Z M

,whichsatises

ϕ v G

that assigns the allelic state

a

to

(h, $)

, there exists a similar founder allele

assignment with equal probability,

Z M 0

, which satises

ϕ w G

that assigns the

allelicstate

a

to

(h, $)

. Inother words:

Z M [[ϕ v G ]] Z M 0 [[ϕ w G ]] n F. Z M 0 (n, $) = (

Z M (n, $) : n = h Z M (n, $) :

otherwise,

fora xed

h

. Theoppositeisalso truesincewecan applythesame argument

using

v

as

w

and

w

as

v

. Furthermore,

P (Z M ) = P (Z M 0 )

since they assign

exactlythe same set of allelic states to founder alleles, which in turnimplies

that:

X

Z M ∈[[ ϕ v G ]]

P(Z M ) = X

Z M 0 ∈[[ ϕ w G ]]

P (Z M 0 )

andfrom (3-5)we deduce:

P(v |G ) = P(w |G ).

case. To do this we use the fact that founder reduction holds in the single

point case, thatis weusethat(3-22)holds. We considertwo markers

M 1

and

M 2

withcomputed single point probabilities. We prove that transferring the probabilities from

M 1

to

M 2

preserves the equalitybetween the probabilities of the two inheritance vectors

v 2

and

w 2

, where

v 2

and

w 2

are inheritance vectors for marker

M 2

, and

v 2

and

w 2

are dened at

v

and

w

, respectively, in Theorem 6. Hence, we want to show that two inheritance vectors in the

sameequivalence class,due tothefounderreductionat marker

M 2

,remainin

the same equivalence class after we have propagated the evidence (genotype

information)observed at marker

M 1

.

Assumethatavector,

v 1

,issomeinheritancevectorformarker

M 1

,and

w 1

isdened intermsof

v 1

asdescribedin Theorem6. Observe that

P (v 1 |G 1 ) = P(w 1 |G 1 )

, and

P (v 2 |G 2 ) = P (w 2 |G 2 )

due to (3-22), where

G 1

and

G 2

are the

genotype information for marker

M 1

and

M 2

,respectively. We want to show that:

P (v 2 |G 1 , G 2 ) = P (w 2 |G 1 , G 2 )

(3-23)

Lettherecombinationfractionbetween

M 1

and

M 2

be

θ 1

. Then

contribu-tion in terms of probability from

v 1

to

v 2

is given by

θ 1 d v (1 θ 1 ) n d v

, where

d v = Ham(v 1 , v 2 )

is the Hamming distance between thevectors and

n

is the

lengthofaninheritance vector forthe pedigree. To showthat(3-23) holdswe

needtoshowthatthecontributionof

v 1

and

w 1

totheprobabilityof

v 2

,is

ex-actlythesameasthecontributionof

v 1

and

w 1

to

w 2

. Todothisweprovethat

Ham(v 1 , v 2 ) = Ham(w 1 , w 2 )

and that

Ham(v 1 , w 2 ) = Ham(w 1 , v 2 )

,because

this implies that

v 2

and

w 2

are updated with exactly the same probabilities given the observed genotype for

M 1

. We only prove this for

Ham(v 1 , v 2 ) = Ham(w 1 , w 2 )

sincethe oppositeis similar.

The value

w 1

and

v 1

are identical for all

(n, $)

, except inthe case where

$ = p

and

f ather(n) = h

, in which case the value diers. Since the same

appliesfor

v 2

and

w 2

weneedonlyconcentrateonHammingdistancesbetween theinheritance vectorsforchildrenof

h

. Forany

(n, $)

if

v 2 (n, $) 6 = v 1 (n, $)

then,bythe denitionof

w 2

and

w 1

itfollowsthat

w 2 (n, $) 6 = w 1 (n, $)

. The

same argument applies for parts of the inheritance vectors where

v 2 (n, $) = v 1 (n, $)

. Inthiscase, bydenition

w 2 (n, $) = w 1 (n, $)

,hencetheHamming

distancebetween

v 1

and

v 2

isequalto theHammingdistancebetween

w 1

and

w 2

.

The same can be shown for

Ham(v 1 , w 2 ) = Ham(w 1 , v 2 )

with the same

arguments. For this reason, in the multi point step the founder reduction

holds.

In[Gud00 ]Gudbjartssondenesfoundercouplereduction asanadditionalway

of reducing the number of computations performed during single and multi

point linkage analysis. Intuitively, founder reduction states that we cannot

distinguish between the maternal and paternal alleles of founders, that is,

we can switch between their two alleles. Founder couple reduction states,

intuitively,thatwecannotdistinguishthemaleandfemalefounderofafounder

couple. Afounder couple istwo founderswho share ospringinthepedigree.

Westate this formallyinthefollowing theorem:

Theorem 7 (Founder Couple Reduction) Let

h, h 0 F

be the male and

femalefounder,respectively, of a founder couple insome pedigree

P = hF, N, f ather, motheri

. Let

C

denote theset of childrenof

h

and

h 0

, that

is

C = { n N | f ather(n) = h

and

mother(n) = h 0 }

. Let

v

and

w

be two

inheritance vectors, suchthat for any

n N

:

w(n, $) =

 

 

v(n, $) : n C

v(n, $) : f ather(n) C

or

mother(n) C v(n, $) :

otherwise

.

Given some genotype information

G = hL, astatesi

on the pedigree where

h, h 0 ∈ L /

and that neither

h

nor

h 0

has any children in

V

not in

C

, the

followingalways holds:

P(v|G ) = P(w|G ).

(3-24)

We do not prove this, but refer the keen reader to [Gud00 , page 35] for

moreinformation on foundercouple reduction.