Multi point calculation using F ourier T ransforms

3.4 Multi Point Analysis

3.4.3 Multi point calculation using F ourier T ransforms

between marker 1 and 2. In the gure

v 2 = w ₃

^is ^updated ^according ^to ^all

statesof

v ₁

The idea behind multi point analysis is that we exploit thefact that few

crossoversaremost likelyto occurbetween two markers, sincethe

recombina-tionfractionbetweentwoadjacentmarkersisalwayslessthan

1 2

^. ^F^or^instance,

if we have observed the exact state of

v ₁

^as

v ₃

^, ^that ^is

P ( v ₁ = v ₃ |G ₁ ) = 1

thentheprobabilityofall inheritancevectorsat

v 2

^close²⁴^to

v ₃

^get^increased

probability,whereasinheritancevectorsfar 25

from

v ₃

^get^decreasedprobability. More precisely, the contribution from any state

v _i

^of

v ₁

^is ^given ^by ^the

probability oftheinheritance vector at marker1conditioned on thegenotype

informationat thatlocus,

P (v _i |G ₁ )

^,^times^the^transitionprobability

P (w ₃ | v _i )

The transition probability is given by the Hamming distance between the

two inheritance vectors which indicates the number of odd crossovers that

have occurred between two markers on a chromosome. The probability of

d

crossoversoccurringbetweentwo markerswithaninheritance vectoroflength

n

^is

θ ^d ₁ · (1 − θ ₁ ) ⁿ ⁻ ^d

We thensum over thecontribution to

w ₃

^of ^every ^vector

v _i

^, ^which

corre-spondsto marginalizing out

v ₁

^. ^The ^sum ⁱⁿ^then ^multiplied ^with ^the

condi-tionalprobabilityof

w ₃

^given

G 2

^. ^This^product^isproportionalto

P (w ₃ |G 1 , G 2 )

We compute this product for every inheritance vector at marker

M ₂

^to

getthe full probability distribution,

P ( v ₂ |G ₁ , G ₂ )

²⁶^. ^The^updated probability distribution at marker 2 can then be used to calculate the left-conditioned

probability at marker 3 and so forth. The right-conditioned probability is

calculatedinat similar fashion.

Using this procedure we can compute

P( v i |G all )

^for ^any

v i

^. ^These^values

canthenbeusedinthescoringfunctions,i.e. LODscore,todeterminelinkage

between markers and traits.

where

|v|

^is^the ^number^of inheritance vectors. A convolution hasto be done twice for each marker: Once for calculating the left conditioned probability

andoncefor the right conditioned probability.

In[Gud00 ]Gudbjartssonshowshowthe algorithmsforFastFourier

Trans-forms can berewritten for better utilisation of CPU cache and registers with

speeds-upranging fromafactor of1.9 to5.3dependingon thedataanalysed,

[Gud00 ,page 32-34]. The speed-ups areobtained througha reordering of the

computationssothat they take advantage ofthe cache memory,and through

unrollingloopswhich reduces the amount of book-keeping.

For more on Fourier Transforms within the eld of linkage analysis, see

[KL98 ]and [Gud00 ].

3.4.4 Founder Reduction

One way of reducing the number of computations for the set of inheritance

vectorsfor a pedigree isbyapplying thesocalled founderreduction, [Gud00,

page34]. Theintuitiveideaisthatconsistentlychangingtheallelepointingto

afounder, frompaternal to maternaland viceversa for all children, yields an

newinheritance vector withthesame probability astheoriginal vector. This

isbecause we cannotdistinguish between maternaland thepaternal alleles of

founderssince the phaseis unknown.

Noticethatitisnotareductioninthenumberofinheritancevectors,rather

anexploitation of symmetriesinthe probabilitydistribution.

For the

{ p, m }

^describing ^paternal ^and ^maternal inheritance we dene in-versionsuchthat:

p = m

^and

m = p.

Thefounderreductionappliesfor anyfounder,butfor convenience we just

state the reduction for one male founder. The reduction for all founders is

equivalent. Stated formally:

Theorem 6 (Founder Reduction) Let

h ∈ F

^be ^a ^male ^founder ⁱⁿ ^some

pedigree

P = h F, N, f ather, mother i

^. ^Let

C

^denote ^the ^set ^of ^children ^of

h

thatis

C = { n ∈ N | f ather(n) = h }

^. ^Let

v

^and

w

^be ^two inheritance vectors, suchthat for any

n ∈ N

w(n, p) =

( v(n, p) : n ∈ C v(n, p) :

^otherwise;

w(n, m) = v(n, m).

Givensomegenotypeinformation

G = hL, astatesi

^on^the^pedigr^ee^the^fol^lowing

alwaysholds:

P(v |G ) = P(w |G ).

^(3-21)

Intuitively,thedenitionof

v

^and

w

^states^that^if^a^child^of

h

^has^inherited

h

^'s ^paternal ^allele ^according ^to

v

^, ^then ^that ^child ^has ^inherited

h

^'s ^maternal

allelein

w

^,^and ^vice^versa.

Proof: Prior to proving for multi point linkage analysiswe prove thatit

holds for single point linkage analysis. The denition of

v

^and

w

^implies ^the

following relationship between

F ^v

^and

F ^w

F ^w (n, $) =

 



 

(h, m) : F ^v (n, $) = (h, p) (h, p) : F ^v (n, $) = (h, m) F ^v (n, $) :

^otherwise

,

where

$ ∈ { p, m }

^. ^Theorem¹ ^states^that:

ϕ ^v _G ^def = ^

n ∈L

(f _F v ( n,p ) = a ₁ ∧ f _F v ( n,m ) = a ₂ ) ∨ (f _F v ( n,p ) = a ₂ ∧ f _F v ( n,m ) = a ₁ ) ,

where

a ₁ , a ₂ ∈ astates(n)

^. ^By ^denition

ϕ ^w _G

^is ^equal ^to

ϕ ^v _G

^with ^regard ^to ^all

sub-expressions,exceptthatanysub-expression:

(f _h,$ = a ₁ ∧ f _n,$ 0 = a ₂ ) ∨ (f _h,$ = a ₂ ∧ f _n,$ 0 = a ₁ ),

ϕ ^v _G

^looks^like:

(f _h,$ = a ₁ ∧ f n,$ ⁰ = a ₂ ) ∨ (f _h,$ = a ₂ ∧ f n,$ ⁰ = a ₁ ),

ϕ ^w _G

^. ^This^means ^that ^any constraints on founder allele

(h, $)

ⁱⁿ

ϕ ^v _G

^is ^also

aconstraint on

(h, $)

ⁱⁿ

ϕ ^w _G

Thisimpliesthatforanyfounderalleleassignment,

Z M

^,^which^satises

ϕ ^v _G

that assigns the allelic state

a

^to

(h, $)

^, ^there ^exists ^a ^similar ^founder ^allele

assignment with equal probability,

Z _M ⁰

^, ^which ^satises

ϕ ^w _G

^that ^assigns ^the

allelicstate

a

^to

(h, $)

^. ^In^other ^words:

∀ Z _M ∈ [[ϕ ^v _G ]] ∃ Z _M ⁰ ∈ [[ϕ ^w _G ]] ∀ n ∈ F. Z _M ⁰ (n, $) = (

Z M (n, $) : n = h Z _M (n, $) :

^otherwise,

fora xed

h

^. ^The^opposite^is^also ^true^since^we^can ^apply^the^same ^argument

using

v

^as

w

^and

w

^as

v

^. ^Furthermore,

P (Z _M ) = P (Z _M ⁰ )

^since ^they ^assign

exactlythe same set of allelic states to founder alleles, which in turnimplies

that:

X

Z _M ∈[[ ϕ ^v _G ]]

P(Z _M ) = X

Z _M ⁰ ∈[[ ϕ ^w _G ]]

P (Z _M ⁰ )

andfrom (3-5)we deduce:

P(v |G ) = P(w |G ).

case. To do this we use the fact that founder reduction holds in the single

point case, thatis weusethat(3-22)holds. We considertwo markers

M ₁

^and

M ₂

^with^computed ^single ^point probabilities. We prove that transferring the probabilities from

M ₁

^to

M ₂

^preserves ^the ^equality^between ^the probabilities of the two inheritance vectors

v ₂

^and

w ₂

^, ^where

v ₂

^and

w ₂

^are inheritance vectors for marker

M ₂

^, ^and

v ₂

^and

w ₂

^are ^dened ^at

v

^and

w

^, respectively, in Theorem 6. Hence, we want to show that two inheritance vectors in the

sameequivalence class,due tothefounderreductionat marker

M ₂

^,^remainⁱⁿ

the same equivalence class after we have propagated the evidence (genotype

information)observed at marker

M ₁

Assumethatavector,

v ₁

^,^is^someinheritancevectorformarker

M ₁

^,^and

w ₁

isdened intermsof

v ₁

^as^describedⁱⁿ ^Theorem^6. ^Observe ^that

P (v ₁ |G ₁ ) = P(w ₁ |G ₁ )

^, ^and

P (v ₂ |G ₂ ) = P (w ₂ |G ₂ )

^due ^to ^(3-22), ^where

G ₁

^and

G ₂

^are ^the

genotype information for marker

M ₁

^and

M ₂

^,respectively. We want to show that:

P (v ₂ |G ₁ , G ₂ ) = P (w ₂ |G ₁ , G ₂ )

^(3-23)

Lettherecombinationfractionbetween

M ₁

^and

M ₂

^be

θ ₁

^. ^Then

contribu-tion in terms of probability from

v ₁

^to

v ₂

^is ^given ^by

θ ₁ ^d ^v (1 − θ ₁ ) ⁿ ⁻ ^d ^v

^, ^where

d _v = Ham(v ₁ , v ₂ )

^is ^the ^Hamming ^distance ^between ^the^vectors ^and

n

^is ^the

lengthofaninheritance vector forthe pedigree. To showthat(3-23) holdswe

needtoshowthatthecontributionof

v ₁

^and

w ₁

^to^theprobabilityof

v ₂

^,^is

ex-actlythesameasthecontributionof

v ₁

^and

w ₁

^to

w ₂

^. ^T^o^do^this^we^prove^that

Ham(v ₁ , v ₂ ) = Ham(w ₁ , w ₂ )

^and ^that

Ham(v ₁ , w ₂ ) = Ham(w ₁ , v ₂ )

^,^because

this implies that

v ₂

^and

w ₂

^are ^updated ^with ^exactly ^the ^same probabilities given the observed genotype for

M ₁

^. ^W^e ^only ^prove ^this ^for

Ham(v ₁ , v ₂ ) = Ham(w ₁ , w ₂ )

^since^the ^opposite^is ^similar.

The value

w ₁

^and

v ₁

^are ^identical ^for ^all

(n, $)

^, ^except ⁱⁿ^the ^case ^where

$ = p

^and

f ather(n) = h

^, ⁱⁿ ^which ^case ^the ^value ^diers. ^Since ^the ^same

appliesfor

v ₂

^and

w ₂

^we^need^onlyconcentrateonHammingdistancesbetween theinheritance vectorsforchildrenof

h

^. ^For^any

(n, $)

^if

v ₂ (n, $) 6 = v ₁ (n, $)

then,bythe denitionof

w ₂

^and

w ₁

^it^follows^that

w ₂ (n, $) 6 = w ₁ (n, $)

^. ^The

same argument applies for parts of the inheritance vectors where

v ₂ (n, $) = v ₁ (n, $)

^. ^In^this^case, ^by^denition

w ₂ (n, $) = w ₁ (n, $)

^,^hence^the^Hamming

distancebetween

v ₁

^and

v ₂

^is^equal^to ^the^Hamming^distance^between

w ₁

^and

w ₂

The same can be shown for

Ham(v ₁ , w ₂ ) = Ham(w ₁ , v ₂ )

^with ^the ^same

arguments. For this reason, in the multi point step the founder reduction

holds.

♦

In[Gud00 ]Gudbjartssondenesfoundercouplereduction asanadditionalway

of reducing the number of computations performed during single and multi

point linkage analysis. Intuitively, founder reduction states that we cannot

distinguish between the maternal and paternal alleles of founders, that is,

we can switch between their two alleles. Founder couple reduction states,

intuitively,thatwecannotdistinguishthemaleandfemalefounderofafounder

couple. Afounder couple istwo founderswho share ospringinthepedigree.

Westate this formallyinthefollowing theorem:

Theorem 7 (Founder Couple Reduction) Let

h, h ⁰ ∈ F

^be ^the ^male ^and

femalefounder,respectively, of a founder couple insome pedigree

P = hF, N, f ather, motheri

^. ^Let

C

^denote ^the^set ^of ^children^of

h

^and

h ⁰

^, ^that

C = { n ∈ N | f ather(n) = h

^and

mother(n) = h ⁰ }

^. ^Let

v

^and

w

^be ^two

inheritance vectors, suchthat for any

n ∈ N

w(n, $) =

 



 

v(n, $) : n ∈ C

v(n, $) : f ather(n) ∈ C

^or

mother(n) ∈ C v(n, $) :

^otherwise

.

Given some genotype information

G = hL, astatesi

^on ^the ^pedigree ^where

h, h ⁰ ∈ L /

^and ^that ^neither

h

^nor

h ⁰

^has ^any ^children ⁱⁿ

V

^not ⁱⁿ

C

^, ^the

followingalways holds:

P(v|G ) = P(w|G ).

^(3-24)

We do not prove this, but refer the keen reader to [Gud00 , page 35] for

moreinformation on foundercouple reduction.

In document BRICS Basic Research in Computer Science (Sider 69-73)

Multi point calculation using F ourier T ransforms

3.4 Multi Point Analysis

3.4.3 Multi point calculation using F ourier T ransforms

v 2 = w 3

v 1

1 2

v 1

v 3

P ( v 1 = v 3 |G 1 ) = 1

v 2

v 3

v 3

v i

v 1

P (v i |G 1 )

P (w 3 | v i )

d

n

θ d 1 · (1 − θ 1 ) n − d

w 3

v i

v 1

w 3

G 2

P (w 3 |G 1 , G 2 )

M 2

P ( v 2 |G 1 , G 2 )

P( v i |G all )

v i

|v|

{ p, m }

p = m

m = p.

h ∈ F

P = h F, N, f ather, mother i

C

h

C = { n ∈ N | f ather(n) = h }

v

w

n ∈ N

w(n, p) =

( v(n, p) : n ∈ C v(n, p) :

w(n, m) = v(n, m).

G = hL, astatesi

P(v |G ) = P(w |G ).

v

w

h

h

v

h

w

v

w

F v

F w

F w (n, $) =

 



 

(h, m) : F v (n, $) = (h, p) (h, p) : F v (n, $) = (h, m) F v (n, $) :

,

$ ∈ { p, m }

ϕ v G def = ^

n ∈L

(f F v ( n,p ) = a 1 ∧ f F v ( n,m ) = a 2 ) ∨ (f F v ( n,p ) = a 2 ∧ f F v ( n,m ) = a 1 ) ,

a 1 , a 2 ∈ astates(n)

ϕ w G

ϕ v G

(f h,$ = a 1 ∧ f n,$ 0 = a 2 ) ∨ (f h,$ = a 2 ∧ f n,$ 0 = a 1 ),

ϕ v G

(f h,$ = a 1 ∧ f n,$ 0 = a 2 ) ∨ (f h,$ = a 2 ∧ f n,$ 0 = a 1 ),

ϕ w G

(h, $)

ϕ v G

(h, $)

ϕ w G

Z M

ϕ v G

v 2 = w ₃

v ₁

v ₁

v ₃

P ( v ₁ = v ₃ |G ₁ ) = 1

v ₃

v ₃

v _i

v ₁

P (v _i |G ₁ )

P (w ₃ | v _i )

θ ^d ₁ · (1 − θ ₁ ) ⁿ ⁻ ^d

w ₃

v _i

v ₁

w ₃

P (w ₃ |G 1 , G 2 )

M ₂

P ( v ₂ |G ₁ , G ₂ )

F ^v

F ^w

F ^w (n, $) =

(h, m) : F ^v (n, $) = (h, p) (h, p) : F ^v (n, $) = (h, m) F ^v (n, $) :

ϕ ^v _G ^def = ^

(f _F v ( n,p ) = a ₁ ∧ f _F v ( n,m ) = a ₂ ) ∨ (f _F v ( n,p ) = a ₂ ∧ f _F v ( n,m ) = a ₁ ) ,

a ₁ , a ₂ ∈ astates(n)

ϕ ^w _G

ϕ ^v _G

(f _h,$ = a ₁ ∧ f _n,$ 0 = a ₂ ) ∨ (f _h,$ = a ₂ ∧ f _n,$ 0 = a ₁ ),

ϕ ^v _G

(f _h,$ = a ₁ ∧ f n,$ ⁰ = a ₂ ) ∨ (f _h,$ = a ₂ ∧ f n,$ ⁰ = a ₁ ),

ϕ ^w _G

ϕ ^v _G

ϕ ^w _G

ϕ ^v _G

Z _M ⁰

ϕ ^w _G

∀ Z _M ∈ [[ϕ ^v _G ]] ∃ Z _M ⁰ ∈ [[ϕ ^w _G ]] ∀ n ∈ F. Z _M ⁰ (n, $) = (

Z M (n, $) : n = h Z _M (n, $) :

P (Z _M ) = P (Z _M ⁰ )

Z _M ∈[[ ϕ ^v _G ]]

P(Z _M ) = X

Z _M ⁰ ∈[[ ϕ ^w _G ]]

P (Z _M ⁰ )

M ₁

M ₂

M ₁

M ₂

v ₂

w ₂

v ₂

w ₂

M ₂

v ₂

w ₂

M ₂

M ₁

v ₁

M ₁

w ₁

v ₁

P (v ₁ |G ₁ ) = P(w ₁ |G ₁ )

P (v ₂ |G ₂ ) = P (w ₂ |G ₂ )

G ₁

G ₂

M ₁

M ₂

P (v ₂ |G ₁ , G ₂ ) = P (w ₂ |G ₁ , G ₂ )

M ₁

M ₂

θ ₁

v ₁

v ₂

θ ₁ ^d ^v (1 − θ ₁ ) ⁿ ⁻ ^d ^v

d _v = Ham(v ₁ , v ₂ )

v ₁

w ₁

v ₂

v ₁

w ₁