Multi Point Probability Computation

3.4 Multi Point Analysis

3.4.2 Multi Point Probability Computation

The objective of multi point probability computation is to calculate a

bet-ter probabilistic estimate of the inheritance pattern for each markers given

all available genotype information based on the single point probability

dis-tribution. Themethod we describe is inspired by [LG87 ] werethe problemis

described by the means of a hidden Markov model (HMM) 21

. This approach

iswidely used due to the factthat complexity only grows linearly with

num-berof markers introduced, but grows exponentially withsize of thepedigree.

Seepage9formoreoncrossover.

SeeAppendixDformoreonhiddenMarkovmodels.

Networks.

Furthermore,weshowthatHMMsaretoorestrictivetorepresentthe

prob-lem,butthatan ordinaryBayesiannetwork withstructure similarto aHMM

canbeused.

Bayesian Networks

In this section we use Bayesian networks as a framework for computing the

multi point probability distribution. This is consistent with thetheory used

in[Gud00 ] and[LG87 ]. Thedescriptionof Bayesiantheoryusedhereisbased

on[Jen01].

Denition 12 (Bayesian Network) A Bayesian network is a tuple

hV , E i

where:

V

^: ^a ^set ^of ^variables ^each ^with^a ^set ^of ^mutual ^exclusive ^states,

E ⊆ V × V

^: ^a ^set ^of ^directed ^edges ^between ^variables, ^and

- each variable

A ∈ V

^with ^parents

B ₁ , . . . , B n

^has ^a (conditional) proba-bility table

P(A | B ₁ , . . . , B _n )

^attached.

(B, A) ∈ E

^then ^the ^edge ^is ^directed ^from

B

^to

A

^and

B

^is ^said ^to ^be

the parent of

A

^. ^We ^denote ^the ^states ^of ^a ^variable

A

^by

A = (a ₀ , . . . , a k )

Whenmodeling aproblem usingBayesian networksthedirected edgesshould

represent the causality of the problem domain. That is, there is a edgefrom

B

^to

A

^if^the ^state ^of

B

^eects^the^state ^of

A

Thejointprobabilitydistribution 22

overallvariablesinaBayesiannetwork

iscalculated usingthechain rule of Bayesian networks:

Theorem 5 (Chain Rule) Let

BN

^be^a^Bayesian^network^over^the^variables

V = { A ₁ , . . . A _n }

^. ^Then ^the ^jointprobability distribution

P ( V )

^is^given ^by:

P ( V ) = Y

i

P(A _i |pa(A i )),

where

pa(A _i )

^is ^the ^parent ^set ^of

A _i

For the proof we referto [Jen01 ,page 21].

Hidden Markov models belong to a special kind of Bayesian network,

namelydynamicBayesiannetworks 23

. ThestructureofahiddenMarkovmodel

If the universe,

V

^, ^consists ^of ^three ^variables

A

B

^, ^and

C

^then ^the ^joint probability

P(V) = P (A, B, C)

^,^which^is^a3-dimensionaltableofsize

|A| · |B | · |C|

asa dynamic Bayesian network isillustrated inFigure3-10. Each step,

t _i

^,^is

referredto asa time slice or time step. For every timeslice

i

^,^the conditional probability table

P (O _i | S _i )

^is ^the ^same ^and ^all transitional probabilities be-tween timeslices arethe same,that is,

P(S _i | S _i ₋₁ ) = P(S _j | S _j ₋₁ )

^for ^all

i

^and

j

^. ^This^is^consistent ^with^the^denition ^of^{[Jen01 ].}

t 1 t 2 t 3

S ₁ S ₂ S ₃

O ₁ O ₂ O ₃

Figure 3-10: The structure of a hidden Markov model as a dynamic

Bayesiannetwork.

In the remaining part of this section we explain how multi point

proba-bilitycomputation sharesstructure withhidden Markovmodels. Undersome

assumptionstheproblemcanbedescribedasaHMM.Theseassumptionsstate

thatthepossibleobservablegenotypeinformation at eachmarkeristhesame,

andbetween eachmarkertherecombination fractionisthesame. Thisis

nec-essary in order to keep each conditional probability table identical over time

slices (according to the denition of HMMs). We follow the construction

in-troduced by Lander and Green in [LG87 ] and we show why the constructed

Bayesian network isaHMM underthese assumptionsonly.

As introduced by [LG87 ], the construction of the Bayesian network goes

as follows. The parent variable of time slice

i

v i

^, ^becomes ^a ^variable ^with

states corresponding to the set of inheritance vectors. The child variable for

thesametimeslice,

G i

^,^has^statescorrespondingtothepossiblegenotype infor-mation. Inthis waywekeep thecausalitythattheinheritance vectorschange

theprobability of observable genotype information. The

i

^-th probability dis-tributionoverinheritancevectorsrefersto theinheritance distributionfor the

i

^-th ^marker ^on ^the chromosome. The a priori probability table of

v i

^is ^the

standard Mendelian inheritance distribution stating that, initially, all

inheri-tance vectors have equal probability. The conditional probability table of

v j

for

j > 1

^is ^given^by:

P( v j = v |v j −1 = w) = θ ^d _j ₋₁ · (1 − θ _j ₋₁ ) ⁿ ⁻ ^d ,

where

n

^is^the ^length^of^theinheritancevectorand

d

^is^the^Hamming^distance

between

v

^and

w

^,^and

θ _j ₋₁

^is^therecombination fractionbetweenmarker

M _j ₋₁

and

M _j

^. ^This ^is ^because ^the contribution from adjacent markers is given as thenumber of crossovers that have occurred, and this is exactlygiven as the

Hammingdistancebetween two markers.

structureis similarto thatof aHMM of length

m

^,^where

m

^is ^the^number^of

markers,but since both

P( v j |v j −1 )

^and

P ( G j |v j )

^dier^at ^each^time^slice^this

is not a HMM. The parent node of each time slice is a variable representing

t 1 t 2 v ₁ v ₂

G ₁ G ₂

t m

G m v m

Figure 3-11: Multi point probability computation represented as a

Bayesian network with similar structure to a hidden

Markovmodel.

all inheritance vectors, whereas the child node is an instance of genotype

in-formation,thatisa stateofthe genotype variable. Thisisnotconsistent with

thedenition ofBayesian networks, but we introduce itto symbolise thatthe

genotype information is always observable, that is, we know the state of the

genotype variable. Thisobservationmight bethatnogenotypeinformation is

available. Furthermore, to propagate evidence inthe form of genotype

infor-mation we need only

P ( G i |v i )

^. ^In ^this ^sense, ^the ^state ^space ⁱⁿ ^each ^parent

node remains the same, but the observations and probabilities change over

eachtimeslice.

Using this model we are able to compute the probability of all markers

giventheavailablegenotypeinformation. Let

G i

^denote^the^observed^genotype

information at locus

i

^and ^let

G all = {G i | 1 ≤ i ≤ m }

^denote ^the ^set ^of ^all

genotype information. We can then compute

P ( v ₁ , . . . , v m |G all )

^by ^applying

Bayes' rule:

P ( v ₁ , . . . , v m |G all ) = P ( v ₁ , . . . , v m , G all ) P ( G all ) .

Noticethatthedenominator isjustthenormalizationfactorandwe needonly

calculatethenumerator. ByapplyingthechainruleforBayesiannetworksthe

numerator isgivenas:

P( v 1 , . . . , v m , G all ) = P ( v 1 )P ( G 1 |v 1 )P ( v 2 |v 1 )P ( G 2 |v 2 )P ( v 3 |v 2 )

· · · P ( v m |v m −1 )P ( G m |v m ).

Findingtheconditional probabilitydistributionsfor theindividual

v i

^is^just^a

matterofmarginalizing out theothervariables.

This approach is inecient since

P ( v ₁ , . . . , v m , G all )

^has

2 ^|v|· ^m

^entries,

|v|

P ( v ₁ , . . . , v m , G all ) ∝ P ( v ₁ |G ₁ )P( v ₂ |v ₁ )P( v ₂ |G ₂ )P( v ₃ |v ₂ )

· · · P( v m |v m −1 )P( v m |G m ),

since

P ( v 1 )

^is ^the^Mendelian probabilitydistribution and from(3-4)insingle point probability computation we have that

P( G i |v i ) ∝ P ( v i |G i )

^. ^So ^far ^we

achieved no reduction in size, but the reduction becomes apparent when we

want to calculate the probability of a single variable given the genotype

in-formation. The reason is that we can use the commutative and distributive

propertiesofmarginalization [Jen01 ,page 16],which means thatto

marginal-izeout avariable we need only consider theprobability tablescontaining the

variable. Thisenablesus todenetheleft-conditionedprobability,

P _i ^L

^,^and^the

right-conditioned probability,

P _i ^R

^,recursively as:

P _i ^L ₊₁ = P _i ^L · X

v i

P( v i |G i )P ( v i +1 |v i ), 1 ≤ i < m

^and ^(3-18)

P _i ^R ₋₁ = P _i ^R · X

v i

P( v i |G i )P ( v i |v i −1 ), 1 < i ≤ m,

^(3-19)

withbasecases

P ₁ ^L = 1 = P _m ^R

^. ^F^rom ^this^it^follows^that^theprobabilityof the variable

v i

conditioned byall genotype information is:

P ( v i |G all ) ∝ P _i ^L · P ( v i |G i ) · P _i ^R .

^(3-20)

Transition:

• •

• • v 1

v 2

v 3

v 4

M ₁

w 1

w 2

w 3

w 4

M ₂

v m w m

P (w ₃ |v _i ) = θ ₁ ^d ⁱ · (1 − θ ^n−d ₁ ⁱ ) d i = Ham(v i , w 3 )

Figure 3-12: Onestepintheleft-conditionedprobabilitycalculations.

Intuitively,werstcalculateallthesinglepointprobabilitiesateachmarker

locus. Wethentransfertheupdatedprobabilitiesonestepat atimeacrossthe

chromosome towards themarkerfor which we want to calculate

P ( v i |G all )

between marker 1 and 2. In the gure

v 2 = w ₃

^is ^updated ^according ^to ^all

statesof

v ₁

The idea behind multi point analysis is that we exploit thefact that few

crossoversaremost likelyto occurbetween two markers, sincethe

recombina-tionfractionbetweentwoadjacentmarkersisalwayslessthan

1 2

^. ^F^or^instance,

if we have observed the exact state of

v ₁

^as

v ₃

^, ^that ^is

P ( v ₁ = v ₃ |G ₁ ) = 1

thentheprobabilityofall inheritancevectorsat

v 2

^close²⁴^to

v ₃

^get^increased

probability,whereasinheritancevectorsfar 25

from

v ₃

^get^decreasedprobability. More precisely, the contribution from any state

v _i

^of

v ₁

^is ^given ^by ^the

probability oftheinheritance vector at marker1conditioned on thegenotype

informationat thatlocus,

P (v _i |G ₁ )

^,^times^the^transitionprobability

P (w ₃ | v _i )

The transition probability is given by the Hamming distance between the

two inheritance vectors which indicates the number of odd crossovers that

have occurred between two markers on a chromosome. The probability of

d

crossoversoccurringbetweentwo markerswithaninheritance vectoroflength

n

^is

θ ^d ₁ · (1 − θ ₁ ) ⁿ ⁻ ^d

We thensum over thecontribution to

w ₃

^of ^every ^vector

v _i

^, ^which

corre-spondsto marginalizing out

v ₁

^. ^The ^sum ⁱⁿ^then ^multiplied ^with ^the

condi-tionalprobabilityof

w ₃

^given

G 2

^. ^This^product^isproportionalto

P (w ₃ |G 1 , G 2 )

We compute this product for every inheritance vector at marker

M ₂

^to

getthe full probability distribution,

P ( v ₂ |G ₁ , G ₂ )

²⁶^. ^The^updated probability distribution at marker 2 can then be used to calculate the left-conditioned

probability at marker 3 and so forth. The right-conditioned probability is

calculatedinat similar fashion.

Using this procedure we can compute

P( v i |G all )

^for ^any

v i

^. ^These^values

canthenbeusedinthescoringfunctions,i.e. LODscore,todeterminelinkage

between markers and traits.

In document BRICS Basic Research in Computer Science (Sider 64-69)

Multi Point Probability Computation

3.4 Multi Point Analysis

3.4.2 Multi Point Probability Computation

hV , E i

V

E ⊆ V × V

A ∈ V

B 1 , . . . , B n

P(A | B 1 , . . . , B n )

(B, A) ∈ E

B

A

B

A

A

A = (a 0 , . . . , a k )

B

A

B

A

BN

V = { A 1 , . . . A n }

P ( V )

P ( V ) = Y

i

P(A i |pa(A i )),

pa(A i )

A i

V

A

B

C

P(V) = P (A, B, C)

|A| · |B | · |C|

t i

i

P (O i | S i )

P(S i | S i −1 ) = P(S j | S j −1 )

i

j

t 1 t 2 t 3

S 1 S 2 S 3

O 1 O 2 O 3

i

v i

G i

i

i

v i

v j

j > 1

P( v j = v |v j −1 = w) = θ d j −1 · (1 − θ j −1 ) n − d ,

n

d

v

w

θ j −1

M j −1

M j

m

m

P( v j |v j −1 )

P ( G j |v j )

t 1 t 2 v 1 v 2

G 1 G 2

t m

G m v m

P ( G i |v i )

G i

i

G all = {G i | 1 ≤ i ≤ m }

P ( v 1 , . . . , v m |G all )

P ( v 1 , . . . , v m |G all ) = P ( v 1 , . . . , v m , G all ) P ( G all ) .

P( v 1 , . . . , v m , G all ) = P ( v 1 )P ( G 1 |v 1 )P ( v 2 |v 1 )P ( G 2 |v 2 )P ( v 3 |v 2 )

· · · P ( v m |v m −1 )P ( G m |v m ).

v i

P ( v 1 , . . . , v m , G all )

2 |v|· m

|v|

P ( v 1 , . . . , v m , G all ) ∝ P ( v 1 |G 1 )P( v 2 |v 1 )P( v 2 |G 2 )P( v 3 |v 2 )

B ₁ , . . . , B n

P(A | B ₁ , . . . , B _n )

A = (a ₀ , . . . , a k )

V = { A ₁ , . . . A _n }

P(A _i |pa(A i )),

pa(A _i )

A _i

t _i

P (O _i | S _i )

P(S _i | S _i ₋₁ ) = P(S _j | S _j ₋₁ )

S ₁ S ₂ S ₃

O ₁ O ₂ O ₃

P( v j = v |v j −1 = w) = θ ^d _j ₋₁ · (1 − θ _j ₋₁ ) ⁿ ⁻ ^d ,

θ _j ₋₁

M _j ₋₁

M _j

t 1 t 2 v ₁ v ₂

G ₁ G ₂

P ( v ₁ , . . . , v m |G all )

P ( v ₁ , . . . , v m |G all ) = P ( v ₁ , . . . , v m , G all ) P ( G all ) .

P ( v ₁ , . . . , v m , G all )

2 ^|v|· ^m

P ( v ₁ , . . . , v m , G all ) ∝ P ( v ₁ |G ₁ )P( v ₂ |v ₁ )P( v ₂ |G ₂ )P( v ₃ |v ₂ )

P _i ^L

P _i ^R

P _i ^L ₊₁ = P _i ^L · X

P _i ^R ₋₁ = P _i ^R · X

P ₁ ^L = 1 = P _m ^R

P ( v i |G all ) ∝ P _i ^L · P ( v i |G i ) · P _i ^R .

M ₁

M ₂

P (w ₃ |v _i ) = θ ₁ ^d ⁱ · (1 − θ ^n−d ₁ ⁱ ) d i = Ham(v i , w 3 )

v 2 = w ₃

v ₁

v ₁

v ₃

P ( v ₁ = v ₃ |G ₁ ) = 1

v ₃

v ₃

v _i

v ₁

P (v _i |G ₁ )

P (w ₃ | v _i )

θ ^d ₁ · (1 − θ ₁ ) ⁿ ⁻ ^d

w ₃

v _i

v ₁

w ₃

P (w ₃ |G 1 , G 2 )

M ₂

P ( v ₂ |G ₁ , G ₂ )