CASE: Analysing Genetic Transcription

Pathway Analysis

8.5 CASE: Analysing Genetic Transcription

We now turn to the examination of the abstract model of genetic transcription that was presented in Section 3.1.2. When subjecting this model to Pathway Analysis we obtain the pathway automaton shown in Fig. 8.3, regardless of the CFA used for computingF_⋆.

The result is rather imprecise, and in more than one way:

q_0

Figure 8.3: Pathway analysis of genetic transcription model.

First of all, the system is not expected to have dead end states. Indeed, inspec-tion reveals that the edges (12,20),(11,20),(7,19) are analysis artifacts, and, if this is taken into account, the model seems quite reasonable.

Secondly, the edges (11,19),(12,19), corresponding to the attachment of actual nucleotides, always appear in pairs. This is adequately explained by the fact that the Pathway Analysis does not take the binding of names into account.

8.6 Concluding Remarks

The CFAs of the previous chapters are mainly concerned with the spatial prop-erties of reachable configurations. In contrast, the Pathway Analysis focuses on the causal properties of the realisable transition sequences. Given a model, P⋆, the Pathway Analysis computes a finite, partially deterministic, automaton that safely over-approximates the set of, possibly infinite, sequential behaviours that are realisable byP⋆.

In aims and scope the Pathway Analysis is superficially related to previous occurrence counting analyses of ambient calculi [HJNN99, LM04, GL05]. How-ever, the Pathway Analysis counts capability prefixes, rather than ambients, and, technically, it resorts to an adaptation of classical Monotone Frameworks [KU77, NNH99], rather than Abstract Interpretation. Thus, the analysis ex-tends a line of work on data flow related analyses for the CCS family of process calculi [NN06, NN08].

Pathway automata model system configurations by extended multisets of ex-posed prefixes, The dynamic nature of processes is captured by transfer functions in the manner of classical Bit-vector Frameworks. These functions determine how extended multisets grow and shrink as prefixes react. A worklist algorithm

computes the final automaton. It expands the state space iteratively until no new states arises by application of the transfer functions.

The practical value of the Pathway Analysis is, in part, determined by its com-putational complexity. As presented here the algorithm is exponential, but the presented algorithm is remarkably flexible in this respect. Three mechanisms in total are used to control the tradeoff between precision and termination prop-erties of the algorithm:

Firstly, the analysis counts and hence a notion of widening, in the manner of Abstract Interpretation [CC77, CC79, NNH99], helps to ensure termination in the presence of infinite multiplicities.

Secondly, theF⋆relation, computed by either of the previously presented CFAs, is used to bound the number of enabled transitions explored by the algorithm.

Finally, an equivalence function on states is used to decide whether a computed state is new or constitutes an update to an already existing state. The only such equivalence explored in this dissertation is equality for dom(E), but in practice a variety of functions may be used – as long as they are finitary, stable, and injective in the terminology of [NN08]. A fine-grained equivalence is likely to generate fairly precise analysis results at a high complexity. A coarser equiva-lence relation will give more approximative analysis results at a lower (perhaps even low polynomial) complexity.

Analysis results obtained in the context of the two ongoing case studies are promising, but also indicate viable avenues of improvement:

When applied to the normal receptor LDL pathway model of Section 3.3 the analysis retrieves an automaton that quite clearly identifies the interaction se-quences that constitute the pathway. There are a few spurious edges. These analysis artifacts owe to the fact that the analysis does not take context infor-mation into account. The results obtained from the defect receptor models are also promising. In the case of exoplasmic defects the resulting pathway automa-ton basically accounts for the findings of the CFAs. But, in the case of cytosolic defects, the pathway automaton excludes the interactions that might lead to the wild over-approximations observed in the corresponding CFA results. The obtained result is invariant with respect to the CFA used to produce F_⋆. When applied the transcription model of Section 3.4 the analysis retrieves an automaton that exhibits the overall behaviour of the transcription process, i.e.

a step-wise elongation procedure that may be interrupted at any point, should the required metabolites not materialise. Again, however, the result contains analysis artifacts that are clearly connected to the fact that the analysis does

not take the binding of names into account.

As this model is flat, in the sense that it does not model spatial structure, it is not surprising that the obtained result is also invariant with respect to the CFA used for producing F_⋆. This is in contrast to the LDL pathway result, where one might have hoped that 2CFA would lead to betterF_⋆estimates.

Technically, the two analysis approaches are complementary — one is context sensitive and considers the bindings of names, while the other is flow-sensitive but ignores bindings and context. This is evident in the results, which seem to indicate that an iterative positive-negative feedback loop between the two anal-yses would improve the precision of both; an idea that immediately motivates the developments of the following chapter.

In document Language Based Techniques for Systems Biology (Sider 187-191)