On the Precision of the Analysis - A Control Flow Analysis of LySa

3.2 A Control Flow Analysis of LySa

3.2.4 On the Precision of the Analysis

The analysis makes use of approximations, which means that the analysis results can, in general, be imprecise. The section discusses in pragmatic terms how the approximations effect the analysis result.

First of all, it should be clear from a number of the results in the previous section that the analysis cannot tell two names apart if their canonical names

are identical. This approximation may cause imprecise result whenever matching of two name take place. Semantically, a matching of two names will fail if the names are different. However, if the two names have the same canonical name, the corresponding match mimicked by the analysis will succeed, thereby, leading to an imprecise result. For example, names generated at a replicated restriction, such as !(ν n)P, cannot be distinguished by the analysis. This may lead to imprecise results, if a process relies on exact matching of the different names generated at the replicated instances of (ν n) . The analysis will, on the other hand, be capable of distinguishing names generated at restrictions that occur at different places in the syntax of the process that is analysed.

In LySa pattern matching takes place at input and at decryption. The analysis of this matching is carried out in a fairly precise manner such that the process following a matching construct only is analysed if the matching may succeed.

However, imprecision may still occur in connection with matching because the analysis records variable binding inρwithout any regard to the context in which the variables are used. This is best illustrated by an example.

Example 3.13 The first parallel branch in the process below sends a message mess1 together with a copy of the messages signed with the key K⁻. The second branch receives a pair and checks whether the second half of the pair is a signed version of the first; if that is the case then the message is sent out on the network. The third branch also sends a pair, which may be received by the second branch. However, semantically this pair does not pass the match in the decryption because there is no signed message. Consequently,mess2 will never be sent as a one-tuple on the network.

hmess1,{|mess1|}_K−i.0|(;y, x).decryptxas{|y;|}_K+inhyi.0

| hmess2,garbagei.0

However, the following is the best analysis result for the process.

ρ(bxc) = {{|bmess1c|}_bK−c,bgarbagec}

ρ(byc) = {bmess1c,bmess2c}

κ = {bmess1c {|bmess1c|}_bK−c,bmess2c bgarbagec, bmess1c,bmess2c }

Notice, in particular, that the analysis reports that mess2 may be sent as a one-tuple on the network. This happens because the matching at decryption succeeds semantically and, consequently, the analysis requires all elements in

ρ(byc) to also be inκ.

The problem illustrated in Example 3.13 is that the variables are recorded in too crude a way by the analysis. It is easy to come up with more sophisticate schemes to record the binding of variables that will improve the way that the

analysis handles these situations. The difficult part is, however, to do this with-out significantly increasing the computational complexity of the analysis. For example, [34] suggests a compromise that allows you to syntactically indicate for which matchings the analysis should spend extra efforts.

Changing the analysis is not the only way around the problem. Often, it is possible adapt the LySa process in such a way that it still models the same application but without getting the imprecise results. For example, the problem in Example 3.13 can solved by utilising that the decrypt construct for validating signatures can both validate a signature and extract its content at the same time.

Hence, it suffices to send only the signed message and leave out the message in clear. That is, the first part of the process in Example 3.13 may instead be modelled as

h{|mess1|}_K−i.0|(;x).decryptxas{|;y|}_K+inhyi.0

Now, the analysis can correctly report thatywill only become bound to messages that have been signed by K⁻. Consequently, only these messages will be sent on the network. It is, of course, hard to say how often similar modifications can be found to solve various practical modelling issues. However, the imprecision in the analysis results that arise on the account of crude way variable bindings are recorded turn out not to be a problem for surprisingly many of the practical applications on which the analysis has been deployed.

Finally, another deliberate approximation is that the analysis of communication does not take into account that communication is synchronous. That is, seman-tically there need to be two parties present for the communication to succeed.

The reason for ignoring this is that the analysis is tailored for a setup where an attacker is present. This attacker will always be ready to participate in any communication. Consequently, there is no reason to spend effort on detailed analysis of whether communication may succeed, because in a setup containing an attacker, it always will. Analysis of a setup where the attacker is present is the topic of Chapter 5. First, however, Chapter 4 describes how the analysis can be implemented.

Implementation

The goal for an implementation of an analysis is to compute the analysis result for a given process. For the analysis of LySa given in Chapter 3 this means that given a processP the implementation will computeρandκsuch thatρ, κ|=P. The overall layout of the implementation consists of two steps that must be carried out whenever a processP is analysed:

(1) generate a formula; apply a functionG:Proc→Formula to P,

(2) find a solution that satisfies the formula; apply a functionS:Formula→ Solution toG(P).

This implementation strategy is typical for implementations of static analy-ses [109, Chapter 6]. Often the formula is referred to as a constraint (on the analysis components) and for this reason these static analysis techniques are also known as constraint based analysis.

To attain an implementation of a specific analysis the two function G and S must be given. This chapter describes how to attain these functions to provide an implementation of the analysis of LySa. The implementation described here is meant as a proof-of-concept implementation that aims only at illustrating that the analysis can indeed be implemented and that an analysis result is computable in reasonable time for realistic examples. Thus, considerations with regard to low level optimisation of run-time, software engineering principals, etc. are not the focal point of this presentation. Instead, the focus will be on attainingGand

S in a correct manner with respect to the original formulation of the analysis in Table 3.1.

The development of a constraint solver, S, is quite an extensive undertaking.

The easiest way forward is, therefore, to use an already developed solver as the function S. This is also preferable from a complexity point of view because solvers developed purely for the purpose of solving constraints are usually man-ufactured by people who pay much attention to complexity aspects of their tools.

Deciding on which solver to use relies on several considerations such as:

• what solvers are available,

• how close is the constraint format to the analysis specification, and

• what techniques are available to prove the implementation correct.

Considering the specification style of the analysis, it is most natural to look for solvers that are based around some kind of logical formalism. This choice is likely to make the reasoning about the correctness of the implementation easier than if one has to relate to a completely different framework.

A main challenge in implementing the analysis of LySa is that it is specified over infinite set of terms built from names and encryption. Therefore, the chosen solver needs to be able to deal with infinite set of terms but unfortunately such tools are limited in supply. The options include tools for solving systems of set constraints [7] such as Bane [12] and Banshee [13]. Another option is Mona [102], which provides checking of satisfiability of fairly powerful logic over first order terms built by a successor operation [83]. It is, however, not a priori clear that these tools cater for an encoding of the analysis of LySa.

Alternatively, one may consider Prolog inspired tools based around Horn clauses where predicates may range over arbitrary sets of terms. Prolog based tools have long been used for protocol analysis [93] but these tools do, however, not have a general guarantee of termination. A recent stand of work [18, 2, 19, 20]

shows that termination of Horn clause based tools need not be a problem for a large class of relevant examples. This development, however, requires quite an amount of hand-crafting of the solving engine used.

The implementation made here takes another direction that will guarantee ter-mination. It uses the Succinct Solver [112] and encodes the analysis domains in a finite way such that termination is guaranteed. The main motivation for choosing this approach is that it has already proven a viable and efficient tool for the implementation of a control flow analysis of the Spi-calculus [111]. It is plausible that the analysis of LySa may be implemented using a similar strategy because LySa is a close relative of the Spi-calculus and the two analyses are relatively similar. Thus, the implementation presented in this chapter will be an adaption of technique presented in [111] into the LySa setting and any significant

differences will be noted whenever it is appropriate. With the implementation of the analysis of LySa it is illustrated that the technique of [111] can be ported to other settings and that it scales well to application on large, practical examples.

Furthermore, a number of optimisations will be presented that relies on manip-ulation of labels (which are introduced in Section 4.1). These optimisations are all novel contributions of this thesis.

The input to the Succinct Solver is a formula in Alternation-free Least Fixed Point logic (ALFP). This logic may be regarded as an extension of Horn clauses or as a fragment of first order predicate logic. When the ALFP formula is interpreted over a finite, unstructured universe the Succinct Solver will compute interpretations of predicates that satisfy the formula. The Succinct Solver may thereby be regarded as the solving functionS.

To obtain a formula generation function, G, one can use the definition of the analysis predicate in Table 3.1 by viewing it as a function that takes the process on the left-hand-side of the iff as an argument and returns the formula on the right-hand-side. However, the formulae in Table 3.1 are not ALFP formulae interpreted over a finite universe. To instead obtain a generation function that produces ALFP formulae one has to find a formulation that only uses features available in ALFP. This will basically be done by encoding the analysis com-ponents as predicates and transforming the analysis accordingly. The encoding will, however, be somewhat involved and to make presentation easier, the defi-nition of the generation function,G, will be derived in a number of steps in the following sections.

In document Automated Analysis of Security in Networking Systems (Sider 53-59)