• Ingen resultater fundet

Library based technology mapping

problem with this approach is that the seemingly only free and mature tool, ABC, only outputs the best candidate and have no options to emit all possible designs thus making this approach impractical.

3. Formulate the problem to be interpreted by a solver: These types of problems can typically be specified as constraint satisfaction problems (CSPs) or Boolean satisfiability problems (SATs). Though most solvers are very efficient, it is very difficult to predict the performance of solving a particular problem potentially making the problems intractable. Some of the properties, such as how to capture the structure of a Boolean ex-pression can be very hard to formulate as a CSP or SAT.

In order to get a very efficient solution with complete transparency for further extensions we have chosenoption 1; to implement a simple technology mapping algorithm by tailoring existing EDA algorithms.

8.3 Library based technology mapping

Given a logic function f, technology mapping is the process of composing a layout of logic gates from a libraryLto expressf. Often several solutions exists to this kind of problem why this process is instructed by an objective function to find the best solution.

This is a four-step procedure (heuristic):

1. Minimisation: Representf as a minimal SoP.

2. Decomposition: Represent the minimised SoP as anand-inverter graph (AIG).

3. Partitioning: Translate the AIG into a forest of single output AIGs.

4. Covering: Cover each of these AIGs with parts fromL.

Several minimal expressions may exists where we – due to the increased difficulty of finding valid designs in GDA mapping compared to EDA mapping – need to examine all minimal forms of a given function.

8.3.1 Minimisation

Several methods for minimising Boolean equations have been developed. Some of the best know are the exact but inefficientQuine-McCluskey algorithm1from Mccluskey(1956) and the inexact though efficientEspresso heuristicfrom Bray-ton et al.(1984). Here exactness refers to the ability of finding the minimal ex-pression in terms of literals and offers a trade-off where larger chip designs often have to rely on "near optimum algorithms" to be tractable to layout. Muroga (1979) surveys many of the classical minimisation methods. As will be shown in Sec. 8.3.2the complexity of the synthesised system is very much dependent on the format of the used SoP. In GDA where it is crucial to keep the general complexity at a minimum selecting a good minimisation procedure is of great importance.

As these genetic systems often will be quite simple, at least compared to the complexity of micro-chips, we will rely on the exactQuine-McCluskeyalgorithm which is a three-step procedure:

1. Produce a minterm expansion (SoP form) of the functionf.

2. Eliminate as many literals as possible by systematically applying XY + XY0=X to obtain the prime implicants.

3. Use a prime implicant chart to select a minimum set of prime implicant that when ORed together producef, and that contains a minimum number of literals.

The following explanation of these steps is greatly inspired byFrenzelusing the truth-table below as example:

1Originally namedThe method of prime implicants but also referred to asThe Tabular Method of Minimisation.

8.3 Library based technology mapping 85

Where each row is a called a mintermm0−m15and where "-" meansdon’t care and is interpreted asfis allowed to be eithertrueorfalse. The termimplicant is used to describe contractions of one or more minterms. A truth-table is often specified in the equivalent but more compact representation:

f(A, B, C, D) =X

m(4,8,10,11,12,15) +X

d(9,14)

1. Minterm expansion

The minterm expansion SoP of this expression is obtained by simply selecting eachtrueminterm:

fA,B,C,D=A0BC0D0+AB0C0D0+AB0CD0+AB0CD+ABC0D0+ABCD0

2. Identifying the Prime Implicants

In order to minimise this expression alltrueanddon’t careminterms are put in aminterm table which is a table grouped on the number of1s in the minterm:

Group Minterm Representation

The rule XY +XY0 =X is applied repeatedly to create as large contractions as possible by comparing the implicants that only differs in a single bit position.

For example the two minterms m4 = 0100andm12= 1100 only differs in the most significant bit, thus it can be represented by the single implicantm4, m12=

−100. The minterm table grouping makes applying this rule efficient as only the neighbour group to each group needs to be checked.

The first stage of contraction yields:

Implicant Representation

-And the second stage, where each of the first stage contractions are tried reduced further yields:

Implicant Representation m8, m9, m10, m11 1 0 -m8, m10, m12, m14 1 - - 0 m10, m11, m14, m15 1 1

-These stages continue until no new reductions can be made. In this case no new reductions can be made after the second stage.

8.3 Library based technology mapping 87

Now the prime implicants, which is all the implicants that cannot be reduced further, simply can be identified by selecting implicants from the highest possible stage until all minterms have been covered. For example none of the second stage implicants coverm4hence them4, m12implicant from the first stage is selected:

fA,B,C,D=AB0+AD0+AC+BC0D0

3. Selecting the minimum set of Prime Implicants

The current expression is already greatly reduced compared to the minterm ex-pansion SoP, but it is possible to reduce it further using aprime implicant chart that is used to analyse which prime implicants cover the necessary minterms, ignoring don’t careminterms, i.e. m9 andm14.

m4 m8 m10 m11 m12 m15

AB0 X X X

AD0 X X X

AC X X X

BC0D0 X X

If a minterm is covered by only one prime implicant, that prime implicant is called an essential prime implicant and have to be a part of the minimal ex-pression, thus these are identified first. In this case the BC0D0 and AC are identified because only m4 and m15 cover the minterms respectively. These two implicants cover in total the minterms m4, m10, m11, m12, m15 where the remaining mintermm8should be covered using the remaining prime implicants.

Here bothAD0andAB0coverm8and are therefore both valid candidates. Thus the two minimal SoPs become:

f1(A,B,C,D)=AB0+AC+BC0D0 f2(A,B,C,D)=AD0+AC+BC0D0

In more advanced examples, selecting the minimal set of prime implicants cov-ering all remaining minterms is not necessarily trivial. These situations are typ-ically resolved using either an exhaustive approach or using Petrick’s method introduced inMccluskey (1956).

8.3.2 Decomposition

The minimal SoP is translated into an AIG as this is the de facto standard data-structure for EDA based technology mapping. The popularity of this

rep-resentation stems from its very efficient mapping algorithms as well as ability to easily perform optimisations by rewriting.

An AIG is a directed, acyclic graph that by its structure implements logical expressions. An AIG only have two types of nodes; namely the two-input And-node as well as the Inverter-node. Creating an AIG-representation of f is relatively straightforward using DeMorgan’s theorem and basic propositional algebra, informally summarised asevery AND-gate can be replaced by an OR-gate with inverters on all in- and outputs. See an example in Fig. 8.3.

(a)Regular layout (b) In AIG

Figure 8.3: Equivalent representations of O= (AB) +C0

Unfortunately these kinds of translations are non-canonical when the sum has more than two products or these products have more than two literals. Conse-quently the AIG is not as general as the SoP since an ordering is imposed upon how to convertN-input gates to 2-input gates. Fig. 8.4illustrates this. This can cause the pattern-matching to not necessarily find viable designs. Within the world of digital electronics this fact is typically only of minor concern as these situations typically are solved by just using the layout leading to the shortest delay (or another objective). Such criteria can here be stated as there typically is an enormous amount of reusable gates available which can be arranged in any order.

Synthesis of genetic systems is somewhat different as we usually have a very lim-ited number of parts that only can be arranged in specific arrangements due to the constraining protein-interactions and orthogonality requirements presented in Sec. 8.2. Henceforth each possible permutation of the AIG should be ob-tained and tried matched, where each gate with a fan-in of more than2causes exponential increase in the amount different AIGs. For larger systems this can lead to an intractable amount of matching where the canonicalAIG-With Choice representation byChatterjee(2007) could prove beneficial as this is specifically developed to overcome this problem by enabling multiple AIGs to be encoded in a singe AIG-With Choice by detecting and storing functional equivalences for each node. Further discussion of the benefits of using different data-structure can be found in Sec. 8.6.

8.3 Library based technology mapping 89

Figure 8.4: Left) Canonical representation of a 3-input OR-gate

Right) Two examples of possible conversions using 2-input OR-gates (non-canonical).

8.3.3 Partitioning

As the output off is not limited to a single output protein2, the corresponding AIG does not necessarily need to be a tree which makes the problem of matching NP-hard. As a result the AIG is partitioned into a forest of trees allowing seamless top-down matching that can be solved within polynomial time,Keutzer (1987).

A trivial partition strategy is to break the graph at all multiple fanout points and introduce intermediate inputs as needed, see example in Fig. 8.5. This approach is quite natural for GDA as the intermediate input introduced corresponds to intermediate proteins that needs to be mapped anyway. Unfortunately this can result in a lot of small trees, which in general cannot be covered as well (with respect to an objective function) as a few larger trees can, Alpert and Kahng (1995). More sophisticated strategies, such as the single-cone partitioning, du-plicate some of the logic in-order to obtain as few trees as possible – but it is not directly clear whether this approach is applicable in GDA partitioning as the library parts, as opposed to EDA, are large indivisible entities. An exten-sive survey of partitioning strategies has been conducted by Alpert and Kahng (1995) and could serve as inspiration for further research.

2In which case the minimisation procedure gets somewhat more complex than otherwise explained in Sec. 8.3.1. Techniques to solve this can be seen in e.g. Jiang et al.(2002). For any practical purposes one can think of the minimised result as a set of SoP expressions

Figure 8.5: Example of naive partitioning strategy. The AIG is partitioned into three AIG-trees at the multiple fan-out node by introducing the intermediate inputi.

It should be clear that a general limitation to partitioning is that a subset of exact solutions to a partitioned problem do not guarantee an exact solution to the problem itself.

8.3.4 Covering

The last step is to cover each of the trees using the parts fromL. This happens recursively using Algorithm2by initiating it on the output-node. As mentioned previously this covering step is different from that of digital electronics, as or-thogonality and protein-compatibility also have to be ensured. This is reflected in the library part selection method. An unmatched branch "b" refers to the sub-tree spawning from node b. The algorithm relies on a procedure to deter-mine whether a given part can be placed at a branchb. Given the AIG structure this is basically just a matter of doing tree pattern-matching.

An example of this algorithm can be seen in Fig. 8.6. This example is somewhat simplified by just showing how to find a single structural solution.

In EDA it is very common to optimise the AIG by e.g. applyingstructural hash-ing that identifies repeated patterns and compact accordingly. Due to the quite strict orthogonality enforcement these repeated patterns will never occur but one could imagine these requirements being relaxed in the future thus making structural hashing applicable to GDA technology mapping as well.