• Ingen resultater fundet

Statistical Analysis and Optimization of Asynchronous Digital Circuits

N/A
N/A
Info
Hent
Protected

Academic year: 2022

Del "Statistical Analysis and Optimization of Asynchronous Digital Circuits"

Copied!
21
0
0

Indlæser.... (se fuldtekst nu)

Hele teksten

(1)

1

Statistical Analysis and Optimization of Asynchronous Digital Circuits

Tsung-Te Liu and Jan M. Rabaey University of California, Berkeley

(2)

Outline

•  Motivation

•  Variability model of CMOS digital circuit

•  Performance model for different timing schemes

•  Performance comparison

•  Conclusion

(3)

3

Variability Continues to Increase as Technology and Voltage Scales Down

Device variability vs. Technology node

-80% ~ +110% @0.3V -40% ~ +30% @1V

Normalized Delay

Delay spread due to process variations

Normalized Delay

Count Count

•  Higher variability with finer design rules and larger wafers

•  Higher variability with lower supply voltages

[Cao, ASU]

(4)

Circuit Performance Characteristics with Different Timing Schemes

Original circuit

Self-timed circuit

Conventional synchronous circuit

Computation Delay

Probability

•  Self-timed circuit is a variation-monitoring circuit by itself

•  Becomes advantageous when the variation is large (B>A)

•  Statistical analysis framework is necessary

B: 3σ delay variation A: protocol circuit delay A

B

(5)

Statistical Analysis Framework

5

Circuit Variability Model

•  Supply voltage 

•  Logic depth

•  Width and length

•  Body bias

Performance Model

•  Computation overhead 

•  Communication overhead

•  Delay and energy performance

Delay

Energy

0

Processors

Communications Sensors

Determine the optimal timing strategy in the presence of variability

(6)

Outline

•  Motivation

•  Variability model of CMOS digital circuit

•  Performance model for different timing schemes

•  Performance comparison

•  Conclusion

(7)

Delay Model of CMOS Digital Circuit

7

•  One unified current model across different operating regions

•  Model error <2% from 0.3V to 1V

4-stage FO4 INV chain

0.2 0.4 0.6 0.8 1

100 101 102

Supply Voltage [V]

Delay [FO4(@V DD=1V)]

Simulation data Model

I ! (VDD "Vth)2

1+ VDD "Vth EsatL

#

$% &

'(

I !exp VDD "Vth S

#

$% &

'(

I !

ln 1+exp VDD "Vth 2S

#

$% &

'( )

*+ ,

-. /0

1

23 4

2

1+ln 1+exp VDD "Vth EsatL

#

$% &

'( )

*+ ,

-. /

05 15

2 35 45

0.2 0.4 0.6 0.8 1

ï1.5 ï1 ï0.5 0 0.5 1

Supply Voltage [V]

Error [%]

(8)

Delay Variability Model

Within die variation (WID)

“Local mismatch”

Die-to-die variation (DTD)

“Global variation”

!Td

µTd = STd

Vth

( )

2 ! !µVth

Vth

"

#

$$

%

&

''

2

+ ST

d

( )

K 2 ! !µK

K

"

#$$ %

&

''

2 ST

d

Vth

=

!Vth Vth

!Td Td

0.2 0.4 0.6 0.8 1

0 5 10 15 20 25

Supply Voltage [V]

σ/μ [%]

Simulation data Model (WID)

Model (Threshold voltage) Model (Geometry)

0.2 0.4 0.6 0.8 1

0 5 10 15 20

Supply Voltage [V]

σ/μ [%]

Simulation data Model (DTD)

Model (Threshold voltage) Model (Geometry)

Threshold voltage Geometry

ST

d

K =

!K K

!Td Td

.

(9)

Delay Variability Model

9

0.2 0.4 0.6 0.8 1

0 5 10 15 20 25 30

Supply Voltage [V]

m/µ [%]

Simulation data Model (total) Model (DTD) Model (WID)

0.2 0.4 0.6 0.8 1

ï8 ï6 ï4 ï2 0 2 4

Supply Voltage [V]

Error [%]

!Td,total µTd,total =

!Td,DTD µTd,DTD

!

"

##

$

%

&

&

2

+ !Td,WID µTd,WID

!

"

##

$

%

&

&

2

•  Model error <8% from 0.3V to 1V

•  Local mismatch dominates at low supply voltages

(10)

0.2 0.4 0.6 0.8 1 5

10 15 20 25 30

Supply Voltage [V]

m/µ [%]

Simulation data (n=4) Model (n=4)

Simulation data (n=8) Model (n=8)

Simulation data (n=24) Model (n=24)

Delay Variability Model with Different Logic Depths

!Td,total_n µTd,total_n =

!Td,DTD_ 4 µTd,DTD_ 4

!

"

##

$

%

&

&

2

+ 4 n

!

"

# $

%&' !Td,WID_ 4 µTd,WID_ 4

!

"

##

$

%

&

&

2

0.2 0.4 0.6 0.8 1

ï10 ï5 0 5 10 15

Supply Voltage [V]

Error [%]

n=4 n=8 n=24

•  Use 4-stage inverter chain model as baseline model

•  Model error <13% for n=8 and <15% for n=24

(11)

Outline

•  Motivation

•  Variability model of CMOS digital circuit

•  Performance model for different timing schemes

•  Performance comparison

•  Conclusion

11

(12)

Delay Overhead Evaluation

Original circuit

Dual-rail timing

Synchronous timing

Computation Delay

Probability

•  Assumption: Process variation follows Gaussian distribution

•  Dual-rail approach: have only protocol overhead but no delay overhead

•  Synchronous approach: have only delay overhead

B: 3σ delay variation A: protocol circuit delay A

B

Dsync = 3!logic,total

µlogic,total

For 99.7% yield:

(13)

Bundled-Data Self-Timed Approach

13

Main data path

fdelay!line = N(µdelay!line,!delay!line2 ) Goal:

Assume main data path and replica delay line exhibit similar statistics:

Dbundled!data = µdelay!line !µlogic µlogic

where

flogic(t)= N

(

µlogic,!logic2

)

P t

(

logic ! tdelay"line

)

#1

Dbundled!data = Dvariation2 " 0.5+ 0.25+ 2 Dvariation2

#

$%% &

'((

Dvariation = 3!logic,WID µlogic,WID

Replica delay line Probability

Computation Delay

Main data path Replica delay line

For 99.7% yield:

(14)

0 50 100 150 200 0

100 200 300 400 500 600

Process Variability [%]

Delay Overhead [%]

Bundled-Data Delay Overhead

O(n2)

O(n)

Dbundled!data "

2 #Dvariation, when Dvariation $ 0 Dvariation2 , when Dvariation $ %

.

&

'( )(

•  Delay overhead becomes much larger as process variability increases!

(15)

Performance Model under Variations

15

Eleakage=VIleakageTdelay

Tcomp= Tdelay (1+P+D) Edynamic=αCswitchV2

Etotal=αCswitchV2

+VIleakageTdelay Tcomp= Tdelay

Eleakage=VIleakage(1+P)Tdelay (1+P+D) Edynamic=αCswitch(1+P)V2

Etotal=αCswitch(1+P)V2

+VIleakage(1+P)Tdelay (1+P+D) Original delay and energy model Statistical delay and energy model

Timing scheme Synchronous Bundled-Data Dual-Rail Delay Overhead (D) Dsync Dbundled-data 0 Protocol Overhead (P) 0 Pbundled-data Pdual-rail

•  Evaluate computation delay and energy under variations

•  Overhead changes with supply voltage and logic depth

(16)

Outline

•  Motivation

•  Variability model of CMOS digital circuit

•  Performance model for different timing schemes

•  Performance comparison

•  Conclusion

(17)

17

•  Global variation affects only synchronous approach

•  Local mismatch dominates at low supply voltages

•  Local mismatch has less impact on longer critical path

4-stage FO4 INV chain

Delay Overhead Comparison

24-stage FO4 INV chain

0.2 0.4 0.6 0.8 1

0 20 40 60 80 100 120

Supply Voltage [V]

Delay Overhead [%]

Synchronous Timing

BundledïData SelfïTiming

0.2 0.4 0.6 0.8 1

0 10 20 30 40 50 60 70

Supply Voltage [V]

Delay Overhead [%]

Synchronous Timing

BundledïData SelfïTiming

(18)

•  Assumption: Pbundled-data = 1TFO4; Pdual-rail = 2TFO4

• Synchronous scheme is better for small critical path at high supply voltages

•  Dual-rail scheme is better for large critical path at low supply voltages

Speed Performance Comparison

4-stage FO4 INV chain 24-stage FO4 INV chain

0.2 0.4 0.6 0.8 1

0.8 0.9 1 1.1 1.2 1.3

Supply Voltage [V]

Normalized Delay

DualïRail SelfïTiming BundledïData SelfïTiming

0.2 0.4 0.6 0.8 1

0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

Supply Voltage [V]

Normalized Delay

DualïRail SelfïTiming BundledïData SelfïTiming

(19)

19

Energy Performance Comparison

24-stage FO4 INV chain

0.2 0.4 0.6 0.8 1

0 10 20 30 40 50 60

Supply [V]

Energy [fJ]

Synchronous Timing (_ = 0.1) DualïRail SelfïTiming (_ = 0.1) BundledïData Selfïtiming (_ = 0.1)

0.2 0.4 0.6 0.8 1

20 30 40 50 60 70

Supply [V]

Energy [fJ]

EnergyïDelay Plot

Synchronous Timing (_ = 0.01) DualïRail SelfïTiming (_ = 0.01) BundledïData Selfïtiming (_ = 0.01)

•  Synchronous scheme is better for high activity at high supply voltages

•  Dual-rail scheme is better for low activity at low supply voltages

•  Leakage dominates for low activity at low supply voltages

(20)

Conclusion

•  A statistical analysis framework is proposed to evaluate performance of CMOS digital circuit in the presence of process variations.

•  Designer can efficiently determine the optimal timing

strategy, pipeline depth and supply voltage based on the proposed variability and statistical performance models.

•  Asynchronous design exhibits better energy and delay

characteristics for circuits with low activity and larger critical path delay under process variations

(21)

21

Acknowledgement

•  Berkeley Wireless Research Center

•  NSF Infrastructure Grant

•  STMicroelectronics

•  Multiscale System Center

Thank you!

Referencer

RELATEREDE DOKUMENTER

Approach 3: Statistical tree-shape analysis Conclusions and open problems. Conclusions and

Analysis performed in this thesis based on a set of requirements for the filter process, have concluded that the best filter type for the digital filers is FIR filters of a

Digital innovation in the museum practice — is the process of thoughtful adaptation, development and implementation of digital technologies into the practice of

In this thesis we have conducted a strategic analysis, an analysis of Latvia, a financial analysis, a valuation, and a scenario analysis of Nordea in order to evaluate the

The theoretical discussion will fill the first part of the article, followed by a statistical analysis of the number of visits and annual awards in order to find the productions

The analysis of power in a collaborative design process and its context: a Foucauldian perspective The later work of Michel Foucault offers an interesting framework

Because the market does not see the physical effects on electrical losses of transmission, a statistical analysis was carried out of loss effect in the AC grid and the

To put the data-driven SPCA method into perspective, tests for each clinical outcome variable were also investigated through a direct analysis of the original variables and by using