• Ingen resultater fundet

Efficient  Timing  Channel  Protec3on  for  On-­‐Chip  Networks

N/A
N/A
Info
Hent
Protected

Academic year: 2022

Del "Efficient  Timing  Channel  Protec3on  for  On-­‐Chip  Networks"

Copied!
21
0
0

Indlæser.... (se fuldtekst nu)

Hele teksten

(1)

Efficient  Timing  Channel  Protec3on   for  On-­‐Chip  Networks  

Yao  Wang  and  G.  Edward  Suh  

Cornell  University  

(2)

  Future  large-­‐scale  mul3-­‐cores  will  be  shared  among   mul3ple  applica3ons  /  virtual  machines  

On-­‐Chip  Networks  are  Shared  Resources  

Virtual  Machine  A  

Virtual  Machine  B  

(3)

Problem:  Timing  Channels  

  Shared  NoC  causes  interference  

  Network  interference  introduces  3ming  channels  

•  Side  channel  

•  Covert  channel  

  High  assurance  systems  requires  security  guarantee  

•  Example:  Corporate  virtual  machines  on  the  cloud  

(4)

RSA  Example  

  RSA  :  a  public  key  cryptographic  algorithm  

•  Prone  to  3ming  channel  aVacks  

Core  0   MC  0  

Core  1   MC  1  

Core  2   MC  2  

Crossbar  

RSA   AVacker  

key:  0110…  

(5)

RSA  Example  

  RSA  :  a  public  key  cryptographic  algorithm  

•  Prone  to  3ming  channel  aVacks  

0 0.2 0.4 0.6 0.8 1

220 230 240 250 260 270 280

Fraction of bit ’1’ in RSA key

Time

(6)

Outline  

  Objec3ve:  Eliminate  3ming  channels  through  the   shared  on-­‐chip  networks  

•  Completely  eliminate  informa3on  leakage  

•  Low  performance  overhead  

  Rest  of  the  talk  

•  Poten3al  approaches  

•  Our  solu3on    

•  Evalua3on  

•  Related  work  

•  Conclusion  

(7)

Use  Quality-­‐of-­‐Service?  

  QoS  techniques  provide  performance  isola3on  to  different   network  flows  

  QoS  techniques  are  not  enough  for  security  

•  A  flow  can  use  bandwidth  beyond  its  alloca3on  

•  Bandwidth  u3liza3on  reveals  the  flow  demand  

Flow  A  

Demand   Flow  B  

Demand   Flow  A   BW  u3liza3on   100%   100%  

100%   0%  

1   2  

A   B  

Bandwidth  alloca3on        A:  50%  

     B:  50%  

50%  

100%  

(8)

Sta3c  Par33oning  

  To  eliminate  3ming  channels,  resource  alloca3on  cannot   depend  on  run-­‐3me  demands  

   Sta3c  par33oning  

•  Spa3al  Network  Par33oning  (SNP)  

•  Temporal  Network  Par33oning  (TNP)  

  Completely  eliminate  the  3ming  channels  

•  High  performance  overhead  

SNP   TNP  

Cycle  0   Cycle  1   …  

VM  A   VM  A  

VM  B   VM  B  

(9)

One-­‐Way  Informa3on  Leak  Protec3on  

  Usually  only  one-­‐way  informa3on  protec3on  is   needed  

•  Mul3-­‐level  security  (MLS)  model  

  One-­‐way  protec3on  is  the  key  for  efficient  3ming   channel  protec3on  

Personal  APP   (Music  Player)  

Business  App   (Bank  Management)  

Regular  VM   Corporate  VM  

Low-­‐Security  Domain   High-­‐Security  Domain  

Informa3on     flow  

PC   Cloud  Compu3ng   In  general  

(10)

Timing  Channel  through  NoC  

HS  demand  

0   1  

1  

LS  th ro ug hp ut  

HS:  High-­‐Security  Domain   LS:  Low-­‐Security  Domain  

HS  -­‐-­‐-­‐>  LS  

(11)

  Reversed  Priority  

•  Assign  high  priority  to  low-­‐security  domain  

•  The  behavior  (throughput,  latency)  of  low-­‐security  domain  is   not  affected  by  high-­‐security  domain  

  Sta3c  Limits  

•  Low-­‐security  domain  could  ini3alize  Denial-­‐of-­‐Service  (DoS)   aVack  

•  Sta3c  limit  controls  the  amount  of  traffic  that  low-­‐security   domain  can  send  during  a  certain  interval    

Reversed  Priority  with  Sta3c  Limits  (RPSL)  

(12)

Implementa3on:  Avoid  Interference  

  Priority-­‐based  separable  allocator  

•  Input  arbiter  &  Output  arbiter  

  Sta3c  virtual  channel  alloca3on  

•  Avoid  head-­‐of-­‐line  blocking  

Router  

Virtual  Channels  

Input  link  

0  

1   2   3  

Low-­‐security  Domain  

High-­‐security  Domain  

(13)

  Sta3c  limit  control  mechanism  

•  Counter  &  Control  logic  

  Apply  to  both  input  and  output  arbiter  

Implementa3on:  Avoid  DoS  

Priority-­‐

based   Arbiter  

Counter   Requests  

Winning   Request  

Sta3c  limit   Control  

Low-­‐security  Domain   High-­‐security  Domain  

Input  Arbiter  

(14)

Benefits  of  One-­‐Way  Protec3on  

Time  

BW   U3liza3on   Total  BW  

HS  

LS   Round-­‐robin  Allocator  

LS   HS  

Time  

BW   U3liza3on   Total  BW  

HS  

LS   Temporal  Network  Par33oning  

LS   HS  

BW   U3liza3on   Total  BW  

HS  

LS  

LS   HS  

RPSL  

LS  

(15)

Experimental  Setup  

  Goals  of  experiments  

•  Timing  channel  protec3on  

•  DoS  protec3on  

•  Performance  overhead    

  Darsim:  cycle-­‐level  NoC  simulator  

  Comparison  of  three  schemes  

•  Round-­‐robin  allocator  (ISLIP)  

•  Temporal  Network  Par33oning  (TNP)  

•  Reversed  Priority  with  Sta3c  Limits  (RPSL)  

(16)

Timing  Channel:  No  Protec3on    

  Simple  network  

  Round-­‐robin  allocator  

1   2   3   4   Low-­‐security  Domain  

High-­‐security  Domain  

HS   LS  

0 0.2 0.4 0.6 0.8 1

0 0.2 0.4 0.6 0.8 1

HS offered BW (flit/cycle)

LS observed BW (flit/cycle)

1.0 flit/cycle

0 0.2 0.4 0.6 0.8 1

0 0.2 0.4 0.6 0.8 1

LS offered BW (flit/cycle)

HS observed BW (flit/cycle)

1.0 flit/cycle

(17)

Timing  Channel:  Two-­‐way  Protec3on  

  Simple  network  

  Temporal  Network  Par33oning  

1   2   3   4   Low-­‐security  Domain  

High-­‐security  Domain  

HS   LS  

0 0.2 0.4 0.6 0.8 1

0 0.2 0.4 0.6 0.8 1

HS offered BW (flit/cycle)

LS observed BW (flit/cycle)

1.0 flit/cycle 1.0 flit/cycle 1.0 flit/cycle

0 0.2 0.4 0.6 0.8 1

0 0.2 0.4 0.6 0.8 1

LS offered BW (flit/cycle)

HS observed BW (flit/cycle)

1.0 flit/cycle

0.4/1.0  

(18)

Timing  Channel:  One-­‐way  Protec3on  

  Simple  network  

  Reversed  Priority  with  Sta3c  Limits  (Sta3c  limit  =  0.8)  

1   2   3   4   Low-­‐security  Domain  

High-­‐security  Domain  

HS   LS  

0 0.2 0.4 0.6 0.8 1

0 0.2 0.4 0.6 0.8 1

HS offered BW (flit/cycle)

LS observed BW (flit/cycle)

1.0 flit/cycle

0 0.2 0.4 0.6 0.8 1

0 0.2 0.4 0.6 0.8 1

LS offered BW (flit/cycle)

HS observed BW (flit/cycle)

1.0 flit/cycle

1.0/1.0  

(19)

Performance  

  Applica3ons  show  bursty  traffic  

  RPSL  is  efficient  for  bursty  traffic  

0 20 40 60 80 100

0 500 1000 1500 2000 2500 3000

Timeline/ (million cycles)

Injected flits every million cycles

FFT OCEAN

HIGH  

LOW  

(20)

Related  Work  

  Side-­‐channel  protec3on  

•  Shared  resources  are  prone  to  side-­‐channel  aVacks,  e.g.  shared   caches,  branch  predic3on  

•  Cannot  be  applied  to  NoC  

  QoS  schemes  

•  Allows  resource  usage  beyond  alloca3on  

•  Insufficient  to  prevent  3ming  channel  aVacks  

  Composability  

•  Remove  interference  between  applica3ons  for  fast  integra3on  

•  Require  bi-­‐direc3onal  non-­‐interference,  incur  high  performance  

overhead  

(21)

Conclusion  

  Shared  on-­‐chip  networks  introduce  3ming  channels  

•  Prevent  effec3ve  sharing  of  large-­‐scale  NoC  in  high  assurance   systems  

  One-­‐way  3ming  channel  protec3on  is  sufficient  in   many  situa3ons  

  RPSL  provides  efficient  one-­‐way  3ming  channel   protec3on  

•  Incurs  low  performance  overhead  

Referencer

RELATEREDE DOKUMENTER

computation communication Algorithm on Chip (ASIC) hardwired hardwired System on Chip (SoC) soft hardwired. Network on Chip (NoC)

~8 bytes per core per 50000-cycle time window Constant power drain..

ƒ … use stochastic sensitivity analysis to derive upper and lower robustness bounds. ƒ Minimum Guaranteed

When performing delay matching of an asyn- chronous circuit a delay element is inserted in the request path to delay the request signal by an equal amount of time compared to the

This thesis investigates design of on-chip net- work links using asynchronous circuits, and presents three link designs of which two are providing virtual channels.. The link

• An low-cost monitoring platform called ARBNET for 3D stacked mesh architectures was proposed which can be efficiently used for various system management purposes. • A fully

In the pursuit of higher resolution simulation models that use all seismic, geological, and dynamic reservoir data - and to make use of modern parallel computing archi- tectures -

Direct contact with people diagnosed with the disease in question (Appendix, Figure 9) The most common means of obtaining knowledge is through databases/webpages. The sec- ond