• Ingen resultater fundet

Timed Games.

N/A
N/A
Info
Hent
Protected

Academic year: 2022

Del "Timed Games."

Copied!
51
0
0

Indlæser.... (se fuldtekst nu)

Hele teksten

(1)

Timed Games UPPAAL-TIGA

Alexandre David

1.2.05

(2)

Overview

„

Timed Games.

„ Algorithm (CONCUR’05).

„ Strategies.

„ Code generation.

„ Architecture of UPPAAL-TIGA.

„ Interactive game.

„

Timed Games with Partial Observability.

„ Algorithm (ATVA’07).

(3)

Why Timed Games?

„

Real-time systems:

„ Systems where correctness depends on the logical order of events and on their

timings!

„ … in addition to correct computation.

„

Real Time Model-checking:

„ Model the environment + the tasks.

Plant

Continuous sensors

actuators

(4)

Why Timed Games?

„

Controller synthesis:

„ Model the environment + what a controller can do.

„ Generate the controller so that controller φ!

Generate the right code automatically.

„ 2-player timed game:

environment moves vs.

controller moves.

⇒ Timed Game Automata.

Plant

Continuous

Controller Program

Discrete sensors

actuators

?

(5)

Controller Synthesis/TGA

„

Given

„ System moves S,

„ Controller moves C,

„ and a property φ,

„

find

„ a strategy Sc s.t. Sc||S φ,

„ or prove there is no such strategy.

(6)

Timed Game Automata

„

Introduced by Maler, Pnueli, Sifakis [Maler & al. ’95].

„

The controller continuously observes the

system (all delays & moves are observable).

„

The controller can

„ wait (delay action),

„ take a controllable move, or

„ prevent delay by taking a controllable move.

(7)

Timed Game Automata

„

Timed automata with controllable and

uncontrollable transitions.

„

Reachability & safety games.

„ control: A<> TGA.goal

„ control: A[] not TGA.L4

„

Memoryless strategy:

(8)

TGA – Let’s Play!

„

control: A<> TGA.goal

x<1 : λ x==1 : c x<2 : λ x≥2 : c

Strategy x≤1 : c

x<1 : λ x==1 : c

Note: This is one strategy.

There are other solutions.

(9)

Results

„

[Maler & al. ’95, De Alfaro & al. ’01]

There is a symbolic iterative algorithm to compute the set W* of winning states for timed games.

„

[Henziger & Kopke ’99]

Safety and reachability control are EXPTIME- complete.

„

[Cassez & al. ’05]

Efficient on-the-fly algorithm for safety &

reachability games.

(10)

Algorithm

„

On-the-fly forward algorithm with a backward fix-point computation of the winning/losing sets.

„ Use all the features of UPPAAL in forward.

„ Possible to mix forward & backward exploration.

„

Solved by Liu & Smolka 1998 for untimed games.

„

Extended symbolic version at CONCUR’05.

(11)
(12)

Backward Propagation

0 1 2

L0

L1

L2

L3

Back-

propagate when

goal is reached.

Note: This is not a strategy, it’s only the set of winning states.

(13)

Backward Propagation

G B

B

Predecessors of G avoiding B?

pred t

(14)

pred

t

– From Federation to Zone

U I U U UI

=

=

)

\ ) ((

)

\ (

) ,

(

) ,

( )

, (

B B

G B

G B

G pred

B G

pred B

G pred

t

i j i j

j i

t j

i t

(15)

Query Language (1)

„

Reachability properties:

„ control: A[ p U q ]

„ control: A<> q ⇔ control: A[ true U q ]

„

Safety properties:

„ control: A[ p W q ]

„ control: A[] p ⇔ control: A[ p W false ]

„

Tuning:

„ change search ordering,

„ add back-propagation of winning+losing states.

Back-pro

pagate winning states &

BFS+DFS.

Back-pro

pagate losing states &

BFS-BFS.

(16)

Query Language (2)

„

Time-optimality

„ control_t*(u,g): A[ p U q ]

„ u is an upper-bound to prune the search, act like an invariant but on the path = expression on the current state.

„ g is the time to the goal from the current state (a lower-bound in fact), also used to prune the search.

States with t+g > u are pruned.

„

Cooperative strategies.

„ E<> control: φ

„ Property satisfied iff φ is reachable but the obtained strategy is maximal.

(17)

Cooperative Strategies

„ State-space is

partitioned between

states from which there is a strategy and those from which there is no strategy.

„ Cooperative strategy suggests moves from the opponent that

would “help” the controller.

Maximal partition with

winning strategies.

(18)

Strategies?

„

The algorithm computes sets of winning and losing states, not strategies.

„

Strategies are computed on top:

„ Take actions that lead to winning states (reachability).

„ Take actions to avoid losing states (safety).

„ Partition states with actions to guarantee progress.

„ This is done on-the-fly and the obtained strategy depends on the exploration order.

(19)

Winning States → Strategy

0 1 2

L0

L1

L2

L3

Winning states

L1→goal past

wait

L3→L1 wait

L2→L3

L0→L1

past

wait L0→L1

wait L0→L1

Also possible

(20)

Strategies as Partitions

„

Built on-the-fly.

„

Guarantee progress in the strategy.

„ No loop.

„

Deterministic strategy.

„

Different problem than computing the set of winning states.

„

Different ordering searches can give different strategies … with possibly the same set of

winning states.

(21)

Code Generation

„

Mapping state → action.

„ # entries = # states.

„

Decision graph state → action.

„ # tests = # variables.

„ More compact.

„ Based on a hybrid BDD/CDD with multi-terminals.

(22)

Decision Graph

„ BDD: boolean variables.

„ CDD: constraints on clocks.

„ Multi-terminals: actions.

„ It works because we have a partition.

(23)

Graph Reduction

„

Testing consecutive bits:

„ Replace by one testing with a mask.

„ Can span on several variables.

(24)

Decision Graph

(25)

Pipeline Architecture

Pipeline Components

Source

Sink

Filter

State Successor

Data

Buffer

(26)

Pipeline Architecture

State-graph Waiting queue

Transition Successor Delay Extrapolation+

Source forward.s,F

Destination backward.s’,B

Predecessor predt

win lose?

s’,F

update?

update? s,B s*,B

Inclusion check+add

(27)

Interactive Game

„

How to play a (timed) strategy against the user?

„ Concrete simulator.

„ Actions depend on the point in time.

„ Allowed delays depend on the actions.

„ The GUI has limited feedback for showing counter-actions.

(28)

Interactive Game

„ Goal: Play the game inside UPPAAL GUI.

„ Problem: The GUI is not as talkative as a command line simulator !

?

(29)

From Symbolic to Concrete

take L0→L1

x y

wait

How long to wait before taking action.

Valid interval for taking action.

0

(30)

From Symbolic to Concrete

„ Strategy = mapping from sets of states to actions (incl. wait).

„ Simulation with a given clock valuation.

take L0→L1 wait 2.3

take L1→L2 wait 1.7 take L2→L0

take L0→L3

(31)

Interactive Game – GUI

„ Avoid “your action has been countered”: Restrict selection w.r.t. the strategy.

„ What is a “selectable action” for the user ?

„ His own transition – if can take it before TIGA

„ The choice of TIGA

„ the other actions are not selectable

(32)

Query Language (4)

„

Partial observable systems:

„ { obs1, obs2 … } control: reachability | safety

„ Generate a controller based on partial

observations of the system (obs1, obs2…).

„

Games with Buchi accepting states.

„ control: A[] ( p && A<> q )

„ control: A[] A<> q

New

(33)

Query Language(5)

„

Simulation checking of TA & TGA.

„ { A, B … } <= { C, D … }

„ Built-in new algorithm that bypasses a manual encoding we had [LSV tech. report KGL, AD, TC].

„ Strategies & simulation in the GUI available.

„ FORMATS’09 – Peter Bulychev, Thomas Chatain, Alexandre David, Kim G. Larsen

New

(34)

TGA with Buchi Accepting States

„

Use TIGA’s algorithm as an intermediate steps in a fixpoint, but modified:

„ winning states are states that can reach goals

„ goal states defined by queries, not necessarily winning

„ Fixpoint:

Win = SOTF(goal)

while (Win ∩ goal) ≠ goal goal = Win ∩ goal

Win = SOTF(goal)

donereturn S0∈Win Didier Lime

(35)

Application: Non-zeno Strategies

„

Add a monitor with the rest of the system.

„

Ask

control: A[] (p && A<>

Monitor.Check)

u

x==1

x=0

Check

(36)

Timed Games

with Partial Observability

„

Previous: Perfect information.

„ Not always suitable for controllers.

„

Partial observation.

„ States or events, here states.

„ Distinguish states w.r.t. observations.

„ Strategy keeps track of states w.r.t. observations.

„ Observations = predicates over states.

(37)

Results

„

Discrete event systems

„ [Kupferman & Vardi ’99, Reif ’84, Arnold & al.

’03]. Game given as modal logic formula: Full- observation as hard as partial observation.

„ [Chatterjee & al. ’06, De Wulf & al. ’06]. Game given as explicit graph: Full-observation PTIME, partial observation EXPTIME.

„

Timed systems, game given as a TA

„ [Cassez & al. ’07] Efficient on-the-fly algorithm, EXPTIME.

(38)

State Based Full Observation

„

2-player reachability game, controllable + uncontrollable actions.

„

Full observation: in l

2

do c

1

, in l

3

do c

2

.

Franck Cassez ATVA’07

(39)

State Based Partial Observation

„

Partition the state-space l

2

=l

3

.

„

Can’t win here.

Franck Cassez ATVA’07

(40)

State Based Partial Observation

Franck Cassez ATVA’07

(41)

Observation For Timed Systems

Franck Cassez ATVA’07

(42)

Stuttering-Free Invariant Observations

Franck Cassez ATVA’07

(43)

TGA with PO: Rules

„

If player 1 wants to take an action c, then

„ player 2 can choose to play any of his actions or c as long as the observation stays the same, or

„ player 2 can delay as long as c is not enabled and the observation stays the same.

„ ⇒ c is urgent.

„

If player 1 wants to delay, then

„ player 2 can delay or take any of his actions as long as the observation stays the same.

„

The turn is back to player 1 as soon as the

observation changes.

(44)

On-the-Fly Algorithm

(45)

Algorithm

Partition the state-space w.r.t. observations.

Observations O1 O2 O3.

Winning/losing is observable.

O1∧O2∧O3

¬O1∧¬O2∧¬O3

O1∧O2∧¬O3

¬O1∧¬O2∧O3 O1∧¬O2∧O3

O1∧¬O2∧¬O3

¬O1∧O2∧¬O3

¬O1∧O2∧O3

(46)

Algorithm

Initial state in some partition.

Compute successors { set of states } w.r.t. a controllable action.

Successors distinguished by observations.

(47)

Algorithm

Construct the graph of sets of symbolic states.

Back-propagate winning/losing states.

(48)

Notes

„

Stuttering-free invariant observations

„ sinks possible in observations (deadlock/livelock/loop)

„

Actions are urgent

„ delay until actions are enabled (or observation changes)

(49)

Algorithm

„

Forward exploration

„ constrained by action + observations

„ delay special

„

Back-propagation.

„ If all successorsa are winning, declare current state winning, strategy: take action a.

„ If one successora is losing, avoid action a.

If no action is winning the current state is losing.

(50)

Example

Observations: L, H, E, B, y in [0,1[

(51)

Example

24 26 166 164

30 34 172 176

38 42 180 184

46 50 188 192

54 58 196 200

62 135 27

165 169

31 35 173 177

39 43 181 185

47 51 189 193

55 59 197 201

63 67 205 209

71 137 295

162 0

H

L E

y Ly Hy Ey Partition:

delay y=0 eject!

Actions:

Referencer

RELATEREDE DOKUMENTER

putbox occurs periodically (exactly) every 25 time units (note: other putbox’s may occur

Additionally, the theory on role-playing games in terms of the advancement of a character would suggest that players will want to enhance their critical hit chance as much as

The main purpose of this paper is to show that the techniques developed by Milner in 15, 17] can be adapted to provide a complete axiomatization of the notion of timed

BoardGameGeek is not simply about playing games; in accruing gaming capital, users engage with a specific set of digital practices that support and enable material acts of play,

Investigating the discourse surrounding the intersection between the payment models for games and what gets considered a real game is particularly important given that two of the

First, availing of the very rich set of cultural variables in the Taking Part survey allows us to disclose how video games playing is related to

The set of all undominated efficient allocations is called the undominated set, and we show, indeed, that for balanced games this set coincides with the core and for antibalanced

The proposed iterative state vector computation method (ISCM) is able to reduce the complexity needed to compute channel transfer matrices in multi-room indoor environments based on