Ambiguity in Context-Free Grammars, Introduction to Pushdown Automata
Martin Fr ¨anzle
Informatics and Mathematical Modelling The Technical University of Denmark
Context-free languages II – p.1/25
What you’ll learn
1. Context-free grammars:
What’s ambiguity Why it is a problem How to avoid it
How to build useful CFGs 2. Pushdown automata:
Machines recognizing CFLs
Operational approach towards CFLs
Tools for CFLs
Context-free languages II – p.2/25
Ambiguity in CFGs
Context-free languages II – p.3/25
Pragmatics
The use cases of regular languages and context-free languages are subtly different:
For RLs, one is primarily interested in language membership:
finding strings in a text,
classifying words/phrases in a text,
With CFLs, one is often also interested in the structural information conveyed by a parse tree:
BE then S
if
BE then S
if
S else
S
S else
S
BE then S
if
BE then S
if
Context-free languages II – p.4/25
A classical ambiguity
Given
if
then else
if
then
true
false
what shall
if false then if false then print true else print false output?
BE then S
if
BE then S
if
S else
S
S else
S
BE then S
if
BE then S
if
Context-free languages II – p.5/25
A classical ambiguity
Given
if
then else
if
then
true
false
what shall
if false then if false then print true else print false output?
BE then S
if
BE then S
if
S else
S
S else
S
BE then S
if
BE then S
if
Context-free languages II – p.5/25
A classical ambiguity — resolved
Given
if
then
if
then else
true
false
what shall
if false then if false then print true else print false output?
S else
S
BE then S if
else S
BE then S
if
BE then S’
if S
BE then S’
if
S’
Context-free languages II – p.6/25
Ambiguity
Def: A CFG is called ambiguous iff there is
s.t. has (at least) two different parse trees wrt. .
N.B. It is not the existence of multiple derivations for some string
that renders ambiguous!
Context-free languages II – p.7/25
Removing ambiguity
There is no algorithm for resolving ambiguity (in the sense of automatically deriving an unambiguous grammar from a given grammar).
There is not even an algorithm for finding out whether a given CFG is ambiguous.
However, there are standard techniques for writing an unambiguous grammar that help in most cases.
Context-free languages II – p.8/25
Removing ambiguity
There is no algorithm for resolving ambiguity (in the sense of automatically deriving an unambiguous grammar from a given grammar).
There is not even an algorithm for finding out whether a given CFG is ambiguous.
However, there are standard techniques for writing an unambiguous grammar that help in most cases.
Context-free languages II – p.8/25
Removing ambiguity
There is no algorithm for resolving ambiguity (in the sense of automatically deriving an unambiguous grammar from a given grammar).
There is not even an algorithm for finding out whether a given CFG is ambiguous.
However, there are standard techniques for writing an unambiguous grammar that help in most cases.
Context-free languages II – p.8/25
Expression grammars
Ambiguous Unambiguous
Context-free languages II – p.9/25
Expression grammars
Ambiguous Unambiguous
Context-free languages II – p.9/25
Derivations vs. ambiguity
Let
be a grammar and .
Thm: The following statements are equivalent:
has more than one parse tree
has more than one leftmost derivation
has more than one rightmost derivation.
Prf. Leftmost (rightmost, resp.) derivations can be understood as a canonical way of traversing a parse tree and are thus in one-to-one correspondence to parse trees.
Context-free languages II – p.10/25
Inherent ambiguity
Def: A CFL is called inherently ambiguous iff all CFGs with
are ambiguous.
N.B. The mere existence of one ambiguous CFG for is not sufficient to render inherently ambiguous.
(In fact, any non-empty CFL has an ambiguous CFG.)
All the previous examples had an unambiguous CFG and are thus not inherently ambiguous.
Context-free languages II – p.11/25
An inherently ambiguous language
The language
is inherently ambiguous.
An example grammar:
Context-free languages II – p.12/25
An inherently ambiguous language
The language
is inherently ambiguous.
An example grammar:
Context-free languages II – p.12/25
Pushdown Automata
Context-free languages II – p.13/25
Idea of pushdown automaton
Take an -NFA and
1. add a stack for unbounded storage, yet with access limited to
“last-in-first-out”;
2. make transitions dependent on input symbol and stack top;
3. add side effect on stack to transitions:
transitions replace stack top by an arbitrary string, thus being able to
“pop” stack (replacement by ),
preserve stack (replace stack top by ),
“push” onto stack (replace stack top by ),
“pop” and then “push” (replace stack top by ).
Context-free languages II – p.14/25
Formal definition
A pushdown automaton (PDA) is a seven-tuple
with
being a finite set of states,
being the input alphabet, i.e. the finite set of input symbols,
being the stack alphabet,
fin
being the transition function,
implies that can move from to
when the input symbol is (or arbitrary in case ) and the stack top is . is then replaced by .
being the start state,
being the start symbol on the stack (i.e., execution starts with the stack containing exactely one item, namely ),
being the set of accepting states.
Context-free languages II – p.15/25
Instantaneous descriptions (IDs)
An instantaneous description (ID) is a triple
, where
denotes the current state,
denotes the remaining input,
denotes the stack contents (top of the stack
first letter of ).
IDs describe configurations of PDA computations.
is a relation between IDs of PDA formalizing “moves” of : iff
Context-free languages II – p.16/25
Instantaneous descriptions (IDs)
An instantaneous description (ID) is a triple
, where
denotes the current state,
denotes the remaining input,
denotes the stack contents (top of the stack
first letter of ).
IDs describe configurations of PDA computations.
is a relation between IDs of PDA formalizing “moves” of :
iff
Context-free languages II – p.16/25
Translation invariance
Thm: If
is a PDA and and then
implies
I.e., PDA behaviours are preserved under stack extension and/or input extension.
Prf: It is easy to see from the definition that
implies
The theorem follows by induction on the number of steps.
Context-free languages II – p.17/25
Translation invariance
Thm: If
is a PDA and then
implies
I.e., PDA behaviour is independent from trailing input.
N.B. The same is not true wrt. trailing stack contents, as the PDA might well pop off, inspect, and then restore some part of the stack contents.
Prf: It is easy to see from the definition that
implies
The theorem follows by induction on the number of steps.
Context-free languages II – p.18/25
Language of a PDA
Def: The language
of a PDA
is
def
This form of language acceptance is often called “acceptance by final state”.
Def: The language of a PDA is
def
This form of language acceptance is often called “acceptance by empty stack”.
N.B. In general, .
Context-free languages II – p.19/25
Language of a PDA
Def: The language
of a PDA
is
def
This form of language acceptance is often called “acceptance by final state”.
Def: The language
of a PDA
is
def
This form of language acceptance is often called “acceptance by empty stack”.
N.B. In general,
.
Context-free languages II – p.19/25
Expressiveness of the acceptance criteria
Thm: If
for some pushdown automaton then there is a pushdown automaton with
.
(Given , the construction of is completely mechanic.)
Thm: If for some pushdown automaton then there is a pushdown automaton with .
(Given , the construction of is completely mechanic.)
Both forms of acceptance are thus equally “expressive”.
Context-free languages II – p.20/25
Expressiveness of the acceptance criteria
Thm: If
for some pushdown automaton then there is a pushdown automaton with
.
(Given , the construction of is completely mechanic.)
Thm: If
for some pushdown automaton then there is a pushdown automaton with
.
(Given , the construction of is completely mechanic.)
Both forms of acceptance are thus equally “expressive”.
Context-free languages II – p.20/25
PDAs vs. CFGs
Accepted by a PDA by empty
stack
Definable by a CFG
Accepted by a PDA
by final state
Claim: PDAs and CFGs define the same languages.
Context-free languages II – p.21/25
Proof of the claim
Have already shown equivalence between
being accepted by some PDA by empty stack being accepted by some PDA by final state.
Suffices to show equivalence of being definable by a CFG
to one of the two forms of PDA definability.
We will show equivalence of CFG-definabilty to being accepted by some PDA by empty stack.
Context-free languages II – p.22/25
From CFG to PDA
Thm: If
for some contextfree grammar then there is a pushdown automaton with
.
(Given , the construction of is completely mechanic.)
Prf: (Essence of) Given , a corresponding 1-state PDA can be defined by
for for
otherwise.
Then show that for any and any
iff lm
Context-free languages II – p.23/25
From CFG to PDA
Thm: If
for some contextfree grammar then there is a pushdown automaton with
.
(Given , the construction of is completely mechanic.)
Prf: (Essence of) Given
, a corresponding 1-state PDA can be defined by
for
for
otherwise.
Then show that for any and any
iff lm
Context-free languages II – p.23/25
From PDA to CFG
Thm: If
for some pushdown automaton then there is a contextfree grammar with
.
(Given , the construction of is completely mechanic.)
Prf: Given , a corresponding CFG can be defined by
Then show that for any , any , and any iff
Context-free languages II – p.24/25
From PDA to CFG
Thm: If
for some pushdown automaton then there is a contextfree grammar with
.
(Given , the construction of is completely mechanic.)
Prf: Given
, a corresponding CFG can be defined by
Then show that for any , any , and any
iff
Context-free languages II – p.24/25
Stack symbols can simulate states
Cor: For each PDA there is a PDA
with
such that
.
Prf: 1. Convert to an equivalent CFG using the construction of the previous theorem,
2. convert to a one-state PDA
using the construction of the second-last theorem.
Prize is blowup in the size of the stack alphabet!
Context-free languages II – p.25/25
Stack symbols can simulate states
Cor: For each PDA there is a PDA
with
such that
.
Prf: 1. Convert to an equivalent CFG using the construction of the previous theorem,
2. convert to a one-state PDA
using the construction of the second-last theorem.
Prize is blowup in the size of the stack alphabet!
Context-free languages II – p.25/25