- In each case below, state what language is generated by the context free grammar
(CFG) productions indicated.
- S → aSa | bSb | A,
A → aBb | bBa,
B → aBa | bBb | a | b | λ
Sample solution:
This describes strings over { a, b } which would be palindromes
except for the presence of exactly one character mismatch
- S → aS | aSbS | λ
Sample solution:
This describes strings over { a, b } for which every prefix
string has at least as many a's as b's
- S → aS | bS | a
Sample solution:
This describes strings over { a, b } which end in a
- S → SS | bS | a
Sample solution:
This also describes strings over { a, b } which end in a
- Let L be the language generated by the CFG with productions
S → aSb | ab | SS
Use mathematical induction to show that no string in L
begins with abb.
Sample solution:
Proof by induction on the number of steps in the derivation
Base case: 1 step
The only string which can be derived in a single step
is ab, via the rule S→ab
and, clearly, ab does not start with abb.
Induction hypothesis: we assume that, for all strings which
can be generated by up to k applications of the grammar's
production rules, the strings do not begin with abb
Induction step: let X be a string generated by k+1 applications
of the grammars production rules.
Observation: no string can be generated which begins
with b. This is clear from inspection of the rules,
the only places terminal b is introduced are after
terminal a's. Thus any string generated must begin with an a.
There are two derivation cases we need to consider:
(i) The first rule in the derivation was S→aSb
Then the remaining S must produce a string of terminals, Y,
in k steps, and by the induction hypothesis Y begins with
an a.
Thus the overall string generated, aYb, cannot begin with abb
(ii) The first rule in the deriation was S→SS
then the remaining two S's must produce two strings of
terminals, X and Y, in a combined total of k steps
Thus, by our induction assumption, neither X nor Y can
begin with abb, and by our observation both of them
begin with a's.
If X has length at least 3, then the beginning of X is
also the beginning of our overall string, so the overall
string cannot begin with abb.
If X has length 2, then the overall string must begin with
aba (ab from X, and the first a from Y) hence the overall
string does not begin with abb
Thus no string generated by the grammar can begin with the string abb.
- Give a CFG that generates the language of all strings over { 0, 1 }
that contain 3 times as many 1's as 0's. Provide a clear argument
describing why your grammar is correct for the language.
Sample solution:
For each 0 added, we must add three 1's,
in any spot within the overall string
The solution below is overkill, but correct,
considering every possible way to arrange the
new zero and three one's
The final S → λ allows us to eliminate
any and all S's once the string terminals have
been generated in a desired order.
S → S0S1S1S1S | S1S0S1S1S | S1S1S0S1S | S1S1S1S0S
S → λ
- Give the state diagram for a push-down automaton (PDA) that recognizes
the language of all non-palindromes over { a, b, c }.
Provide a clear argument describing why your PDA is correct for the language.
Sample solution:
Create the CFG for non-palindromes,
and use our standard techinque to build a 3-state
PDA from the CFG
S → aSa | bSb | cSc | A
A → aBb | bBa | aBc | cBa | bBc | cBb
B → S | A | a | b | c | λ
The grammar works correctly, since rule S→A must be
applied at some point, and guarantees the resulting string
cannot be a palindrome.
Rule S→A can be applied at any point, and once it is applied
any string can be generated in place of the A.
Given that our construction technique for CFG→PDA is valid,
the resulting PDA must also accept exactly the set of non-palindromes.
For an argument as to why our construction technique is valid,
see the sample solution to question 6 below.
Since the state diagrams are difficult to depict online,
here is a description of the construction:
There are 3 states:
q0 is the start state,
and has a single transition going to state q1
This transition takes place as the first step
(empty stack, consuming no input) and pushes S on the stack
q2 is the accept state,
and there is a single transition from q1 to q2
which takes place when the stack is empty and the entire
input string has been consumed
q1 is the working state, and has numerous transitions
each of which goes from q1 to q1
for each production rule, X→α there is one transition
if X is on the top of the stack then replace it with
α without consuming any input characters
for each terminal (i.e. a, b, c) there is a single transition
if the next character in the input string is the
same as the character on the top of the stack,
then pop that character from the top of the stack
- A transition table is given below for a PDA with start state q0
and accept state q2. Clearly describe what language is accepted
by the PDA.
Current Input TopOfStack Next Top-ofStack
State Symbol Symbol State String
q0 a z0 q0 xz0
q0 b z0 q0 xz0
q0 a x q0 xx
q0 b x q0 xx
q0 c x q1 x
q0 c z0 q1 z0
q1 a x q1 λ
q1 b x q1 λ
q1 λ z0 q2 z0
all other combinations reject
Sample solution:
The set of strings of the form
(a+b)ic(a+b)i where i ≥ 0
(push an x for every a or b seen,
until you see a single c,
then pop an x for every a or b seen
accept if the numbers were equal)
- Give either the state diagram or the
transition table for a PDA that recognizes the same language
as that generated by the CFG with the following production rules
S → S+T | T
T → T*F | F
F → (S) | a
Provide a clear argument describing why your PDA is correct for the language.
Sample solution:
Current Input TopOfStack Next TopOfStack
State Symbol Symbol State String
q0 λ z0 q1 Sz0
q1 + + q1 λ
q1 * * q1 λ
q1 ( ( q1 λ
q1 ) ) q1 λ
q1 a a q1 λ
q1 λ S q1 S+T
q1 λ S q1 T
q1 λ T q1 T*F
q1 λ T q1 F
q1 λ F q1 (S)
q1 λ F q1 a
q1 λ z0 q2 z0
all others reject
Argument: why the CFG and the PDA describe exactly the same language
(1) any string generated by the CFG can be accepted by the PDA:
For any string generated by the language, there exists
a leftmost derivation,
e.g. S->S+T
->T+T
->F+T
->a+T
->a+T*F
->a+F*F
->a+a*F
->a+a*a
To accept the string, the PDA must merely select the
transition corresponding to the correct production
at each step involving a nonterminal
e.g. Replace S with S+T,
Replace S with T
Replace T with F
Replace F with a (then pop/match the a and +)
Replace T with T*F
Replace T with F
Replace F with a (then pop/match the a and *)
Replace F with a (then pop/match the a)
(2) any string accepted by the PDA can be generated by the CFG:
The PDA can only accept a string by matching terminals
on the stack to the symbols in the input string
The only way a sequence of terminals can be pushed on the
PDA stack (and hence popped/matched to consume the
input string) is via the application of transitions
which correspond to the production rules in the CFG
Thus if there exists a sequence of transitions by which
the PDA can accept the string, then there also exist
a sequence of productions by which the CFG can
generate the string.
For a full proof, see theorem 7.2 in the text (pp. 212-213)