C SC 320: Assignment 3 sample solutions

C SC 320: Assignment 3

In each case below, state what language is generated by the context free grammar (CFG) productions indicated.

S → aSa | bSb | A,
A → aBb | bBa,
B → aBa | bBb | a | b | λ

Sample solution:
This describes strings over { a, b } which would be palindromes 
  except for the presence of exactly one character mismatch

S → aS | aSbS | λ

Sample solution:
This describes strings over { a, b } for which every prefix
   string has at least as many a's as b's

S → aS | bS | a

Sample solution:
This describes strings over { a, b } which end in a

S → SS | bS | a

Sample solution:
This also describes strings over { a, b } which end in a

Let L be the language generated by the CFG with productions S → aSb | ab | SS
Use mathematical induction to show that no string in L begins with abb.

Sample solution:
Proof by induction on the number of steps in the derivation

Base case: 1 step
  The only string which can be derived in a single step
      is ab, via the rule S→ab
  and, clearly, ab does not start with abb.

Induction hypothesis: we assume that, for all strings which
  can be generated by up to k applications of the grammar's
  production rules, the strings do not begin with abb

Induction step: let X be a string generated by k+1 applications
     of the grammars production rules. 
  Observation: no string can be generated which begins
     with b.  This is clear from inspection of the rules,
     the only places terminal b is introduced are after
     terminal a's.  Thus any string generated must begin with an a.
  There are two derivation cases we need to consider:
  (i) The first rule in the derivation was S→aSb
        Then the remaining S must produce a string of terminals, Y,
           in k steps, and by the induction hypothesis Y begins with
           an a.
        Thus the overall string generated, aYb, cannot begin with abb
     
  (ii) The first rule in the deriation was S→SS
        then the remaining two S's must produce two strings of 
           terminals, X and Y, in a combined total of k steps
        Thus, by our induction assumption, neither X nor Y can
           begin with abb, and by our observation both of them
           begin with a's.
        If X has length at least 3, then the beginning of X is
           also the beginning of our overall string, so the overall
           string cannot begin with abb.
        If X has length 2, then the overall string must begin with
           aba (ab from X, and the first a from Y) hence the overall
           string does not begin with abb

Thus no string generated by the grammar can begin with the string abb.

Give a CFG that generates the language of all strings over { 0, 1 } that contain 3 times as many 1's as 0's. Provide a clear argument describing why your grammar is correct for the language.

Sample solution:
For each 0 added, we must add three 1's,
    in any spot within the overall string

The solution below is overkill, but correct,
    considering every possible way to arrange the
       new zero and three one's
The final S → λ allows us to eliminate
     any and all S's once the string terminals have
     been generated in a desired order.

S → S0S1S1S1S | S1S0S1S1S | S1S1S0S1S | S1S1S1S0S
S → λ

Give the state diagram for a push-down automaton (PDA) that recognizes the language of all non-palindromes over { a, b, c }. Provide a clear argument describing why your PDA is correct for the language.

Sample solution:
Create the CFG for non-palindromes,
and use our standard techinque to build a 3-state
    PDA from the CFG

S → aSa | bSb | cSc | A
A → aBb | bBa | aBc | cBa | bBc | cBb
B → S | A | a | b | c | λ

The grammar works correctly, since rule S→A must be
applied at some point, and guarantees the resulting string
cannot be a palindrome.

Rule S→A can be applied at any point, and once it is applied
any string can be generated in place of the A.

Given that our construction technique for CFG→PDA is valid,
the resulting PDA must also accept exactly the set of non-palindromes.

For an argument as to why our construction technique is valid,
see the sample solution to question 6 below.

Since the state diagrams are difficult to depict online,
here is a description of the construction:
  There are 3 states:
     q₀ is the start state,
        and has a single transition going to state q₁
        This transition takes place as the first step
        (empty stack, consuming no input) and pushes S on the stack
     q₂ is the accept state,
        and there is a single transition from q₁ to q₂
        which takes place when the stack is empty and the entire
        input string has been consumed
     q₁ is the working state, and has numerous transitions
        each of which goes from q₁ to q₁
        for each production rule, X→α there is one transition
            if X is on the top of the stack then replace it with
               α without consuming any input characters
        for each terminal (i.e. a, b, c) there is a single transition
            if the next character in the input string is the
               same as the character on the top of the stack,
               then pop that character from the top of the stack

A transition table is given below for a PDA with start state q₀ and accept state q₂. Clearly describe what language is accepted by the PDA.

Current Input  TopOfStack Next  Top-ofStack
 State  Symbol   Symbol   State    String
  q₀      a         z₀     q₀        xz₀
  q₀      b         z₀     q₀        xz₀
  q₀      a         x      q₀        xx
  q₀      b         x      q₀        xx
  q₀      c         x      q₁        x
  q₀      c         z₀     q₁        z₀
  q₁      a         x      q₁        λ
  q₁      b         x      q₁        λ
  q₁      λ         z₀     q₂        z₀
all other combinations reject

Sample solution:
The set of strings of the form
(a+b)ⁱc(a+b)ⁱ where i ≥ 0

(push an x for every a or b seen,
 until you see a single c,
 then pop an x for every a or b seen
 accept if the numbers were equal)

Give either the state diagram or the transition table for a PDA that recognizes the same language as that generated by the CFG with the following production rules

S → S+T | T
T → T*F | F
F → (S) | a

Provide a clear argument describing why your PDA is correct for the language.

Sample solution:
Current Input TopOfStack Next TopOfStack
 State  Symbol  Symbol   State  String
  q₀     λ         z₀      q₁     Sz₀
  q₁     +         +       q₁      λ
  q₁     *         *       q₁      λ
  q₁     (         (       q₁      λ
  q₁     )         )       q₁      λ
  q₁     a         a       q₁      λ
  q₁     λ         S       q₁      S+T
  q₁     λ         S       q₁      T
  q₁     λ         T       q₁      T*F
  q₁     λ         T       q₁      F
  q₁     λ         F       q₁      (S)
  q₁     λ         F       q₁      a
  q₁     λ         z₀      q₂      z₀
all others reject

Argument: why the CFG and the PDA describe exactly the same language

  (1) any string generated by the CFG can be accepted by the PDA:
          For any string generated by the language, there exists
              a leftmost derivation, 
          e.g. S->S+T
                ->T+T
                ->F+T
                ->a+T
                ->a+T*F
                ->a+F*F
                ->a+a*F
                ->a+a*a
  
          To accept the string, the PDA must merely select the
              transition corresponding to the correct production
              at each step involving a nonterminal
          e.g. Replace S with S+T,
               Replace S with T
               Replace T with F
               Replace F with a (then pop/match the a and +)
               Replace T with T*F
               Replace T with F
               Replace F with a (then pop/match the a and *)
               Replace F with a (then pop/match the a)

  (2) any string accepted by the PDA can be generated by the CFG:
          The PDA can only accept a string by matching terminals
               on the stack to the symbols in the input string
          The only way a sequence of terminals can be pushed on the
               PDA stack (and hence popped/matched to consume the
               input string) is via the application of transitions
               which correspond to the production rules in the CFG
          Thus if there exists a sequence of transitions by which
               the PDA can accept the string, then there also exist
               a sequence of productions by which the CFG can
               generate the string.

For a full proof, see theorem 7.2 in the text (pp. 212-213)