C SC 320: Assignment 3

  1. In each case below, state what language is generated by the context free grammar (CFG) productions indicated.
    1. S → aSa | bSb | A,
      A → aBb | bBa,
      B → aBa | bBb | a | b | λ
      Sample solution:
      This describes strings over { a, b } which would be palindromes 
        except for the presence of exactly one character mismatch
      
    2. S → aS | aSbS | λ
      Sample solution:
      This describes strings over { a, b } for which every prefix
         string has at least as many a's as b's
      
    3. S → aS | bS | a
      Sample solution:
      This describes strings over { a, b } which end in a
      
    4. S → SS | bS | a
      Sample solution:
      This also describes strings over { a, b } which end in a
      

  2. Let L be the language generated by the CFG with productions S → aSb | ab | SS
    Use mathematical induction to show that no string in L begins with abb.
    Sample solution:
    Proof by induction on the number of steps in the derivation
    
    Base case: 1 step
      The only string which can be derived in a single step
          is ab, via the rule S→ab
      and, clearly, ab does not start with abb.
    
    Induction hypothesis: we assume that, for all strings which
      can be generated by up to k applications of the grammar's
      production rules, the strings do not begin with abb
    
    Induction step: let X be a string generated by k+1 applications
         of the grammars production rules. 
      Observation: no string can be generated which begins
         with b.  This is clear from inspection of the rules,
         the only places terminal b is introduced are after
         terminal a's.  Thus any string generated must begin with an a.
      There are two derivation cases we need to consider:
      (i) The first rule in the derivation was S→aSb
            Then the remaining S must produce a string of terminals, Y,
               in k steps, and by the induction hypothesis Y begins with
               an a.
            Thus the overall string generated, aYb, cannot begin with abb
         
      (ii) The first rule in the deriation was S→SS
            then the remaining two S's must produce two strings of 
               terminals, X and Y, in a combined total of k steps
            Thus, by our induction assumption, neither X nor Y can
               begin with abb, and by our observation both of them
               begin with a's.
            If X has length at least 3, then the beginning of X is
               also the beginning of our overall string, so the overall
               string cannot begin with abb.
            If X has length 2, then the overall string must begin with
               aba (ab from X, and the first a from Y) hence the overall
               string does not begin with abb
    
    Thus no string generated by the grammar can begin with the string abb.
    

  3. Give a CFG that generates the language of all strings over { 0, 1 } that contain 3 times as many 1's as 0's. Provide a clear argument describing why your grammar is correct for the language.
    Sample solution:
    For each 0 added, we must add three 1's,
        in any spot within the overall string
    
    The solution below is overkill, but correct,
        considering every possible way to arrange the
           new zero and three one's
    The final S → λ allows us to eliminate
         any and all S's once the string terminals have
         been generated in a desired order.
    
    S → S0S1S1S1S | S1S0S1S1S | S1S1S0S1S | S1S1S1S0S
    S → λ
    

  4. Give the state diagram for a push-down automaton (PDA) that recognizes the language of all non-palindromes over { a, b, c }. Provide a clear argument describing why your PDA is correct for the language.
    Sample solution:
    Create the CFG for non-palindromes,
    and use our standard techinque to build a 3-state
        PDA from the CFG
    
    S → aSa | bSb | cSc | A
    A → aBb | bBa | aBc | cBa | bBc | cBb
    B → S | A | a | b | c | λ
    
    The grammar works correctly, since rule S→A must be
    applied at some point, and guarantees the resulting string
    cannot be a palindrome.
    
    Rule S→A can be applied at any point, and once it is applied
    any string can be generated in place of the A.
    
    Given that our construction technique for CFG→PDA is valid,
    the resulting PDA must also accept exactly the set of non-palindromes.
    
    For an argument as to why our construction technique is valid,
    see the sample solution to question 6 below.
    
    Since the state diagrams are difficult to depict online,
    here is a description of the construction:
      There are 3 states:
         q0 is the start state,
            and has a single transition going to state q1
            This transition takes place as the first step
            (empty stack, consuming no input) and pushes S on the stack
         q2 is the accept state,
            and there is a single transition from q1 to q2
            which takes place when the stack is empty and the entire
            input string has been consumed
         q1 is the working state, and has numerous transitions
            each of which goes from q1 to q1
            for each production rule, X→α there is one transition
                if X is on the top of the stack then replace it with
                   α without consuming any input characters
            for each terminal (i.e. a, b, c) there is a single transition
                if the next character in the input string is the
                   same as the character on the top of the stack,
                   then pop that character from the top of the stack
            
    

  5. A transition table is given below for a PDA with start state q0 and accept state q2. Clearly describe what language is accepted by the PDA.
    Current Input  TopOfStack Next  Top-ofStack
     State  Symbol   Symbol   State    String
      q0      a         z0     q0        xz0
      q0      b         z0     q0        xz0
      q0      a         x      q0        xx
      q0      b         x      q0        xx
      q0      c         x      q1        x
      q0      c         z0     q1        z0
      q1      a         x      q1        λ
      q1      b         x      q1        λ
      q1      λ         z0     q2        z0
    all other combinations reject
    
    Sample solution:
    The set of strings of the form
    (a+b)ic(a+b)i where i ≥ 0
    
    (push an x for every a or b seen,
     until you see a single c,
     then pop an x for every a or b seen
     accept if the numbers were equal)
    

  6. Give either the state diagram or the transition table for a PDA that recognizes the same language as that generated by the CFG with the following production rules
    S → S+T | T
    T → T*F | F
    F → (S) | a
    
    Provide a clear argument describing why your PDA is correct for the language.
    Sample solution:
    Current Input TopOfStack Next TopOfStack
     State  Symbol  Symbol   State  String
      q0     λ         z0      q1     Sz0
      q1     +         +       q1      λ
      q1     *         *       q1      λ
      q1     (         (       q1      λ
      q1     )         )       q1      λ
      q1     a         a       q1      λ
      q1     λ         S       q1      S+T
      q1     λ         S       q1      T
      q1     λ         T       q1      T*F
      q1     λ         T       q1      F
      q1     λ         F       q1      (S)
      q1     λ         F       q1      a
      q1     λ         z0      q2      z0
    all others reject
    
    Argument: why the CFG and the PDA describe exactly the same language
    
      (1) any string generated by the CFG can be accepted by the PDA:
              For any string generated by the language, there exists
                  a leftmost derivation, 
              e.g. S->S+T
                    ->T+T
                    ->F+T
                    ->a+T
                    ->a+T*F
                    ->a+F*F
                    ->a+a*F
                    ->a+a*a
      
              To accept the string, the PDA must merely select the
                  transition corresponding to the correct production
                  at each step involving a nonterminal
              e.g. Replace S with S+T,
                   Replace S with T
                   Replace T with F
                   Replace F with a (then pop/match the a and +)
                   Replace T with T*F
                   Replace T with F
                   Replace F with a (then pop/match the a and *)
                   Replace F with a (then pop/match the a)
    
      (2) any string accepted by the PDA can be generated by the CFG:
              The PDA can only accept a string by matching terminals
                   on the stack to the symbols in the input string
              The only way a sequence of terminals can be pushed on the
                   PDA stack (and hence popped/matched to consume the
                   input string) is via the application of transitions
                   which correspond to the production rules in the CFG
              Thus if there exists a sequence of transitions by which
                   the PDA can accept the string, then there also exist
                   a sequence of productions by which the CFG can
                   generate the string.
    
    For a full proof, see theorem 7.2 in the text (pp. 212-213)