CSCI 162 Lab 6: theory of computation


Schedule:

In the labs of weeks 11 and 12 (Apr. 3 and 10) we'll discuss different aspects of the theory of computation: finite automata, regular expressions, and algorithm complexity.

The lab exercise is due at 5pm on Friday April 12th.


Reference material:

Regular expressions in practice
Most programming languages provide some mechanism for describing a pattern using a regular expression and then testing other strings to see if they match that pattern.

The most common notation for the patterns is as follows (with some variations depending on the specific programming language in use of course):
regexmeaning
abc the string "abc"
x* 0 or more instances of x
x+ 1 or more instances of x
x? 0 or 1 instances of x
. any single character
[afg] any one of characters a, f or g
[^afg] any one character except a, f or g
[A-Z0-9] any one of character from A-Z or from 0-9
(pattern1 | pattern2) matches pattern 1 or pattern 2
^ matches the start of a string
$ matches the end of a string
Many languages say a string matches a pattern if it contains the pattern, so the notation "^pattern$" allows us to say the string must exactly match the pattern (not simply contain it).

The brackets for grouping allow more complex expressions, e.g. pattern
( xyz | [13579]* )+ would match strings like "xyz", "113", "79xyzxyz31755", etc.


Examples of matching syntax in different languages
Suppose we had a variable named pat that held our regular expression, and a variable named str that held the string we wanted to check. The syntax for checking for a match in various languages would look like:
  • C++: if (std::regex_match(str, pat)) // need to #include <regex>
  • lisp: (if (si::string-match pat str)
  • bash: if [[ $str =~ $pat ]]
  • perl: if ($str =~ $pat)

Examples of complexity analysis
For this lab I'm less concerned with rigourous math/theory proofs, and more concerned with your ability to recognize and describe the key factors for particular algorithms.

Note that we are using pseudo-code, not "real" code, since all we are interested in is the algorithm complexity.

Example 1: change an element in an array of size N
Algorithm:
   arr[i] = x
The time taken to update arr[i] is constant time, it does not depend on N at all, thus O(1).


Example 2: linear search of an array of size N

Algorithm:
   for i = 0 to N-1:
      if (arr[i] equals target) return i
   return -1
In the worst case, we need to search all N elements of the array, thus O(N), or linear time.


Example 3: binary search of an array of size N

Algorithm:
   int bsearch(arr, low, high, target)
      if low > high return -1
      mid = (low+high) / 2
      if (target > mid) return bsearch(arr, mid+1, high, target)
      else if (target < mid) return bsearch(arr, low, mid-1, target)
      else return i

original call:
   pos = bsearch(array, 0, N-1, x)
We cut the search space in half on each call to bsearch, thus there are at most log2(N) calls before we get down to a single element to check/reject.

On each individual call we are doing a fixed amount of work, independent of N, call that constant k (at worst a computation, three comparisons, and a call)

so worst case is k * log2(N) for constant k, i.e. O(log(N)), or logarithmic time.


Example 4: printing multiplication tables for up to NxN

Algorithm:
   for i = 1 to N do:
       for j = 1 to N do:
           print i "*" j "=" (i*j) "\n"
We make N passes through the outer loop (i loop), and for each of those we make N passes through the inner loop (j loop) and for each of those we do one print statement.

The amount of work for the print is a constant, independent of N, call it k.

Thus the total work is N * N * k, or O(N2), or quadratic time.


Lab repositories:

Obtain the lab 6 repository using the same techniques as previous labs, i.e.

The remaining instructions for lab 3 are in the README file.


Lab exercises:

Once you have completed all parts of the lab, be sure to add, commit, and push your updates:

REMEMBER: if you miss the add, the commit, or the push then your changes never make it back to the central repository, so they'll never get marked!