Bracket matching

One interesting problem we encounter is how to check that the brackets in a typical program 'match up' correctly: i.e. there are equal numbers of each type of opening and closing brackets, and in an appropriate order for the language. Compilers must check this, but many editors and other software development tools also try to help the programmer by highlighting matching brackets while editing.

If we ignore all the text between the brackets, and for the moment focus purely on the brackets themselves, we can examine the problem a little more simply.

Consider the example of a valid sequence below: D closes C, F closes E, G closes B, and H closes A, so the ordering is valid.
{ { ( ) { } } }
A B C D E F G H
In the example of an invalid sequence below, we have a round bracket, D, attempting to close the open curly Bracket, E:
{ { ( { ) } } }
A B C E D F G H

We can define what are valid sequences of brackets by providing a simple set of rules on how to build such sequences, e.g.:

  1. { } is a valid sequence
  2. [ ] is a valid sequence
  3. ( ) is a valid sequence
  4. If S is a valid sequence then so are { S } and [ S ] and ( S )
  5. If S1 and S2 are valid sequences then so is S1 followed by S2

Combinations of these rules would allow us to build any valid bracket sequence.
For example, to build { { () [] } } we could use:

Rule 3 for                  ( )
Rule 2 for                  [ ]
Rule 5 to put them together ( ) [ ]
Rule 1 to enclose them:     { ( ) [ ] }
Rule 1 again:               { { ( ) [ ] } }

To try and check a sequence to see if it was valid could be done by trying to derive a series of steps that produced it, but there is a simpler stack-based algorithm.

The basic concept is that we use the stack to remember what sequence of brackets are open right now, so we know what order they need to be closed in. Whenever we see a new opening bracket we push it on the top of the stack, whenever we see a closing bracket we remove the most recent (top) opening bracket from the stack and make sure they match. If they don't match there is an error, if the stack is empty when we see a closing bracket there is an error, and if the stack is not empty at the end of the input then there is an error.

The algorithm is outlined below.

open the file containing the code to be checked,
   and make sure it opened successfully
start with an empty stack
examine each character, one at a time, until
   you reach the end of file or detect an error
      if it is an opening bracket 
         then push it on the stack
      otherwise, if it is a closing bracket 
         then pop the top bracket off the stack
            and see if they match: [ with ], ( with ), etc
         if they don't match the bracket sequence is invalid
         or if there was nothing to pop off the stack
            then the sequence is invalid since there
            was no opening bracket waiting to be closed
after the last character, if the stack is not empty
   then the sequence is invalid
   (since some open brackets never got closed)
otherwise the sequence is valid
close the file

Version 1

The checkBrackets function below tries to validate the order of brackets in a file that can contain mixes of round brackets ( ), square brackets [ ], and curly brackets { }. checkBrackets ignores any other characters in the file.


// Assuming we have a class for a Stack of chars, with methods 
//    bool Stack::push(char c);
//    bool Stack::pop(char &c); 
//    bool Stack::isempty();

// bool checkBrackets(char filename[])
// ===================================
// The filename parameter gives the name of a file containing, between
//     various other characters, a sequence of opening and closing
//     brackets of the following types: { } ( ) [ ]
// The purpose of the function is to check that the order of brackets
//     within the file is valid for standard nesting rules.
// (All characters other than brackets are ignored by the function.)
// If the file is readable and the bracket ordering is valid 
//     then checkBrackets should return true.
// If the file is unreadable or the bracket ordering is invalid
//     then checkBrackets should return false.
bool checkBrackets(char filename[])
{
   // use a stack to track the currently open brackets
   Stack S;

   // storage for characters currently being processed
   char c, c2;

   // try opening the file
   ifstream fpin;
   fpin.open(filename);
   if (fpin.fail()) {
      cout << "Unable to open file " << filename << endl;
      return false;
   }

   // assume the file contents are valid unless/until we determine otherwise
   bool valid = true;

   // process the file one character at a time
   while (valid && !fpin.eof()) {

      // get the next character and check for end of file
      fpin.get(c);
      if (!fpin.eof()) {

         // if it is an opening bracket then push it on the stack
         if ((c == '{') || (c == '(') || (c == '[')) {
            if (!S.push(c)) {
               cout << "Stack failure, unable to continue processing\n";
               valid = false;
            }
         }

         // if it is a closing bracket then pop the top bracket off
         //    the stack if possible (otherwise it is an error) 
         else if ((c == '}') || (c == ')') || (c == ']')) {
             if (S.isempty()) {
                cout << "Found a closing " << c;
                cout << " when no brackets were open\n";
                valid = false;
             } else {
                if (!S.pop(c2)) {
                   cout << "Stack failure, unable to continue processing\n";
                   valid = false;
                } else {
                   if ((c == '}') && (c2 != '{')) valid = false;
                   if ((c == ')') && (c2 != '(')) valid = false;
                   if ((c == ']') && (c2 != '[')) valid = false;
                   if (!valid) {
                      cout << "Bracket mismatch: " << c2;
                      cout << " and " << c << endl;
                   }
                }
             }
         }

         // if it is anything else then the character is ignored   
         else { }
      }
   }

   // once we have read every character in the file,
   // make sure there are no brackets left unclosed
   if (valid && !S.isempty()) {
      cout << "The following open bracket(s) were left unmatched at the ";
      cout << "end of the file:";
      while (!S.isempty()) {
         S.pop(c);
         cout << c;
      }
      cout << endl;
      valid = false;
   }

   // close the file and return the final result
   fpin.close();
   return valid;
} 

Version 2

If an error is detected, it would be handy if we could identify exactly which brackets in the file didn't match - for instance if we specified what line each bracket was on and which character position in the line.

To do that, we'll need to keep track of line numbers and character positions within the current line as we read the file (done by watching for newline characters), and whenever we push a bracket onto the stack we include the line number and character position.

Prototypes for the altered stack methods are given below, along with the new version of checkBrackets.


// Assuming we have a class for a Stack of chars, with methods 
//    bool Stack::push(char c, int line, int pos);
//    bool Stack::pop(char &c, int &line, int &pos); 
//    bool Stack::isempty();

// bool checkBrackets(char filename[])
// ===================================
// The filename parameter gives the name of a file containing, between
//     various other characters, a sequence of opening and closing
//     brackets of the following types: { } ( ) [ ]
// The purpose of the function is to check that the order of brackets
//     within the file is valid for standard nesting rules.
// (All characters other than brackets are ignored by the function.)
// If the file is readable and the bracket ordering is valid 
//     then checkBrackets should return true.
// If the file is unreadable or the bracket ordering is invalid
//     then checkBrackets should return false.
bool checkBrackets(char filename[])
{
   // use a stack to track the currently open brackets
   Stack S;

   // storage for characters currently being processed
   char c, c2;

   // track the current line number and position within the line
   int position = 1;
   int line = 1;

   // try opening the file
   ifstream fpin;
   fpin.open(filename);
   if (fpin.fail()) {
      cout << "Unable to open file " << filename << endl;
      return false;
   }

   // assume the file contents are valid unless/until we determine otherwise
   bool valid = true;

   // process the file one character at a time
   while (valid && !fpin.eof()) {

      // get the next character and check for end of file
      fpin.get(c);
      if (!fpin.eof()) {

         // increment the character position within the line
         position++;

         // if it is an opening bracket then push it on the stack
         if ((c == '{') || (c == '(') || (c == '[')) {
            if (!S.push(c, line, position)) {
               cout << "Stack failure, unable to continue processing\n";
               valid = false;
            }
         }

         // if it is a closing bracket then pop the top bracket off
         //    the stack if possible (otherwise it is an error) 
         else if ((c == '}') || (c == ')') || (c == ']')) {
             if (S.isempty()) {
                cout << "Found a closing " << c << " from line:pos ";
                cout << line << ":" << position;
                cout << " when no brackets were open\n";
                valid = false;
             } else {
                // lookup the stored bracket, line, and position
                int pos2, line2;
                if (!S.pop(c2, line2, pos2)) {
                   cout << "Stack failure, unable to continue processing\n";
                   valid = false;
                } else {
                   if ((c == '}') && (c2 != '{')) valid = false;
                   if ((c == ')') && (c2 != '(')) valid = false;
                   if ((c == ']') && (c2 != '[')) valid = false;
                   if (!valid) {
                      cout << "Bracket mismatch: " << c2 << " from line:pos ";
                      cout << line2 << ":" << pos2 << " and " << c;
                      cout << " " << line << ":" << position << endl;
                   }
                }
             }
         }

         // if it is a new line adjust the line number and position
         else if (c == '\n') { 
            position = 1;
            line++;
         }
      }
   }

   // once we have read every character in the file,
   // make sure there are no brackets left unclosed
   if (valid && !S.isempty()) {
      cout << "The following open bracket(s) were left unmatched at the ";
      cout << "end of the file:\n";
      while (!S.isempty()) {
         S.pop(c, line, position);
         cout << "   " << c << " line:pos " << line << ":" << position << endl;
      }
      cout << endl;
      valid = false;
   }

   // close the file and return the final result
   fpin.close();
   return valid;
} 

Version 3

The version above works well, but it checks ALL brackets in the file.

There are times when that isn't desirable, e.g. if we wrote a single bracket in a comment somewhere, e.g.
// blah blah blah { blah blah
/* blah blah blah ( blah blah */

Similarly someone could write a single bracket inside a text string, like
cout << " blah blah blah [ ";

Or, perhaps in a character literal, e.g. char c = '{';

To handle those cases, we can introduce different reading modes:


// Assuming we have a class for a Stack of chars, with methods 
//    bool Stack::push(char c, int line, int pos);
//    bool Stack::pop(char &c, int &line, int &pos); 
//    bool Stack::isempty();

// bool checkBrackets(char filename[])
// ===================================
// The filename parameter gives the name of a file containing, between
//     various other characters, a sequence of opening and closing
//     brackets of the following types: { } ( ) [ ]
// The purpose of the function is to check that the order of brackets
//     within the file is valid for standard nesting rules.
// (All characters other than brackets are ignored by the function.)
// If the file is readable and the bracket ordering is valid 
//     then checkBrackets should return true.
// If the file is unreadable or the bracket ordering is invalid
//     then checkBrackets should return false.
bool checkBrackets(char filename[])
{
   // there are five different reading modes while
   // testing a file, we start off in Checking mode
   //    Checking (the regular bracket checking mode)
   //    Multi (processing a multi-line comment, e.g. /* ... */)
   //    Single (processing a single-line comment, e.g // ...)
   //    TextLit (processing a text literal, e.g. ".....")
   //    CharLit (processing a char literal), e.g. '.')
   enum Modes { Checking, Multi, Single, TextLit, CharLit };
   Modes mode = Checking;

   // use a stack to track the currently open brackets
   Stack S;

   // storage for characters currently being processed
   char c, c2;

   // track the current line number and position within the line
   int position = 1;
   int line = 1;

   // try opening the file
   ifstream fpin;
   fpin.open(filename);
   if (fpin.fail()) {
      cout << "Unable to open file " << filename << endl;
      return false;
   }

   // assume the file contents are valid unless/until we determine otherwise
   bool valid = true;

   // process the file one character at a time
   while (valid && !fpin.eof()) {

      // get the next character and check for end of file
      fpin.get(c);
      if (!fpin.eof()) {

         // increment the character position within the line
         position++;

         // if it is a new line adjust the line number and position
         if (c == '\n') { 
            position = 1;
            line++;
         }

         // handle cases where we are in text literal mode
         if (mode == TextLit) {
              // switch back to regular checking mode after "
              if (c == '"') mode = Checking;

              // skip an extra char if you see a \ during the lit,
              // since that could be a '\"'
              if (c == '\\') {
                 fpin.get(c);
                 if (!fpin.eof()) position++;
              }
         }

         // handle cases where we are in char literal mode
         else if (mode == CharLit) {
              // switch back to regular checking mode after '
              if (c == '\'') mode = Checking;

              // skip an extra char if you see a \ during the lit,
              // since that could be a '\''
              if (c == '\\') {
                 fpin.get(c);
                 if (!fpin.eof()) position++;
              }
         }

         // handle cases where we are in multi-line comment mode
         else if (mode == Multi) {
            // switch back to regular checking mode after */
            if (c == '*') {
               // look ahead to see if the next char is a /
               c2 = fpin.peek();
               if (c2 == '/') mode = Checking;
            }
         }

         // handle cases where we are in single-line comment mode
         else if (mode == Single) {
            // switch back to regular checking mode at end of line
            if (c == '\n') mode = Checking;
         }

         // handle cases where we are in regular bracket checking mode
         else {
            // see if we need to switch to one of the other modes
            if (c == '"') mode = TextLit;
            else if (c == '\'') mode = CharLit;
            else if (c == '/') {
               // peek ahead and see if this is the start of a comment
               c2 = fpin.peek();
               if (c2 == '*') {
                  mode = Multi;
                  fpin.get(c);
                  if (!fpin.eof()) position++;
               } else if (c2 == '/') {
                  mode = Single;
                  fpin.get(c);
                  if (!fpin.eof()) position++;
               }
            }

            // if it is an opening bracket then push it on the stack
            else if ((c == '{') || (c == '(') || (c == '[')) {
               if (!S.push(c, line, position)) {
                  cout << "Stack failure, unable to continue processing\n";
                  valid = false;
               }
            }
   
            // if it is a closing bracket then pop the top bracket off
            //    the stack if possible (otherwise it is an error) 
            else if ((c == '}') || (c == ')') || (c == ']')) {
                if (S.isempty()) {
                   cout << "Found a closing " << c << " from line:pos ";
                   cout << line << ":" << position;
                   cout << " when no brackets were open\n";
                   valid = false;
                } else {
                   // lookup the stored bracket, line, and position
                   int pos2, line2;
                   if (!S.pop(c2, line2, pos2)) {
                      cout << "Stack failure, unable to continue processing\n";
                      valid = false;
                   } else {
                      if ((c == '}') && (c2 != '{')) valid = false;
                      if ((c == ')') && (c2 != '(')) valid = false;
                      if ((c == ']') && (c2 != '[')) valid = false;
                      if (!valid) {
                         cout << "Bracket mismatch: " << c2 << " from line:pos ";
                         cout << line2 << ":" << pos2 << " and " << c;
                         cout << " " << line << ":" << position << endl;
                      }
                   }
                }
            }
         }
      }
   }

   // once we have read every character in the file,
   // make sure there are no brackets left unclosed
   if (valid && !S.isempty()) {
      cout << "The following open bracket(s) were left unmatched at the ";
      cout << "end of the file:\n";
      while (!S.isempty()) {
         S.pop(c, line, position);
         cout << "   " << c << " line:pos " << line << ":" << position << endl;
      }
      cout << endl;
      valid = false;
   }

   // close the file and return the final result
   fpin.close();
   return valid;
}