CSCI 161 Lab 3: sections S26N01/2

See the main lab page for submission deadlines and late penalties.

It is assumed you are keeping up to date with the lectures and labs.
Some additional git information is available through the git quick notes, git project/lab submission, and lecture resources pages.

This lab builds on the ideas from lab2, but now stores the data in a linked list, rather than an array, and uses mergesort on the linked list (instead of quicksort on an array).

The marking for this lab will be stricter in terms of the code standards, so be sure you have reviewed them carefully. Marks will be deducted for failing to follow standards.

lab3 product requirements/specification/design

As mentioned at the top of this page, the desired functionality for lab3 is the same as lab2, but with two key differences:

  1. Rather than using arrays to store the data we will now use a linked list, adding a new element to the back of the existing list each time you read a new valid pair from the file.

  2. The sorting algorithm you are using must be mergesort this time (not quicksort), and it must operate on your linked list (not an array).

    Hand-built linked list development is a core part of this lab. As such, you must use the linked-list of structs approach for this lab, i.e. not C++ library lists or queues or similar data structures to store your collection of rankings.

    Similarly, the implementation, testing, and debugging of a recursive mergesort algorithm (with separate merge function) is another core part, and you are required to implement each of them yourself - i.e. do not use the various C++ algorithm/sort libraries and functions.

Design ideas

As with labs 1 and 2, you are encouraged to come up with your own design, but are free to use the sample design below.

The design shows a similar set of functions as those in lab2, but using two new structs, one for individual items and one for a list of rankings as a whole.

Underneath the sample design code you'll find discussion of the more challenging aspects of switching to work with a linked list and switching to work with mergesort.

// sample main from lab3.cpp

int main(int argc, char* argv[])
{
   // quit if the user didn't pass valid arguments
   if (!checkUsage(argc, argv)) {
      cerr << "Program terminated early, unable to proceed\n";
      return 0;
   }

   string filename = argv[1]; // filename is more intuitive later
   welcome(); // program intro for user

   // load the rankings from the named file
   ItemList rankings;
   initList(rankings);
   if (!readRankings(filename, rankings)) {
      cerr << "Error: could not load file, ending program" << endl;
      deallocate(rankings);
      return 0;
   }

   // run head-to-head matchups, skipping unsuccessful ones
   int matches = getNumMatches();
   for (int m = 1; m <= matches; m++) {
      if (!runMatchup(rankings)) {
         cerr << "Error: match " << m << " unsuccessful, rankings not updated\n";
      }
   }

   // sort the rankings (lowest to highest)
   msortRanks(rankings);

   // write the updated rankings back to the file
   if (!writeRankings(filename, rankings)) {
      cerr << "Unable to updated rankings file " << filename << endl;
   }

   deallocate(rankings);
   return 0;
}


//sample ranking.h #pragma once #include <iostream> #include <string> using std::cout; using std::cin; using std::cerr; using std::endl; using std::string; const bool DEBUGMODE = false; // used to turn debugging output on/off // representation of a single item within a linked list struct ItemRank { int rank; // positive integer rank string description; // text description of item ItemRank* next; // pointer to the item after/behind this item (closer to the back) ItemRank* prev; // pointer to the item ahead/before this item (closer to the front) }; // representation of a list of items struct ItemList { ItemRank* front; // the first item in line ItemRank* back; // the last item in line int size; // number of items in the list }; // delete all the allocated items in a list then re-initialize it void deallocate(ItemList &rankings); // load item rankings from the named file, store in rankings, // return true if successful, false otherwise bool readRankings(string filename, ItemList &rankings); // initialize a list as empty void initList(ItemList &rankings); // insert the item at the back of the rankings list // return true if successful, false otherwise bool insertBack(ItemList &rankings, ItemRank* item); // remove the front item from the rankings list // return the pointer to it (null if list was empty) ItemRank* removeFront(ItemList &rankings); // return a pointer to the nth item in the list, null if N not valid ItemRank* getNthItem(ItemList rankings, int N); // return true iff descript is a valid item description // (i.e. contains at least one non-whitespace character) bool validDescription(string descript); // run a head-to-head matchup // - pick distinct pair of items // - present user with choice, get valid response // - update rankings based on response // return true iff the matchup is successful bool runMatchup(ItemList &rankings); // write rankings to a file // return true iff successful bool writeRankings(string filename, ItemList rankings); // calculate and return new rank for X based on current ranks for X and Y // and score from X's perspective: 1=X win, 0.5=draw, 0=Y win // return unchanged RankX in the event of an invalid parameter int calcNewRank(int RankX, int RankY, float score); // present the user with a choice between items X and Y // return 1 if they choose X // 0 if they choose Y // 0.5 if it's a draw // (repeat until they make a valid choice) float getChoice(ItemRank* X, ItemRank* Y); // remove rankings from the two sorted source lists, merging them into the destination list bool merge(ItemList &source1, ItemList &source2, ItemList &destination); // sort the rankings from lowest rank to highest void msortRanks(ItemList &rankings);


How our read routine changes

In our lab2 version of reading from a file we could simply put the next item into the next array position. Now we need to create a new item (like the allocate function from the lecture on Feb. 25) and then insert it into the list of items (like the insertAtBack function from the Feb25 lecture).


How picking two random items for a match changes

In our lab2 version for an array of N elements we could simply pick two different random numbers in the range 0..N-1 and grab those two items for our matchup.

In lab3, however, the list doesn't inherently have a size associated with it and we have no way to jump directly to the i'th item in the list.

One possible solution is to add a size field to the list struct: initialize it to 0 and increment it every time you do an insert (so the size always matches the number of elements currently in the list).

Then we can still pick two random numbers in the range 0..N-1 (or 1..N), but we need to scan the list from the front to find the i'th item. This is essentially just a for loop that counts how many items to skip across, then return a pointer to the item you land on. If you look at the example design above, you'll see a getNthItem function that does exactly that.


Mergesort on linked lists

If you're looking at the slides/sample code online, you'll see the mergesort and merge in that example ( mergesort.cpp ) is based on sorting an array, whereas here in lab3 you'll be using mergesort on a linked list (like the msortRanks and merge functions in the design above).

This does involve some extra juggling to translate an array-based solution into a linked-list style. The core steps are still the same for the recursive mergesort function itself (msortRanks above):

  1. split the original linked list into two roughly-equal halves
  2. call mergesort (msortRanks) on each half
  3. merge the two sorted half-lists back into one sorted whole

The big changes are in steps 1 and 3.

  - In step one, to split the list in half, we can't simply point at the middle and break the list in half since we only have pointers to the two ends (front and back). What we can do, is create two new empty lists and go through the 'big' list one item at a time, removing them from the big list and alternating between putting them in little list one and little list two.

  - In step three, to merge the two lists together into one big one, we just keep looking at the current front item in each of the two little lists, picking the smaller of them and moving it from the little list (e.g. removeFront) and putting it into the big list (e.g. insertBack).


Second week discussion: planning test cases

Just to reiterate points discussed earlier in the term, it's crucial for developers to create good habits in thoroughly testing their own code: both the program as a whole and the individual functions.

A substantial part of this comes down to recognizing what constitutes good test cases: how do we ensure that the code we've written actually does behave as intended, both on 'good/normal' user data and also in its handling of user mistakes.

Going through the specifications/requirements for a product, nearly every sentence should trigger testing ideas in the minds of the developer, covering all the different ways in which valid/invalid data may be provided by the user.

It is important not to make the assumption that our code correctly handles anything, valid or invalid, without actually verifying that with a corresponding test case.

Considering our current program as a case study, there are four key areas where the user provides data:

In each of these areas, we should come up with a list of typical valid data options with an extra focus on edge cases (valid but barely), and typical erroneous data with (again) a focus on edge cases (invalid but barely).

For each of the test scenarios we come up with, we should come up with the precise data to be used for the test case and the precise results that we should then see from the program (all output messages and all changes to the file content).

In our case study, the kinds of test cases we would need (at least) would include:

Before submitting a solution, look at the list above and ask yourself if you have tested your final submission of the lab against each of those scenarios.


Second week discussion: debuggers/gdb

Another key developer skill is that of debugging: being able to track down the root cause of errors in our code. We may learn of those errors through testing (as above) or through feedback from users or other developers, but we must then have a plan of attack for detecting the source of the issue.

The first crucial step is to come up with one or more hypotheses as to the possible sources, and then develop a plan for either confirming or rejecting each hypothesis. Such plans might include refining our test cases to narrow down specific possibilities, and might also include examining the code's behaviour as it executes during a test case. This latter approach is made significantly easier through the use of debuggers: tools which allow us to pause the program as it runs and examine (or even modify) the values of variables and memory to get greater insights as to its actual run-time behaviour (as opposed to what we thought it *should* be doing).

We'll look at the use of one basic debugger, gdb, but most available debuggers and IDEs (integrated development environments, such as VSCode) provide a very similar suite of capabilities.

I strongly recommend you try these steps on your lab3x, and get used to routinely using debuggers as part of your programming toolbox.