CSCI 161 Spring 2026: lab 3, S26N01/2

CSCI 161 Lab 3: sections S26N01/2

See the main lab page for submission deadlines and late penalties.

It is assumed you are keeping up to date with the lectures and labs.
Some additional git information is available through the git quick notes, git project/lab submission, and lecture resources pages.

This lab builds on the ideas from lab2, but now stores the data in a linked list, rather than an array, and uses mergesort on the linked list (instead of quicksort on an array).

The marking for this lab will be stricter in terms of the code standards, so be sure you have reviewed them carefully. Marks will be deducted for failing to follow standards.

lab3 product requirements/specification/design

As mentioned at the top of this page, the desired functionality for lab3 is the same as lab2, but with two key differences:

Rather than using arrays to store the data we will now use a linked list, adding a new element to the back of the existing list each time you read a new valid pair from the file.
The sorting algorithm you are using must be mergesort this time (not quicksort), and it must operate on your linked list (not an array).
Hand-built linked list development is a core part of this lab. As such, you must use the linked-list of structs approach for this lab, i.e. not C++ library lists or queues or similar data structures to store your collection of rankings.
Similarly, the implementation, testing, and debugging of a recursive mergesort algorithm (with separate merge function) is another core part, and you are required to implement each of them yourself - i.e. do not use the various C++ algorithm/sort libraries and functions.

Design ideas

As with labs 1 and 2, you are encouraged to come up with your own design, but are free to use the sample design below.

The design shows a similar set of functions as those in lab2, but using two new structs, one for individual items and one for a list of rankings as a whole.

Underneath the sample design code you'll find discussion of the more challenging aspects of switching to work with a linked list and switching to work with mergesort.

// sample main from lab3.cpp int main(int argc, char* argv[]) { // quit if the user didn't pass valid arguments if (!checkUsage(argc, argv)) { cerr << "Program terminated early, unable to proceed\n"; return 0; } string filename = argv[1]; // filename is more intuitive later welcome(); // program intro for user // load the rankings from the named file ItemList rankings; initList(rankings); if (!readRankings(filename, rankings)) { cerr << "Error: could not load file, ending program" << endl; deallocate(rankings); return 0; } // run head-to-head matchups, skipping unsuccessful ones int matches = getNumMatches(); for (int m = 1; m <= matches; m++) { if (!runMatchup(rankings)) { cerr << "Error: match " << m << " unsuccessful, rankings not updated\n"; } } // sort the rankings (lowest to highest) msortRanks(rankings); // write the updated rankings back to the file if (!writeRankings(filename, rankings)) { cerr << "Unable to updated rankings file " << filename << endl; } deallocate(rankings); return 0; }
//sample ranking.h #pragma once #include <iostream> #include <string> using std::cout; using std::cin; using std::cerr; using std::endl; using std::string; const bool DEBUGMODE = false; // used to turn debugging output on/off // representation of a single item within a linked list struct ItemRank { int rank; // positive integer rank string description; // text description of item ItemRank* next; // pointer to the item after/behind this item (closer to the back) ItemRank* prev; // pointer to the item ahead/before this item (closer to the front) }; // representation of a list of items struct ItemList { ItemRank* front; // the first item in line ItemRank* back; // the last item in line int size; // number of items in the list }; // delete all the allocated items in a list then re-initialize it void deallocate(ItemList &rankings); // load item rankings from the named file, store in rankings, // return true if successful, false otherwise bool readRankings(string filename, ItemList &rankings); // initialize a list as empty void initList(ItemList &rankings); // insert the item at the back of the rankings list // return true if successful, false otherwise bool insertBack(ItemList &rankings, ItemRank* item); // remove the front item from the rankings list // return the pointer to it (null if list was empty) ItemRank* removeFront(ItemList &rankings); // return a pointer to the nth item in the list, null if N not valid ItemRank* getNthItem(ItemList rankings, int N); // return true iff descript is a valid item description // (i.e. contains at least one non-whitespace character) bool validDescription(string descript); // run a head-to-head matchup // - pick distinct pair of items // - present user with choice, get valid response // - update rankings based on response // return true iff the matchup is successful bool runMatchup(ItemList &rankings); // write rankings to a file // return true iff successful bool writeRankings(string filename, ItemList rankings); // calculate and return new rank for X based on current ranks for X and Y // and score from X's perspective: 1=X win, 0.5=draw, 0=Y win // return unchanged RankX in the event of an invalid parameter int calcNewRank(int RankX, int RankY, float score); // present the user with a choice between items X and Y // return 1 if they choose X // 0 if they choose Y // 0.5 if it's a draw // (repeat until they make a valid choice) float getChoice(ItemRank* X, ItemRank* Y); // remove rankings from the two sorted source lists, merging them into the destination list bool merge(ItemList &source1, ItemList &source2, ItemList &destination); // sort the rankings from lowest rank to highest void msortRanks(ItemList &rankings);

How our read routine changes

In our lab2 version of reading from a file we could simply put the next item into the next array position. Now we need to create a new item (like the allocate function from the lecture on Feb. 25) and then insert it into the list of items (like the insertAtBack function from the Feb25 lecture).

How picking two random items for a match changes

In our lab2 version for an array of N elements we could simply pick two different random numbers in the range 0..N-1 and grab those two items for our matchup.

In lab3, however, the list doesn't inherently have a size associated with it and we have no way to jump directly to the i'th item in the list.

One possible solution is to add a size field to the list struct: initialize it to 0 and increment it every time you do an insert (so the size always matches the number of elements currently in the list).

Then we can still pick two random numbers in the range 0..N-1 (or 1..N), but we need to scan the list from the front to find the i'th item. This is essentially just a for loop that counts how many items to skip across, then return a pointer to the item you land on. If you look at the example design above, you'll see a getNthItem function that does exactly that.

Mergesort on linked lists

If you're looking at the slides/sample code online, you'll see the mergesort and merge in that example ( mergesort.cpp ) is based on sorting an array, whereas here in lab3 you'll be using mergesort on a linked list (like the msortRanks and merge functions in the design above).

This does involve some extra juggling to translate an array-based solution into a linked-list style. The core steps are still the same for the recursive mergesort function itself (msortRanks above):

split the original linked list into two roughly-equal halves
call mergesort (msortRanks) on each half
merge the two sorted half-lists back into one sorted whole

The big changes are in steps 1 and 3.

- In step one, to split the list in half, we can't simply point at the middle and break the list in half since we only have pointers to the two ends (front and back). What we can do, is create two new empty lists and go through the 'big' list one item at a time, removing them from the big list and alternating between putting them in little list one and little list two.

- In step three, to merge the two lists together into one big one, we just keep looking at the current front item in each of the two little lists, picking the smaller of them and moving it from the little list (e.g. removeFront) and putting it into the big list (e.g. insertBack).

Second week discussion: planning test cases

Just to reiterate points discussed earlier in the term, it's crucial for developers to create good habits in thoroughly testing their own code: both the program as a whole and the individual functions.

A substantial part of this comes down to recognizing what constitutes good test cases: how do we ensure that the code we've written actually does behave as intended, both on 'good/normal' user data and also in its handling of user mistakes.

Going through the specifications/requirements for a product, nearly every sentence should trigger testing ideas in the minds of the developer, covering all the different ways in which valid/invalid data may be provided by the user.

It is important not to make the assumption that our code correctly handles anything, valid or invalid, without actually verifying that with a corresponding test case.

Considering our current program as a case study, there are four key areas where the user provides data:

the filenames provided as command line arguments
the contents of those files
the user's selection of how many head-to-head matchups to run
the user's responses to the individual head-to-head matchups

In each of these areas, we should come up with a list of typical valid data options with an extra focus on edge cases (valid but barely), and typical erroneous data with (again) a focus on edge cases (invalid but barely).

For each of the test scenarios we come up with, we should come up with the precise data to be used for the test case and the precise results that we should then see from the program (all output messages and all changes to the file content).

In our case study, the kinds of test cases we would need (at least) would include:

for the filenames provided as command line arguments:
- a simple valid filename, e.g. "test17", where the file has valid content
- a filename and path, e.g. "tests/whatever"
- no filename (error message expected)
- the name of a file that does not exist (error message expected)
- multiple filenames (error message expected)
for the contents of those files:
- a file with just two simple valid entries
- a file with multiple simple valid entries
- a file with blank lines and whitespace lines in between valid entries, should skip these
- a file where the descriptions are multiple words long with a variety of embedded space
- a file with minimum valid ranks (would also do some with maximum valid ranks if we had defined an upper limit)
- a file with descriptions that are just punctuation (should be ok)
- a file with descriptions that are just single characters (should be ok)
- a file entirely filled with whitespace, error message expected
- test with a completely empty file (no content at all), error message expected
- a file with lines missing the ranks, error message expected, lines skipped
- a file with lines with negative ranks, error message expected, lines skipped
- a file with lines with non-integer ranks, error message expected, lines skipped
- a file with lines missing the descriptions, error message expected, lines skipped
for the user's selection of how many head-to-head matchups to run:
- just one matchup
- two matchups
- some larger number of matchups
- an attempt to use 0 matchups (should error and prompt again)
- an attempt to use negative matchups (should error and prompt again)
- an attempt to use nonnumeric matchups (should error and prompt again)
- an attempt with each of the above errors multiple times (to check the repeat until valid)
for the user's responses to the individual head-to-head matchups:
- a response of F
- a response of f
- a response of S
- a response of s
- a response of D
- a response of d
- a response with any other character (should error and prompt again)
- multiple consecutive invalid responses (to check the repeat until valid)

Before submitting a solution, look at the list above and ask yourself if you have tested your final submission of the lab against each of those scenarios.

Second week discussion: debuggers/gdb

Another key developer skill is that of debugging: being able to track down the root cause of errors in our code. We may learn of those errors through testing (as above) or through feedback from users or other developers, but we must then have a plan of attack for detecting the source of the issue.

The first crucial step is to come up with one or more hypotheses as to the possible sources, and then develop a plan for either confirming or rejecting each hypothesis. Such plans might include refining our test cases to narrow down specific possibilities, and might also include examining the code's behaviour as it executes during a test case. This latter approach is made significantly easier through the use of debuggers: tools which allow us to pause the program as it runs and examine (or even modify) the values of variables and memory to get greater insights as to its actual run-time behaviour (as opposed to what we thought it *should* be doing).

We'll look at the use of one basic debugger, gdb, but most available debuggers and IDEs (integrated development environments, such as VSCode) provide a very similar suite of capabilities.

I strongly recommend you try these steps on your lab3x, and get used to routinely using debuggers as part of your programming toolbox.

Preparing for gdb use
- gdb requires that we compile with the -g option (already done in the makefiles if you're using gdb on your lab code)
Starting gdb and setting up break points
- To start gdb on an executable file use gdb ./filename, e.g.
  gdb ./lab3x
- After printing a bunch of startup/loading information, gdb will provide you with a command prompt that looks like (gdb)
- At this point we'll instruct gdb where we want it to pause and wait for more instructions.
  These can be line numbers or functions within one of the .cpp files, and are set using the break command, e.g. the following would tell it to pause whenever it hits line 39 in lab3.cpp and whenever it starts the runMatchup function
```
break ranking.cpp:runMatchup
break lab3.cpp:39
```
Running/controlling the program through gdb
- To start the program we use run and whatever command line arguments we would ordinarily pass to the program we're testing, e.g. to run lab3x on file tests/posted we would enter the following command at the (gdb) prompt run tests/posted
- The program will run normally (well, maybe a bit slower than normal), including displaying output and waiting for user input at the usual spots.
- If/when the program hits one of the breakpoints we set earlier then it will pause and give us another (gdb) prompt and wait for us to tell it what to do next.
  The most common options are as follows, but there are many many others:
  - tell it to run just one more instruction then pause again using
    step
  - tell it to run just one more instruction then pause again, but treat function calls as a single action (rather than going 'inside' them to step there)
    next
  - get it to display the value of a variable or constant (that is accessible in the current function), e.g. for variable x
    print x
  - set a new value for a(n accessible) variable, e.g. set x to 200
    set variable x = 200
  - show the sequence of function calls that led to the current spot in the program (especially useful right after a crash)
    backtrace
  - show the lines of code immediately before/after the current location in the code
    list
  - tell it to resume running normally (until the next breakpoint) using
    continue
  - tell it to quit/close (it may ask for confirmation to kill the running program)
    quit
  Note that most of these commands can be abbreviated to just a single letter, e.g. c for continue, n for next, etc.
Recommended experimentation Whether in gdb or vscode, practice using debuggers to examine your code's runtime behaviour.
I recommend trying it out today with your lab3x: set some breakpoints and examine the variable values and code sequence with p, n, s, c, etc. It's far handier than needing to embed cout statements to print variable values when you're trying to figure out what/where the code is going wrong as it executes!