CSCE 2014 - Homework 2
Due Date - 02/23/2012 at 11:59 PM

1. Problem Statement:

The purpose of this assignment is to give students experience using linked lists to solve data processing problems. Your task to to read the words of a book that are stored in an ascii file, and perform a variety of operations on these words. Linked lists are ideal for this task because we do not know in advance what words to expect in the book.

The goal of your program will be to process the input file and calculate the following interesting facts about the book:
1) How many words start with each of the 26 letters 'a' to 'z'?
2) Which letter starts the fewest words?
3) Which letter starts the most words?
4) What words in the books start with a given letter?

To test and evaluate your program you can use all/part of "copperfield.txt", "crusoe.txt" or "time-machine.txt" in the CSCE 2014 source folder. These books were downloaded from http://www.gutenberg.org, which is a great place to find free books to read (or process).

2. Design:

In order to answer these questions, you will need to create an array of 26 linked lists to store the words that begin with 'a' to 'z'. You should use the List class defined in "list.h" and "list.cpp" to do this. The basic Insert, Search, Delete methods should let you store the word data, and you can use Print to see the contents of a linked list.

The tricky part of the problem is counting words. In particular, what should you do when you see a word multiple times? Add it to the linked list multiple times? Update a counter of how many times you have seen the word? You will get different answers to questions 1-3 based on your choice. My advice is to start with the easiest solution first, debug your program, and then try the more complex solution when you have the easy solution completed.

Another problem is counting the number of words that start with a given letter. To answer this you need to know the number of nodes in a linked list. Since there is no built in method for this, you need to extend the List class to answer this question. Again, there are several design choices with different pros/cons. Again my advice is to keep it as simple as possible.

3. Implementation:

You can implement this program using either a bottom-up approach or a top-down approach. If you go for a bottom-up approach, start by creating basic methods and classes, and test theses methods using a simple main program that calls each method. When this is working, you can create the main program that uses these methods to solve the problem above.

If you go for a top-down approach, start by creating your main program that reads user input, and calls empty methods to pretend to solve the problem. Then add in the code for these methods one at a time. This way, you will get an idea of how the whole program will work before you dive into the details of implementing each method and class.

Regardless of which technique you choose to use, you should develop your code incrementally adding code, compiling, debugging, a little bit at a time. This way, you always have a program that "does something" even if it is not complete.

When you think you are about 1/2 way through the program, upload a copy of your source code and your program output at that point. Be sure to hand in something that compiles even if it does not do much when it runs.

4. Testing:

Test your program to check that it operates correctly for all of the requirements listed above. Also check for the error handling capabilities of the code. Try your program on 2-3 input documents, and save your testing output in text files for submission on the program due date.

5. Documentation:

When you have completed your C++ program, write a short report (less than one page long) describing what the objectives were, what you did, and the status of the program. Does it work properly for all test cases? Are there any known problems? Save this report in a separate text file to be submitted electronically.

6. Project Submission:

In this class, we will be using electronic project submission to make sure that all students hand their programming projects and labs on time, and to perform automatic analysis of all programs that are submitted. When you have completed the tasks above go to the class web site to "submit" your documentation, C++ program, and testing files.

The dates on your electronic submission will be used to verify that you met the due date above. All late projects will receive reduced credit (50% off if less than 24 hours late, no credit if more than 24 hours late), so hand in your best effort on the due date.

You should also PRINT a copy of these files and hand them into your teaching assistant in your next lab. Include a title page which has your name and uaid, and attach your hand written design notes from above.