Due Date - 11/19/2009 at 11:59 pm

The goal of this assignment is to learn more about binary search trees by implementing a sorting algorithm called "tree_sort". As you know, when you traverse a binary search tree, you can print all of the data values in numerical order if you do an "in-order" traversal. We can use this information to devise a fast but simple sorting algorithm:

- Initialize an array of N unsorted data values using a random number generator or reading from a file.
- Insert the N values into a binary search tree.
- Extract sorted data into an array using an "in-order" traversal of the binary search tree.

When the data array in in random order, the insertion process will take O(logN) steps to insert each value, so the overall cost of inserting N items is O(N*logN). The traversal phase looks at each data node once, so it takes O(N) steps. Hence, the "tree_sort" algorithm should take O(N*logN) time like "quick_sort" and "merge_sort".

One downside for this algorithm is the memory overhead. Each node in the binary search tree has one data value, and two pointers. Hence, the total space for sorting N values is 4*N (the original array plus the 3 storage locations per node).

The other downside for this algorithm is the worst case behavior. As you saw in lab, the binary tree becomes very tall and skinny, like a linked list, when the data is inserted in sorted order. In this case, the insertions take O(N) steps each, so the algorithm is O(N^2) like "quick_sort" for sorted data. To avoid this worst case behavior we have three options:

- Check for sorted data prior to calling "tree_sort" and do nothing if the data is already sorted.
- Randomize the input data to "unsort" the data before insertion.
- Use a fancy insertion algorithm that traverses the input array in "binary search order" during insertions.

Your program should implement ONE of the methods above to avoid O(N^2) worst case behavior. Do this AFTER you have the basic algorithm implemented and tested.

Your primary design task is to decide what functions to implement in your main program, what methods to add to the binary tree class, and how to call to perform the sorting task. For example, one option would be to implement a function called "tree_sort(int data[], int low, int high)" and BinaryTree methods "Extract(...)" and "ExtractHelper(..., Node *Tree)" that are called by "tree_sort". That way, at some future date you could merge your code with the sort.cpp code if you wanted to compare the speeds of "tree_sort" to other sorting algorithms.

Since we already have an implementation of binary search trees in tree.h and tree.cpp, you should use this code to build your "tree_sort" program. You are also welcome to borrow code from "sort.cpp" as needed to create and initialize arrays with N random values for sorting.

If you look closely at our binary search tree implementation, you
will see that we are currently **discarding** duplicate data values.
You will need to change this so duplicates are correctly stored in the tree.
This should be a minor code change, but with a big impact on correctness.

You need to model your code to extract data from the binary search tree on the current "Print" method. To do so, you will probably need to add some additional parameters and modify the logic to copy data into an output array instead of printing to cout. To test this you can print out data from the array and compare this to what the "Print" command outputs.

Test your program to check that it operates correctly for all of the requirements listed above. Try a few different array sizes, and also unsorted and sorted data to confirm that your program does what you intended. Finally, check the error handling capabilities of your code. Save your testing output in text files for submission on the program due date.

When you have completed your C++ program, write a short report (less than one page long) describing what the objectives were, what you did, and the status of the program. Does it work properly for all test cases? Are there any known problems? Save this report in a separate text file to be submitted electronically.

In this class, we will be using electronic project submission to make sure that all students hand their programming projects and labs on time, and to perform automatic analysis of all programs that are submitted. When you have completed the tasks above go to the class web site to "submit" your documentation, C++ program, and testing files.

The dates on your electronic submission will be used to verify that you met the due date above. All late projects will receive reduced credit (50% off if less than 24 hours late, no credit if more than 24 hours late), so hand in your best effort on the due date.

You should also PRINT a copy of these files and hand them into your teaching assistant in your next lab. Include a title page which has your name and uaid, and attach your hand written design notes from above.