- Introduction to Sorting
- Generating Unsorted Data
- Comparing Bubble Sort and Selection Sort
- bubble_sort on N unsorted data (use random_data)
- bubble_sort on N semi-sorted data (use shuffle_data)
- bubble_sort on N sorted data (use sort twice)
- selection sort on N unsorted data (use random_data)
- selection sort on N semi-sorted data (use shuffle_data)
- selection sort on N sorted data (use sort twice)
- Submit Work

The purpose of this lab is to evaluate the performance of two well known sorting algorithms: bubble sort and selection sort. We will do this by calculating the cpu times to sort different types of input data (unsorted, semi-sorted, and sorted) using different sorting algorithms. As we will see there can be some dramatic differences in performance.

The purpose of sorting is to put a sequence of data records in ascending or descending order based on a sorting key. For example, to sort student records based on last name, or sort baseball player records based on batting average. The problem of sorting data has been around a long, long time. See the Wikipedia entries for "Hollerith card" and "tabulating machine" for more information.

There have been dozens of different sorting algorithms invented over the years, each with different advantages and disadvantages. In the next two labs, we will be studying six classic sorting techniques: bubble sort, insertion sort, selection sort, quick sort, merge sort, and bucket sort. These are described in more detail in your text book, and also on the "sorting" Wikipedia page.

There are two ways to generate unsorted data for sorting experiments. The first way is to use a random number generator and create N values within a range [L..H]. This is like rolling dice over and over again. The second method starts with sorted data and randomly shuffles data around to create unsorted data. This is essentially what we do when we shuffle a deck of cards. These two approaches are used in the code below to initialize an array of integers.

//------------------------------------------------------------------ // Initialize data array with random values between 0 and range-1 //------------------------------------------------------------------ void random_data(int data[], int count, int range) { // Put specified count of random numbers into data array for (int index = 0; index < count; index++) data[index] = rand() % range; } //------------------------------------------------------------------ // Shuffle data values inside data array //------------------------------------------------------------------ void shuffle_data(int data[], int count, int shuffles) { // Shuffle data by swapping random pairs of values for (int index = 0; index < shuffles; index++) { int pos1 = rand() % count; int pos2 = rand() % count; int temp = data[pos1]; data[pos1] = data[pos2]; data[pos2] = temp; } } //------------------------------------------------------------------ // Write data array to output file //------------------------------------------------------------------ void write_data(string name, int data[], int count) { // Open output file ofstream dout; dout.open(name.c_str()); if (dout.fail()) cout << "Error: could not open output file\n"; // Write the data dout << count; for (int i = 0; i < count; i++) { if (i % 20 == 0) dout << endl; dout << data[i] << " "; } // Close the file dout.close(); }

Create a new program in NetBeans and cut and paste the
functions "random_data", "shuffle_data" and "write_data"
above into your program. Now create a main program that declares
an array of integers, and calls "random_data" and "write_data"
to create a data file "random.txt" with 100 random values
between 0 and 99. If you look at this file, it should look
almost as *unsorted* as possible.

Now extend your main program to call the "shuffle_data"
function on an array of 100 integers to create the file
"shuffle.txt". Before you call the "shuffle_data" function,
you need to initialize the array to contain values
[0, 1, 2, ... 99] in ascending order. Try a few different
values of the "shuffles" parameter to choose a value
that creates data that looks *semi-sorted* (some sequences of
values are in the correct order, but others are out of order).

Cut and paste your main program below:

Cut and paste one of your shuffle_data results below:

Cut and paste the functions "bubble_sort" and "selection_sort" below into your program.

//----------------------------------------------------------------- // Bubble sort algorithm //----------------------------------------------------------------- void bubble_sort(int data[], int count) { int pass = 1; int exchange = 1; // Bubble largest value to the right N times while ((pass < count) && (exchange > 0)) { // Scan unsorted part of data array exchange = 0; for (int index = 0; index < count - pass; index++) { // Swap two data values if out of order if (data[index] > data[index + 1]) { int temp = data[index]; data[index] = data[index + 1]; data[index + 1] = temp; exchange++; } } pass++; } } //--------------------------------------------------------------- // Selection sort algorithm //--------------------------------------------------------------- void selection_sort(int data[], int low, int high) { // Put largest unsorted value at end of sorted list for (int last = high; last > low; last--) { // Find index of largest value in unsorted array int largest = low; for (int index = low + 1; index <= last; index++) if (data[index] > data[largest]) largest = index; // Swap with last element in unsorted array int temp = data[last]; data[last] = data[largest]; data[largest] = temp; } }

Now we are ready to do some experiments to compare the speed of these two sorting algorithms. Your next task is to modify your main program to calculate the CPU times for the following tasks:

Use the code below to measure the CPU time. Calculate the run times for N=1000. If the times are too small to measure, increase N as needed until you can get a CPU time greater than a few seconds. Stop your program if it uses more than a minute of CPU time and decrease N accordingly.

// Get start time clock_t time1=clock(); // Some chunk of code // Get end time clock_t time2=clock(); double run_time = (time2-time1)/(double)CLOCKS_PER_SEC; cout << "Run time: " << run_time << " seconds\n";

Cut and paste your modified main program below:

Enter the results of your six timing experiments below:

This lab assignment will be submitted electronically to the TAs once you fill in the fields below and click on the "submit" button. You do NOT need to print a copy of this document to hand in.

Your UAID number: | |

Your website PASSWORD: |