CSCE 5013 - Homework 2
Due Date - September 29, 2011 at 11:59 PM

1. Problem Statement:

The goal of this programming assignment is to perform the character matching portion of optical character recogition (OCR). You will be given a folder of license plate letter images that have been automatically extracted from car photographs. These letter images are of different sizes and may have distortion due to motion blur or uneven illumination. Your task is to compare these N images to each other to determine which characters match each other. Students have the option of performing either "template matching" or "feature matching" to compare characters. Your NEXT programming assignment will combine automatic letter extraction from car images, character identification via template or feature matching, and spatial analysis of letters to perform full OCR on licence plate images.

2. Design:

The letter images that are provided in the src/hw2/letters folder were automatically extracted from the car images in src/hw2/cars. As you will see, these images have a wide range of sizes and shapes, so one of your first tasks is to "normalize for size". One option is to use the "Interpolate" or "MakeIcon" method in your program to resize each letter image prior to template matching or feature matching. Another option is to embed size normalization in the matching algorithm itself.

The letter images are stored in grey scale JPEG images, so you will need to convert these into a B/W image using either the "Threshold" or "ThresholdAutomatic" methods. If the output images look rough or noisy, you may want to smooth the input image or the threshold image using "Average", "Binomial", "Median" or another smoothing method. You can use the "menu" command to try a few of these smoothing techniques to select what you want for your program.

The "template matching" approach requires that we compare each pixel in the input image to the corresponding pixel in the template image. To do this, you can use (a) sum of matching black pixels, (b) sum of matching white pixels, (c) sum of matching black or white pixels, or (d) Euclidean distance to get a match score. Even with "size normalization" it is possible that the input image will not be perfectly aligned with the template image. For this reason, you need to consider a range of possible alignment offsets between the input image and the template image in order to find the best match score. For example, if the offset is (dx,dy) this will mean that you need to compare input[y+dy][x+dx] to template[y][x] for all valid (x,y) values. Make sure you check that (x+dx,y+dy) are valid image coordinates and perform zero padding or wrap around as needed.

The "feature matching" approach requires that we calculate K features for each letter image, and then we compare these features using either Euclidean distances (sum of feature differences squared) or Absolute distances (sum of absolute feature differences) to get a match score. Since features are typically much smaller than the images themselves feature matching is faster than template matching in most applications. The easiest letter features to calculate are central moments:

M0 = area
M1(x) = meanx
M1(y) = meany
M2(xx) = variance xx = sum[(x-meanx)(x-meanx)]/area
M2(xy) = covariance xy = sum[(x-meanx)(y-meany)]/area
M2(yy) = variance yy = sum[(y-meany)(y-meany)]/area
Other letter features can be computed by dividing the input image into N by M regions, and calculating the percentage of black pixels in each sub-region. This is logically very similar to interpolating to an N by M image and doing template matching.

Once you have implemented and tested your function for matching one input image to one template image, you can create a main program that does an "exhaustive search" of all possible character matches. For example, if you have L letter images, you need to match each letter to the L-1 other letters to see which gives the BEST match score. For evaluation purposes, output your results in an ascii file with one line per image in the following format: "best_match_score input_image_name matching_image_name". If you sort this ascii file using "sort -n" you can see which characters had the best match scores and which had the worst match scores, and look at the corresponding images to see if they match correctly or not. You can also use the "menu subtract image1 image2 image1-image2" command to create a difference image to look at.

3. Implementation:

You can implement this program using either a bottom-up approach or a top-down approach. If you go for a bottom-up approach, start by creating basic methods and classes, and test theses methods using a simple main program that calls each method. When this is working, you can create the main program that uses these methods to solve the problem above.

If you go for a top-down approach, start by creating your main program that reads user input, and calls empty methods to pretend to solve the problem. Then add in the code for these methods one at a time. This way, you will get an idea of how the whole program will work before you dive into the details of implementing each method and class.

Regardless of which technique you choose to use, you should develop your code incrementally adding code, compiling, debugging, a little bit at a time. This way, you always have a program that "does something" even if it is not complete.

When you think you are about 1/2 way through the program, upload a copy of your source code and your program output at that point. Be sure to hand in something that compiles even if it does not do much when it runs.

4. Testing:

Test your program to check that it operates correctly for all of the requirements listed above. Also check for the error handling capabilities of the code. Try your program on 2-3 input documents, and save your testing output in text files for submission on the program due date.

5. Documentation:

When you have completed your C++ program, write a short report (1-2 pages long) describing what the objectives were, what you did, and the status of the program. Does it work properly for all test cases? Are there any known problems? Save this report in a separate text file to be submitted electronically.

6. Project Submission:

In this class, we will be using electronic project submission to make sure that all students hand their programming projects and labs on time, and to perform automatic analysis of all programs that are submitted. When you have completed the tasks above go to the class web site to "submit" your documentation, C++ program, and testing files.

The dates on your electronic submission will be used to verify that you met the due date above. All late projects will receive reduced credit (50% off if less than 24 hours late, no credit if more than 24 hours late), so hand in your best effort on the due date.

You should also PRINT a copy of these files and hand them into your teaching assistant in your next lab. Include a title page which has your name and uaid, and attach your hand written design notes from above.