CSCI 241 - Homework 6: Sorting it all out
Creating a Unix tool
Due by 11:59.59pm Wednesday, April 12th
The Github URL for this assignment is https://classroom.github.com/a/a3Kgxt2x As there are a number of components to this assignment, I’d encourage you to get started before the last minute.
Introduction
You will be creating a copy of a commonly used Unix tool; a tool that allows you to sort text in various ways.
You’ll also learn about the C99 replacement for atoi() and get some additional experience working with strings.
As with the previous assignment, I’d like you to read through the project specification and then formulate an estimate of the length of time that you expect it will take to complete the project. Record your estimate in the README before you begin coding.
sort
The tool you’ll be creating is sort, modeled after the Unix tool of the same name.
This program reads in lines of text and then sorts them based on specified criteria. By default, it does so using the ordering of strcmp().
Example
INPUT: ALICE bob BOB CAROL alice carol
OUTPUT: ALICE BOB CAROL alice bob carol
Folding case
Notice how all the capital letters come before the lower case ones? That is not a mistake. The numeric value of ‘A’ is 32 less than the value of ‘a’ so it is natural for them to come first. (See notes below if odd behavior occurs.)
However, sometimes you don’t want this to take place. Therefore you should support the -f flag which folds lowercase letters into uppercase ones (i.e., case insensitive comparisons).
OUTPUT: ALICE alice BOB bob CAROL carol
Ordering of otherwise identical upper vs. lower case lines is up to the sort, not something you need to handle.
Numeric sorting
Sometimes, you want to sort lines based on a leading number. To do that you’ll use the -n flag. Note that this should ignore leading whitespace, treat the first thing as a number, then sort based on the rest of the line as before if multiple lines have the same leading number.
INPUT: 0 Alice 99 Bob 20 Carol 172 Eve 76 Mallory 9 Trent 59 Plod 14 Steve
OUTPUT: 0 Alice 9 Trent 14 Steve 20 Carol 59 Plod 76 Mallory 99 Bob 172 Eve
I’d like you to do this by implementing long mystrtol(char *start, char **rest) based on the strtol() function added in to C99 to replace atoi().
This function should take a pointer to the start of a string, skip leading whitespace, and attempt to convert the next characters into a long integer. If it isn’t a number, you should return 0.
The second parameter is where you can pass in the address to a character pointer. If this value is not NULL, you should store the address of the first character after the number you interpreted. If no number was interpreted, you should just set it to the start of the line, whitespace included.
NOTE: The real version of strtol() takes a third parameter that lets you specify the base of the number being interpreted, but we aren’t implementing that.
/* Example of mystrtol */ char line[] = " -56 Cards in a negative deck"; long value; char *rest; value = mystrtol(line, &rest); // After the call, the following should be true assert(value == -56); assert(rest == &line[4]);
Reversing the sort
You sometimes want things in descending order instead of ascending order. To do this, you will support the -r flag to reverse the sort, regardless of the other options specified. So you can do reverse numeric, folded, or normal sorts.
Programming notes
Some guidelines you should follow when working on your solution:
- Options are always given as separate fields. You don't need to handle things like -js for this.
- See the strcmp() man page for pointers to other comparison functions, like those that ignore case.
- If strcmp() seems to already be ignoring case, check the value of the locale command. Do things change if you do something like "LC_COLLATE=C sort" when you run your command?
- Read all of your input in from stdin. You can do this using getchar() -- just don't forget to include a null byte at the end of each line.
- The maximum line length you should read is 1024 bytes. If an input line is longer than this, only store the first 1024 bytes (but you need to consume the rest of the line).
- Use an array of character pointers to store your pointers to lines.
- Use a constant to set the size of the array of char pointers at 1,048,576 lines (1024 * 1024). You will need to dynamically allocate the array (i.e., use malloc).
- Use malloc() to allocate storage for each line you read.
- Use free() at the end of your program to deallocate everything.
- Use valgrind to verify that you did deallocate it all.
- Use the built-in qsort() function to sort your lines.
More on command line arguments
In addition to the flags listed above, I’d like you to implement a -h and -? flag that prints out a brief usage message and then exits the program with a non-zero value. You should do the same behavior if an unknown flag is passed to your program.
You should have your program’s main function return 0 upon successful completion of the assigned task.
handin
README
Create a file called README that contains
- Your name and a description of the program
- A listing of the files with a short one line description of the contents
- Any known bugs or incomplete functions
- Your estimated time from the start and the actual time taken, feel free to speculate as to the reason for any differences
- Your affirmation as to the honor code if you followed it
Now you should make clean to get rid of your executables and handin your folder containing your source files, Makefile, and README.
% git add % git commit % git push
Extra Credit
For Extra Credit, implement QuickSort or MergeSort instead of using the built-in function. Compare how fast your implementation is to the library one.
Grading
Here is what I am looking for in this assignment:
- A working Makefile with your program, all, and clean as targets
- Appropriately modular code
- Good comments
- A working mystrtol() implementation
- All flags working as described above
- Programs which run under valgrind without errors
- A statement about any valgrind errors which occur
- A README with the information above. The listing of known bugs is important.
Last Modified: Oct 29, 2015 - Roberto Hoyle
</body> </html>