CSCI 415/515: Fall 2022
Systems Programming
Project 2: fed

Due: Tue, Sept 27, 7:00am


In this project, you will implement a basic file editor, which we will call fed. Later in the semester, we will learn about ed and sed, which are two powerful command-line file editors.

Name

fed - edit a file

Synopsis

fed [-h] [-s START] [-e END] [-r] [-k] FILE

Description

The fed editor either prints or modifies FILE according to an operation, which the user gives as an option. The operations are:

print
Print (part of) the file to stdout.
remove
Remove bytes from the file.
keep
Keep bytes in the file, removing the others.

The default operation is print. Only one operation may be specified (the process should report an error and exit without performing any operations if the user supplies more than one operation).

An operation usually operates on the range of bytes in the file corresponding to byte indices [START, END), where index 0 is the first byte in the file. If not specified, START defaults to index 0, and END to the file's size (that is, one more than the last byte index). We denote the file's original size as FSIZE.

Options

-h, --help

Print a usage statement to stdout and exit with status 0.

-s, --start START

The start index for an operation. If not specified, START defaults to 0. START must be in the range [0, FSIZE]. (N.B., FSIZE is a valid value so that the insert command can append data.)

-e, --end END

The end index for an operation. If not specified, END defaults to the file's size. END must in the range [0, FSIZE]. It is an error if START > END.

-r, --remove

The remove operation. Remove the bytes in the file from indices [START, END). Any remaining bytes from [END, FSIZE) are shifted down to START. The file's new size is (FSIZE - (END - START)).

-k, --keep

The keep operation. Keep the bytes in the file from indices [START, END), and removes all others. These kept bytes are shifted down to index 0. The file's new size is (END - START).

Exit Status

The exit status is 0 if the operation succeeeded, and non-zero (conventionally 1) if the operation failed or the process encountered an error parsing the command-line options.

Bonus 1

Implement the following operation:

-x, --expunge

The expunge operation. Overwrite the bytes in the file form indices [START, END) with * characters. The file size does not change.

Remember to include a blank file called bonus1 in your project submission so that I know to grade this feature.

Bonus 2

Implement the following operation:

-i, --insert STR

The insert operation. Insert the string STR into the file at index START, shifting the existing bytes up. The file's new size is (FSIZE + strlen(STR)).

Remember to include a blank file called bonus2 in your project submission so that I know to grade this feature.

Simplifying Assumptions

We will assume that our single fed process has exclusive access to to file; thus, no other process will modify the file concurrently. In other words, our program can assume that the file's contents and size do not change from under it during an operation.

Submitting

Submit your project as a zip file via gradescope. Your project must include a Makefile that builds an executable called fed. Please refer to the instructions for submitting an assignment for details on how to login to gradescope and properly zip your project.

Rubric

Input Files: alphabet

When working on the project, it's helpful to have a mapping from byte index to alphabet letter in front of you:


                1111111111222222
      01234567890123456789012345
      abcdefghijklmnopqrstuvwxyz
      

-h, --help


1.1 Print a usage statement (2 pts)


        ./fed --help
        

Prints a usage statement to stdout. The statement must start with either Usage or usage; you decide the rest of the message. Conventionally, this option either prints the synopsis or a more verbose statement that also includes a description of the options.

1.2 Zero exit status (2 pts)


        ./fed --help
        echo $?
        0
        

The exit status is zero.

Bad usage: More than One Operation


2.1 Print error message (1 pt)


        ./fed -k -r -s10 -e15 alphabet
        

Print a one-line error message to stderr (you decide the contents of the message).

2.2 Nonzero exit status (1 pt)


        ./fed -k -r -s10 -e15 alphabet
        

The exit status is nonzero.

Bad usage: Unknown option


3.1 Print error message (1 pt)


        ./fed -s10 -e15 -q alphabet
        

Print a one-line error message to stderr (you decide the message).

3.2 Nonzero exit status (1 pt)


        ./fed -s10 -e15 -q alphabet
        

The exit status is nonzero.

Bad usage: Missing option argument


4.1 Print error message (1 pt)


        ./fed -s -e 9 alphabet
        

Print a one-line error message to stderr (you decide the message).

4.2 Nonzero exit status (1 pt)


        ./fed -s -e 9 alphabet
        

The exit status is nonzero.

Bad Usage: Invalid range


5.1 Negative start error message (2 pts)


        ./fed -s -1 alphabet
        

Print a one-line error message to stderr (you decide the message).

5.2 Negative start nonzero exit status (2 pts)


        ./fed -s -1 alphabet
        

The exit status is nonzero.

5.3 End too large error message (2 pts)


        ./fed --start 10 --end 27 alphabet
        

Print a one-line error message to stderr (you decide the message).

5.4 Start after end (2 pts)


        ./fed -s 10 -e 9 alphabet
        

Print a one-line error message to stderr (you decide the message).

File does not exist


6.1 Print error message (1 pt)


        ./fed nonexistent
        

Prints a message to stderr that includes the text No such file (case-insenstive).

6.2 Nonzero exit status (1 pt)


        ./fed nonexistent
        

The exit status is nonzero.

Print operation


7.1 Cat (5 pts)


        ./fed alphabet
        abcdefghijklmnopqrstuvwxyz$
        

Has the above output (stdout). Note that the output does not have a newline, and so the shell prompt immediately follows.

7.2 Exit status (5 pts)


        ./fed alphabet
        echo $?
        0
        

Exit status is zero.

7.3 Tail (5 pts)


        ./fed -s 10 alphabet
        klmnopqrstuvwxyz$ 
        

Has the above output (stdout). Note that the output does not have a newline, and so the shell prompt immediately follows.

7.4 Head (5 pts)


        ./fed --end 7 alphabet
        abcdefg$ 
        

Has the above output (stdout). Note that the output does not have a newline, and so the shell prompt immediately follows.

7.5 Inner (5 pts)


        ./fed --start 14 --end 24 alphabet
        opqrstuvwx$ 
        

Has the above output (stdout). Note that the output does not have a newline, and so the shell prompt immediately follows.

Remove operation


8.1 Head (5 pts)


        ./fed -r -e5 alphabet
        cat alphabet
        fghijklmnopqrstuvwxyz$ 
        

The file is modified as shown above. Note that the output does not have a newline, and so the shell prompt immediately follows.

8.2 Exit status (5 pts)


        ./fed -r -e5 alphabet
        echo $?
        0
        

The exit status is zero.

8.3 Tail (5 pts)


        ./fed -r -s15 alphabet
        cat alphabet
        abcdefghijklmno$ 
        

The file is modified as shown above. Note that the output does not have a newline, and so the shell prompt immediately follows.

8.4 Tail explicit (5 pts)


        ./fed -r -s15 -e26 alphabet
        cat alphabet
        abcdefghijklmno$ 
        

The file is modified as shown above. Note that the output does not have a newline, and so the shell prompt immediately follows.

8.5 Inner (5 pts)


        ./fed -r -s10 -e20 alphabet
        cat alphabet
        abcdefghijuvwxyz$ 
        

The file is modified as shown above. Note that the output does not have a newline, and so the shell prompt immediately follows.

8.6 Empty range (5 pts)


        ./fed -r -s20 -e20 alphabet
        cat alphabet
        abcdefghijklmnopqrstuvwxyz$
        

The file contents are not modified. Note that the output does not have a newline, and so the shell prompt immediately follows.

Keep operation


9.1 Head (5 pts)


        ./fed -k --end=7 alphabet
        cat alphabet
        abcdefg$ 
        

The file is modified as shown above. Note that the output does not have a newline, and so the shell prompt immediately follows.

9.2 Exit status (5 pts)


        ./fed -k --end=7 alphabet
        echo $?
        0
        

The exit status is zero.

9.3 Tail (5 pts)


        ./fed -k --start=20 alphabet
        cat alphabet
        uvwxyz$ 
        

The file is modified as shown above. Note that the output does not have a newline, and so the shell prompt immediately follows.

9.4 Inner char (5 pts)


        ./fed -k -s20 -e21  alphabet
        cat alphabet
        u$ 
        

The file is modified as shown above. Note that the output does not have a newline, and so the shell prompt immediately follows.

9.5 None (5 pts)


        ./fed -k -s5 -e5 alphabet
        cat alphabet
        

The file is now empty.

Bonus 1: Expunge operation


100.1 Head (2 pts)


        ./fed -x -e17 alphabet
        cat alphabet
        *****************rstuvwxyz$ 
        

The file is modified as shown above. Note that the output does not have a newline, and so the shell prompt immediately follows.

100.2 Tail (2 pts)


        ./fed -x -s23 alphabet
        cat alphabet
        abcdefghijklmnopqrstuvw***$ 
        

The file is modified as shown above. Note that the output does not have a newline, and so the shell prompt immediately follows.

100.3 Inner (2 pts)


        ./fed -x -s5 -e20 alphabet
        cat alphabet
        abcde***************uvwxyz$ 
        

The file is modified as shown above. Note that the output does not have a newline, and so the shell prompt immediately follows.

100.4 Inner char (2 pts)


        ./fed --expunge --start 5 --end 6 alphabet
        cat alphabet
        abcde*ghijklmnopqrstuvwxyz$ 
        

The file is modified as shown above. Note that the output does not have a newline, and so the shell prompt immediately follows.

100.5 Exit status (2 pts)


        ./fed --expunge --start 5 --end 6 alphabet
        echo $?
        0
        

The exit status is 0.

Bonus 2: Insert


200.1 Head (2 pts)


        ./fed -i 1234 alphabet
        cat alphabet
        1234abcdefghijklmnopqrstuvwxyz$ 
        

The file is modified as shown above. Note that the file does not have a newline, and so the shell prompt immediately follows.

200.2 Tail (2 pts)


        ./fed -i 12345 -s 26 alphabet
        cat alphabet
        abcdefghijklmnopqrstuvwxyz12345$ 
        

The file is modified as shown above. Note that the file does not have a newline, and so the shell prompt immediately follows.

200.3 Inner char (2 pts)


        ./fed --insert @ -s4 alphabet
        cat alphabet
        abcd@efghijklmnopqrstuvwxyz$ 
        

The file is modified as shown above. Note that the file does not have a newline, and so the shell prompt immediately follows.

200.4 Inner string (2 pts)


        ./fed -i 123456789 -s23 alphabet
        cat alphabet
        abcdefghijklmnopqrstuvw123456789xyz$ 
        

The file is modified as shown above. Note that the file does not have a newline, and so the shell prompt immediately follows.

200.5 Exit status (2 pts)


        ./fed -i 123456789 -s23 alphabet
        echo $?
        0
        

The exit status is 0.

Hint 1

The project description states that the user may specify only one operation (print, remove, keep, expunge, or insert); the program should print a one-line error message and exit with a nonzero status if the user specifies more than one operation.

The following is skeleton code to validate that the user specified only a single operation. The idea is to use a bit mask: each operation corresponds to a bit in an unsigned integer. While processing the command-line arguments with getopt_long, if we notice that the user specified an operation, we set that operation's corresponding bit in the integer variable cmd using the logical OR operator (|).

To make the bit mask values more readable, we use the pre-processor (the #define directives) to define the mask for each operation. These #defines use the left shift operator (<<) to set a single bit: CMD_REMOVE is the value 1 (the first bit is set), CMD_KEEP is the value 2 (only the second bit is set) CMD_EXPUNGE is the value 4 (only the third bit is set), and so on.

When getopt_long finishes parsing the command-line options, we then use a switch statement to check cmd's value against each operation's value: for the default operation (print), this is the zero value (no bits set); for the other operations, it is a value with only a single bit set (the #define'd values). If more than one bit is set, the cmd value matches none of the cases, and we report an error in the switch statement's default case.


      

Hint 2

The remove operation is probably the trickiest feature to implement. The way to approach this feature is to break it down into cases of whether we are removing bytes from the front, middle, or end of the file:

Case 1: Remove head
   Remove   
    Keep    
Case 2: Remove inner
Keep 
   Remove   
Keep 
Case 3: Remove tail
    Keep    
   Remove   
Case 4: Remove all
          Remove          

Case 2 is the most general, and all other cases can be seen as a subcase. However, programmatically, it's helpful to note that we can handle Case 3 and Case 4 simply by calling truncate(file, start) to remove the bytes.

The way to handle Case 2 is to open the file twice: once in read mode and once in write mode. Your program will have two file descriptors open, and we'll call the reading one rfd and the writing one wfd. Use lseek to position rfd at the end byte, and to position wfd at the start byte. Read a chunk from rfd, and write the chunk to wfd, repeating until rfd encounters the end of the file. At this point, you will have effectively "shifted" the tail bytes that you are keeping down, overwriting (some or all) of the "remove" bytes. The last thing to do is to truncate the file to the proper smaller size: FSIZE - (end - start). In other words, the new size is the old file size (FSIZE) minus the number of bytes that you were to remove.

Once you've implemented a function to handle the remove operation, implementing the keep operation is simple: call your remove function twice; once to remove the tail bytes you are not keeping, and once to remove the head bytes you are not keeping.