As our programs grow in complexity, we will need to support more complicated command-line usages beyond mandatory positional arguments, to also include optional arguments. For instance, while the basic usage of grep is:
grep PATTERN FILE
grep also includes optional flag arguments, like:
grep -i PATTERN FILE # case-insensitive search
as well as options that take an argument:
grep -m 5 PATTERN FILE # stop searching after 5 matches
These options can be used to gather, and can even be "grouped" using a single hyphen:
grep -i -m 5 PATTERN FILE # case-insenstive stop searching after 5 matches
grep -im 5 PATTERN FILE # equivalent syntax
Moreover, each of these options has an equivalent long name:
grep --ignore-case PATTERN FILE # same as -i
grep --max-count 5 PATTERN FILE # same as -m
grep --max-count=5 PATTERN FILE # alternative syntax that uses '='
Writing your own code to parse command-line options like these is tedious and error-prone. Fortunately, libc provides us with the getopt and getopt_long library functions.
The major difference between these two functions is that getopt only handles short options (e.g., -i, and -m) and is a POSIX standard, , whereas getopt_long handles both short and long (e.g., --ignore-case, --max-count) options, but is a GNU-extension (though widely supported). Since long options are a nice feature that require little additional work on our part, we will focus on how to use getopt_long.
In this example, we will implement the command-line parsing for a subset of grep's options, namely:
Show usage statement and exit.
Perform a case-insensitive match for PATTERN.
Stop reading FILE after NUM matching lines.
We will call our program grep_stub to denote that the program only performs the command-line parsing, and does not actually search a file. We will build up the program in steps, and then show the complete source code at the end.
The basic template for using getopt_long is:
const char *short_opts = ":him:";
struct option long_opts[] = {
{"help", no_argument, NULL, 'h'},
{"ignore-case", no_argument, NULL, 'i'},
{"max-count", required_argument, NULL, 'm'},
{NULL, 0, NULL, 0}
};
while (1) {
opt = getopt_long(argc, argv, short_opts, long_opts, NULL);
if (opt == -1)
break;
switch (opt) {
case 'h': /* ... */
break;
case 'i': /* ... */
break;
case 'm': /* ... */
break;
case '?': /* ... unknown option */
case ':': /* ... option missing required argument */
default:
/* ... unexpected; here for completeness */
}
}
On each successive call, getopt_long parses the next option it finds in argv and returns a value corresponding to that option. We then use a switch statement to determine the specific option to process. After getopt_long has parsed all options, it returns -1.
When encountering a short option, the value that getopt_long returns is the value of the character. When encountering a long option, we have a few choices on what we would like getopt_long to return based on how we specify the struct option long_opts array on line 2. Here, we use the conventional approach of having getopt_long return the same value as for the short option; we achieve these by specifying that the third element of a struct option is NULL, and the fourth element is the corresponding short option character.
The short_opts variable on line 1 is a string of the short options. An option that takes a required argument is followed by a :. The leading : suppresses getopt_long's normal (terrible) error handling. By default getopt_long prints an error message if it cannot parse the options; we'd prefer to print our own messages. Also by default, getopt_long returns '?' on error: either an unknown option or an option that is missing a required argument. The leading : says to disambiguate these two cases by having getopt_long return '?' for an unknown option and ':' for an option that is missing a required argument.
The following is the complete main function. For now, disregard, mu_str_to_int and the die_* functions; the former simply converts a string to an int, and the latter print an error and terminate the program.
The important points are:
The complete grep_stub program appears below. A few important
things to note. First, as per the Feature Test Macros section of the
getopt_long
man page, to ensure that the libc header files expose the
getopt_long function, we must include the following
#define _GNU_SOURCE
before including header files:
You can read about such defines on the feature_test_macros man page; _GNU_SOURCE is a very common one.
Second, the usage statement uses the trick that the compiler concatenates C strings that are separated only by spaces.
Finally, the order of optional arguments and positional arguments is immaterial: getopt_long will permute argv such that, after it is finished processing the options, all positional arguments are at the end of argv. Note that some programs require that options come first and positional arguments last, as this is the POSIX-ly correct bevhavior: getopt_long includes ways to enable such behavior, if desired.
The complete program is: