Why getopt?

I use getopt almost exclusively in all software that I write by myself, and often insist on using it when collaborating with others, even when the language convention is to use something else.

The reason is simple: getopt is a part of the user interface, and user interfaces should strive to be simple and consistent. As an end user, I find it jarring when, for example, I have to run a script by specifying the interpreter by hand, or when the language-specific extension is a part of the file name. This is an implementation detail which should not concern me - the #! should take care of that for me. Similarly, getopt is over 40 years old, is supported nearly universally, and is easy to understand both for the user and the programmer.

It’s a matter of UX

Users don’t like to be surprised when interacting with a program. If the platform’s convention is to put the “OK” button on the right, and “cancel” button on its left, presenting them in the opposite order is like laying a trap; even you don’t get tripped by it, you must’ve expended additional energy on interpreting the situation.

It’s the same with command line argument parsing. Some people might be used to typing rm -rf, others have rm -fr in their muscle memory.

However a program written e.g. using Go’s flag module might trip someone up, since a single dash is allowed to specify a long option, rather than a set of short options; in an extreme example, -fr and -rf can mean completely different things.

It’s a matter of code and documentation quality

Complex libraries, such as Python’s argparse, hide what is actually going on in your program’s argument handling code. While they allow very fancy things to be expressed tersely, the actual logic becomes opaque to the reader. Consider this example from argparse’s introduction:

import argparse

parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('integers', metavar='N', type=int, nargs='+',
                    help='an integer for the accumulator')
parser.add_argument('--sum', dest='accumulate', action='store_const',
                    const=sum, default=max,
                    help='sum the integers (default: find the max)')

args = parser.parse_args()
print(args.accumulate(args.integers))

Here’s (almost) identical logic written using getopt:

import os
import getopt

args, opts = getopt.getopt(os.argv[1:], "", ["sum"])
func = max
for opt, arg in opts:
    if opt == "--sum":
        func = sum
print(func(int(arg) for arg args))

Now there’s of course two things missing (which offers a very good counter-argument against getopt): documentation, and validation/error handling.

Let’s have another look at the documentation that was auto-generated by argparse:

usage: prog.py [-h] [--sum] N [N ...]

Process some integers.

positional arguments:
 N           an integer for the accumulator

options:
 -h, --help  show this help message and exit
 --sum       sum the integers (default: find the max)

In my opinion, this message could just as well be hardcoded in the program source. Its existence provides an excellent reference to whoever is reading the code, and entices focusing on the clarity of the message. It is a good idea to start writing the program by first writing this help message. If I were to implement prog.py from scratch, I would write the help message as follows:

Usage: accumulate [-h | --help] [--func=F] ARGS

This utility accumulates ARGS (each interpreted as a number),
according to the function F (which by default is max).

Options:
 -h, --help  Show this help message and exit.
 --func=F    Use function F to accumulate the numbers.

The function F can be one of:
 max         Find the largest number among the arguments. (Default.)
             You must provide at least one argument.
 sum         Sum the arguments. A sum of zero arguments is zero.

By writing the documentation first, we’ve achieved the following:

Our program now has a name (accumulate), and a more clearly defined purpose.
We’ve identified the edge/error cases, such as attempting to find the maximum of zero numbers; meanwhile a sum of zero numbers is the addition identity (zero), so it would make sense to allow that.
We’ve generalized our program to handle numbers, rather than integers. Python has a module for exact decimal arithmetic, so why not use that?
We’ve made the interface more extensible, leaving enough space to allow adding a hundred more functions in the future, without cluttering the option namespace, or painting ourselves in the corner by introducing mutually exclusive options.
The printed text is (somewhat structured) hand-written prose, which reads more easily than the auto-generated text.

So how does the code to handle all of that now look like?

import os
import getopt
import decimal

def show_usage():
    print("Usage: accumulate [-h | --help] [--func=F] ARGS")

def show_help():
    show_usage()
    print("""
This utility...
""")  # omitted for brevity

def main():
    try:
        args, opts = getopt.getopt(os.argv[1:], "h", ["help", "func="])
    except getopt.GetoptError:
        show_usage()
        exit(1)
    funcs = {"max": max, "sum": sum}
    func = max
    for opt, arg in opts:
        if opt in ["-h", "--help"]:
            show_help()
            exit()
        elif opt == "--func":
            try:
                func = funcs[arg]
            except LookupError:
                show_usage()
                exit(1)
    if func == sum and len(args) == 0:
        print("Error: cannot sum zero numbers.")
        exit(1)
    print(func(decimal.Decimal(arg) for arg args))

if __name__ == "__main__":
    main()

So, is this a lot of error handling code? No, I don’t think so. Real-world programs need to handle such edge cases all of the time.

Is this too much code for such a small utility? After all, we’ve gone from ten to dozens of lines of code. Again, I don’t think so. Even the tiniest utility (many of which will never ever get a proper manual page) will greatly benefit from a carefully written --help-style reference. The task at hand happens to fit the example given in argparse’s introduction, but many real-world utilities won’t. Resorting to use every single one of argparse’s capabilities in an attempt to writte less lines of code is just golf.

Click, Typer, etc

Don’t even get me started!

Appendix A: boilerplate

The argument-parsing boilerplate for different languages can be trivially copy-pasted from a template; I keep a couple of such copypastas in my dotfiles:

Python
Shell

I’ve taken on maintaining a fork of an excellent getopt library for Go, and provided some boilerplate in the examples directory:

Go (original)

Appendix B: support for getopt

If you see a glaring omission, feel free to tickle me and suggest an edit!

Operating systems / platforms

POSIX; and via POSIX: glibc, musl, and therefore, probably every Linux distro in existence.
Windows/WSL, per installed Linux distro.
OpenBSD
NetBSD
FreeBSD
DragonFly BSD
illumos
Solaris
SerenityOS

Programming languages

GNU Awk
Go (package)
Python
Ruby
Rust (crate)
Shell

Also, check out the article on Rosetta Code.

Notable for doing something completely different

Windows (because DOS (because CP/M (because VMS)))
find(1) (GNU, BSD)
X11
Go
plan9
suckless.org projects