Welcome to little lamb

Code » limb » commit 4d7aa65

Add loadopt.h & loadopt() for "full" options parsing

author Olivier Brunel
2023-03-25 21:09:57 UTC
committer Olivier Brunel
2023-03-26 14:03:03 UTC
parent 692e59d5b2c6d04e89155104162e635beaa632cc

Add loadopt.h & loadopt() for "full" options parsing

That is, parses options from command-line (via parseopt(3)), then
(optionally) from configuration file (if not already set).

Also handles options to be required, and checks for required/optional
post-options command-line arguments.

Unlike parseopt(3), it /does/ throw appropriate warnings on errors.

doc/loadopt.3.md +150 -0
doc/loadopt.h.0.md +50 -0
include/limb/loadopt.h +57 -0
include/loadopt.h +33 -0
meta/libs/limb +3 -0
src/loadopt.c +297 -0
src/loadopt_handle_noconfig.c +31 -0

diff --git a/doc/loadopt.3.md b/doc/loadopt.3.md
new file mode 100644
index 0000000..d17deed
--- /dev/null
+++ b/doc/loadopt.3.md
@@ -0,0 +1,150 @@
+% limb manual
+% loadopt(3)
+
+# NAME
+
+loadopt - parse options from command-line and (optionally) configuration file
+
+# SYNOPSIS
+
+    #include <limb/loadopt.h>
+
+```pre hl
+int loadopt(int *<em>first</em>, int <em>argc</em>, const char **<em>argv</em>, const struct option *<em>options</em>,
+            const char *<em>file</em>, const char *<em>section</em>, unsigned int <em>flags</em>,
+            struct loadopt *<em>ctx</em>)
+```
+
+# DESCRIPTION
+
+The `loadopt`() function parses command-line arguments. Then, optionally, it
+will read a configuration file to load any options that hasn't been set yet.
+After that it ensures any option marked as required has been set, then may check
+for non-options arguments.
+
+The actual parsing of options is done through [parseopt](3), as such many of the
+arguments to `loadopt`() are the same as to [parseopt](3), namely `first`,
+`argc`, `argv`, `options` and `flags`. Please refer to [parseopt](3) for more
+on those.
+
+It is important to note, however, that the member `flags` of *struct option* is
+relevant here : it allows to define option-specific flags. Specifically, its
+value is constructed as a bitwise-inclusive OR of the following :
+
+: *OPT_SKIP*
+:: To indicate this option shall not be set from configuration file, i.e. can
+:: only be used from command-line.
+
+: *OPT_REQ*
+:: To indicate this option is required, i.e. error out if after having parsed
+:: all command-line options (and, optionally, options set in configuration
+:: file), the option has not been set.
+:: Mostly useful when option that require an argument.
+
+The last argument `ctx` is a semi-opaque structure, that should be initialized
+to all zeroes. It contains the same members as *struct parseopt* with the same
+meaning/use as described in [parseopt](3).
+Additionally, the following members are of interest :
+
+: `from_file`
+:: When an option was found, set to 0 when the option was set on command-line,
+:: set to 1 when it comes from the configuration file.
+
+## Configuration File
+
+When called, `loadopt`() will first defer to [parseopt](3) to handle parsing of
+command-line arguments. Once done, if `file` was specified (i.e. is not NULL)
+then the corresponding file will be read and parsed as an INI-like configuration
+file.
+
+That is, it expects to find on every line a long option name (without the "--"),
+optionally followed by an argument. No escaping of any kind is supported.
+Such options are then simply processed through [parseopt](3) as-if they had been
+specified on command-line, and can thusly be processed exactly as such by the
+caller.
+Note however that any option already set (on command-line) will be ignored, as
+command-line options overwrite the defaults from configuration file.
+
+! HINT:
+! If you need to know whether an option was specified an command-line or from
+! configuration file, you can check the member `from_file` of the *struct
+! loadopt*.
+
+If `section` was not NULL, `loadopt`() will first look for a section of the
+specified name in the file (defined by a line containing nothing but an open
+square bracket '[', the section's name as specified (case-sensitive), and a
+close square bracket ']'), and only process options from within said section.
+
+Very little is done in parsing the file, however any space at the beginning of
+a line will be skipped/ignored (except for the section header, as described
+above). Additionally, if the first character (save for spaces) on a line is
+either a semi-colon ';' or a number sign '#' then the entire line is ignored,
+allowing for comments.
+
+## Command-line Arguments
+
+Lastly, `loadopt`() can perform some (minimal) checking for post-options
+arguments on the command line once parsing of all options has been successful,
+which includes making sure all options marked required were indeed found
+(either from file or on command-line).
+
+For this to happen, you need to specify a special element in the array `options`
+by the macro *LOADOPT_ARGUMENTS*. Every element /after/ it will be handled as
+referring to an argument instead of an option.
+Note that this can be the first element of the array, if there are no options.
+
+For those elements, members `shortopt`, `id` and `flags` are ignored. The
+`longopt` member is only used in warnings to the user, to refer to said
+argument.
+
+The important member is `arg` which is treated as defining whether the
+argument is required (*ARG_REQ*) or optional (*ARG_OPT*). If required and no
+argument was given on command-line, an error occurs. If optional and not
+specified, `loadopt`() successfully ends, returning -1.
+
+Finally, after all such argument definitions in the array `options`, you must
+terminate it with either one of the macros *LOADOPT_DONE_OPEN* and
+*LOADOPT_DONE*
+
+If the former was specified, `loadopt`() ends successfully regardless of whether
+more arguments were specified on command-line or not. With the later however,
+any more arguments will result in an error for "too many arguments".
+
+Note that either of those can be specified even without any actual arguments,
+i.e. as last element right after all the options.
+
+In other words, *LOADOPT_DONE* is the same as *OPTION_DONE* as described from
+[parseopt](3).
+
+# RETURN VALUE
+
+If an option was successfully found, `loadopt`() returns the option `id` if
+non-zero, else its `shortopt`. When all options, and, optionally, arguments,
+have been successfully parsed -1 is returned.
+
+If an error occurs, a warning is emitted to *stderr* and a negative value (other
+than -1) is returned, depending on the error.
+
+## Warnings
+
+The warnings are actually written out using one of the [warn](3) family of
+functions. Refer to it for more on how and where data is being written to.
+
+# ERRORS
+
+The `loadopt`() function may fail and return :
+
+: *LOADOPT_ERR_FILE*
+:: Unable to open/read the configuration file
+
+: *LOADOPT_ERR_MISSINGOPT*
+:: A required option was missing/not specified
+
+: *LOADOPT_ERR_MISSINGARG*
+:: A required argument was missing/not specified
+
+: *LOADOPT_ERR_TOOMANY*
+:: Too many arguments were specified
+
+The `loadopt`() function may also fail and return any of the errors specified
+for [parseopt](3).
diff --git a/doc/loadopt.h.0.md b/doc/loadopt.h.0.md
new file mode 100644
index 0000000..4697e5e
--- /dev/null
+++ b/doc/loadopt.h.0.md
@@ -0,0 +1,50 @@
+% limb manual
+% loadopt.h(0)
+
+# NAME
+
+loadopt.h - parse options from command-line & configuration file
+
+# SYNOPSIS
+
+    #include <limb/loadopt.h>
+
+# DESCRIPTION
+
+This header defines functions to parse options from command-line, and
+optionally configuration file.
+
+## Constants
+
+The following constants are defined :
+
+: *OPT_SKIP*
+:: Set an option not be loaded from configuration file
+
+: *OPT_REQ*
+:: Set an option to be required
+
+: *LOADOPT_ERR_FILE*, *LOADOPT_ERR_MISSINGOPT*, *LOADOPT_ERR_MISSINGARG*
+: *LOADOPT_ERR_TOOMANY*
+:: Possible return values for [loadopt](3)
+
+## Macros
+
+The following macros are defined :
+
+: *LOADOPT_ARGUMENTS*, *LOADOPT_DONE_OPEN*, *LOADOPT_DONE*
+:: To be used an element in a *struct option* array
+
+## Structures
+
+The following structure are defined :
+
+: *struct loadopt*
+:: A semi-opaque structure to be passed to [loadopt](3)
+
+## Functions
+
+The following functions are defined :
+
+: [loadopt](3)
+:: To parse options from command-line & optionally configuration file
diff --git a/include/limb/loadopt.h b/include/limb/loadopt.h
new file mode 100644
index 0000000..f2bbcee
--- /dev/null
+++ b/include/limb/loadopt.h
@@ -0,0 +1,57 @@
+/* This file is part of limb                           https://lila.oss/limb
+ * Copyright (C) 2023 Olivier Brunel                          jjk@jjacky.com */
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef LIMB_LOADOPT_H
+#define LIMB_LOADOPT_H
+
+#include <skalibs/stralloc.h>
+#include "limb/parseopt.h"
+
+enum {
+    /* private/internal */
+    OPT_WAIT    = 0,
+    OPT_DONE    = 1 << 0,
+    OPT_SET     = 1 << 1,
+    /* public * */
+    OPT_SKIP    = 1 << 2,   /* don't load from file */
+    OPT_REQ     = 1 << 3,   /* must be specified */
+};
+
+#define LOADOPT_ARGUMENTS       { 0, 0, ARG_REQ,  OPT_DONE }
+#define LOADOPT_DONE_OPEN       { 0, 0, ARG_OPT,  OPT_DONE }
+#define LOADOPT_DONE            { 0, 0, ARG_NONE, OPT_DONE }
+
+
+enum {
+    LOADOPT_ERR_FILE        = -6,
+    LOADOPT_ERR_MISSINGOPT  = -7,
+    LOADOPT_ERR_MISSINGARG  = -8,
+    LOADOPT_ERR_TOOMANY     = -9,
+    LOADOPT_ERR_INVALIDARG  = -10,
+};
+
+struct loadopt {
+    /* struct parseopt */
+    u16 cur;
+    u16 off;
+    const char *arg;
+    /* loadopt */
+    stralloc sa;
+    size_t saoff;
+    u8 optflags[64];
+    u16 loflags         : 4;    /* LOADOPT_* */
+    u16 state           : 4;
+    u16 _unused         : 7;
+    /* public (read-only) */
+    u16 from_file       : 1;
+};
+
+enum {
+    LOADOPT_ID_NOCONFIG  = 1,
+};
+
+extern int loadopt(int *first, int argc, const char **argv, const struct option *options,
+                   const char *file, const char *section, unsigned int poflags,
+                   struct loadopt *ctx);
+
+#endif /* LIMB_LOADOPT_H */
diff --git a/include/loadopt.h b/include/loadopt.h
new file mode 100644
index 0000000..3dfa494
--- /dev/null
+++ b/include/loadopt.h
@@ -0,0 +1,33 @@
+/* This file is part of limb                           https://lila.oss/limb
+ * Copyright (C) 2023 Olivier Brunel                          jjk@jjacky.com */
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef LIMB_LIMB_LOADOPT_H
+#define LIMB_LIMB_LOADOPT_H
+
+#include "limb/parseopt.h"
+
+enum state {
+    STATE_NONE = 0,
+    STATE_INIT,
+    STATE_CMDLINE,
+    STATE_FILE,
+    STATE_SECTION,
+    STATE_CONFIG,
+    STATE_OPTIONS,
+    STATE_ARGS,
+    STATE_DONE
+};
+
+enum {
+    LOADOPT_REFILL      = 1 << 0,
+    LOADOPT_IN_COMMENTS = 1 << 1,
+    LOADOPT_REFILLED    = 1 << 2,
+    LOADOPT_EOF         = 1 << 3,
+};
+
+void add_optflags(u8 *optflags, u16 idx, u8 val);
+u8 get_optflags(const u8 *optflags, u16 idx);
+
+int loadopt_handle_noconfig(int idx, const struct option *options, struct loadopt *ctx);
+
+#endif /* LIMB_LIMB_LOADOPT_H */
diff --git a/meta/libs/limb b/meta/libs/limb
index 130ac73..8948676 100644
--- a/meta/libs/limb
+++ b/meta/libs/limb
@@ -48,6 +48,9 @@ obj/out_putmsgdie.o
 obj/err_putmsgdie.o
 # parseopt.h
 obj/parseopt.o
+# loadopt.h
+obj/loadopt_handle_noconfig.o
+obj/loadopt.o
 # find msb
 obj/msb64.o
 # {,un}pack u64
diff --git a/src/loadopt.c b/src/loadopt.c
new file mode 100644
index 0000000..eb928c6
--- /dev/null
+++ b/src/loadopt.c
@@ -0,0 +1,297 @@
+/* This file is part of limb                           https://lila.oss/limb
+ * Copyright (C) 2023 Olivier Brunel                          jjk@jjacky.com */
+/* SPDX-License-Identifier: GPL-2.0-only */
+#include <ctype.h>
+#include "limb/bytestr.h"
+#include "limb/djbunix.h"
+#include "limb/loadopt.h"
+#include "limb/output.h"
+#include "loadopt.h"
+
+void
+add_optflags(u8 *optflags, u16 idx, u8 val)
+{
+    if (idx % 2)
+        val <<= 4;
+    else
+        val &= 0xf;
+    optflags[idx / 2] |= val;
+}
+
+u8
+get_optflags(const u8 *optflags, u16 idx)
+{
+    u8 b = optflags[idx / 2];
+    if (idx % 2) b >>= 4;
+    return b & 0xf;
+}
+
+static void
+parseopt_warn(int c, int idx, const char **argv, const struct option *options,
+             const struct parseopt *ctx)
+{
+    if (c >= 0 || c == PARSEOPT_DONE)
+        return;
+
+    switch (c) {
+        case PARSEOPT_ERR_NONAME:
+            warn("option name missing");
+            break;
+        case PARSEOPT_ERR_UNKNOWN:
+            if (!strncmp(argv[ctx->cur], "--", 2)) {
+                warn("unknown option: ", argv[ctx->cur]);
+            } else {
+                char buf[3] = { '-', argv[ctx->cur][ctx->off], 0 };
+                warn("unknown option: ", buf);
+            }
+            break;
+        case PARSEOPT_ERR_ARGREQ:
+            {
+                char buf[2] = { options[idx].shortopt, 0 };
+                warn("option --", options[idx].longopt,
+                     (*buf) ? "/-" : "", buf, " requires an argument");
+            }
+            break;
+    }
+}
+
+static int
+loadopt_handle(int c, int first, const char **argv, const struct option *options,
+               int from_file, struct parseopt *ctx)
+{
+    if (!from_file && c >= 0 && options[first].id == LOADOPT_ID_NOCONFIG)
+        return loadopt_handle_noconfig(first, options, (struct loadopt *) ctx);
+
+    if (c == PARSEOPT_ERR_UNKNOWN && first >= 0) {
+        const char *s = argv[ctx->cur] + ctx->off;
+        size_t l = strlen(s);
+        adde("did you mean --", options[first].longopt);
+        while (options[++first].longopt)
+            if (!strncmp(s, options[first].longopt, l))
+                adde(" or --", options[first].longopt);
+        err(" ?");
+    }
+
+    return c;
+}
+
+int
+loadopt(int *first, int argc, const char **argv, const struct option *options,
+        const char *file, const char *section, unsigned int poflags,
+        struct loadopt *ctx)
+{
+    /* init */
+    if (!ctx->state) {
+        for (int i = 0; options[i].longopt; ++i)
+            if (options[i].flags & OPT_SKIP)
+                add_optflags(ctx->optflags, i, OPT_SKIP);
+        ctx->state = STATE_INIT;
+    }
+
+    /* init is done, parse options from command line */
+nextopt:
+    if (ctx->state == STATE_INIT) {
+        int c, idx = -1;
+        c = parseopt(&idx, argc, argv, options, poflags, (struct parseopt *) ctx);
+
+        if (c >= 0)
+            add_optflags(ctx->optflags, idx, OPT_SET);
+        else
+            parseopt_warn(c, idx, argv, options, (struct parseopt *) ctx);
+
+        if (c != PARSEOPT_DONE) {
+            c = loadopt_handle(c, idx, argv, options, 0, (struct parseopt *) ctx);
+            if (c == PARSEOPT_DONE)
+                goto nextopt;
+        }
+        if (c != PARSEOPT_DONE) {
+            if (first)
+                *first = idx;
+            return c;
+        }
+        ctx->state = STATE_CMDLINE;
+    }
+
+    /* command line is done, check if there's still a need to load from file */
+    if (ctx->state == STATE_CMDLINE || ctx->state == STATE_SECTION) {
+        if (file) {
+            int i;
+            /* find next option to be read from file */
+            for (i = 0; options[i].longopt; ++i)
+                if (!(get_optflags(ctx->optflags, i) & (OPT_SET | OPT_SKIP)))
+                    break;
+            if (!options[i].longopt) {
+                /* all options are set or to be skipped from file */
+                if (ctx->state == STATE_SECTION)
+                    stralloc_free(&ctx->sa);
+                ctx->state = STATE_CONFIG;
+            }
+        } else {
+            ctx->state = STATE_CONFIG;
+        }
+    }
+
+    /* command line is done, now onto the defined file */
+    if (ctx->state == STATE_CMDLINE) {
+        /* prepend buffer with LF to make searching for section easier */
+        if (!stralloc_catb(&ctx->sa, "\n", 1)
+                || !openslurpclose(&ctx->sa, file)) {
+            warnusys("read ", ESC, file, ESC);
+            stralloc_free(&ctx->sa);
+            ctx->state = STATE_CONFIG;
+            return LOADOPT_ERR_FILE;
+        }
+        ctx->state = STATE_FILE;
+    }
+
+    /* file is read, position into the right section */
+    if (ctx->state == STATE_FILE) {
+        if (section) {
+            /* we prepend a LF as we did into the sa to make searching easier,
+             * we also append a another LF because it there's not, it's EOF and
+             * therefore and empty section, which means nothing to do */
+            size_t llen = 4 + strlen(section);
+            char line[llen];
+            line[0] = '\n';
+            line[1] = '[';
+            memcpy(line + 2, section, llen - 4);
+            line[llen - 2] = ']';
+            line[llen - 1] = '\n';
+
+            size_t o = byte_str(ctx->sa.s, ctx->sa.len, line, llen);
+            /* eof? section doesn't exist, so we're done */
+            if (o == ctx->sa.len) {
+                stralloc_free(&ctx->sa);
+                ctx->state = STATE_CONFIG;
+            } else {
+                /* position into the section */
+                ctx->saoff = o + llen;
+                ctx->state = STATE_SECTION;
+            }
+        } else {
+            ctx->state = STATE_SECTION;
+        }
+    }
+
+    /* file is opened at the right section, parse options */
+    if (ctx->state == STATE_SECTION) {
+nextfileopt:
+
+        /* allow/ignore spaces at beginning of line */
+        for ( ; ctx->saoff < ctx->sa.len; ++ctx->saoff) {
+            char c = ctx->sa.s[ctx->saoff];
+            /* allow comments */
+            if (c == ';' || c == '#')
+                ctx->saoff += byte_chr(ctx->sa.s + ctx->saoff,
+                                       ctx->sa.len - ctx->saoff, '\n');
+            else if (!isspace(c))
+                break;
+        }
+
+        if (ctx->saoff >= ctx->sa.len || ctx->sa.s[ctx->saoff] == '[') {
+            stralloc_free(&ctx->sa);
+            ctx->state = STATE_CONFIG;
+        } else {
+            /* we want a full line */
+            size_t o = ctx->saoff + byte_chr(ctx->sa.s + ctx->saoff,
+                                             ctx->sa.len - ctx->saoff, '\n');
+            /* if found (i.e. not last line in the file) NUL-terminate it */
+            if (o < ctx->sa.len)
+                ctx->sa.s[o] = 0;
+
+            struct parseopt po = { 0 };
+            const char *argv[] = { "", ctx->sa.s + ctx->saoff };
+
+            int c, idx = -1;
+            c = parseopt(&idx, 2, argv, options, PARSEOPT_IS_LONG | PARSEOPT_STRICT, &po);
+
+            /* seek past the line */
+            ctx->saoff = o + 1;
+
+            /* ignore already set option, and "argument required" error for
+             * already set option */
+            if ((c >= 0 || c == PARSEOPT_ERR_ARGREQ)
+                    && (get_optflags(ctx->optflags, idx) & OPT_SET)) {
+                goto nextfileopt;
+            }
+
+            if (c >= 0)
+                add_optflags(ctx->optflags, idx, OPT_SET);
+            else
+                parseopt_warn(c, idx, argv, options, &po);
+
+            if (c != PARSEOPT_DONE) {
+                c = loadopt_handle(c, idx, argv, options, 1, &po);
+                if (c == PARSEOPT_DONE)
+                    goto nextfileopt;
+            }
+            if (c != PARSEOPT_DONE) {
+                if (first)
+                    *first = idx;
+                /* adjust returned value */
+                ctx->arg = po.arg;
+                ctx->from_file = 1;
+                return c;
+            }
+
+            stralloc_free(&ctx->sa);
+            ctx->state = STATE_CONFIG;
+        }
+    }
+
+    /* config file done, check all required options were set */
+    if (ctx->state == STATE_CONFIG) {
+        int i;
+        for (i = 0; options[i].longopt; ++i) {
+            if ((get_optflags(ctx->optflags, i) & (OPT_REQ | OPT_SET)) == OPT_REQ) {
+                char buf[2] = { options[i].shortopt, 0 };
+                warn("option --", options[i].longopt, (*buf) ? "/-" : "", buf, " missing");
+                return LOADOPT_ERR_MISSINGOPT;
+            }
+        }
+        /* re-use off as the current argument's index from w/in options */
+        ctx->off = i;
+        /* set arg as the first argument */
+        ctx->arg = (const char *) (uintptr_t) ctx->cur;
+        ctx->state = STATE_OPTIONS;
+    }
+
+    /* options done, on to arguments */
+    if (ctx->state == STATE_OPTIONS) {
+        while (ctx->state == STATE_OPTIONS) {
+            const struct option *arg = &options[ctx->off];
+            if (arg->flags & OPT_DONE) {
+                if (arg->arg == ARG_NONE && ctx->cur < argc) {
+                    warn("too many arguments");
+                    return LOADOPT_ERR_TOOMANY;
+                } else if (arg->arg == ARG_REQ) {
+                    ++ctx->off;
+                    continue;
+                }
+                /* ARG_NONE w/out args, or ARG_OPT; i.e. ok we're done */
+                break;
+            }
+
+            if (arg->arg == ARG_REQ && ctx->cur == argc) {
+                warn("argument ", ESC, arg->longopt, ESC," missing");
+                return LOADOPT_ERR_MISSINGARG;
+            }
+
+            /* next argument from command line */
+            ++ctx->cur;
+            /* next argument defined/to check for */
+            ++ctx->off;
+        }
+        /* set cur to the first argument */
+        ctx->cur = (uintptr_t) ctx->arg & 0xffff;
+        /* done */
+        ctx->state = STATE_ARGS;
+    }
+
+    /* arguments done, we're finally done */
+    if (ctx->state == STATE_ARGS)
+        ctx->state = STATE_DONE;
+
+    /* this is the end... */
+    return PARSEOPT_DONE;
+}
diff --git a/src/loadopt_handle_noconfig.c b/src/loadopt_handle_noconfig.c
new file mode 100644
index 0000000..1585dad
--- /dev/null
+++ b/src/loadopt_handle_noconfig.c
@@ -0,0 +1,31 @@
+#include "limb/loadopt.h"
+#include "limb/output.h"
+#include "loadopt.h"
+
+int
+loadopt_handle_noconfig(int idx, const struct option *options, struct loadopt *ctx)
+{
+    if (!ctx->arg) {
+        for (int i = 0; options[i].longopt; ++i)
+            if (!(get_optflags(ctx->optflags, i) & OPT_SET))
+                add_optflags(ctx->optflags, i, OPT_SKIP);
+    } else {
+        for (char *s = strtok((char *) ctx->arg, ","); s; s = strtok(NULL, ",")) {
+            int i;
+            for (i = 0; options[i].longopt; ++i) {
+                if (!strcmp(s, options[i].longopt)) {
+                    add_optflags(ctx->optflags, i, OPT_SKIP);
+                    break;
+                }
+            }
+            if (!options[i].longopt) {
+                char buf[2] = { options[idx].shortopt, 0 };
+                warn("invalid argument to option --", options[idx].longopt,
+                     (*buf) ? "/-" : "", buf, ": ", ctx->arg);
+                return LOADOPT_ERR_INVALIDARG;
+            }
+        }
+    }
+
+    return PARSEOPT_DONE;
+}