Welcome to little lamb

Code » limb » commit fbd5e24

hmap: Fix things, add hmap_next()

author Olivier Brunel
2023-05-29 17:35:00 UTC
committer Olivier Brunel
2023-07-05 07:39:26 UTC
parent 8e4c7f5954ed9621c779ca7ff7f943f6b77a5b85

hmap: Fix things, add hmap_next()

It was buggy. Very much so. Now things should work as expected, however
it is simply not possible to remove elements from a hmap.
Should that be needed, a new design shall be implemented instead.

Add hmap_next() to iterate over an entire hmap: not an optimized path and
one that requires *no changes* in the hmap during iteration (that is, the
table's content must remain unchanged, but changing the element's data
is fine), but there it is.

src/doc/hmap.h.0.md +3 -0
src/doc/hmap.h/hmap_init.3.md +23 -8
src/include/hmap.h +3 -2
src/liblimb/hmap.h/grow.c +45 -38
src/liblimb/hmap.h/hmap_next.c +39 -0
src/liblimb/hmap.h/hmap_set.c +9 -13
src/liblimb/include/limb/hmap.h +1 -0

diff --git a/src/doc/hmap.h.0.md b/src/doc/hmap.h.0.md
index 6af3f56..c6916ff 100644
--- a/src/doc/hmap.h.0.md
+++ b/src/doc/hmap.h.0.md
@@ -36,6 +36,9 @@ The following functions are defined :
 : [hmap_init](3)
 :: Initializes an hash table, setting the size of its elements.
 
+: [hmap_next](3)
+:: To iterate over all elements in an hash table.
+
 : [hmap_set](3)
 :: Sets (or adds) the data associated with the given key.
 
diff --git a/src/doc/hmap.h/hmap_init.3.md b/src/doc/hmap.h/hmap_init.3.md
index 5e51228..d980bbd 100644
--- a/src/doc/hmap.h/hmap_init.3.md
+++ b/src/doc/hmap.h/hmap_init.3.md
@@ -3,7 +3,7 @@
 
 # NAME
 
-hmap\_init, hmap\_set, hmap\_get, hmap\_free - hash table management
+hmap_init, hmap_next, hmap_set, hmap_get, hmap_free - hash table management
 
 # SYNOPSIS
 
@@ -11,6 +11,7 @@ hmap\_init, hmap\_set, hmap\_get, hmap\_free - hash table management
 
 ```pre hl
 int hmap_init(size_t <em>dlen</em>, size_t <em>n</em>, hmap *<em>hmap</em>)
+void *hmap_next(u32 *<em>key</em>, hmap *<em>hmap</em>)
 int hmap_set(u32 <em>key</em>, void *<em>data</em>, hmap *<em>hmap</em>)
 void *hmap_get(u32 <em>key</em>, hmap *<em>hmap</em>)
 void hmap_free(hmap *<em>hmap</em>)
@@ -33,9 +34,10 @@ pointed to by `hmap` for a hash table of `n` elements, using `dlen` bytes as
 per-element data size.
 
 The hash table will automatically grow as needed when adding new elements, but
-it is not possible to shrunk it down. Note also that the actual size of the
-table must be a power of two, and will therefore be adjusted accordingly (e.g.
-using 10 as `n` will actually result in a table initialized for 16 elements).
+it is not possible to shrunk it down, nor is it possible to remove elements.
+Note also that the actual size of the table must be a power of two, and will
+therefore be adjusted accordingly (e.g.  using 10 as `n` will actually result in
+a table initialized for 16 elements).
 
 The `hmap_set`() function searches the table for an element with the given `key`
 and sets its associated data to that pointed to by `data` (which must be of the
@@ -43,17 +45,26 @@ size given as `dlen` to `hmap_init`() earlier). Note that the actual data
 pointed will be copied into the table.
 
 If not such element exists, it is added to the table. If needed the table might
-grow its size in order to add the new element. If `data` is NULL then the
-element is "removed" from the table - that is, the memory is emptied out to be
-available for use by another element, the table itself does not shrunk in size.
+grow its size in order to add the new element.
 
 To get a valid 32bit key from a data blob (e.g. a string), you might want to use
 [hlookup32](3).
 
-The `hmap_get`() function searches the tale for an element with the given `key`
+The `hmap_get`() function searches the table for an element with the given `key`
 and returns a pointer to its associated data when found, or NULL if no such
 element exists in the table.
 
+The `hmap_next`() function allows to iterate over all elements in the table,
+although this is not an optimized path. It will locate the element for the key
+pointed to by `key`, set it to the key of the next element and return a pointer
+to its associated data. If there are no more elements in the table, it will
+return *NULL*.
+
+The first call to `hmap_next`() must have the value pointe by `key` by zero,
+successive calls must them re-use the value in order to move forward. Although
+changing an element's /data/ is allowed, one should not change the table's
+content during such iteration (i.e. do not add elements to the table).
+
 The `hmap_free`() function frees all memory allocated by the table and returns
 it into a uninitialized state. It is then possible to re-use it with
 `hmap_init`() with different parameters.
@@ -66,6 +77,10 @@ error, with `errno` set to indicate the cause of the error.
 The `hmap_get`() function returns a pointer to the data associated with the
 element if found, or NULL otherwise. It cannot fail.
 
+The `hmap_next`() function returns a pointer to the data associated with the
+element, or NULL is the table is empty or has been fully iterated (no more
+elements past the one whose key is pointed to by `key`). It cannot fail.
+
 # ERRORS
 
 The `hmap_init`() function may fail if :
diff --git a/src/include/hmap.h b/src/include/hmap.h
index 0bb4e39..1202bfc 100644
--- a/src/include/hmap.h
+++ b/src/include/hmap.h
@@ -11,15 +11,16 @@
 #define MINSIZE     8
 #define MAXSIZE     ((u32) -1 / 2 + 1)
 
-/* IMMPORANT: the stralloc hmap->sa is *NOT* used as expected in here. Indeed,
+/* IMPORTANT: the stralloc hmap->sa is *NOT* used as expected in here. Indeed,
  * we determine the *exact* size we want allocated and ask for it.
  *
- * Then, sa.a is truned into our mask, i.e. the number of elements - 1
+ * Then, sa.a is turned into our mask, i.e. the number of elements - 1
  * Similarly, sa.len is used as counter of used elements, but those aren't used
  * in order and every element (even empty/zero ones) should be considered in
  * use by the hmap.
  */
 
+/* keep in sync w/ fullitem in grow.c */
 struct item {
     u32 key;
     u8 data[];
diff --git a/src/liblimb/hmap.h/grow.c b/src/liblimb/hmap.h/grow.c
index 687e3e2..e7cc7b4 100644
--- a/src/liblimb/hmap.h/grow.c
+++ b/src/liblimb/hmap.h/grow.c
@@ -4,47 +4,53 @@
 #include "hmap.h"
 
 static void
-relocate(u32 idx, u32 min, u32 max, u32 *left, hmap *hmap)
+relocate(u32 idx, u32 min, u32 max, u32 *left, u32 relocated[], hmap *hmap)
 {
+    /* keep in sync w/ item in hmap.h */
     struct fullitem {
         u32 key;
         u8 data[DLEN(hmap)];
     } * it;
 
     it = ITEM(idx, hmap);
-    if (it->key) {
-        struct fullitem itmp, *new;
-
-        /* remove item from current spot */
-        itmp = *it;
-        memset(it, 0, sizeof(*it));
-
-        /* and look for its natural spot, or first available collision spot
-         * if needed */
-        u32 i, j;
-        for (i = it->key, j = 1; ; i += j++) {
-            /* get item's wanted spot */
-            new = ITEM(i & hmap->sa.a, hmap);
-
-            /* put it back if it's where it was */
-            if (new == it)
-                break;
-
-            /* make sure its wanted spot has been relocated if needed */
-            relocate(i & hmap->sa.a, min, max, left, hmap);
-
-            /* settle there if it's now available */
-            if (!new->key)
-                break;
-        }
-        /* put item in its (new) spot */
-        *new = itmp;
-        /* decrease left iif it was within [min,max] *and* its new spot is
-         * outside of ]min,max] */
-        i &= hmap->sa.a;
-        if (idx >= min && idx <= max && (i > max || i <= min))
-            --*left;
+
+    u32 n = idx / 32;
+    u32 b = 1 << (idx - n);
+    if ((relocated[n] & b) || !it->key)
+        return;
+
+    struct fullitem itmp, *new;
+
+    /* remove item from current spot */
+    itmp = *it;
+    memset(it, 0, sizeof(*it));
+
+    /* and look for its natural spot, or first available collision spot
+     * if needed */
+    u32 i, j;
+    for (i = itmp.key, j = 1; ; i += j++) {
+        /* get item's wanted spot */
+        new = ITEM(i & hmap->sa.a, hmap);
+
+        /* put it back if it's where it was */
+        if (new == it)
+            break;
+
+        /* make sure its wanted spot has been relocated if needed */
+        relocate(i & hmap->sa.a, min, max, left, relocated, hmap);
+
+        /* settle there if it's now available */
+        if (!new->key)
+            break;
     }
+    /* put item in its (new) spot */
+    *new = itmp;
+    n = (i & hmap->sa.a) / 32;
+    b = 1 << ((i & hmap->sa.a) - n);
+    relocated[n] |= b;
+    /* decrease left iif it was within [min,max] */
+    if (idx >= min && idx <= max)
+        --*left;
 }
 
 int
@@ -64,9 +70,10 @@ grow(size_t n, hmap *hmap)
         ++hmap->sa.a;
         hmap->sa.a *= hmap->ilen;
     }
-    oldsize = hmap->sa.a;
+    oldsize = hmap->sa.a / hmap->ilen;
 
     if (!stralloc_ready_tuned(&hmap->sa, newsize * hmap->ilen, 0, 0, 1)) {
+        hmap->sa.a /= hmap->ilen;
         --hmap->sa.a;
         return 0;
     }
@@ -81,12 +88,12 @@ grow(size_t n, hmap *hmap)
     /* we'll (try to) relocate all items to their (new) "natural" spots, or
      * a new collision/alternate spot if one got freed */
     u32 left = hmap->sa.len;
+    u32 relocated[((hmap->sa.a + 1) / 32) + 1];
+    memset(relocated, 0, sizeof(relocated));
     for (u32 i = 0; left; ++i) {
         struct item *it = ITEM(i, hmap);
-        if (it->key) {
-            relocate(i, i, oldsize, &left, hmap);
-            --left;
-        }
+        if (it->key)
+            relocate(i, i, oldsize, &left, relocated, hmap);
     }
 
     return 1;
diff --git a/src/liblimb/hmap.h/hmap_next.c b/src/liblimb/hmap.h/hmap_next.c
new file mode 100644
index 0000000..05518a1
--- /dev/null
+++ b/src/liblimb/hmap.h/hmap_next.c
@@ -0,0 +1,39 @@
+/* This file is part of limb                           https://lila.oss/limb
+ * Copyright (C) 2023 Olivier Brunel                          jjk@jjacky.com */
+/* SPDX-License-Identifier: GPL-2.0-only */
+#include "hmap.h"
+
+void *
+hmap_next(u32 *key, hmap *hmap)
+{
+    struct item *it;
+    u32 i, j;
+
+    if (!hmap->sa.len)
+        return NULL;
+
+    if (*key) {
+        for (i = *key, j = 1; ; i += j++) {
+            it = ITEM(i & hmap->sa.a, hmap);
+            if (!it->key || it->key == *key)
+                break;
+        }
+        i &= hmap->sa.a;
+        if (++i > hmap->sa.a)
+            return NULL;
+    } else {
+        i = 0;
+        it = ITEM(i, hmap);
+    }
+
+    for ( ; i <= hmap->sa.a; ++i) {
+        it = ITEM(i, hmap);
+        if (it->key) break;
+    }
+
+    if (!it->key)
+        return NULL;
+
+    *key = it->key;
+    return it->data;
+}
diff --git a/src/liblimb/hmap.h/hmap_set.c b/src/liblimb/hmap.h/hmap_set.c
index d2a19d7..64e95df 100644
--- a/src/liblimb/hmap.h/hmap_set.c
+++ b/src/liblimb/hmap.h/hmap_set.c
@@ -10,23 +10,19 @@ hmap_set(u32 key, void *data, hmap *hmap)
     struct item *it;
 
     it = lookup(key, hmap);
-    if (!data) {
-        if (!it->key)
-            return (errno = ENOENT, 0);
-        memset(it->data, 0, hmap->ilen);
-        return 1;
+    if (!it->key) {
+        /* should we grow? */
+        if (1 + hmap->sa.len > hmap->sa.a - hmap->sa.a / 4) {
+            if (!grow(2 * hmap->sa.len, hmap))
+                return (errno = ENOMEM, 0);
+            /* find possibly new location after growth */
+            it = lookup(key, hmap);
+        }
+        ++hmap->sa.len;
     }
 
     it->key = key;
     memcpy(it->data, data, DLEN(hmap));
 
-    if (++hmap->sa.len > hmap->sa.a - hmap->sa.a / 4) {
-        if (!grow(2 * hmap->sa.len, hmap)) {
-            --hmap->sa.len;
-            memset(it, 0, hmap->ilen);
-            return (errno = ENOMEM, 0);
-        }
-    }
-
     return 1;
 }
diff --git a/src/liblimb/include/limb/hmap.h b/src/liblimb/include/limb/hmap.h
index d9c98dd..9e549de 100644
--- a/src/liblimb/include/limb/hmap.h
+++ b/src/liblimb/include/limb/hmap.h
@@ -18,6 +18,7 @@ struct hmap {
 
 extern int hmap_init(size_t dlen, size_t n, hmap *hmap);
 extern int hmap_set(u32 key, void *data, hmap *hmap);
+extern void *hmap_next(u32 *key, hmap *hmap);
 extern void *hmap_get(u32 key, hmap *hmap);
 extern void hmap_free(hmap *hmap);