digital-domain.net

NGINX Unit Serialised Pointers

In NGINX Unit we make use of what we call serialised pointers. In simplest terms these are nothing more than offsets into memory. However, the way they are implemented is somewhat non-obvious.

These are needed when we want to share memory (containing pointers) via Inter Process Communications methods.

This text will attempt to explain them.

In Unit it is common to have a chunk of memory that starts with a structure then has some some data after it, such as a bunch of, possibly nul terminated, strings.

Each of these strings would have an associated nxt_unit_sptr_t structure member which is defined like

union nxt_unit_sptr_u {
    uint8_t   base[1];
    uint32_t  offset;
};

.base[1] is only used to get the address of this union, the array decays to a pointer, so .base is the address of the union.

This is really the key to the whole thing, we never set (or retrieve) .base, it merely exists to provide the address of the union.

.offset is then an offset relative from the .base address to the start of the data in question.

(This could have been implemented using a simple integer type)

The following example program and diagram will hopefully make things clear

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>

union sptr_u {
        uint8_t base[1];
        uint32_t offset;
};
typedef union sptr_u sptr_t;

struct s {
        uint8_t name1_len;
        uint8_t name2_len;
        uint8_t name3_len;

        sptr_t name1;
        sptr_t name2;
        sptr_t name3;
};

static void sptr_set(sptr_t *sptr, void *ptr)
{
        sptr->offset = (uint8_t *)ptr - sptr->base;
}

static void *sptr_get(sptr_t *sptr)
{
        return sptr->base + sptr->offset;
}

int main(void)
{
        static const char * const names[] = { "toor", "foobar", "baz" };
        struct s *s = malloc(sizeof(struct s) +
                             strlen(names[0]) + strlen(names[1]) +
                             strlen(names[2]) + 3);
        char *p = (char *)(s) + sizeof(struct s);

        sptr_set(&s->name1, p);
        p = stpcpy(p, names[0]);

        p++;
        sptr_set(&s->name2, p);
        p = stpcpy(p, names[1]);

        p++;
        sptr_set(&s->name3, p);
        p = stpcpy(p, names[2]);

        printf("name1 : %s\n", (const char *)sptr_get(&s->name1));
        printf("name2 : %s\n", (const char *)sptr_get(&s->name2));
        printf("name3 : %s\n", (const char *)sptr_get(&s->name3));

        free(s);

        exit(EXIT_SUCCESS);
}

The above program results in something like

Fig 1. structure memory layout

pahole(1) shows

union sptr_u {
        uint8_t                    base[1];            /*     0     1 */
        uint32_t                   offset;             /*     0     4 */
};
struct s {
        uint8_t                    name1_len;            /*     0     1 */
        uint8_t                    name2_len;            /*     1     1 */
        uint8_t                    name3_len;            /*     2     1 */

        /* XXX 1 byte hole, try to pack */

        sptr_t                     name1;                /*     4     4 */
        sptr_t                     name2;                /*     8     4 */
        sptr_t                     name3;                /*    12     4 */

        /* size: 16, cachelines: 1, members: 6 */
        /* sum members: 15, holes: 1, sum holes: 1 */
        /* last cacheline: 16 bytes */
};

So we have three strings; “toor”, “foobar” & “baz”

toor starts at the address of s->name1 + 12, 12 is sizeof(sptr_t) * 3.

foobar start at the address of s->name2 + 13, 13 is sizeof(sptr_t) * 2 + the length of “toor\0” (5).

baz starts at the address of s->name3 + 16, 16 is sizeof(sptr_t) + the lengths of “toor\0” & “foobar\0” (12).


Andrew Clayton, Apr 8th 2024