Age | Commit message (Collapse) | Author | Files | Lines |
|
Andrei reported an issue on arm64 where he was seeing the following
error message when running the tests
2024/01/17 18:32:31.109 [error] 54904#54904 "gidmap" field has an entry with "size": 1, but for unprivileged unit it must be 1.
This error message is guarded by the following if statement
if (nxt_slow_path(m.size > 1)
Turns out size was indeed > 1, in this case it was 289356276058554369,
m.size is defined as a nxt_int_t, which on arm64 is actually 8 bytes,
but was being printed as a signed int (4 bytes) and by chance/undefined
behaviour comes out as 1.
But why is size so big? In this case it should have just been 1 with a
config of
'gidmap': [{'container': 0, 'host': os.getegid(), 'size': 1}],
This is due to nxt_int_t being 64bits on arm64 but using a conf type of
NXT_CONF_MAP_INT which means in nxt_conf_map_object() we would do (using
our m.size variable as an example)
ptr = nxt_pointer_to(data, map[i].offset);
...
ptr->i = num;
Where ptr is a union pointer and is now pointing at our m.size
Next we set m.size to the value of num (which is 1 in this case), via
ptr->i where i is a member of that union of type int.
So here we are setting a 64bit memory location (nxt_int_t on arm64)
through a 32bit (int) union alias, this means we are only setting the
lower half (4) of the bytes.
Whatever happens to be in the upper 4 bytes will remain, giving us our
exceptionally large value.
This is demonstrated by this program
#include <stdio.h>
#include <stdint.h>
int main(void)
{
int64_t num = -1; /* All 1's in two's complement */
union {
int32_t i32;
int64_t i64;
} *ptr;
ptr = (void *)#
ptr->i32 = 1;
printf("num : %lu / %ld\n", num, num);
ptr->i64 = 1;
printf("num : %ld\n", num);
return 0;
}
$ make union-32-64-issue
cc union-32-64-issue.c -o union-32-64-issue
$ ./union-32-64-issue
num : 18446744069414584321 / -4294967295
num : 1
However that is not the only issue, because the members of
nxt_clone_map_entry_t were specified as nxt_int_t's on the likes of
x86_64 this would be a 32bit signed integer. However uid/gids on Linux
at least are defined as unsigned integers, so a nxt_int_t would not be
big enough to hold all potential values.
We could make the nxt_uint_t's but then we're back to the above union
aliasing problem.
We could just set the memory for these variables to 0 and that would
work, however that's really just papering over the problem.
The right thing is to use a large enough sized type to store these
things, hence the previously introduced nxt_cred_t. This is an int64_t
which is plenty large enough.
So we switch the nxt_clone_map_entry_t structure members over to
nxt_cred_t's and use NXT_CONF_MAP_INT64 as the conf type, which then
uses the right sized union member in nxt_conf_map_object() to set these
variables.
Reported-by: Andrei Zeliankou <zelenkov@nginx.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This is a generic type to represent a uid_t/gid_t on Linux when user
namespaces are in use.
Technically this only needs to be an unsigned int, but we make it an
int64_t so we can make use of the existing NXT_CONF_MAP_INT64 type.
This will be used in subsequent commits.
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
Since the previous commit, this is no longer used.
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
Since the previous commit, this is no longer used.
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
|
|
|
|
The process abstraction has changed to:
setup(task, process)
start(task, process_data)
prefork(task, process, mp)
The prefork() occurs in the main process right before fork.
The file src/nxt_main_process.c is completely free of process
specific logic.
The creation of a process now supports a PROCESS_CREATED state. The
The setup() function of each process can set its state to either
created or ready. If created, a MSG_PROCESS_CREATED is sent to main
process, where external setup can be done (required for rootfs under
container).
The core processes (discovery, controller and router) doesn't need
external setup, then they all proceeds to their start() function
straight away.
In the case of applications, the load of the module happens at the
process setup() time and The module's init() function has changed
to be the start() of the process.
The module API has changed to:
setup(task, process, conf)
start(task, data)
As a direct benefit of the PROCESS_CREATED message, the clone(2) of
processes using pid namespaces now doesn't need to create a pipe
to make the child block until parent setup uid/gid mappings nor it
needs to receive the child pid.
|
|
The setuid/setgid syscalls requires root capabilities but if the kernel
supports unprivileged user namespace then the child process has the full
set of capabilities in the new namespace, then we can allow setting "user"
and "group" in such cases (this is a common security use case).
Tests were added to ensure user gets meaningful error messages for
uid/gid mapping misconfigurations.
|
|
|