summaryrefslogtreecommitdiffhomepage
path: root/src/nxt_isolation.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2024-06-18fs: Rename nxt_fs_mkdir_all() => nxt_fs_mkdir_p()Alejandro Colomar1-1/+1
"all" is too generic of an attribute to be meaningful. In the context of mkdir(), "parents" is used for this meaning, as in mkdir -p, so it should be more straightforward to readers. Tested-by: Andy Postnikov <apostnikov@gmail.com> Tested-by: Andrew Clayton <a.clayton@nginx.com> Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Signed-off-by: Alejandro Colomar <alx@kernel.org>
2024-04-25Constify a bunch of static local variablesAndrew Clayton1-12/+12
A common pattern was to declare variables in functions like static nxt_str_t ... Not sure why static, as they were being treated more like string literals (and of course they are _not_ thread safe), let's actually make them constants (qualifier wise). This handles core code conversion. Reviewed-by: Zhidao HONG <z.hong@f5.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2024-01-30Isolation: Use an appropriate type for storing uid/gidsAndrew Clayton1-3/+3
Andrei reported an issue on arm64 where he was seeing the following error message when running the tests 2024/01/17 18:32:31.109 [error] 54904#54904 "gidmap" field has an entry with "size": 1, but for unprivileged unit it must be 1. This error message is guarded by the following if statement if (nxt_slow_path(m.size > 1) Turns out size was indeed > 1, in this case it was 289356276058554369, m.size is defined as a nxt_int_t, which on arm64 is actually 8 bytes, but was being printed as a signed int (4 bytes) and by chance/undefined behaviour comes out as 1. But why is size so big? In this case it should have just been 1 with a config of 'gidmap': [{'container': 0, 'host': os.getegid(), 'size': 1}], This is due to nxt_int_t being 64bits on arm64 but using a conf type of NXT_CONF_MAP_INT which means in nxt_conf_map_object() we would do (using our m.size variable as an example) ptr = nxt_pointer_to(data, map[i].offset); ... ptr->i = num; Where ptr is a union pointer and is now pointing at our m.size Next we set m.size to the value of num (which is 1 in this case), via ptr->i where i is a member of that union of type int. So here we are setting a 64bit memory location (nxt_int_t on arm64) through a 32bit (int) union alias, this means we are only setting the lower half (4) of the bytes. Whatever happens to be in the upper 4 bytes will remain, giving us our exceptionally large value. This is demonstrated by this program #include <stdio.h> #include <stdint.h> int main(void) { int64_t num = -1; /* All 1's in two's complement */ union { int32_t i32; int64_t i64; } *ptr; ptr = (void *)&num; ptr->i32 = 1; printf("num : %lu / %ld\n", num, num); ptr->i64 = 1; printf("num : %ld\n", num); return 0; } $ make union-32-64-issue cc union-32-64-issue.c -o union-32-64-issue $ ./union-32-64-issue num : 18446744069414584321 / -4294967295 num : 1 However that is not the only issue, because the members of nxt_clone_map_entry_t were specified as nxt_int_t's on the likes of x86_64 this would be a 32bit signed integer. However uid/gids on Linux at least are defined as unsigned integers, so a nxt_int_t would not be big enough to hold all potential values. We could make the nxt_uint_t's but then we're back to the above union aliasing problem. We could just set the memory for these variables to 0 and that would work, however that's really just papering over the problem. The right thing is to use a large enough sized type to store these things, hence the previously introduced nxt_cred_t. This is an int64_t which is plenty large enough. So we switch the nxt_clone_map_entry_t structure members over to nxt_cred_t's and use NXT_CONF_MAP_INT64 as the conf type, which then uses the right sized union member in nxt_conf_map_object() to set these variables. Reported-by: Andrei Zeliankou <zelenkov@nginx.com> Reviewed-by: Zhidao Hong <z.hong@f5.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-03-17Default PR_SET_NO_NEW_PRIVS to off.Andrew Clayton1-0/+4
This prctl(2) option was enabled in commit 0277d8f1 ("Isolation: Fix the enablement of PR_SET_NO_NEW_PRIVS.") and this was being set by default. This prctl(2) when enabled renders (amongst other things) the set-UID and set-GID bits on executables ineffective after an execve(2). This causes an issue for applications that want to execute the sendmail(8) binary, this includes the PHP mail() function, which is usually set-GID. After some internal discussion it was decided to disable this option by default. Closes: <https://github.com/nginx/unit/issues/852> Fixes: 0277d8f1 ("Isolation: Fix the enablement of PR_SET_NO_NEW_PRIVS.") Fixes: e2b53e16 ("Added "rootfs" feature.") Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-24Isolation: rootfs: Set the sticky bit on the tmp directory.Andrew Clayton1-1/+1
When using the 'rootfs' isolation option, by default a tmpfs filesystem is mounted on tmp/. Currently this is mounted with a mode of 0777, i.e drwxrwxrwx. 3 root root 60 Feb 22 11:56 tmp however this should really have the sticky bit[0] set (as is per-normal for such directories) to prevent users from having free reign on the files contained within. What we really want is it mounted with a mode of 01777, i.e drwxrwxrwt. 3 root root 60 Feb 22 11:57 tmp [0]: To quote inode(7) "The sticky bit (S_ISVTX) on a directory means that a file in that directory can be renamed or deleted only by the owner of the file, by the owner of the directory, and by a privileged process." Reviewed-by: Liam Crilly <liam@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-02-17Isolation: Rename NXT_HAVE_CLONE -> NXT_HAVE_LINUX_NS.Andrew Clayton1-4/+4
Due to the need to replace our use of clone/__NR_clone on Linux with fork(2)/unshare(2) for enabling Linux namespaces(7) to keep the pthreads(7) API working. Let's rename NXT_HAVE_CLONE to NXT_HAVE_LINUX_NS, i.e name it after the feature, not how it's implemented, then in future if we change how we do namespaces again we don't have to rename this. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-10Isolation: wired up per-application cgroup support internally.Andrew Clayton1-0/+50
This commit hooks into the cgroup infrastructure added in the previous commit to create per-application cgroups. It does this by adding each "prototype process" into its own cgroup, then each child process inherits its parents cgroup. If we fail to create a cgroup we simply fail the process. This behaviour may get enhanced in the future. This won't actually do anything yet. Subsequent commits will hook this up to the build and config systems. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-08-02Rejecting non-Linux pivot_root(2).Alejandro Colomar1-3/+3
Some non-Linux systems implement pivot_root(2), even if they don't document that. An example is MacOS: $ grepc pivot_root / 2>/dev/null .../sys/sysproto.h:3012: int pivot_root(struct proc *, struct pivot_root_args *, int *); Since the prototype of the syscall differs from that of Linux, we can't use that syscall. Let's make sure the test only detects pivot_root(2) under Linux. Also, rename the feature macro to make clear that it's only about Linux's pivot_root(2). This closes #737 issue on GitHub.
2022-08-02Including <mntent.h> iff it exists.Alejandro Colomar1-1/+1
With NXT_HAVE_PIVOT_ROOT, we had issues in MacOS. Headers should normally be included unconditionally, except of course if they don't exist. This fixes part of the #737 issue on GitHub.
2022-07-18Replaced Linux syscall macros by libc macros.Alejandro Colomar1-1/+1
User-space programs should use the SYS_*form, as documented in syscall(2). That also adds compatibility to non-Linux systems.
2021-08-03Fixed dead assignments.Max Romanov1-2/+2
Found by Clang Static Analyzer.
2020-12-14Isolation: fixed unmounting when mnt namespace is in place.Tiago Natel de Moura1-6/+0
The code had a wrong assumption that "mount namespaces" automatically unmounts process mounts when exits but this happens only with unprivileged mounts.
2020-11-16Isolation: added option to disable "procfs" mount.Tiago Natel de Moura1-18/+27
Now users can disable the default procfs mount point in the rootfs. { "isolation": { "automount": { "procfs": false } } }
2020-11-13Isolation: added option to disable tmpfs mount.Tiago Natel de Moura1-19/+29
Now users can disable the default tmpfs mount point in the rootfs. { "isolation": { "automount": { "tmpfs": false } } }
2020-10-29Isolation: mounting of procfs by default when using "rootfs".Tiago Natel de Moura1-37/+49
2020-10-29Isolation: correctly unmount non-dependent paths first.Tiago Natel de Moura1-4/+36
When mount points reside within other mount points, this patch sorts them by path length and then unmounts then in an order reverse to their mounting. This results in independent paths being unmounted first. This fixes an issue in buildbots where dependent paths failed to unmount, leading to the build script removing system-wide language libraries.
2020-09-18Updated racially charged language in messages and comments.Artem Konev1-3/+3
2020-09-16Isolation: remove redundant macro.Tiago Natel de Moura1-1/+1
2020-08-25Isolation: added "automount" option.Tiago Natel de Moura1-10/+57
Now it's possible to disable default bind mounts of languages by setting: { "isolation": { "automount": { "language_deps": false } } } In this case, the user is responsible to provide a "rootfs" containing the language libraries and required files for the application.
2020-08-20Isolation: mount tmpfs by default.Tiago Natel de Moura1-49/+56
2020-08-20Moved isolation related code to "nxt_isolation.c".Tiago Natel de Moura1-0/+958