Age | Commit message (Collapse) | Author | Files | Lines |
|
Due to the need to replace our use of clone/__NR_clone on Linux with
fork(2)/unshare(2) for enabling Linux namespaces(7) to keep the
pthreads(7) API working. Let's rename NXT_HAVE_CLONE to
NXT_HAVE_LINUX_NS, i.e name it after the feature, not how it's
implemented, then in future if we change how we do namespaces again we
don't have to rename this.
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This commit hooks into the cgroup infrastructure added in the previous
commit to create per-application cgroups.
It does this by adding each "prototype process" into its own cgroup,
then each child process inherits its parents cgroup.
If we fail to create a cgroup we simply fail the process. This behaviour
may get enhanced in the future.
This won't actually do anything yet. Subsequent commits will hook this
up to the build and config systems.
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
Registering an isolated PID in the global PID hash is wrong
because it can be duplicated. Isolated processes are stored only
in the children list until the response for the WHOAMI message is
processed and the global PID is discovered.
To remove isolated siblings, a pointer to the children list is
introduced in the nxt_process_init_t struct.
This closes #633 issue on GitHub.
|
|
posix_spawn(3POSIX) was introduced by POSIX.1d
(IEEE Std 1003.1d-1999), and was later consolidated in
POSIX.1-2001, requiring it in all POSIX-compliant systems.
It's safe to assume it's always available, more than 20 years
after its standardization.
Link: <https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/spawn.h.html>
|
|
Splitting the process type connectivity matrix to 'keep ports' and 'send
ports'; the 'keep ports' matrix is used to clean up unnecessary ports after
forking a new process, and the 'send ports' matrix determines which process
types expect to get created process ports.
Unfortunately, the original single connectivity matrix no longer works because
of an application stop delay caused by prototypes. Existing applications
should not get the new router port at the moment.
|
|
|
|
This enables the reuse of process creation functions.
|
|
The socket is required for intercontextual communication in multithreaded apps.
|
|
|
|
Generic process-to-process shared memory exchange is no more required. Here,
it is transformed into a router-to-application pattern. The outgoing shared
memory segments collection is now the property of the application structure.
The applications connect to the router only, and the process only needs to group
the ports.
|
|
The application process needs to request the port from the router instead of the
latter pushing the port before sending a request to the application. This is
required to simplify the communication between the router and the application
and to prepare the router to use the application shared port and then the queue.
|
|
- Changed the port management callbacks to notifications, which e. g. avoids
the need to call the libunit function
- Added context and library instance reference counts for a safer resource
release
- Added the router main port initialization
|
|
|
|
|
|
The process abstraction has changed to:
setup(task, process)
start(task, process_data)
prefork(task, process, mp)
The prefork() occurs in the main process right before fork.
The file src/nxt_main_process.c is completely free of process
specific logic.
The creation of a process now supports a PROCESS_CREATED state. The
The setup() function of each process can set its state to either
created or ready. If created, a MSG_PROCESS_CREATED is sent to main
process, where external setup can be done (required for rootfs under
container).
The core processes (discovery, controller and router) doesn't need
external setup, then they all proceeds to their start() function
straight away.
In the case of applications, the load of the module happens at the
process setup() time and The module's init() function has changed
to be the start() of the process.
The module API has changed to:
setup(task, process, conf)
start(task, data)
As a direct benefit of the PROCESS_CREATED message, the clone(2) of
processes using pid namespaces now doesn't need to create a pipe
to make the child block until parent setup uid/gid mappings nor it
needs to receive the child pid.
|
|
An earlier attempt (ad6265786871) to resolve this condition on the
router's side added a new issue: the app could get a request before
acquiring a port.
|
|
Missing error log messages added.
|
|
The setuid/setgid syscalls requires root capabilities but if the kernel
supports unprivileged user namespace then the child process has the full
set of capabilities in the new namespace, then we can allow setting "user"
and "group" in such cases (this is a common security use case).
Tests were added to ensure user gets meaningful error messages for
uid/gid mapping misconfigurations.
|
|
This is required to avoid include cycles, as some nxt_clone_* functions
depend on the credential structures, but nxt_process depends on clone
structures.
|
|
Introduces the functions nxt_process_init_create() and
nxt_process_init_creds_set().
|
|
Now the nxt_user_groups_get() function uses getgrouplist(3) when available
(except MacOS, see below). For some platforms, getgrouplist() supports
a method of probing how much groups the user has but the behavior is not
consistent. The method used here consists of optimistically trying to get up
to min(256, NGROUPS_MAX) groups; only if ngroups returned exceeds the original
value, we do a second call. This method can block main's process if LDAP/NDIS+
is in use.
MacOS has getgrouplist(3) but it's buggy. It doesn't update ngroups if the
value passed is smaller than the number of groups the user has. Some
projects (like Go stdlib) call getgrouplist() in a loop, increasing ngroups
until it exceeds the number of groups user belongs to or fail when a limit
is reached. For performance reasons, this is to be avoided and MacOS is
handled in the fallback implementation.
The fallback implementation is the old Unit approach. It saves main's
user groups (getgroups(2)) and then calls initgroups(3) to load application's
groups in main, then does a second getgroups(2) to store the gids and restore
main's groups in the end. Because of initgroups(3)' call to setgroups(2),
this method requires root capabilities. In the case of OSX, which has
small NGROUPS_MAX by default (16), it's not possible to restore main's groups
if it's large; if so, this method fallbacks again: user_cred gids aren't
stored, and the worker process calls initgroups() itself and may block for
some time if LDAP/NDIS+ is in use.
|
|
- Introduced nxt_runtime_process_port_create().
- Moved nxt_process_use() into nxt_process.c from nxt_runtime.c.
- Renamed nxt_runtime_process_remove_pid() as nxt_runtime_process_remove().
- Some public functions transformed to static.
This closes #327 issue on GitHub.
|
|
Now it's possible to pass -DNXT_HAVE_CLONE=0 for debugging.
|
|
When using "credential: true", the new namespace starts with a completely
empty uid and gid ranges. Then, any setuid/setgid/setgroups calls using ids
not properly mapped with uidmap and gidmap fields return EINVAL, meaning
the id is not valid inside the new namespace.
|
|
|
|
The leak has been introduced in 325b315e48c4.
This closes #322 issue in GitHub.
|
|
|
|
This closes #228 issue on GitHub.
|
|
|
|
|
|
The bug had appeared in 5cc5002a788e when process type has been
converted to bitmask. This commit reverts the type back to a number.
This commit is related to #131 issue on GitHub.
|
|
|
|
|
|
|
|
CID 200496
CID 200494
CID 200490
CID 200489
CID 200483
CID 200482
CID 200472
CID 200465
|
|
- Main process should be connected to all other processes.
- Controller should be connected to Router.
- Router should be connected to Controller and all Workers.
- Workers should be connected to Router worker thread ports only.
This filtering helps to avoid unnecessary communication and various errors
during massive application workers stop / restart.
|
|
Two different router threads may send different requests to single
application worker. In this case shared memory fds from worker
to router will be send over 2 different router ports. These fds
will be received and processed by different threads in any order.
This patch made possible to add incoming shared memory segments in
arbitrary order. Additionally, array and memory pool are no longer
used to store segments because of pool's single threaded nature.
Custom array-like structure nxt_port_mmaps_t introduced.
|
|
This helps to decouple process removal from port memory pool cleanups.
|
|
Use counter helps to simplify logic around port and application free.
Port 'post' function introduced to simplify post execution of particular
function to original port engine's thread.
Write message queue is protected by mutex which makes port write operation
thread safe.
|
|
Memory pool is not used by port_hash and it was a mistake to pass it into
'add' and 'remove' functions. port_hash enrties are allocated from heap.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Used for connection mem pool cleanup, which can be used by buffers.
Used for port mem pool to safely destroy linked process.
|
|
|
|
Application process start request DATA message from router to master.
Master notifies router via NEW_PORT message after worker process become ready.
|