Age | Commit message (Collapse) | Author | Files | Lines |
|
Due to the need to replace our use of clone/__NR_clone on Linux with
fork(2)/unshare(2) for enabling Linux namespaces(7) to keep the
pthreads(7) API working. Let's rename NXT_HAVE_CLONE to
NXT_HAVE_LINUX_NS, i.e name it after the feature, not how it's
implemented, then in future if we change how we do namespaces again we
don't have to rename this.
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This commit hooks into the cgroup infrastructure added in the previous
commit to create per-application cgroups.
It does this by adding each "prototype process" into its own cgroup,
then each child process inherits its parents cgroup.
If we fail to create a cgroup we simply fail the process. This behaviour
may get enhanced in the future.
This won't actually do anything yet. Subsequent commits will hook this
up to the build and config systems.
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
Signed-off-by: Alejandro Colomar <alx@nginx.com>
|
|
Unix domain sockets are normally backed by files in the
filesystem. This has historically been problematic when closing
and opening again such sockets, since SO_REUSEADDR is ignored for
Unix sockets (POSIX left the behavior of SO_REUSEADDR as
implementation-defined, and most --if not all-- implementations
decided to just ignore this flag).
Many solutions are available for this problem, but all of them
have important caveats:
- unlink(2) the file when it's not needed anymore.
This is not easy, because the process that controls the fd may
not be the same process that created the file, and may not have
file permissions to remove it.
Further solutions can be applied to that caveat:
- unlink(2) the file right after creation.
This will remove the pathname from the filesystem without
closing the socket (it will continue to live until the last fd
is closed). This is not useful for us, since we need the
pathname of the socket as its interface.
- chown(2) or chmod(2) the directory that contains the socket.
For removing a file from the filesystem, a process needs
write permissions in the containing directory. We could
put sockets in dummy directories that can be chown(2)ed to
nobody. This could be dangerous, though, as we don't control
the socket names. It is our users who configure the socket
name in their configuration, and so it's easy that they don't
understand the many implications of not chosing an appropriate
socket pathname. A user could unknowingly put the socket in a
directory that is not supposed to be owned by user nobody, and
if we blindly chown(2) or chmod(2) the directory, we could be
creating a big security hole.
- Ask the main process to remove the socket.
This would require a very complex communication mechanism with
the main process, which is not impossible, but let's avoid it
if there are simpler solutions.
- Give the child process the CAP_DAC_OVERRIDE capability.
That is one of the most powerful capabilities. A process with
that capability can be considered root for most practical
aspects. Even if the capability is disabled for most of the
lifetime of the process, there's a slight chance that a
malicious actor could activate it and then easily do serious
damage to the system.
- unlink(2) the file right before calling bind(2).
This is dangerous because another process (for example, another
running instance of unitd(8)), could be using the socket, and
removing the pathname from the filesystem would be problematic.
To do this correctly, a lot of checks should be added before the
actual unlink(2), which is error-prone, and difficult to do
correctly, and atomically.
- Use abstract-namespace Unix domain sockets.
This is the simplest solution, as it only requires accepting a
slightly different syntax (basically a @ prefix) for the socket
name, to transform it into a string starting with a null byte
('\0') that the kernel can understand. The patch is minimal.
Since abstract sockets live in an abstract namespace, they don't
create files in the filesystem, so there's no need to remove
them later. The kernel removes the name when the last fd to it
has been closed.
One caveat is that only Linux currently supports this kind of
Unix sockets. Of course, a solution to that could be to ask
other kernels to implement such a feature.
Another caveat is that filesystem permissions can't be used to
control access to the socket file (since, of course, there's no
file). Anyone knowing the socket name can access to it. The
only method to control access to it is by using
network_namespaces(7). Since in unitd(8) we're using 0666 file
sockets, abstract sockets should be no more insecure than that
(anyone can already read/write to the listener sockets).
- Ask the kernel to implement a simpler way to unlink(2) socket
files when they are not needed anymore. I've suggested that to
the <linux-fsdevel@vger.kernel.org> mailing list, in:
<lore.kernel.org/linux-fsdevel/0bc5f919-bcfd-8fd0-a16b-9f060088158a@gmail.com/T>
In this commit, I decided to go for the easiest/simplest solution,
which is abstract sockets. In fact, we already had partial
support. This commit only fixes some small bug in the existing
code so that abstract Unix sockets work:
- Don't chmod(2) the socket if it's an abstract one.
This fixes the creation of abstract sockets, but doesn't make them
usable, since we produce them with a trailing '\0' in their name.
That will be fixed in the following commit.
This closes #669 issue on GitHub.
|
|
Some lines (incorrectly) had an indentation of 3 or 5, or 7 or 9,
or 11 or 13, or 15 or 17 spaces instead of 4, 8, 12, or 16. Fix them.
Found with:
$ find src -type f | xargs grep -n '^ [^ ]';
$ find src -type f | xargs grep -n '^ [^ *]';
$ find src -type f | xargs grep -n '^ [^ ]';
$ find src -type f | xargs grep -n '^ [^ *]';
$ find src -type f | xargs grep -n '^ [^ +]';
$ find src -type f | xargs grep -n '^ [^ *+]';
$ find src -type f | xargs grep -n '^ [^ +]';
$ find src -type f | xargs grep -n '^ [^ *+]';
|
|
After the c8790d2a89bb commit, the SIGCHLD handler may return before processing
all awaiting PIDs. To avoid zombie processes and ensure successful main
process termination, waitpid() must be called until an error is returned.
This closes #600 issue on GitHub.
|
|
After the c8790d2a89bb commit, the SIGCHLD handler may return before processing
all awaiting PIDs. To avoid zombie processes and ensure successful main
process termination, waitpid() must be called until an error is returned.
This closes #600 issue on GitHub.
|
|
Application process started with shared port (and queue) already configured.
But still waits for PORT_ACK message from router to start request processing
(so-called "ready state").
Waiting for router confirmation is necessary. Otherwise, the application may
produce response and send it to router before the router have the information
about the application process. This is a subject of further optimizations.
|
|
|
|
This enables the reuse of process creation functions.
|
|
|
|
Introducting application graceful stop. For now only used when application
process reach request limit value.
This closes #585 issue on GitHub.
|
|
Declarations became unused after 6976d36be926.
No functional changes.
|
|
|
|
Found by Clang Static Analyzer.
|
|
This feature allows one to specify blocks of code that are called when certain
lifecycle events occur. A user configures a "hooks" property on the app
configuration that points to a script. This script will be evaluated on boot
and should contain blocks of code that will be called on specific events.
An example of configuration:
{
"type": "ruby",
"processes": 2,
"threads": 2,
"user": "vagrant",
"group": "vagrant",
"script": "config.ru",
"hooks": "hooks.rb",
"working_directory": "/home/vagrant/unit/rbhooks",
"environment": {
"GEM_HOME": "/home/vagrant/.ruby"
}
}
An example of a valid "hooks.rb" file follows:
File.write("./hooks.#{Process.pid}", "hooks evaluated")
on_worker_boot do
File.write("./worker_boot.#{Process.pid}", "worker booted")
end
on_thread_boot do
File.write("./thread_boot.#{Process.pid}.#{Thread.current.object_id}",
"thread booted")
end
on_thread_shutdown do
File.write("./thread_shutdown.#{Process.pid}.#{Thread.current.object_id}",
"thread shutdown")
end
on_worker_shutdown do
File.write("./worker_shutdown.#{Process.pid}", "worker shutdown")
end
This closes issue #535 on GitHub.
|
|
|
|
For listen socket request reply port can be NULL if Router crashes immediately
after issuing the request.
Found by Coverity (CID 366310).
|
|
This patch is required to remove fragmented messages functionality.
|
|
|
|
Introducing manual protocol selection for 'universal' apps and frameworks.
|
|
This closes #486 issue on GitHub.
|
|
This closes #482 issue on GitHub.
|
|
This closes #458 issue on GitHub.
|
|
This closes #459 issue on GitHub.
|
|
|
|
Now it is possible to specify the name of the application callable using
optional parameter 'callable'. Default value is 'application'.
This closes #290 issue on GitHub.
|
|
Now it's possible to disable default bind mounts of
languages by setting:
{
"isolation": {
"automount": {
"language_deps": false
}
}
}
In this case, the user is responsible to provide a "rootfs"
containing the language libraries and required files for
the application.
|
|
|
|
Previously, an error during the prefork phase triggered assert:
src/nxt_port.c:27 assertion failed: port->pair[0] == -1
and resulted in exiting of the main process.
This could be easily reproduced by pushing a configuration with "rootfs",
when daemon is running without required permissions.
|
|
|
|
The process abstraction has changed to:
setup(task, process)
start(task, process_data)
prefork(task, process, mp)
The prefork() occurs in the main process right before fork.
The file src/nxt_main_process.c is completely free of process
specific logic.
The creation of a process now supports a PROCESS_CREATED state. The
The setup() function of each process can set its state to either
created or ready. If created, a MSG_PROCESS_CREATED is sent to main
process, where external setup can be done (required for rootfs under
container).
The core processes (discovery, controller and router) doesn't need
external setup, then they all proceeds to their start() function
straight away.
In the case of applications, the load of the module happens at the
process setup() time and The module's init() function has changed
to be the start() of the process.
The module API has changed to:
setup(task, process, conf)
start(task, data)
As a direct benefit of the PROCESS_CREATED message, the clone(2) of
processes using pid namespaces now doesn't need to create a pipe
to make the child block until parent setup uid/gid mappings nor it
needs to receive the child pid.
|
|
This allows to specify multiple subsequent targets inside PHP applications.
For example:
{
"listeners": {
"*:80": {
"pass": "routes"
}
},
"routes": [
{
"match": {
"uri": "/info"
},
"action": {
"pass": "applications/my_app/phpinfo"
}
},
{
"match": {
"uri": "/hello"
},
"action": {
"pass": "applications/my_app/hello"
}
},
{
"action": {
"pass": "applications/my_app/rest"
}
}
],
"applications": {
"my_app": {
"type": "php",
"targets": {
"phpinfo": {
"script": "phpinfo.php",
"root": "/www/data/admin",
},
"hello": {
"script": "hello.php",
"root": "/www/data/test",
},
"rest": {
"root": "/www/data/example.com",
"index": "index.php"
},
}
}
}
}
|
|
|
|
The setuid/setgid syscalls requires root capabilities but if the kernel
supports unprivileged user namespace then the child process has the full
set of capabilities in the new namespace, then we can allow setting "user"
and "group" in such cases (this is a common security use case).
Tests were added to ensure user gets meaningful error messages for
uid/gid mapping misconfigurations.
|
|
This is required to avoid include cycles, as some nxt_clone_* functions
depend on the credential structures, but nxt_process depends on clone
structures.
|
|
Introduces the functions nxt_process_init_create() and
nxt_process_init_creds_set().
|
|
- Introduced nxt_runtime_process_port_create().
- Moved nxt_process_use() into nxt_process.c from nxt_runtime.c.
- Renamed nxt_runtime_process_remove_pid() as nxt_runtime_process_remove().
- Some public functions transformed to static.
This closes #327 issue on GitHub.
|
|
This avoids memory leak reports from the address sanitizer.
|
|
This patch closes #328 in github.
|
|
|
|
When Unit starts, the main process waits for module discovery message for a
while. If a QUIT signal arrives at this time, the router and controller
processes created by main and Unit stay running. Also, the main process
doesn't stop them after the second QUIT signal is received in this case.
|
|
The <sched.h> is already included by nxt_unix.h.
This closes #314 PR on GitHub.
|
|
Found by Coverity (CID 349485).
|
|
|
|
This closes #312 issue on GitHub.
|
|
|
|
|
|
|
|
This is related to issue #198 on GitHub.
|