summaryrefslogtreecommitdiffhomepage
path: root/src (follow)
AgeCommit message (Collapse)AuthorFilesLines
2023-01-20Python: ASGI: Switch away from asyncio.get_event_loop().Andrew Clayton1-1/+20
Several users on GitHub reported issues with running Python ASGI apps on Unit with Python 3.11.1 (this would also effect Python 3.10.9) with the following error from Unit 2023/01/15 22:43:22 [alert] 0#77128 [unit] Python failed to call 'asyncio.get_event_loop' TL;DR asyncio.get_event_loop() is currently broken due to the process of deprecating part or all of it. First some history. In Unit we had this commit commit 8dcb0b9987033d0349a6ecf528014a9daa574787 Author: Max Romanov <max.romanov@nginx.com> Date: Thu Nov 5 00:04:59 2020 +0300 Python: request processing in multiple threads. One of things this did was to create a new asyncio event loop in each thread using asyncio.new_event_loop(). It's perhaps worth noting that all these asyncio.* functions are Python functions that we call from the C code in Unit. Then we had this commit commit f27fbd9b4d2bdaddf1e7001d0d0bc5586ba04cd4 Author: Max Romanov <max.romanov@nginx.com> Date: Tue Jul 20 10:37:54 2021 +0300 Python: using default event_loop for main thread for ASGI. This changed things so that Unit calls asyncio.get_event_loop() in the _main_ thread (but still calls asyncio.new_event_loop() in the other threads). asyncio.get_event_loop() up until recently would either return an already running event loop or return a newly created one. This was done for $reasons that the commit message and GitHub issue #560 hint at. But the intimation is that there can already be an event loop running from the application (I assume it's referring to the users application) at this point and if there is we should use it. Now for the Python side of things. On the main branch we had commit 172c0f2752d8708b6dda7b42e6c5a3519420a4e8 Author: Serhiy Storchaka <storchaka@gmail.com> Date: Sun Apr 25 13:40:44 2021 +0300 bpo-39529: Deprecate creating new event loop in asyncio.get_event_loop() (GH-23554) This commit began the deprecating of asyncio.get_event_loop(). commit fd38a2f0ec03b4eec5e3cfd41241d198b1ee555a Author: Serhiy Storchaka <storchaka@gmail.com> Date: Tue Dec 6 19:42:12 2022 +0200 gh-93453: No longer create an event loop in get_event_loop() (#98440) This turned asyncio.get_event_loop() into a RuntimeError _if_ there isn't a current event loop. commit e5bd5ad70d9e549eeb80aadb4f3ccb0f2f23266d Author: Serhiy Storchaka <storchaka@gmail.com> Date: Fri Jan 13 14:40:29 2023 +0200 gh-100160: Restore and deprecate implicit creation of an event loop (GH-100410) This re-creates the event loop if there wasn't one and emits a deprecation warning. After at least the last two commits Unit no longer works with the Python _main_ branch. Meanwhile on the 3.11 branch we had commit 3fae04b10e2655a20a3aadb5e0d63e87206d0c67 Author: Serhiy Storchaka <storchaka@gmail.com> Date: Tue Dec 6 17:15:44 2022 +0200 [3.11] gh-93453: Only emit deprecation warning in asyncio.get_event_loop when a new event loop is created (#99949) which is what caused our breakage, though perhaps unintentionally as we get the following traceback Traceback (most recent call last): File "/usr/lib64/python3.11/asyncio/events.py", line 676, in get_event_loop f = sys._getframe(1) ^^^^^^^^^^^^^^^^ ValueError: call stack is not deep enough 2023/01/18 02:46:10 [alert] 0#180279 [unit] Python failed to call 'asyncio.get_event_loop' However, regardless, it is clear we need to stop using asyncio.get_event_loop(). One option is to switch to the higher level asyncio.run() API, however that is a rather large change. This commit takes the simpler approach of using asyncio.get_running_loop() (which it seems get_event_loop() will eventually be an alias of) in the _main_ thread to return the currently running event loop, or if there is no current event loop, it will call asyncio.new_event_loop() to return a newly created event loop. I believe this mimics the current behaviour. In my testing get_event_loop() seemed to always return a newly created loop, as when just calling get_running_loop() it would return NULL and we would fail out. When running two processes each with 2 threads we would get the following loops with Python 3.11.0 and unpatched Unit <_UnixSelectorEventLoop running=False closed=False debug=False> <_UnixSelectorEventLoop running=False closed=False debug=False> <_UnixSelectorEventLoop running=False closed=False debug=False> <_UnixSelectorEventLoop running=False closed=False debug=False> and with Python 3.11.1 and a patched Unit we would get <_UnixSelectorEventLoop running=False closed=False debug=False> <_UnixSelectorEventLoop running=False closed=False debug=False> <_UnixSelectorEventLoop running=False closed=False debug=False> <_UnixSelectorEventLoop running=False closed=False debug=False> Tested-by: Rafał Safin <rafal.safin12@gmail.com> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-01-20Python: ASGI: Factor out event loop creation to its own function.Andrew Clayton1-21/+35
This is a preparatory patch that factors out the asyncio event loop creation code from nxt_python_asgi_ctx_data_alloc() into its own function, to facilitate being called multiple times. This a part of the work to move away from using the asyncio.get_event_loop() function due to it no longer creating event loops if there wasn't one running. See the following commit for the gory details. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-11-07PHP: Fix a potential problem parsing the path.Andrew Clayton1-1/+2
@dward on GitHub reported an issue with a URL like http://foo.bar/test.php?blah=test.php/foo where we would end up trying to run the script test.php?blah=test.php In the PHP module the format 'file.php/' is treated as a special case in nxt_php_dynamic_request() where we check the _path_ part of the url for the string '.php/'. The problem is that the path actually also contains the query string, thus we were finding 'test.php/' in the above URL and treating that whole path as the script to run. The fix is simple, replace the strstr(3) with a memmem(3), where we can limit the amount of path we use for the check. The trick here and what is not obvious from the code is that while path.start points to the whole path including the query string, path.length only contains the length of the _path_ part. NOTE: memmem(3) is a GNU extension and is neither specified by POSIX or ISO C, however it is available on a number of other systems, including: FreeBSD, OpenBSD, NetBSD, illumos, and macOS. If it comes to it we can implement a simple alternative for systems which lack memmem(3). This also adds a test case (provided by @dward) to cover this. Closes: <https://github.com/nginx/unit/issues/781> Cc: Andrei Zeliankou <zelenkov@nginx.com> Reviewed-by: Alejandro Colomar <alx@nginx.com> Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com> [test] Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-02Fix endianness detection in nxt_websocket_header_t.Andrew Clayton1-2/+2
The nxt_websocket_header_t structure defines the layout of a websocket frame header. As the websocket frame is mapped directly onto this structure its layout needs to match how it's coming off the network. The network being big endian means on big endian systems the structure layout can simply match that of the websocket frame header. On little endian systems we need to reverse the two bytes. This was done via the BYTE_ORDER, BIG_ENDIAN and LITTLE_ENDIAN macros, however these are not universal, e.g they are _not_ defined on illumos (OpenSolaris / OpenIndiana) and so we get the following compiler errors In file included from src/nxt_h1proto.c:12:0: src/nxt_websocket_header.h:25:13: error: duplicate member 'opcode' uint8_t opcode:4; ^~~~~~ src/nxt_websocket_header.h:26:13: error: duplicate member 'rsv3' uint8_t rsv3:1; ^~~~ src/nxt_websocket_header.h:27:13: error: duplicate member 'rsv2' uint8_t rsv2:1; ^~~~ src/nxt_websocket_header.h:28:13: error: duplicate member 'rsv1' uint8_t rsv1:1; ^~~~ src/nxt_websocket_header.h:29:13: error: duplicate member 'fin' uint8_t fin:1; ^~~ src/nxt_websocket_header.h:31:13: error: duplicate member 'payload_len' uint8_t payload_len:7; ^~~~~~~~~~~ src/nxt_websocket_header.h:32:13: error: duplicate member 'mask' uint8_t mask:1; ^~~~ This commit fixes that by using the new NXT_HAVE_{BIG,LITTLE}_ENDIAN macros introduced in the previous commit. Closes: <https://github.com/nginx/unit/issues/297> Fixes: e501c74 ("Introducing websocket support in router and libunit.") Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-01-05Python: Fix enabling of UTF-8 in some situations.Andrew Clayton1-0/+14
There was a couple of reports of Python applications failing due to the following type of error File "/opt/netbox/netbox/netbox/configuration.py", line 25, in _import print(f"\U0001f9ec loaded config '{path}'") UnicodeEncodeError: 'ascii' codec can't encode character '\U0001f9ec' in position 0: ordinal not in range(128) due to the use of Unicode text in the print() statement. This only happened for python 3.8+ when using the "home" configuration option as this meant we were going through the new PyConfig configuration. When using this new configuration method with the 'isolated' specific API (for embedded Python) UTF-8 is disabled by default, PyPreConfig->utf8_mode = 0. To fix this we need to setup the Python pre config and enable utf-8 mode. However rather than enable utf-8 unconditionally we can set to it to -1 so that it will use the LC_CTYPE environment variable to determine whether to enable utf-8 mode or not. utf-8 mode will be enabled if LC_CTYPE is either: C, POSIX or some specific UTF-8 locale. This is the default utf8_mode setting when using the non-isolated PyPreConfig API. Reported-by: Tobias Genannt <tobias.genannt@kappa-velorum.net> Tested-by: Tobias Genannt <tobias.genannt@kappa-velorum.net> Link: <https://peps.python.org/pep-0587/> Link: <https://docs.python.org/3/c-api/init_config.html#c.PyPreConfig.utf8_mode> Fixes: 491d0f70 ("Python: Added support for Python 3.11.") Closes: <https://github.com/nginx/unit/issues/817> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-30Python: Do some cleanup in nxt_python3_init_config().Andrew Clayton1-10/+12
This is a preparatory patch for future work and cleans up the code a little in the Python 3.8+ variant of nxt_python3_init_config(). The main advantage being we no longer have calls to PyConfig_Clear() in two different paths. The variables have a little extra space in their declarations to allow for the next patch which introduces a variable with a longer type name, which will help reduce the size of the diff. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-14Python: Added "prefix" to configuration.OutOfFocus46-23/+155
This patch gives users the option to set a `"prefix"` attribute for Python applications, either at the top level or for specific `"target"`s. If the attribute is present, the value of `"prefix"` must be a string beginning with `"/"`. If the value of the `"prefix"` attribute is longer than 1 character and ends in `"/"`, the trailing `"/"` is stripped. The purpose of the `"prefix"` attribute is to set the `SCRIPT_NAME` context value for WSGI applications and the `root_path` context value for ASGI applications, allowing applications to properly route requests regardless of the path that the server uses to expose the application. The context value is only set if the request's URL path begins with the value of the `"prefix"` attribute. In all other cases, the `SCRIPT_NAME` or `root_path` values are not set. In addition, for WSGI applications, the value of `"prefix"` will be stripped from the beginning of the request's URL path before it is sent to the application. Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com> Reviewed-by: Artem Konev <artem.konev@nginx.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-12-14Removed dead code.OutOfFocus41-1/+0
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-12-13Configuration: made large_header_buffers a valid setting.Andrew Clayton1-0/+3
This is an extension to the previous commit, which made large_header_buffer_size a valid configuration setting. This commit makes a related value, large_header_buffers, a valid configuration setting. While large_header_buffer_size effectively limits the maximum size of any single header (although unit will try to pack multiple headers into a buffer if they wholly fit). large_header_buffers limits how many of these 'large' buffers are available. It makes sense to also allow this to be user set. large_header_buffers is already set by the configuration system in nxt_router.c it just isn't set as a valid config option in nxt_conf_validation.c With this change users can set this option in their config if required by the following "settings": { "http": { "large_header_buffers": 8 } }, It retains its default value of 4 if this is not set. NOTE: This is being released as undocumented and subject to change as it exposes internal workings of unit. Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-13Configuration: made large_header_buffer_size a valid setting.Andrew Clayton1-0/+3
@JanMikes and @tagur87 on GitHub both reported issues with long URLs that were exceeding the 8192 byte large_header_buffer_size setting, which resulted in a HTTP 431 error (Request Header Fields Too Large). This can be resolved in the code by updating the following line in src/nxt_router.c::nxt_router_conf_create() skcf->large_header_buffer_size = 8192; However, requiring users to modify unit and install custom versions is less than ideal. We could increase the value, but to what? This commit takes the option of allowing the user to set this option in their config by making large_header_buffer_size a valid configuration setting. large_header_buffer_size is already set by the configuration system in nxt_router.c it just isn't set as a valid config option in nxt_conf_validation.c With this change users can set this option in their config if required by the following "settings": { "http": { "large_header_buffer_size": 16384 } }, It retains its default value of 8192 bytes if this is not set. With this commit, without the above setting or too low a value, with a long URL you get a 431 error. With the above setting set to a large enough value, the request is successful. NOTE: This setting really determines the maximum size of any single header _value_. Also, unit will try and place multiple values into a buffer _if_ they fully fit. NOTE: This is being released as undocumented and subject to change as it exposes internal workings of unit. Closes: <https://github.com/nginx/unit/issues/521> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-10Isolation: wired up cgroup support to the config system.Andrew Clayton1-0/+62
This hooks the cgroup support up to the config system so it can actually be used. To make use of this in unit a new "cgroup" section has been added to the isolation configuration. e.g "applications": { "python": { "type": "python", "processes": 5, "path": "/opt/unit/unit-cgroup-test/", "module": "app", "isolation": { "cgroup": { "path": "app/python" } } } } Now there are two ways to specify the path, relative, like the above (without a leading '/') and absolute (with a leading '/'). In the above case the "python" application is placed into its own cgroup under CGROUP_ROOT/<main unit process cgroup>/app/python. Whereas if you specified say "path": "/unit/app/python" Then the python application would be placed under CGROUP_ROOT/unit/app/python The first option allows you to easily take advantage of any resource limits that have already been configured for unit. With the second method (absolute pathname) if you know of an already existing cgroup where you'd like to place it, you can, e.g "path": "/system.slice/unit/python" Where system.slice has already been created by systemd and may already have some overall system limits applied which would also apply to unit. Limits apply down the hierarchy and lower groups can't exceed the previous group limits. So what does this actually look like? Lets take the unit-calculator application[0] and have each of its applications placed into their own cgroup. If we give each application a new section like "isolation": { "cgroup": { "path": "/unit/unit-calculator/add" } } changing the path for each one, we can visualise the result with the systemd-cgls command, e.g │ └─session-5.scope (#4561) │ ├─ 6667 sshd: andrew [priv] │ ├─ 6684 sshd: andrew@pts/0 │ ├─ 6685 -bash │ ├─ 12632 unit: main v1.28.0 [/opt/unit/sbin/unitd --control 127.0.0.1:808> │ ├─ 12634 unit: controller │ ├─ 12635 unit: router │ ├─ 13550 systemd-cgls │ └─ 13551 less ├─unit (#4759) │ └─unit-calculator (#5037) │ ├─subtract (#5069) │ │ ├─ 12650 unit: "subtract" prototype │ │ └─ 12651 unit: "subtract" application │ ├─multiply (#5085) │ │ ├─ 12653 unit: "multiply" prototype │ │ └─ 12654 unit: "multiply" application │ ├─divide (#5101) │ │ ├─ 12671 unit: "divide" prototype │ │ └─ 12672 node divide.js │ ├─sqroot (#5117) │ │ ├─ 12679 unit: "sqroot" prototype │ │ └─ 12680 /home/andrew/src/unit-calculator/sqroot/sqroot │ └─add (#5053) │ ├─ 12648 unit: "add" prototype │ └─ 12649 unit: "add" application We used an absolute path so the cgroups will be created relative to the main cgroupfs mount, e.g /sys/fs/cgroup We can see that the main unit processes are in the same cgroup as the shell from where they were started, by default child process are placed into the same cgroup as the parent. Then we can see that each application has been placed into its own cgroup under /sys/fs/cgroup Taking another example of a simple 5 process python application, with "isolation": { "cgroup": { "path": "app/python" } } Here we have specified a relative path and thus the python application will be placed below the existing cgroup that contains the main unit process. E.g │ │ │ ├─app-glib-cinnamon\x2dcustom\x2dlauncher\x2d3-43951.scope (#90951) │ │ │ │ ├─ 988 unit: main v1.28.0 [/opt/unit/sbin/unitd --no-daemon] │ │ │ │ ├─ 990 unit: controller │ │ │ │ ├─ 991 unit: router │ │ │ │ ├─ 43951 xterm -bg rgb:20/20/20 -fg white -fa DejaVu Sans Mono │ │ │ │ ├─ 43956 bash │ │ │ │ ├─ 58828 sudo -i │ │ │ │ ├─ 58831 -bash │ │ │ │ └─app (#107351) │ │ │ │ └─python (#107367) │ │ │ │ ├─ 992 unit: "python" prototype │ │ │ │ ├─ 993 unit: "python" application │ │ │ │ ├─ 994 unit: "python" application │ │ │ │ ├─ 995 unit: "python" application │ │ │ │ ├─ 996 unit: "python" application │ │ │ │ └─ 997 unit: "python" application [0]: <https://github.com/lcrilly/unit-calculator> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-10Isolation: wired up per-application cgroup support internally.Andrew Clayton4-0/+79
This commit hooks into the cgroup infrastructure added in the previous commit to create per-application cgroups. It does this by adding each "prototype process" into its own cgroup, then each child process inherits its parents cgroup. If we fail to create a cgroup we simply fail the process. This behaviour may get enhanced in the future. This won't actually do anything yet. Subsequent commits will hook this up to the build and config systems. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-10Isolation: added core cgroup infrastructure.Andrew Clayton2-0/+188
Firstly, this is not to be confused with CLONE_NEWCGROUP which unit already supports and is related to namespaces. To re-cap, namespaces allow processes to have different views of various parts of the system such as filesystem mounts, networking, hostname etc. Whereas cgroup[0] is a Linux kernel facility for collecting a bunch of processes together to perform some task on the group as a whole, for example to implement resource limits. There are two parts to cgroup, the core part of organising processes into a hierarchy and the controllers which are responsible for enforcing resource limits etc. There are currently two versions of the cgroup sub-system, the original cgroup and a version 2[1] introduced in 3.16 (August 2014) and marked stable in 4.5 (March 2016). This commit supports the cgroup V2 API and implements the ability to place applications into their own cgroup on a per-application basis. You can put them each into their own cgroup or you can group some together. The ability to set resource limits can easily be added in future. The initial use case of this would be to aid in observability of unit applications which becomes much easier if you can just monitor them on a per cgroup basis. One thing to note about cgroup, is that unlike namespaces which are controlled via system calls such as clone(2) and unshare(2), cgroups are setup and controlled through the cgroupfs pseudo-filesystem. cgroup is Linux only and this support will only be enabled if configure finds the cgroup2 filesystem mount, e.g cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate,memory_recursiveprot) The cgroups are removed on shutdown or as required on reconfiguration. This commit just adds the basic infrastructure for using cgroups within unit. Subsequent commits will wire up this support. It supports creating cgroups relative to the main cgroup root and also below the cgroup of the main unit process. [0]: <https://man7.org/linux/man-pages/man7/cgroups.7.html> [1]: <https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html> Cc: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-10Added simple wrappers for fopen(3) and fclose(3).Andrew Clayton2-0/+41
Add simple wrapper functions for fopen(3) and fclose(3) that are somewhat akin to the nxt_file_open() and nxt_file_close() wrappers that log errors. Suggested-by: Alejandro Colomar <alx@nginx.com> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-08Fix compilation with GCC and -O0.Andrew Clayton1-2/+2
Andrei reported an issue with building unit when using '-O0' with GCC producing the following compiler errors cc -c -pipe -fPIC -fvisibility=hidden -O -W -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wmissing-prototypes -Werror -g -O0 -I src -I build \ \ \ -o build/src/nxt_unit.o \ -MMD -MF build/src/nxt_unit.dep -MT build/src/nxt_unit.o \ src/nxt_unit.c src/nxt_unit.c: In function ‘nxt_unit_log’: src/nxt_unit.c:6601:9: error: ‘msg’ may be used uninitialized [-Werror=maybe-uninitialized] 6601 | p = nxt_unit_snprint_prefix(p, end, pid, level); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/nxt_unit.c:6682:1: note: by argument 2 of type ‘const char *’ to ‘nxt_unit_snprint_prefix’ declared here 6682 | nxt_unit_snprint_prefix(char *p, const char *end, pid_t pid, int level) | ^~~~~~~~~~~~~~~~~~~~~~~ src/nxt_unit.c:6582:22: note: ‘msg’ declared here 6582 | char msg[NXT_MAX_ERROR_STR], *p, *end; | ^~~ src/nxt_unit.c: In function ‘nxt_unit_req_log’: src/nxt_unit.c:6645:9: error: ‘msg’ may be used uninitialized [-Werror=maybe-uninitialized] 6645 | p = nxt_unit_snprint_prefix(p, end, pid, level); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/nxt_unit.c:6682:1: note: by argument 2 of type ‘const char *’ to ‘nxt_unit_snprint_prefix’ declared here 6682 | nxt_unit_snprint_prefix(char *p, const char *end, pid_t pid, int level) | ^~~~~~~~~~~~~~~~~~~~~~~ src/nxt_unit.c:6625:35: note: ‘msg’ declared here 6625 | char msg[NXT_MAX_ERROR_STR], *p, *end; | ^~~ cc1: all warnings being treated as errors The above was reproduced with $ ./configure --cc-opt=-O0 && ./configure python && make -j4 This warning doesn't happen on clang (15.0.4) or GCC (8.3) and seems to have been introduced in GCC 11. The above is from GCC (12.2.1, Fedora 37). The trigger of this GCC issue is actually part of a commit I introduced a few months back to constify some function parameters and it seems the consensus for how to resolve this problem is to simply remove the const qualifier from the *end parameter to nxt_unit_snprint_prefix(). Reported-by: Andrei Zeliankou <zelenkov@nginx.com> Link: <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100417> Link: <https://github.com/samtools/htslib/pull/1285> Link: <https://gcc.gnu.org/gcc-11/changes.html> Fixes: 4418f99 ("Constified numerous function parameters.") Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-06Node.js: added "shortCircuit" option for ES modules hook.Andrei Zeliankou1-2/+4
Starting from Node.js v18.6.0 return value from all hooks must have "shortCircuit: true" option specified. For more information see: https://github.com/nodejs/node/commit/10bcad5c6e
2022-12-06Python: Added support for Python 3.11.Andrew Clayton1-2/+67
Python 3.8 added a new Python initialisation configuration API[0]. Python 3.11 marked the old API as deprecated resulting in the following compiler warnings which we treat as errors, failing the build src/python/nxt_python.c: In function ‘nxt_python_start’: src/python/nxt_python.c:130:13: error: ‘Py_SetProgramName’ is deprecated [-Werror=deprecated-declarations] 130 | Py_SetProgramName(nxt_py_home); | ^~~~~~~~~~~~~~~~~ In file included from /opt/python-3.11/include/python3.11/Python.h:94, from src/python/nxt_python.c:7: /opt/python-3.11/include/python3.11/pylifecycle.h:37:38: note: declared here 37 | Py_DEPRECATED(3.11) PyAPI_FUNC(void) Py_SetProgramName(const wchar_t *); | ^~~~~~~~~~~~~~~~~ src/python/nxt_python.c:134:13: error: ‘Py_SetPythonHome’ is deprecated [-Werror=deprecated-declarations] 134 | Py_SetPythonHome(nxt_py_home); | ^~~~~~~~~~~~~~~~ /opt/python-3.11/include/python3.11/pylifecycle.h:40:38: note: declared here 40 | Py_DEPRECATED(3.11) PyAPI_FUNC(void) Py_SetPythonHome(const wchar_t *); | ^~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors We actually have a few config scenarios: Python < 3, Python >= 3.0 < 3.8 and for Python 3 we have two configs where we select one based on virtual environment setup. Factor out the Python 3 config initialisation into its own function. We actually create two functions, one for Python 3.8+ and one for older Python 3. We pick the right function to use at build time. The new API also has error checking (where the old API doesn't) which we handle. [0]: https://peps.python.org/pep-0587/ Closes: <https://github.com/nginx/unit/issues/710> [ Andrew: Expanded upon patch from @sandeep-gh ] Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-11-22NJS: added http request prototype.Zhidao HONG6-3/+358
2022-11-20Basic njs support.Zhidao HONG13-44/+445
2022-11-20Var: separating nxt_tstr_t from nxt_var_t.Zhidao HONG16-226/+402
It's for the introduction of njs support. For each option that supports native variable and JS template literals introduced next, it's unified as template string. No functional changes.
2022-11-20Var: improved variable parsing with empty names.Zhidao HONG1-43/+32
Unit parsed the case of "$uri$$host" into unknown variables. This commit makes it invalid variable instead.
2022-11-17Refactored functions that set WSGI variables.OutOfFocus41-6/+24
Splitting `nxt_python_add_sptr` into several functions will make future additions easier. Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-11-17Removed dead code.OutOfFocus43-16/+1
Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-11-15Optimization for the "--no-unix-sockets" case.Andrei Zeliankou1-21/+19
2022-11-04Removed the unsafe nxt_memchr() wrapper for memchr(3).Alejandro Colomar10-27/+23
The casts are unnecessary, since memchr(3)'s argument is 'const void *'. It might have been necessary in the times of K&R, where 'void *' didn't exist. Nowadays, it's unnecessary, and _very_ unsafe, since casts can hide all classes of bugs by silencing most compiler warnings. The changes from nxt_memchr() to memchr(3) were scripted: $ find src/ -type f \ | grep '\.[ch]$' \ | xargs sed -i 's/nxt_memchr/memchr/' Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-11-04Removed the unsafe nxt_memcmp() wrapper for memcmp(3).Alejandro Colomar18-46/+42
The casts are unnecessary, since memcmp(3)'s arguments are 'void *'. It might have been necessary in the times of K&R, where 'void *' didn't exist. Nowadays, it's unnecessary, and _very_ unsafe, since casts can hide all classes of bugs by silencing most compiler warnings. The changes from nxt_memcmp() to memcmp(3) were scripted: $ find src/ -type f \ | grep '\.[ch]$' \ | xargs sed -i 's/nxt_memcmp/memcmp/' Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-11-02PHP: allowed to specify URLs without a trailing '/'.Andrew Clayton3-6/+92
Both @lucatacconi & @mwoodpatrick reported what appears to be the same issue on GitHub. Namely that when using the PHP language module and trying to access a URL that is a directory but without specifying the trailing '/', they were getting a '503 Service Unavailable' error. Note: This is when _not_ using the 'script' option. E.g with the following config { "listeners": { "[::1]:8080": { "pass": "applications/php" } }, "applications": { "php": { "type": "php", "root": "/var/tmp/unit-php" } } } and with a directory path of /var/tmp/unit-php/foo containing an index.php, you would see the following $ curl http://localhost/foo <title>Error 503</title> Error 503 However $ curl http://localhost/foo/ would work and serve up the index.php This commit fixes the above so you get the desired behaviour without specifying the trailing '/' by doing the following 1] If the URL doesn't end in .php and doesn't have a trailing '/' then check if the requested path is a directory. 2) If it is a directory then create a 301 re-direct pointing to it. This matches the behaviour of the likes of nginx, Apache and lighttpd. This also matches the behaviour of the "share" action in Unit. This doesn't effect the behaviour of the 'script' option which bypasses the nxt_php_dynamic_request() function. This also adds a couple of tests to test/test_php_application.py to ensure this continues to work. Closes: <https://github.com/nginx/unit/issues/717> Closes: <https://github.com/nginx/unit/issues/753> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-10-28Fixed some function definitions.Andrew Clayton4-5/+5
Future releases of GCC will render function definitions like func() invalid by default. See the previous commit 09f88c9 ("Fixed main() prototypes in auto tests.") for details. Such functions should be defined like func(void) This is a good thing to do regardless of the upcoming GCC changes. Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-10-21TLS: Using ERR_get_error_all() with OpenSSL 3.Remi Collet1-0/+4
Link: <https://www.openssl.org/docs/man3.0/man7/migration_guide.html> Cc: Andy Postnikov <apostnikov@gmail.com> Cc: Andrew Clayton <a.clayton@nginx.com> Signed-off-by: Remi Collet <remi@remirepo.net> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-10-20Preferring system crypto policy.Remi Collet1-7/+7
If we don't call SSL_CTX_set_cipher_list(), then it uses the system's default. Link: <https://fedoraproject.org/wiki/Changes/CryptoPolicy> Link: <https://docs.fedoraproject.org/en-US/packaging-guidelines/CryptoPolicies/> Link: <https://www.redhat.com/en/blog/consistent-security-crypto-policies-red-hat-enterprise-linux-8> Signed-off-by: Remi Collet <remi@remirepo.net> Acked-by: Andrei Belov <defan@nginx.com> [ alx: add changelog and tweak commit message ] Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-10-14Configuration: stopped automatic migration to the "share" behavior.Zhidao HONG1-21/+0
This commit removed the $uri auto-append for the "share" option introduced in rev be6409cdb028. The main reason is that it causes problems when preparing Unit configurations to be loaded at startup from the state directory. E.g. Docker. A valid conf.json file with $uri references will end up with $uri$uri due to the auto-append.
2022-10-19Added parentheses for consistency.Remi Collet1-8/+8
Reported-by: Andrew Clayton <a.clayton@nginx.com> Signed-off-by: Remi Collet <remi@remirepo.net> Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-10-19PHP: Fixed php_module_startup() call for PHP 8.2.Remi Collet1-0/+4
PHP 8.2 changed the prototype of the function, removing the last parameter. Signed-off-by: Remi Collet <remi@remirepo.net> Cc: Timo Stark <t.stark@nginx.com> Cc: George Peter Banyard <girgias@php.net> Tested-by: Andy Postnikov <apostnikov@gmail.com> Acked-by: Andy Postnikov <apostnikov@gmail.com> Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-10-14Added missing error checking in the C API.Alex Colomar1-10/+28
pthread_mutex_init(3) may fail for several reasons, and failing to check will cause Undefined Behavior when those errors happen. Add missing checks, and correctly deinitialize previously created stuff before exiting from the API. Signed-off-by: Alejandro Colomar <alx@nginx.com> Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Reviewed-by: Zhidao HONG <z.hong@f5.com>
2022-10-14Fixed the build on MacOS (and others).Andrew Clayton5-257/+279
@alejandro-colomar reported that the build was broken on MacOS cc -o build/unitd -pipe -fPIC -fvisibility=hidden -O -W -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -fstrict-aliasing -Wstrict-overflow=5 -Wmissing-prototypes -Werror -g \ build/src/nxt_main.o build/libnxt.a \ \ \ -L/usr/local/Cellar/pcre2/10.40/lib -lpcre2-8 Undefined symbols for architecture x86_64: "_nxt_fs_mkdir_parent", referenced from: _nxt_runtime_pid_file_create in libnxt.a(nxt_runtime.o) _nxt_runtime_controller_socket in libnxt.a(nxt_controller.o) ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) make: *** [build/unitd] Error 1 This was due to commit 57fc920 ("Socket: Created control socket & pid file directories."). This happened because this commit introduced the usage of nxt_fs_mkdir_parent() in core code which uses nxt_fs_mkdir(), both of these are defined in src/nxt_fs.c. It turns out however that this file doesn't get built on MacOS (or any system that isn't Linux or that lacks a FreeBSD compatible nmount(2) system call) due to the following In auto/sources we have if [ $NXT_HAVE_ROOTFS = YES ]; then NXT_LIB_SRCS="$NXT_LIB_SRCS src/nxt_fs.c" fi NXT_HAVE_ROOTFS is set in auto/isolation If [ $NXT_HAVE_MOUNT = YES -a $NXT_HAVE_UNMOUNT = YES ]; then NXT_HAVE_ROOTFS=YES cat << END >> $NXT_AUTO_CONFIG_H #ifndef NXT_HAVE_ISOLATION_ROOTFS #define NXT_HAVE_ISOLATION_ROOTFS 1 #endif END fi While we do have a check for a generic umount(2) which is found on MacOS, for mount(2) we currently only check for the Linux mount(2) and FreeBSD nmount(2) system calls. So NXT_HAVE_ROOTFS is set to NO on MacOS and we don't build src/nxt_fs.c This fixes the immediate build issue by taking the mount/umount OS support out of nxt_fs.c into a new nxt_fs_mount.c file which is guarded by the above while we now build nxt_fs.c unconditionally. This should fix the build on any _supported_ system. Reported-by: Alejandro Colomar <alx@nginx.com> Fixes: 57fc920 ("Socket: Created control socket & pid file directories.") Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-10-12HTTP: added a $request_time variable.Zhidao HONG3-0/+37
2022-10-04Ruby: used nxt_ruby_exception_log() in nxt_ruby_rack_init().Andrew Clayton1-1/+1
For consistency use nxt_ruby_exception_log() rather than nxt_alert() in nxt_ruby_rack_init(). Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-10-04Ruby: added support for rack V3.Zhidao HONG1-1/+6
Ruby applications would fail to start if they were using rack v3 2022/09/28 15:48:46 [alert] 0#80912 [unit] Ruby: Failed to parse rack script 2022/09/28 15:48:46 [notice] 80911#80911 app process 80912 exited with code 1 This was due to a change in the rack API Rack V2 def self.load_file(path, opts = Server::Options.new) ... cfgfile.sub!(/^__END__\n.*\Z/m, '') app = new_from_string cfgfile, path return app, options end Rack V3 def self.load_file(path) ... return new_from_string(config, path) end This patch handles _both_ the above APIs by correctly handling the cases where we do and don't get an array returned from nxt_ruby_rack_parse_script(). Closes: <https://github.com/nginx/unit/issues/755> Tested-by: Andrew Clayton <a.clayton@nginx.com> Reviewed-by: Andrew Clayton <a.clayton@nginx.com> [ Andrew: Patch by Zhidao, commit message by me with input from Zhidao ] Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-10-03Renamed a couple of members of nxt_unit_request_t.Andrew Clayton10-23/+24
This is a preparatory patch that renames the 'local' and 'local_length' members of the nxt_unit_request_t structure to 'local_addr' and 'local_addr_length' in preparation for the adding of 'local_port' and 'local_port_length' members. Suggested-by: Zhidao HONG <z.hong@f5.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-10-03Socket: Created control socket & pid file directories.Andrew Clayton4-0/+36
@alejandro-colomar reported an issue on GitHub whereby Unit would fail to start due to not being able to create the control socket (a Unix Domain Socket) 2022/08/05 20:12:22 [alert] 21613#21613 bind(6, unix:/opt/local/unit/var/run/unit/control.unit.sock.tmp) failed (2: No such file or directory) This could happen if the control socket was set to a directory that doesn't exist. A common place to put the control socket would be under /run/unit, and while /run will exist, /run/unit may well not (/run is/should be cleared on each boot). The pid file would also generally go under /run/unit, though this is created after the control socket, however it could go someplace else so we should also ensure its directory exists. This commit will try to create the pid file and control sockets parent directory. In some cases the user will need to ensure that the rest of the path already exists. This adds a new nxt_fs_mkdir_parent() function that given a full path to a file (or directory), strips the last component off before passing the remaining directory path to nxt_fs_mkdir(). Cc: Konstantin Pavlov <thresh@nginx.com> Closes: <https://github.com/nginx/unit/issues/742> Reported-by: Alejandro Colomar <alx@nginx.com> Reviewed-by: Alejandro Colomar <alx@nginx.com> Tested-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-09-22Status: fixed error in connection statistics.Zhidao HONG2-4/+7
When proxy is used, the number of accepted connections is not counted, This also results in the wrong number of active connections.
2022-09-19HTTP: fixed cookie parsing.Zhidao HONG1-5/+2
The fixing supports the cookie value with the '=' character. This is related to #756 PR on Github. Thanks to changxiaocui.
2022-09-10Fixed a mutex leak in the C API.Alex Colomar1-12/+8
In nxt_unit_create() we could leak a mutex created in nxt_unit_ctx_init(). This could happen if nxt_unit_ctx_init() succeeded but later on we bailed out of nxt_unit_create(), we would destroy the mutex created in nxt_unit_create() but not the one created in nxt_unit_ctx_init(). Reorder things so that we do the call to nxt_unit_create() after all the other checks so if it fails we don't leak the mutex it created. Co-developed-by: Andrew Clayton <a.clayton@f5.com> Signed-off-by: Andrew Clayton <a.clayton@f5.com> Signed-off-by: Alex Colomar <a.colomar@f5.com>
2022-09-06Status: fixed incorrect pointer in test operation.Zhidao HONG1-3/+3
Found by Coverity (CID 380755).
2022-08-29Status: added requests count.Zhidao HONG5-1/+9
2022-08-29Implemented basic statistics API.Valentin Bartenev12-16/+452
2022-08-31Ruby: prevented a segfault on receiving SIGINT (^C).Andrew Clayton1-0/+2
As was reported[0] by @travisbell on GitHub, if running unit from the terminal in the foreground when hitting ^C to exit it, the ruby application processes would segfault if they were using threads. It's not 100% clear where the actual problem lies, but it _looks_ like it may be in ruby. The simplest way to deal with this for now is to just ignore SIGINT in the ruby application processes. Unit will still receive and handle it, cleanly shutting everything down. For people who want to handle SIGINT in their ruby application running under unit they can still trap SIGINT and it will override the ignore. [0]: https://github.com/nginx/unit/issues/562#issuecomment-1223229585 Closes: https://github.com/nginx/unit/issues/562
2022-08-18Disallowed abstract unix socket syntax in non-Linux systems.Alejandro Colomar1-4/+5
The previous commit added/fixed support for abstract Unix domain sockets on Linux with a leading '@' or '\0'. To be consistent in all platforms, treat those prefixes as markers for abstract sockets in all platforms, and fail if abstract sockets are not supported by the platform. That will avoid mistakes when copying a config file from a Linux system and using it in non-Linux, which would surprisingly create a normal socket.
2022-08-18Storing abstract sockets with @ internally.Alejandro Colomar1-1/+6
We accept both "\u0000socket-name" and "@socket-name" as abstract unix sockets. The first one is passed to the kernel pristine, while the second is transformed '@'->'\0'. The commit that added support for unix sockets accepts both variants, but we internally stored it in the same way, using "\u0000..." for both. We want to support abstract sockets transparently to the user, so that if the user configures unitd with '@', if we receive a query about the current configuration, the user should see the same exact thing that was configured. So, this commit avoids the transformation in the internal state file, storing user input pristine, and we only transform the '@' for a string that will be used internally (not user-visible). This commit (indirectly) fixes a small bug, where we created abstract sockets with a trailing '\0' in their name due to calling twice nxt_sockaddr_parse() on the same string. By calling that function only once with each copy of the string, we have fixed that bug.
2022-08-18Fixed support for abstract Unix sockets.Alejandro Colomar1-1/+3
Unix domain sockets are normally backed by files in the filesystem. This has historically been problematic when closing and opening again such sockets, since SO_REUSEADDR is ignored for Unix sockets (POSIX left the behavior of SO_REUSEADDR as implementation-defined, and most --if not all-- implementations decided to just ignore this flag). Many solutions are available for this problem, but all of them have important caveats: - unlink(2) the file when it's not needed anymore. This is not easy, because the process that controls the fd may not be the same process that created the file, and may not have file permissions to remove it. Further solutions can be applied to that caveat: - unlink(2) the file right after creation. This will remove the pathname from the filesystem without closing the socket (it will continue to live until the last fd is closed). This is not useful for us, since we need the pathname of the socket as its interface. - chown(2) or chmod(2) the directory that contains the socket. For removing a file from the filesystem, a process needs write permissions in the containing directory. We could put sockets in dummy directories that can be chown(2)ed to nobody. This could be dangerous, though, as we don't control the socket names. It is our users who configure the socket name in their configuration, and so it's easy that they don't understand the many implications of not chosing an appropriate socket pathname. A user could unknowingly put the socket in a directory that is not supposed to be owned by user nobody, and if we blindly chown(2) or chmod(2) the directory, we could be creating a big security hole. - Ask the main process to remove the socket. This would require a very complex communication mechanism with the main process, which is not impossible, but let's avoid it if there are simpler solutions. - Give the child process the CAP_DAC_OVERRIDE capability. That is one of the most powerful capabilities. A process with that capability can be considered root for most practical aspects. Even if the capability is disabled for most of the lifetime of the process, there's a slight chance that a malicious actor could activate it and then easily do serious damage to the system. - unlink(2) the file right before calling bind(2). This is dangerous because another process (for example, another running instance of unitd(8)), could be using the socket, and removing the pathname from the filesystem would be problematic. To do this correctly, a lot of checks should be added before the actual unlink(2), which is error-prone, and difficult to do correctly, and atomically. - Use abstract-namespace Unix domain sockets. This is the simplest solution, as it only requires accepting a slightly different syntax (basically a @ prefix) for the socket name, to transform it into a string starting with a null byte ('\0') that the kernel can understand. The patch is minimal. Since abstract sockets live in an abstract namespace, they don't create files in the filesystem, so there's no need to remove them later. The kernel removes the name when the last fd to it has been closed. One caveat is that only Linux currently supports this kind of Unix sockets. Of course, a solution to that could be to ask other kernels to implement such a feature. Another caveat is that filesystem permissions can't be used to control access to the socket file (since, of course, there's no file). Anyone knowing the socket name can access to it. The only method to control access to it is by using network_namespaces(7). Since in unitd(8) we're using 0666 file sockets, abstract sockets should be no more insecure than that (anyone can already read/write to the listener sockets). - Ask the kernel to implement a simpler way to unlink(2) socket files when they are not needed anymore. I've suggested that to the <linux-fsdevel@vger.kernel.org> mailing list, in: <lore.kernel.org/linux-fsdevel/0bc5f919-bcfd-8fd0-a16b-9f060088158a@gmail.com/T> In this commit, I decided to go for the easiest/simplest solution, which is abstract sockets. In fact, we already had partial support. This commit only fixes some small bug in the existing code so that abstract Unix sockets work: - Don't chmod(2) the socket if it's an abstract one. This fixes the creation of abstract sockets, but doesn't make them usable, since we produce them with a trailing '\0' in their name. That will be fixed in the following commit. This closes #669 issue on GitHub.