Age | Commit message (Collapse) | Author | Files | Lines |
|
@filiphanes requested support for bytearray
and memoryview in the request body here:
<https://github.com/nginx/unit/issues/648>
This patch implements bytearray body support only.
Memoryview body still need to be implemented.
|
|
|
|
In that particular issue the compiled nuxt files end up importing the
http module as node:http rather than http only. This bypasses unit's
custom loader implementation which only check for the http or unit-http
modules, and their websocket counterparts.
This changeset adds replace sources for both the node:http and
node:websocket import signatures.
Closes: https://github.com/nginx/unit/issues/1013
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This commit introduces the 'vars' JavaScript object to NJS,
enabling direct access to native variables such as $uri and $arg_foo.
The syntax is `${vars.var_name}` or `${'vars[var_name]'}`.
For example:
{
"action": {
"share": "`/www/html${vars.uri}`"
}
}
|
|
|
|
This commit is for subsequent commits that will support njs variable
accessing. In this commit, nxt_var_get() is introduced to extend
the variable handling capabilities. Concurrently, nxt_var_ref_get()
has been refactored to use in both configuration and request phases.
|
|
This commit enhances nxt_var_cache_value() to enable variable access
using string names, complementing the existing reference index method.
The modification ensures future compatibility with njs variable access.
|
|
|
|
|
|
This is a simple temporary fix (doesn't address the underlying problem)
for an issue reported by a user on GitHub whereby downloading of files
from a PHP application would cause the router process to crash.
This is actually a generic problem that will affect anything sending
data via nxt_unit_response_write().
This is just a simple fix for the 1.32 release, after which the full
correct fix will be worked out.
Link: <https://github.com/nginx/unit/issues/1125>
Reported-by: rustedsword <https://github.com/rustedsword>
Co-developed-by: rustedsword <https://github.com/rustedsword>
Tested-by: rustedsword <https://github.com/rustedsword>
Tested-by: Andrew Clayton <a.clayton@nginx.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
Functionally identical, but marginally more idiomatic.
Refines: fbeb2065b180e2376088387ee150d3975dc08cd5
|
|
Several users in GitHub have asked for the ability to set the
permissions of the unitd UNIX Domain control socket.
This can of course be done externally, but can be done much cleaner by
Unit itself.
This commit adds three new options
--control-mode Set the mode of the socket, e.g 644
--control-user Set the user/owner of the socket, e.g unit
--control-group Set the group of the socket, e.g unit
Of course these only have an affect when using a UNIX Domain Socket for
the control socket.
Requested-by: michaelkosir <https://github.com/michaelkosir>
Requested-by: chopanovv <https://github.com/chopanovv>
Link: <https://github.com/nginx/unit/issues/254>
Link: <https://github.com/nginx/unit/issues/980>
Closes: https://github.com/nginx/unit/issues/840
Tested-by: Liam Crilly <liam.crilly@nginx.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This wraps chown(2) but takes the user/owner and group as strings.
It's a little long winded as it uses the thread safe versions of
getpwnam()/getgrname() which require a little more work.
This function will be used by the following commit that allows to set
the permissions of the Unix domain control socket.
We need to cast uid & gid to long in the call to nxt_thread_log_alert()
to appease clang-ast as it's adamant that uid/gid are unsigned ints, but
chown(2) takes -1 for these values to indicate don't change this item,
and it'd be nice to show them in the error message.
Note that getpwnam()/getgrname() don't define "not found" as an error as
per their man page
The formulation given above under "RETURN VALUE" is from POSIX.1-2001.
It does not call "not found" an error, and hence does not specify what
value errno might have in this situation. But that makes it impossible
to recognize errors. One might argue that according to POSIX errno
should be left unchanged if an entry is not found. Experiments on var‐
ious UNIX-like systems show that lots of different values occur in this
situation: 0, ENOENT, EBADF, ESRCH, EWOULDBLOCK, EPERM, and probably
others.
Thus if we log an error from these functions we can end up with the
slightly humorous error message
2024/02/12 15:15:12 [alert] 99404#99404 getpwnam_r("noddy", ...) failed (0: Success) (User not found) while creating listening socket on unix:/opt/unit/control.unit.sock
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
* Take options as well as requestListener
Unit-http have not kept up with the signature of nodejs's http package
development. Nodejs allows an optional `options` object to be passed to
the `createServer` function, we didn't. This resulted in function
signature errors when user code that did make use of the options arg
tried to call unit's replaced function.
This change changes the signature to be more in line with how nodejs
does it discarding it and printing a message to stdout.
* Add test file to start node application with options
* Add changes to docs/changes.xml
Closes: https://github.com/nginx/unit/issues/1043
|
|
It's an integer, not a floating number.
Fixes: 68c6b67ffc84 ("Configuration: support for rational numbers.")
Closes: https://github.com/nginx/unit/issues/1115
Link: <https://github.com/nginx/unit/pull/1116>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Dan Callahan <d.callahan@f5.com>
Cc: Valentin Bartenev <vbartenev@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
|
|
The commit that added support for Unix sockets accepts abstract sockets
using '@' in the config, but we stored it internally using '\0'.
We want to support abstract sockets transparently to the user, so that
if the user configures unitd with '@', if we receive a query about the
current configuration, the user should see the same exact thing that was
configured. So, this commit avoids the transformation in the internal
state file, storing user input pristine, and we only transform the '@'
in temporary strings.
This commit fixes another bug, where we try to connect to abstract
sockets with a trailing '\0' in their name due to calling twice
nxt_sockaddr_parse() on the same string. By calling that function only
once with each copy of the string, we have fixed that bug.
The following code was responsible for this bug, which the second time
it was called, considered these sockets as file-backed (not abstract)
Unix socket, and so appended a '\0' to the socket name.
$ grepc -tfd nxt_sockaddr_unix_parse . | grep -A10 @
if (path[0] == '@') {
path[0] = '\0';
socklen--;
#if !(NXT_LINUX)
nxt_thread_log_error(NXT_LOG_ERR,
"abstract unix domain sockets are not supported");
return NULL;
#endif
}
sa = nxt_sockaddr_alloc(mp, socklen, addr->length);
This bug was found thanks to some experiment about using 'const' for
some strings.
And here's some history:
- 9041d276fc6a ("nxt_sockaddr_parse() introducted.")
This commit introduced support for abstract Unix sockets, but they
only worked as "servers", and not as "listeners". We corrupted the
JSON config file, and stored a \u0000. This also caused calling
connect(2) with a bogus trailing null byte, which tried to connect to
a different abstract socket.
- d8e0768a5bae ("Fixed support for abstract Unix sockets.")
This commit (partially) fixed support for abstract Unix sockets, so
they they worked also as listeners. We still corrupted the JSON
config file, and stored a \u0000. This caused calling connect(2)
(and now bind(2) too) with a bogus trailing null byte.
- e2aec6686a4d ("Storing abstract sockets with @ internally.")
This commit fixed the problem by which we were corrupting the config
file, but only for "listeners", not for "servers". (It also fixes
the issue about the terminating '\0'.) We completely forgot about
"servers", and other callers of the same function.
To reproduce the problem, I used the following config:
```json
{
"listeners": {
"*:80": {
"pass": "routes/u"
},
"unix:@abstract": {
"pass": "routes/a"
}
},
"routes": {
"u": [{
"action": {
"pass": "upstreams/u"
}
}],
"a": [{
"action": {
"return": 302,
"location": "/i/am/not/at/home/"
}
}]
},
"upstreams": {
"u": {
"servers": {
"unix:@abstract": {}
}
}
}
}
```
And then check the state file:
$ sudo cat /opt/local/nginx/unit/master/var/lib/unit/conf.json \
| jq . \
| grep unix;
"unix:@abstract": {
"unix:\u0000abstract": {}
After this patch, the state file has a '@' as expected:
$ sudo cat /opt/local/nginx/unit/unix/var/lib/unit/conf.json \
| jq . \
| grep unix;
"unix:@abstract": {
"unix:@abstract": {}
Regarding the trailing null byte, here are some tests:
$ sudo strace -f -e 'bind,connect' /opt/local/nginx/unit/d8e0/sbin/unitd \
|& grep abstract;
[pid 22406] bind(10, {sa_family=AF_UNIX, sun_path=@"abstract\0"}, 12) = 0
[pid 22410] connect(134, {sa_family=AF_UNIX, sun_path=@"abstract\0"}, 12) = 0
^C
$ sudo killall unitd
$ sudo strace -f -e 'bind,connect' /opt/local/nginx/unit/master/sbin/unitd \
|& grep abstract;
[pid 22449] bind(10, {sa_family=AF_UNIX, sun_path=@"abstract"}, 11) = 0
[pid 22453] connect(134, {sa_family=AF_UNIX, sun_path=@"abstract\0"}, 12) = -1 ECONNREFUSED (Connection refused)
^C
$ sudo killall unitd
$ sudo strace -f -e 'bind,connect' /opt/local/nginx/unit/unix/sbin/unitd \
|& grep abstract;
[pid 22488] bind(10, {sa_family=AF_UNIX, sun_path=@"abstract"}, 11) = 0
[pid 22492] connect(134, {sa_family=AF_UNIX, sun_path=@"abstract"}, 11) = 0
^C
Fixes: 9041d276fc6a ("nxt_sockaddr_parse() introducted.")
Fixes: d8e0768a5bae ("Fixed support for abstract Unix sockets.")
Fixes: e2aec6686a4d ("Storing abstract sockets with @ internally.")
Link: <https://github.com/nginx/unit/pull/1108>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Liam Crilly <liam.crilly@nginx.com>
Cc: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
|
|
Refactor.
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
|
|
This function is like nxt_conf_get_string(), but creates a new copy,
so that it can be modified without corrupting the configuration string.
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
|
|
With the previous commit which introduced the use of the
NXT_CONF_VLDT_REQUIRED flag, we no longer need to do this separate
validation, it's only purpose was to check if the three uidmap/gidmap
settings had been provided.
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
Use the NXT_CONF_VLDT_REQUIRED flag on the app_procmap members. These
three settings are required.
These are for the uidmap & gidmap settings in the config.
Suggested-by: Zhidao HONG <z.hong@f5.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
Andrei reported an issue on arm64 where he was seeing the following
error message when running the tests
2024/01/17 18:32:31.109 [error] 54904#54904 "gidmap" field has an entry with "size": 1, but for unprivileged unit it must be 1.
This error message is guarded by the following if statement
if (nxt_slow_path(m.size > 1)
Turns out size was indeed > 1, in this case it was 289356276058554369,
m.size is defined as a nxt_int_t, which on arm64 is actually 8 bytes,
but was being printed as a signed int (4 bytes) and by chance/undefined
behaviour comes out as 1.
But why is size so big? In this case it should have just been 1 with a
config of
'gidmap': [{'container': 0, 'host': os.getegid(), 'size': 1}],
This is due to nxt_int_t being 64bits on arm64 but using a conf type of
NXT_CONF_MAP_INT which means in nxt_conf_map_object() we would do (using
our m.size variable as an example)
ptr = nxt_pointer_to(data, map[i].offset);
...
ptr->i = num;
Where ptr is a union pointer and is now pointing at our m.size
Next we set m.size to the value of num (which is 1 in this case), via
ptr->i where i is a member of that union of type int.
So here we are setting a 64bit memory location (nxt_int_t on arm64)
through a 32bit (int) union alias, this means we are only setting the
lower half (4) of the bytes.
Whatever happens to be in the upper 4 bytes will remain, giving us our
exceptionally large value.
This is demonstrated by this program
#include <stdio.h>
#include <stdint.h>
int main(void)
{
int64_t num = -1; /* All 1's in two's complement */
union {
int32_t i32;
int64_t i64;
} *ptr;
ptr = (void *)#
ptr->i32 = 1;
printf("num : %lu / %ld\n", num, num);
ptr->i64 = 1;
printf("num : %ld\n", num);
return 0;
}
$ make union-32-64-issue
cc union-32-64-issue.c -o union-32-64-issue
$ ./union-32-64-issue
num : 18446744069414584321 / -4294967295
num : 1
However that is not the only issue, because the members of
nxt_clone_map_entry_t were specified as nxt_int_t's on the likes of
x86_64 this would be a 32bit signed integer. However uid/gids on Linux
at least are defined as unsigned integers, so a nxt_int_t would not be
big enough to hold all potential values.
We could make the nxt_uint_t's but then we're back to the above union
aliasing problem.
We could just set the memory for these variables to 0 and that would
work, however that's really just papering over the problem.
The right thing is to use a large enough sized type to store these
things, hence the previously introduced nxt_cred_t. This is an int64_t
which is plenty large enough.
So we switch the nxt_clone_map_entry_t structure members over to
nxt_cred_t's and use NXT_CONF_MAP_INT64 as the conf type, which then
uses the right sized union member in nxt_conf_map_object() to set these
variables.
Reported-by: Andrei Zeliankou <zelenkov@nginx.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This is a generic type to represent a uid_t/gid_t on Linux when user
namespaces are in use.
Technically this only needs to be an unsigned int, but we make it an
int64_t so we can make use of the existing NXT_CONF_MAP_INT64 type.
This will be used in subsequent commits.
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This feature allows users to specify conditions to control if access log
should be recorded. The "if" option supports a string and JavaScript code.
If its value is empty, 0, false, null, or undefined, the logs will not be
recorded. And the '!' as a prefix inverses the condition.
Example 1: Only log requests that sent a session cookie.
{
"access_log": {
"if": "$cookie_session",
"path": "..."
}
}
Example 2: Do not log health check requests.
{
"access_log": {
"if": "`${uri == '/health' ? false : true}`",
"path": "..."
}
}
Example 3: Only log requests when the time is before 22:00.
{
"access_log": {
"if": "`${new Date().getHours() < 22}`",
"path": "..."
}
}
or
{
"access_log": {
"if": "!`${new Date().getHours() >= 22}`",
"path": "..."
}
}
Closes: https://github.com/nginx/unit/issues/594
|
|
This is in preparation for adding conditional access logging.
No functional changes.
|
|
According to the Node.js documenation this variable
should only include numbering scheme.
Thanks to @dbit-xia.
Closes: https://github.com/nginx/unit/issues/1085
|
|
On GH, @tonychuuy reported an issue when using Units 'share' action they
would get the following error in the unit log
2024/01/15 17:53:41 [error] 49#52 *103 file "/var/www/html/public/vendor/telescope/app.css" has changed while sending response to a client
This would happen when trying to serve files over a certain size and the
requested file would not be sent.
This is due to a somewhat bogus check in
nxt_http_static_buf_completion()
I say bogus because it's not clear what the check is trying to
accomplish and the error message is not entirely accurate either.
The check in question goes like
n = pread(file->fd, buf, size, offset);
return n;
...
if (n != size) {
if (n >= 0) {
/* log file changed error and finish */
/* >> Problem is here << */
}
/* log general error and finish */
}
If the number of bytes read is not what we asked for and is > -1 (i.e
not an error) then it says the file has changed, but really it only
checks if the file has _shrunk_ (we can't get back _more_ bytes than we
asked for) since it was stat'd.
This is what happens
recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
We get a request from a client, open the requested file and stat(2) it to
get the file size.
We would then go into a pread/writev loop reading the file data and
sending it to the client until it's all been sent.
However what was happening in this case was this (showing a dummy file
of 149922 bytes)
pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
write(2, "2024/01/17 15:30:50 [error] 1849"..., 109) = 109
We wanted to read 131072 bytes but only read 61440 bytes, the above
check triggered and the file transfer was aborted and the above error
message logged.
Normally for a regular file you will only get less bytes than asked for
if the read call is interrupted by a signal or you're near the end of
file.
There is however at least another situation where this may happen, if
the file in question is being served from a network filesystem.
It turns out that was indeed the case here, the files where being served
over the 9P filesystem protocol. Unit was running in a docker container
in an Ubuntu VM under Windows/WSL2 and the files where being passed
through to the VM from Windows over 9P.
Whatever the intention of this check, it is clearly causing issues in
real world scenarios.
If it was really desired to check if the had changed since it was
opened/stat'd then it would require a different methodology and be a
patch for another day. But as it stands this current check does more
harm than good, so lets just remove it.
With it removed we now get for the above test file
recvfrom(22, "GET /tfile HTTP/1.1\r\nHost: local"..., 2048, 0, NULL, NULL) = 82
openat(AT_FDCWD, "/mnt/9p/tfile", O_RDONLY|O_NONBLOCK) = 23
newfstatat(23, "", {st_mode=S_IFREG|0644, st_size=149922, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 135168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f367817b000
pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 131072, 0) = 61440
pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 18850, 61440) = 18850
writev(22, [{iov_base="HTTP/1.1 200 OK\r\nLast-Modified: "..., iov_len=171}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=18850}], 3) = 80461
pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 69632, 80290) = 61440
pread64(23, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192, 141730) = 8192
close(23) = 0
writev(22, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=61440}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8192}], 2) = 69632
So we can see we do two pread(2)s's and a writev(2), then another two
pread(2)s and another writev(2) and all the file data has been read and
sent to the client.
Reported-by: tonychuuy <https://github.com/tonychuuy>
Link: <https://en.wikipedia.org/wiki/9P_(protocol)>
Fixes: 08a8d1510 ("Basic support for serving static files.")
Closes: https://github.com/nginx/unit/issues/1064
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
Closes: <https://github.com/nginx/unit/pull/1062>
|
|
This is to improve error messages for rewrite configuration.
Take the configuration as an example:
{
"rewrite": "`${a + "
}
Previously, when applying it the user would see this error message:
failed to apply previous configuration
After this change, the user will see this improved error message:
the previous configuration is invalid: "SyntaxError: Unexpected end of input in default:1" in the "rewrite" value.
|
|
Coverity picked up a potential issue with the previous commit d9f5f1fb7
("Ruby: Handle response field arrays") in that a size_t could wrap
around to SIZE_MAX - 1.
This would happen if we were given an empty array of header values.
Fixes: d9f5f1fb7 ("Ruby: Handle response field arrays")
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
@xeron on GitHub reported an issue whereby with a Rails 7.1 application
they were getting the following error
2023/10/22 20:57:28 [error] 56#56 [unit] #8: Ruby: Wrong header entry 'value' from application
2023/10/22 20:57:28 [error] 56#56 [unit] #8: Ruby: Failed to run ruby script
After some back and forth debugging it turns out rack was trying to send
back a header comprised of an array of values. E.g
app = Proc.new do |env|
["200", {
"Content-Type" => "text/plain",
"X-Array-Header" => ["Item-1", "Item-2"],
}, ["Hello World\n"]]
end
run app
It seems this became a possibility in rack v3.0[0]
So along with a header value type of T_STRING we need to also allow
T_ARRAY.
If we get a T_ARRAY we need to build up the header field using the given
values.
E.g
"X-Array-Header" => ["Item-1", "", "Item-3", "Item-4"],
becomes
X-Array-Header: Item-1; ; Item-3; Item-4
[0]: <https://github.com/rack/rack/blob/main/UPGRADE-GUIDE.md?plain=1#L26>
Reported-by: Ivan Larionov <xeron.oskom@gmail.com>
Closes: <https://github.com/nginx/unit/issues/974>
Link: <https://github.com/nginx/unit/pull/998>
Tested-by: Timo Stark <t.stark@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
|
|
This closes #1006 issue on GitHub.
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
|
|
In nxt_php_execute() it is possible we could bail out before cleaning up
the FILE * representing the PHP script to execute.
At this point we only need to call fclose(3) on it.
We could have possibly moved the opening of this file to later in the
function, but it is probably good to bail out as early as possible if we
can't open it.
This was found by Coverity.
Fixes: bebc03c72 ("PHP: Implement better error handling.")
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This fixes some typos and grammatical errors in the comments of
src/nxt_unit.h
Link: <https://github.com/nginx/unit/pull/889>
[ Adjust summary and write commit message as this just contains the
fixes from the PR and not actual changes - Andrew ]
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
The isb instruction fits for spin loops where it allows to save cpu
power.
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
On GitHub, @RomainMou reported an issue whereby HTTP header field values
where being incorrectly reported as non-ascii by the Python .isacii()
method.
For example, using the following test application
def application(environ, start_response):
t = environ['HTTP_ASCIITEST']
t = "'" + t + "'" + " (" + str(len(t)) + ")"
if t.isascii():
t = t + " [ascii]"
else:
t = t + " [non-ascii]"
resp = t + "\n\n"
start_response("200 OK", [("Content-Type", "text/plain")])
return (bytes(resp, 'latin1'))
You would see the following
$ curl -H "ASCIITEST: $" http://localhost:8080/
'$' (1) [non-ascii]
'$' has an ASCII code of 0x24 (36).
The initial idea was to adjust the second parameter to the
PyUnicode_New() call from 255 to 127. This unfortunately had the
opposite effect.
$ curl -H "ASCIITEST: $" http://localhost:8080/
'$' (1) [ascii]
Good. However...
$ curl -H "ASCIITEST: £" http://localhost:8080/
'£' (2) [ascii]
Not good. Let's take a closer look at this.
'£' is not in basic ASCII, but is in extended ASCII with a value of 0xA3
(163). Its UTF-8 encoding is 0xC2 0xA3, hence the length of 2 bytes
above.
$ strace -s 256 -e sendto,recvfrom curl -H "ASCIITEST: £" http://localhost:8080/
sendto(5, "GET / HTTP/1.1\r\nHost: localhost:8080\r\nUser-Agent: curl/8.0.1\r\nAccept: */*\r\nASCIITEST: \302\243\r\n\r\n", 92, MSG_NOSIGNAL, NULL, 0) = 92
recvfrom(5, "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nServer: Unit/1.30.0\r\nDate: Mon, 22 May 2023 12:44:11 GMT\r\nTransfer-Encoding: chunked\r\n\r\n12\r\n'\302\243' (2) [ascii]\n\n\r\n0\r\n\r\n", 102400, 0, NULL, NULL) = 160
'£' (2) [ascii]
So we can see curl sent it UTF-8 encoded '\302\243\' which is C octal
escaped UTF-8 for 0xC2 0xA3, and we got the same back. But it should not
be marked as ASCII.
When doing PyUnicode_New(size, 127) it sets the buffer as ASCII. So we
need to use another function and that function would appear to be
PyUnicode_DecodeCharmap()
Which creates an Unicode object with the correct ascii/non-ascii
properties based on the character encoding.
With this function we now get
$ curl -H "ASCIITEST: $" http://localhost:8080/
'$' (1) [ascii]
$ curl -H "ASCIITEST: £" http://localhost:8080/
'£' (2) [non-ascii]
and for good measure
$ curl -H "ASCIITEST: $ £" http://localhost:8080/
'$ £' (4) [non-ascii]
$ curl -H "ASCIITEST: $" -H "ASCIITEST: £" http://localhost:8080/
'$, £' (5) [non-ascii]
PyUnicode_DecodeCharmap() does require having the full string upfront so
we need to build up the potentially comma separated header field values
string before invoking this function.
I did not want to touch the Python 2.7 code (which may or may not even
be affected by this) so kept these changes completely isolated from
that, hence a slight duplication with the for () loop.
Python 2.7 was sunset on January 1st 2020[0], so this code will
hopefully just disappear soon anyway.
I also purposefully didn't touch other code that may well have similar
issues (such as the HTTP header field names) if we ever get issue
reports about them, we'll deal with them then.
[0]: <https://www.python.org/doc/sunset-python-2/>
Link: <https://docs.python.org/3/c-api/unicode.html>
Closes: <https://github.com/nginx/unit/issues/868>
Reported-by: RomainMou <https://github.com/RomainMou>
Tested-by: RomainMou <https://github.com/RomainMou>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This is a preparatory patch for fixing an issue with the encoding of
http header field values.
This patch simply moves the nxt_unit_sptr_get() to the top of the
function where we will need it in the next commit.
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
|
|
This variable contains a string that is formed using random data and
can be used as a unique request identifier.
This closes #714 issue on GitHub.
|
|
|
|
|
|
This was inadvertently removed in 76086d6d ("Wasm: Allow to set the HTTP
response status.")
Fixes: 76086d6d ("Wasm: Allow to set the HTTP response status.")
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
|
|
On Github, @rlandgrebe reported an issue when trying to rewrite URLs
that contained query strings.
With the PHP language module we were in fact segfaulting (SIGSEGV) in
libphp
[93960.462952] unitd[20940]: segfault at 7f307cef6476 ip 00007f2f81a94577 sp 00007fff28a777d0 error 4 in libphp-8.2.so[7f2f818df000+2fd000] likely on CPU 0 (core 0, socket 0)
#0 0x00007f2abd494577 in php_default_treat_data (arg=1, str=0x0,
destArray=<optimized out>)
at /usr/src/debug/php-8.2.10-1.fc38.x86_64/main/php_variables.c:488
488 if (c_var && *c_var) {
(gdb) p c_var
$1 = 0x7f2bb8880676 <error: Cannot access memory at address 0x7f2bb8880676>
This was when trying to get the query string which somehow is pointing
off into the woods.
This gdb debug session when doing rewrite basically shows the core of
the issue
(gdb) x /64bs req->fields
...
0x7f7eaaaa8090: "GET"
0x7f7eaaaa8094: "HTTP/1.1"
0x7f7eaaaa809d: "::1"
0x7f7eaaaa80a1: "::1"
0x7f7eaaaa80a5: "8080"
0x7f7eaaaa80aa: "localhost"
0x7f7eaaaa80b4: "/test?q=a"
0x7f7eaaaa80be: "/test"
...
(gdb) p target_pos
$4 = (void *) 0x7f7eaaaa80b4
(gdb) p query_pos
$6 = (void *) 0x7f7eaaaa6af6
(gdb) p r->args->start
$8 = (u_char *) 0x7f7ea4002b02 "q=a HTTP/1.1\r\nHost: localhost:8080\r\nUser-Agent: curl/8.0.1\r\nAccept: */*\r\n\r\n"
(gdb) p r->target.start
$9 = (u_char *) 0x7f7ea40040c0 "/test?q=a"
That last address, 0x7f7ea40040c0, looks out of wack, it should be
smaller than r->args->start.
That results in a calculation in nxt_router_prepare_msg()
if (r->args->start != NULL) {
query_pos = nxt_pointer_to(target_pos,
r->args->start - r->target.start);
nxt_unit_sptr_set(&req->query, query_pos);
} else {
that goes negative that then is stored in req->query.offset which is a
uint32_t and so wraps backwards from UINT_MAX to give us an offset of a
little under 4GiB, hence the above invalid memory access.
All this happens due to in nxt_http_rewrite() if we have a URL with a
query string, we create a new memory allocation to store the transformed
URL and query string.
We set r->target to point to this new allocation, but we also need to
point r->args->start to the start of the query string in this new
allocation.
Reported-by: René Landgrebe <https://github.com/rlandgrebe>
Tested-by: René Landgrebe <https://github.com/rlandgrebe>
Tested-by: Liam Crilly <liam.crilly@nginx.com>
Fixes: 14d6d97b ("HTTP: added basic URI rewrite.")
Closes: <https://github.com/nginx/unit/issues/964>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
The default configuration previously contained just a listeners and
applications object. Since routes is now a principle configuration object,
and a recommended way of configurating Unit, it is now included in the
default configuration.
This change benefits new users because it explicitly introduces the three
principle configuration objects which leads more intuitively to the
documentation. Experienced users may choose to ignore or delete routes.
routes is defined as an array instead of an object because this change
is designed to assist new users, where the simpler form of routes is
easier to understand.
|
|
We need to take into account the size of the nxt_unit_response_t
structure itself when calculating where to start appending data to in
memory.
Closes: <https://github.com/nginx/unit/issues/923>
Reported-by: Alejandro Colomar <alx@kernel.org>
Reviewed-by: Andrew Clayton <a.clayton@nginx.org>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
This closes #871 issue on GitHub.
|
|
Starting from Node.js 15.0.0 the chunk parameter of the response.write()
can be a Uint8Array.
This closes #870 issue on GitHub.
|
|
Currently Wasm modules are limited to a 32bit address space (until at
least the memory64 work is completed). All the counters etc in the
request structure were u32's. Which matched with 32bit memory
limitation.
However there is really no need to not allow >4GiB uploads that can be
saved off to disk or some such.
To do this we just need to increase the ->content_len &
->total_content_sent members to u64's.
However because we need the request structure to have the exact same
layout on 32bit (for Wasm modules) as it does on 64bit we need to re-jig
the order of some of these members and add a four-byte padding member.
Thus the request structure now looks like on 64bit (as shown by
pahole(1))
struct nxt_wasm_request_s {
uint32_t method_off; /* 0 4 */
uint32_t method_len; /* 4 4 */
uint32_t version_off; /* 8 4 */
uint32_t version_len; /* 12 4 */
uint32_t path_off; /* 16 4 */
uint32_t path_len; /* 20 4 */
uint32_t query_off; /* 24 4 */
uint32_t query_len; /* 28 4 */
uint32_t remote_off; /* 32 4 */
uint32_t remote_len; /* 36 4 */
uint32_t local_addr_off; /* 40 4 */
uint32_t local_addr_len; /* 44 4 */
uint32_t local_port_off; /* 48 4 */
uint32_t local_port_len; /* 52 4 */
uint32_t server_name_off; /* 56 4 */
uint32_t server_name_len; /* 60 4 */
/* --- cacheline 1 boundary (64 bytes) --- */
uint64_t content_len; /* 64 8 */
uint64_t total_content_sent; /* 72 8 */
uint32_t content_sent; /* 80 4 */
uint32_t content_off; /* 84 4 */
uint32_t request_size; /* 88 4 */
uint32_t nfields; /* 92 4 */
uint32_t tls; /* 96 4 */
char __pad[4]; /* 100 4 */
nxt_wasm_http_field_t fields[]; /* 104 0 */
/* size: 104, cachelines: 2, members: 25 */
/* last cacheline: 40 bytes */
};
and the same structure (taken from unit-wasm) compiled as 32bit
struct luw_req {
u32 method_off; /* 0 4 */
u32 method_len; /* 4 4 */
u32 version_off; /* 8 4 */
u32 version_len; /* 12 4 */
u32 path_off; /* 16 4 */
u32 path_len; /* 20 4 */
u32 query_off; /* 24 4 */
u32 query_len; /* 28 4 */
u32 remote_off; /* 32 4 */
u32 remote_len; /* 36 4 */
u32 local_addr_off; /* 40 4 */
u32 local_addr_len; /* 44 4 */
u32 local_port_off; /* 48 4 */
u32 local_port_len; /* 52 4 */
u32 server_name_off; /* 56 4 */
u32 server_name_len; /* 60 4 */
/* --- cacheline 1 boundary (64 bytes) --- */
u64 content_len; /* 64 8 */
u64 total_content_sent; /* 72 8 */
u32 content_sent; /* 80 4 */
u32 content_off; /* 84 4 */
u32 request_size; /* 88 4 */
u32 nr_fields; /* 92 4 */
u32 tls; /* 96 4 */
char __pad[4]; /* 100 4 */
struct luw_hdr_field fields[]; /* 104 0 */
/* size: 104, cachelines: 2, members: 25 */
/* last cacheline: 40 bytes */
};
We can see the structures have the same layout, same size and no
padding.
We need the __pad member as otherwise I saw gcc and clang on Alpine
Linux automatically add the 'packed' attribute to the structure which
made the two structures not match.
Link: <https://github.com/WebAssembly/memory64>
Link: <https://github.com/nginx/unit-wasm>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
When trying to upload files to the luw-upload-reflector demo[0] above a
certain size that would mean Unit would need to make more than two calls
to the request_handler function in the Wasm module we would get the
following error from wasmtime and the upload would stall on the third
call to the request_handler
WASMTIME ERROR: failed to call function [->wasm_request_handler]
error while executing at wasm backtrace:
0: 0x5ce2 - <unknown>!memcpy
1: 0x7df - luw_req_buf_append
at /home/andrew/src/unit-wasm/src/c/libunit-wasm.c:308:14
2: 0x3a1 - luw_request_handler
at /home/andrew/src/unit-wasm/examples/c/luw-upload-reflector.c:110:3
Caused by:
wasm trap: out of bounds memory access
This was due to ->content_off (the offset of where the actual body
content starts in the request structure/memory) being some overly large
value.
This was largely down to me being an idiot!
Before calling the loop that makes the calls to the request_handler we
would calculate the new offset, which is now just the size of the
request structure as we don't re-send all the HTTP meta data and headers
etc. However because this value is in the request structure which is in
the shared memory and we use this same memory for requests and
responses, when we make a response we overwrite this request structure
with the response structure, so our ->content_off is now some wacked out
value when we make the next call to the request_handler.
To fix this we just need to reset ->content_off each time round the
loop.
There's also no point in setting ->nfields to 0, it has the same issue
as above, but doesn't get re-used by the Wasm module anyway.
[0]: <https://github.com/nginx/unit-wasm/blob/main/examples/c/luw-upload-reflector.c>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|