Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
|
|
Don't reconstruct a new string for the $request_line from the parsed
method, target, and HTTP version, but rather keep a pointer to the
original memory where the request line was received.
This will be necessary for implementing URI rewrites, since we want to
log the original request line, and not one constructed from the
rewritten target.
This implementation changes behavior (only for invalid requests) in the
following way:
Previous behavior was to log as many tokens from the request line as
were parsed validly, thus:
Request -> access log ; error log
"GET / HTTP/1.1" -> "GET / HTTP/1.1" OK ; =
"GET / HTTP/1.1" -> "GET / HTTP/1.1" [1] ; =
"GET / HTTP/2.1" -> "GET / HTTP/2.1" OK ; =
"GET / HTTP/1." -> "GET / HTTP/1." [2] ; "GET / HTTP/1. [null]"
"GET / food" -> "GET / food" [2] ; "GET / food [null]"
"GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] ; =
"GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] ; =
"GET food HTTP/1.1" -> "GET" ; "GET [null] [null]"
"OPTIONS * HTTP/1.1" -> "OPTIONS" [3] ; "OPTIONS [null] [null]"
"FOOBAR baz HTTP/1.1"-> "FOOBAR" ; "FOOBAR [null] [null]"
"FOOBAR / HTTP/1.1" -> "FOOBAR / HTTP/1.1" ; =
"get / HTTP/1.1" -> "-" ; " [null] [null]"
"" -> "-" ; " [null] [null]"
This behavior was rather inconsistent. We have several options to go
forward with this patch:
- NGINX behavior.
Log the entire request line, up to '\r' | '\n', even if it was
invalid.
This is the most informative alternative. However, RFC-complying
requests will probably not send invalid requests.
This information would be interesting to users where debugging
requests constructed manually via netcat(1) or a similar tool, or
maybe for debugging a client, are important. It might be interesting
to support this in the future if our users are interested; for now,
since this approach requires looping over invalid requests twice,
that's an overhead that we better avoid.
- Previous Unit behavior
This is relatively fast (almost as fast as the next alternative, the
one we chose), but the implementation is ugly, in that we need to
perform the same operation in many places around the code.
If we want performance, probably the next alternative is better; if
we want to be informative, then the first one is better (maybe in
combination with the third one too).
- Chosen behavior
Only logging request lines when the request is valid. For any
invalid request, or even unsupported ones, the request line will be
logged as "-". Thus:
Request -> access log [4]
"GET / HTTP/1.1" -> "GET / HTTP/1.1" OK
"GET / HTTP/1.1" -> "GET / HTTP/1.1" [1]
"GET / HTTP/2.1" -> "-" [3]
"GET / HTTP/1." -> "-"
"GET / food" -> "-"
"GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2]
"GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2]
"GET food HTTP/1.1" -> "-"
"OPTIONS * HTTP/1.1" -> "-"
"FOOBAR baz HTTP/1.1"-> "-"
"FOOBAR / HTTP/1.1" -> "FOOBAR / HTTP/1.1"
"get / HTTP/1.1" -> "-"
"" -> "-"
This is less informative than previous behavior, but considering how
inconsistent it was, and that RFC-complying agents will probably not
send us such requests, we're ready to lose that information in the
log. This is of course the fastest and simplest implementation we
can get.
We've chosen to implement this alternative in this patch. Since we
modified the behavior, this patch also changes the affected tests.
[1]: Multiple successive spaces as a token delimiter is allowed by the
RFC, but it is discouraged, and considered a security risk. It is
currently supported by Unit, but we will probably drop support for
it in the future.
[2]: Unit currently supports spaces in the request-target. This is
a violation of the relevant RFC (linked below), and will be fixed
in the future, and consider those targets as invalid, returning
a 400 (Bad Request), and thus the log lines with the previous
inconsistent behavior would be changed.
[3]: Not yet supported.
[4]: In the error log, regarding the "log_routes" conditional logging
of the request line, we only need to log the request line if it
was valid. It doesn't make sense to log "" or "-" in case that
the request was invalid, since this is only useful for
understanding decisions of the router. In this case, the access
log is more appropriate, which shows that the request was invalid,
and a 400 was returned. When the request line is valid, it is
printed in the error log exactly as in the access log.
Link: <https://datatracker.ietf.org/doc/html/rfc9112#section-3>
Suggested-by: Liam Crilly <liam@nginx.com>
Reviewed-by: Zhidao Hong <z.hong@f5.com>
Cc: Timo Stark <t.stark@nginx.com>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Cc: Andrew Clayton <a.clayton@nginx.com>
Cc: Artem Konev <a.konev@f5.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
|
|
This makes the build tree more organized, which is good for adding new
stuff. Now, it's useful for example for adding manual pages in man3/,
but it may be useful in the future for example for extending the build
system to run linters (e.g., clang-tidy(1), Clang analyzer, ...) on the
C source code.
Previously, the build tree was quite flat, and looked like this (after
`./configure && make`):
$ tree -I src build
build
├── Makefile
├── autoconf.data
├── autoconf.err
├── echo
├── libnxt.a
├── nxt_auto_config.h
├── nxt_version.h
├── unitd
└── unitd.8
1 directory, 9 files
And after this patch, it looks like this:
$ tree -I src build
build
├── Makefile
├── autoconf.data
├── autoconf.err
├── bin
│ └── echo
├── include
│ ├── nxt_auto_config.h
│ └── nxt_version.h
├── lib
│ ├── libnxt.a
│ └── unit
│ └── modules
├── sbin
│ └── unitd
├── share
│ └── man
│ └── man8
│ └── unitd.8
└── var
├── lib
│ └── unit
├── log
│ └── unit
└── run
└── unit
17 directories, 9 files
It also solves one issue introduced in
5a37171f733f ("Added default values for pathnames."). Before that
commit, it was possible to run unitd from the build system
(`./build/unitd`). Now, since it expects files in a very specific
location, that has been broken. By having a directory structure that
mirrors the installation, it's possible to trick it to believe it's
installed, and run it from there:
$ ./configure --prefix=./build
$ make
$ ./build/sbin/unitd
Fixes: 5a37171f733f ("Added default values for pathnames.")
Reported-by: Liam Crilly <liam@nginx.com>
Reviewed-by: Konstantin Pavlov <thresh@nginx.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Cc: Zhidao Hong <z.hong@f5.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
|
|
In BSD systems, it's usually </var/db> or some other dir under </var>
that is not </var/lib>, so $statedir is a more generic name. See
hier(7).
Reported-by: Andrei Zeliankou <zelenkov@nginx.com>
Reported-by: Zhidao Hong <z.hong@f5.com>
Reviewed-by: Konstantin Pavlov <thresh@nginx.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Cc: Liam Crilly <liam@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
|
|
We install jars with names like websocket-api-${NXT_JAVA_MODULE}-$NXT_VERSION.jar,
which translates to versioned NXT_JAVA_MODULE in the packaging system, e.g.
websocket-api-java11-1.30.0.jar.
|
|
|
|
|
|
|
|
Also added temporary directory clearing after checking available
modules to prevent garbage environment when tests start.
|
|
Previously, it was necessary to support older versions of Python for
compatibility. F-strings were released in Python 3.6. Python 3.5 was
marked as unsupported by the end of 2020, so now it's possible to start
using f-strings safely for better readability and performance.
|
|
|
|
Mutable types as default arguments is bad practice since
they are evaluated only once when the function is defined.
|
|
This allows one to simply run `./configure` and expect it to
produce sane defaults for an install.
Previously, without specifying `--prefix=...`, `make install`
would simply fail, recommending to set `--prefix` or `DESTDIR`,
but that recommendation was incomplete at best, since it didn't
set many of the subdirs needed for a good organization.
Setting `DESTDIR` was even worse, since that shouldn't even affect
an installation (it is required to be transparent to the
installation).
/usr/local is the historic Unix standard path to use for
installations from source made manually by the admin of the
system. Some package managers (Homebrew, I'm looking specifically
at you) have abused that path to install their things, but 1) it's
not our fault that someone else incorrectly abuses that path (and
they seem to be fixing it for newer archs; e.g., they started
using /opt/homebrew for Apple Silicon), 2) there's no better path
than /usr/local, 3) we still allow changing it for systems where
this might not be the desired path (MacOS Intel with hombrew), and
4) it's _the standard_.
See a related conversation with Ingo (OpenBSD maintainer):
On 7/27/22 16:16, Ingo Schwarze wrote:
> Hi Alejandro,
[...]
>
> Alejandro Colomar wrote on Sun, Jul 24, 2022 at 07:07:18PM +0200:
>> On 7/24/22 16:57, Ingo Schwarze wrote:
>>> Alejandro Colomar wrote on Sun, Jul 24, 2022 at 01:20:46PM +0200:
>
>>>> /usr/local is for sysadmins to build from source;
>
>>> Doing that is *very* strongly discouraged on OpenBSD.
>
>> I guess that's why the directory was reused in the BSDs to install ports
>> (probably ports were installed by the sysadmin there, and by extension,
>> ports are now always installed there, but that's just a guess).
>
> Maybe. In any case, the practice of using /usr/local for packages
> created from ports is significantly older than the recommendation
> to refrain from using upstream "make install" outside the ports
> framework.
>
> * The FreeBSD ports framework was started by Jordan Hubbard in 1993.
> * The ports framework was ported from FreeBSD to OpenBSD
> by Niklas Hallqvist in 1996.
> * NetBSD pkgsrc was forked from FreeBSD ports by Alistair G. Crooks
> and Hubert Feyrer in 1997.
>
> I failed to quickly find Jordan's original version, but rev. 1.1
> of /usr/ports/infrastructure/mk/bsd.port.mk in OpenBSD (dated Jun 3
> 22:47:10 1996 UTC) already said
>
> LOCALBASE ?= /usr/local
> PREFIX ?= ${LOCALBASE}
>
[...]
>> I had a discussion in NGINX Unit about it, and
>> the decission for now has been: "support prefix=/usr/local for default
>> manual installation through the Makefile, and let BSD users adjust to
>> their preferred path".
>
> That's an *excellent* solution for the task, thanks for doing it
> the right way. By setting PREFIX=/usr/local by default in the
> upstream Makefile, you are minimizing the work for *BSD porters.
>
> The BSD ports frameworks will typically run the upstreak "make install"
> with the variable DESTDIR set to a custom value, for example
>
> DESTDIR=/usr/ports/pobj/groff-1.23.0/fake-amd64
>
> so if the upstream Makefile sets PREFIX=/usr/local ,
> that's perfect, everything gets installed to the right place
> without an intervention by the person doing the porting.
>
> Of course, if the upstream Makefile would use some other PREFIX,
> that would not be a huge obstacle. All we have to do in that case
> is pass the option --prefix=/usr/local to the ./configure script,
> or something equivalent if the software isn't using GNU configure.
>
>> We were concerned that we might get collisions
>> with the BSD port also installing in /usr/local, but that's the least
>> evil (and considering BSD users don't typically run `make install`, it's
>> not so bad).
>
> It's not bad at all. It's perfect.
>
> Of course, if a user wants to install *without* the ports framework,
> they have to provide their own --prefix. But that's not an issue
> because it is easy to do, and installing without a port is discouraged
> anyway.
===
Directory variables should never contain a trailing slash (I've
learned that the hard way, where some things would break
unexpectedly). Especially, make(1) is likely to have problems
when things have double slashes or a trailing slash, since it
treats filenames as text strings. I've removed the trailing slash
from the prefix, and added it to the derivate variables just after
the prefix. pkg-config(1) also expects directory variables to have
no trailing slash.
===
I also removed the code that would set variables as depending on
the prefix if they didn't start with a slash, because that is a
rather non-obvious behavior, and things should not always depend
on prefix, but other dirs such as $(runstatedir), so if we keep
a similar behavior it would be very unreliable. Better keep
variables intact if set, or use the default if unset.
===
Print the real defaults for ./configure --help, rather than the actual
values.
===
I used a subdirectory under the standard /var/lib for NXT_STATE,
instead of a homemade "state" dir that does the same thing.
===
Modified the Makefile to create some dirs that weren't being
created, and also remove those that weren't being removed in
uninstall, probably because someone forgot to add them.
===
Add new options for setting the new variables, and rename some to be
consistent with the standard names. Keep the old ones at configuration
time for compatibility, but mark them as deprecated. Don't keep the old
ones at exec time.
===
A summary of the default config is:
Unit configuration summary:
bin directory: ............. "/usr/local/bin"
sbin directory: ............ "/usr/local/sbin"
lib directory: ............. "/usr/local/lib"
include directory: ......... "/usr/local/include"
man pages directory: ....... "/usr/local/share/man"
modules directory: ......... "/usr/local/lib/unit/modules"
state directory: ........... "/usr/local/var/lib/unit"
tmp directory: ............. "/tmp"
pid file: .................. "/usr/local/var/run/unit/unit.pid"
log file: .................. "/usr/local/var/log/unit/unit.log"
control API socket: ........ "unix:/usr/local/var/run/unit/control.unit.sock"
Link: <https://www.gnu.org/prep/standards/html_node/Directory-Variables.html>
Link: <https://refspecs.linuxfoundation.org/FHS_3.0/fhs/index.html>
Reviewed-by: Artem Konev <a.konev@f5.com>
Reviewed-by: Andrew Clayton <a.clayton@nginx.com>
Tested-by: Andrew Clayton <a.clayton@nginx.com>
Reviewed-by: Konstantin Pavlov <thresh@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
|
|
Since the previous commit, we now properly handle 403 Forbidden & 404
Not Found errors in the PHP language module.
This adds a test for 403 Forbidden to test/test_php_application.py, but
also fixes a test in test/test_php_targets.py where we were checking for
503 but should have been a 404, which we now do.
Acked-by: Alejandro Colomar <alx@nginx.com>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
[ Incorporates a couple of small test cleanups from Andrei ]
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
|
|
|
|
@dward on GitHub reported an issue with a URL like
http://foo.bar/test.php?blah=test.php/foo
where we would end up trying to run the script
test.php?blah=test.php
In the PHP module the format 'file.php/' is treated as a special case in
nxt_php_dynamic_request() where we check the _path_ part of the url for
the string '.php/'.
The problem is that the path actually also contains the query string,
thus we were finding 'test.php/' in the above URL and treating that
whole path as the script to run.
The fix is simple, replace the strstr(3) with a memmem(3), where we can
limit the amount of path we use for the check.
The trick here and what is not obvious from the code is that while
path.start points to the whole path including the query string,
path.length only contains the length of the _path_ part.
NOTE: memmem(3) is a GNU extension and is neither specified by POSIX or
ISO C, however it is available on a number of other systems, including:
FreeBSD, OpenBSD, NetBSD, illumos, and macOS.
If it comes to it we can implement a simple alternative for systems
which lack memmem(3).
This also adds a test case (provided by @dward) to cover this.
Closes: <https://github.com/nginx/unit/issues/781>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com> [test]
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
|
|
This patch gives users the option to set a `"prefix"` attribute
for Python applications, either at the top level or for specific
`"target"`s. If the attribute is present, the value of `"prefix"`
must be a string beginning with `"/"`. If the value of the `"prefix"`
attribute is longer than 1 character and ends in `"/"`, the
trailing `"/"` is stripped.
The purpose of the `"prefix"` attribute is to set the `SCRIPT_NAME`
context value for WSGI applications and the `root_path` context
value for ASGI applications, allowing applications to properly route
requests regardless of the path that the server uses to expose the
application.
The context value is only set if the request's URL path begins with
the value of the `"prefix"` attribute. In all other cases, the
`SCRIPT_NAME` or `root_path` values are not set. In addition, for
WSGI applications, the value of `"prefix"` will be stripped from
the beginning of the request's URL path before it is sent to the
application.
Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com>
Reviewed-by: Artem Konev <artem.konev@nginx.com>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
|
|
|
|
Added tests for the "large_header_buffer_size" and
"large_header_buffers" configuration options.
|
|
|
|
Hide expected alerts by default.
Silence succesfull "go build" information.
|
|
|
|
|
|
Unit parsed the case of "$uri$$host" into unknown variables.
This commit makes it invalid variable instead.
|
|
@dward on GitHub reported an issue with a URL like
http://foo.bar/test.php?blah=test.php/foo
where we would end up trying to run the script
test.php?blah=test.php
In the PHP module the format 'file.php/' is treated as a special case in
nxt_php_dynamic_request() where we check the _path_ part of the url for
the string '.php/'.
The problem is that the path actually also contains the query string,
thus we were finding 'test.php/' in the above URL and treating that
whole path as the script to run.
The fix is simple, replace the strstr(3) with a memmem(3), where we can
limit the amount of path we use for the check.
The trick here and what is not obvious from the code is that while
path.start points to the whole path including the query string,
path.length only contains the length of the _path_ part.
NOTE: memmem(3) is a GNU extension and is neither specified by POSIX or
ISO C, however it is available on a number of other systems, including:
FreeBSD, OpenBSD, NetBSD, illumos, and macOS.
If it comes to it we can implement a simple alternative for systems
which lack memmem(3).
This also adds a test case (provided by @dward) to cover this.
Closes: <https://github.com/nginx/unit/issues/781>
Cc: Andrei Zeliankou <zelenkov@nginx.com>
Reviewed-by: Alejandro Colomar <alx@nginx.com>
Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com> [test]
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
|
|
Migration of "share" behaviour was dropped after b57b4749b993.
|
|
|
|
Now version output evaluates only once.
OpenSSL checks more carefully.
|
|
Both @lucatacconi & @mwoodpatrick reported what appears to be the same
issue on GitHub. Namely that when using the PHP language module and
trying to access a URL that is a directory but without specifying the
trailing '/', they were getting a '503 Service Unavailable' error.
Note: This is when _not_ using the 'script' option.
E.g with the following config
{
"listeners": {
"[::1]:8080": {
"pass": "applications/php"
}
},
"applications": {
"php": {
"type": "php",
"root": "/var/tmp/unit-php"
}
}
}
and with a directory path of /var/tmp/unit-php/foo containing an
index.php, you would see the following
$ curl http://localhost/foo
<title>Error 503</title>
Error 503
However
$ curl http://localhost/foo/
would work and serve up the index.php
This commit fixes the above so you get the desired behaviour without
specifying the trailing '/' by doing the following
1] If the URL doesn't end in .php and doesn't have a trailing '/'
then check if the requested path is a directory.
2) If it is a directory then create a 301 re-direct pointing to it.
This matches the behaviour of the likes of nginx, Apache and
lighttpd.
This also matches the behaviour of the "share" action in Unit.
This doesn't effect the behaviour of the 'script' option which bypasses
the nxt_php_dynamic_request() function.
This also adds a couple of tests to test/test_php_application.py to
ensure this continues to work.
Closes: <https://github.com/nginx/unit/issues/717>
Closes: <https://github.com/nginx/unit/issues/753>
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
|
|
|
|
Access log used for the variables testing instead of limited routing.
Added missed test for $status variable.
Some tests moved from "test_access_log.py" to "test_variables.py".
|
|
|
|
|
|
The fixing supports the cookie value with the '=' character.
This is related to #756 PR on Github.
Thanks to changxiaocui.
|
|
|
|
|
|
|
|
|
|
Registering an isolated PID in the global PID hash is wrong
because it can be duplicated. Isolated processes are stored only
in the children list until the response for the WHOAMI message is
processed and the global PID is discovered.
To remove isolated siblings, a pointer to the children list is
introduced in the nxt_process_init_t struct.
This closes #633 issue on GitHub.
|
|
|
|
Also added tests for the following variables:
$request_line, $time_local, $bytes_sent, and $status.
|
|
Also removed unnesessary re.compile() calls.
|
|
Having the basename of the script pathname was incorrect. While
we don't have something more accurate, the best thing to do is to
have it empty (which should be the right thing most of the time).
This closes #715 issue on GitHub.
The bug was introduced in git commit
0032543fa65f454c471c968998190b027c1ff270
'Ruby: added the Rack environment parameter "SCRIPT_NAME".'.
|
|
If you need to specify a $ in a URI you can now use '$dollar' or
'${dollar}'.
Added some tests for the above to test_variables.py setting a Location
string.
|
|
|
|
|