summaryrefslogtreecommitdiffhomepage
path: root/src/nxt_http_variables.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2023-04-12HTTP: optimizing $request_line.Alejandro Colomar1-30/+1
Don't reconstruct a new string for the $request_line from the parsed method, target, and HTTP version, but rather keep a pointer to the original memory where the request line was received. This will be necessary for implementing URI rewrites, since we want to log the original request line, and not one constructed from the rewritten target. This implementation changes behavior (only for invalid requests) in the following way: Previous behavior was to log as many tokens from the request line as were parsed validly, thus: Request -> access log ; error log "GET / HTTP/1.1" -> "GET / HTTP/1.1" OK ; = "GET / HTTP/1.1" -> "GET / HTTP/1.1" [1] ; = "GET / HTTP/2.1" -> "GET / HTTP/2.1" OK ; = "GET / HTTP/1." -> "GET / HTTP/1." [2] ; "GET / HTTP/1. [null]" "GET / food" -> "GET / food" [2] ; "GET / food [null]" "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] ; = "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] ; = "GET food HTTP/1.1" -> "GET" ; "GET [null] [null]" "OPTIONS * HTTP/1.1" -> "OPTIONS" [3] ; "OPTIONS [null] [null]" "FOOBAR baz HTTP/1.1"-> "FOOBAR" ; "FOOBAR [null] [null]" "FOOBAR / HTTP/1.1" -> "FOOBAR / HTTP/1.1" ; = "get / HTTP/1.1" -> "-" ; " [null] [null]" "" -> "-" ; " [null] [null]" This behavior was rather inconsistent. We have several options to go forward with this patch: - NGINX behavior. Log the entire request line, up to '\r' | '\n', even if it was invalid. This is the most informative alternative. However, RFC-complying requests will probably not send invalid requests. This information would be interesting to users where debugging requests constructed manually via netcat(1) or a similar tool, or maybe for debugging a client, are important. It might be interesting to support this in the future if our users are interested; for now, since this approach requires looping over invalid requests twice, that's an overhead that we better avoid. - Previous Unit behavior This is relatively fast (almost as fast as the next alternative, the one we chose), but the implementation is ugly, in that we need to perform the same operation in many places around the code. If we want performance, probably the next alternative is better; if we want to be informative, then the first one is better (maybe in combination with the third one too). - Chosen behavior Only logging request lines when the request is valid. For any invalid request, or even unsupported ones, the request line will be logged as "-". Thus: Request -> access log [4] "GET / HTTP/1.1" -> "GET / HTTP/1.1" OK "GET / HTTP/1.1" -> "GET / HTTP/1.1" [1] "GET / HTTP/2.1" -> "-" [3] "GET / HTTP/1." -> "-" "GET / food" -> "-" "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] "GET food HTTP/1.1" -> "-" "OPTIONS * HTTP/1.1" -> "-" "FOOBAR baz HTTP/1.1"-> "-" "FOOBAR / HTTP/1.1" -> "FOOBAR / HTTP/1.1" "get / HTTP/1.1" -> "-" "" -> "-" This is less informative than previous behavior, but considering how inconsistent it was, and that RFC-complying agents will probably not send us such requests, we're ready to lose that information in the log. This is of course the fastest and simplest implementation we can get. We've chosen to implement this alternative in this patch. Since we modified the behavior, this patch also changes the affected tests. [1]: Multiple successive spaces as a token delimiter is allowed by the RFC, but it is discouraged, and considered a security risk. It is currently supported by Unit, but we will probably drop support for it in the future. [2]: Unit currently supports spaces in the request-target. This is a violation of the relevant RFC (linked below), and will be fixed in the future, and consider those targets as invalid, returning a 400 (Bad Request), and thus the log lines with the previous inconsistent behavior would be changed. [3]: Not yet supported. [4]: In the error log, regarding the "log_routes" conditional logging of the request line, we only need to log the request line if it was valid. It doesn't make sense to log "" or "-" in case that the request was invalid, since this is only useful for understanding decisions of the router. In this case, the access log is more appropriate, which shows that the request was invalid, and a 400 was returned. When the request line is valid, it is printed in the error log exactly as in the access log. Link: <https://datatracker.ietf.org/doc/html/rfc9112#section-3> Suggested-by: Liam Crilly <liam@nginx.com> Reviewed-by: Zhidao Hong <z.hong@f5.com> Cc: Timo Stark <t.stark@nginx.com> Cc: Andrei Zeliankou <zelenkov@nginx.com> Cc: Andrew Clayton <a.clayton@nginx.com> Cc: Artem Konev <a.konev@f5.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-11-20Var: separating nxt_tstr_t from nxt_var_t.Zhidao HONG1-3/+3
It's for the introduction of njs support. For each option that supports native variable and JS template literals introduced next, it's unified as template string. No functional changes.
2022-11-04Removed the unsafe nxt_memcmp() wrapper for memcmp(3).Alejandro Colomar1-2/+2
The casts are unnecessary, since memcmp(3)'s arguments are 'void *'. It might have been necessary in the times of K&R, where 'void *' didn't exist. Nowadays, it's unnecessary, and _very_ unsafe, since casts can hide all classes of bugs by silencing most compiler warnings. The changes from nxt_memcmp() to memcmp(3) were scripted: $ find src/ -type f \ | grep '\.[ch]$' \ | xargs sed -i 's/nxt_memcmp/memcmp/' Reviewed-by: Andrew Clayton <a.clayton@nginx.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-10-12HTTP: added a $request_time variable.Zhidao HONG1-0/+33
2022-07-20Var: added a $dollar variable that translates to a '$'.Andrew Clayton1-0/+14
Allow $dollar (or ${dollar}) to translate to a literal $ to allow support for sub-delimiters in URIs. It is possible to have URLs like https://example.com/path/15$1588/9925$2976.html and thus it would be useful to be able to specify them in various bits of the unit config such as the location setting. However this hadn't been possible due to $ being used to denote variables for substitution. E.g $host. As was noted in the below GitHub issue it was suggested by @VBart to use $sign to represent a literal $, however I feel $dollar is more appropriate so we have a variable named after the thing it represents, also @tippexs found[0] that &dollar is used in HTML to represent a $, so there is some somewhat related precedent. (The other idea to use $$ was rejected in my original pull-request[1] for this issue.) This means the above URL could be specified as https://example.com/path/15${dollar}1588/9925${dollar}2976.html in the unit config. This is done by adding a variable called 'dollar' which is loaded into the variables hash table which translates into a literal $. This is then handled in nxt_var_next_part() where variables are parsed for lookup and $dollar is set for substitution by a literal '$'. Actual variable substitution happens in nxt_var_query_finish(). [0]: https://github.com/nginx/unit/pull/693#issuecomment-1130412323 [1]: https://github.com/nginx/unit/pull/693 Closes: https://github.com/nginx/unit/issues/675
2022-07-14HTTP: added more variables.Zhidao HONG1-0/+235
This commit adds the following variables: $remote_addr, $time_local, $request_line, $status, $body_bytes_sent, $header_referer, $header_user_agent.
2022-07-14Var: dynamic variables support.Zhidao HONG1-23/+164
This commit adds the variables $arg_NAME, $header_NAME, and $cookie_NAME.
2022-06-02Summary: Var: removing all async stuff.Zhidao HONG1-16/+10
No functional changes.
2022-05-31Var: Added $request_uri (as in NGINX).Alejandro Colomar1-0/+20
This supports a new variable $request_uri that contains the path and the query (See RFC 3986, section 3). Its contents are percent encoded. This is useful for example to redirect HTTP to HTTPS: { "return": "301", "location": "https://$host$request_uri" } When <http://example.com/foo%23bar?baz> is requested, the server redirects to <https://example.com/foo%23bar?baz>. === Testing: //diff --git a/src/nxt_http_return.c b/src/nxt_http_return.c //index 82c9156..adeb3a1 100644 //--- a/src/nxt_http_return.c //+++ b/src/nxt_http_return.c //@@ -196,6 +196,7 @@ nxt_http_return_send_ready(nxt_task_t *task, void *obj, void *data) // field->value = ctx->encoded.start; // field->value_length = ctx->encoded.length; // } //+ fprintf(stderr, "ALX: target[%1$i]: <%2$.*1$s>\n", (int)r->target.length, r->target.start); // // r->state = &nxt_http_return_send_state; // { "listeners": { "*:81": { "pass": "routes/ru" } }, "routes": { "ru": [{ "action": { "return": 301, "location": "$request_uri" } }] } } $ curl -i http://localhost:81/*foo%2Abar?baz#arg HTTP/1.1 301 Moved Permanently Location: /*foo%2Abar?baz Server: Unit/1.27.0 Date: Mon, 30 May 2022 16:04:30 GMT Content-Length: 0 $ sudo cat /usr/local/unit.log | grep ALX ALX: target[15]: </*foo%2Abar?baz>
2020-08-28Vars: added $host.Valentin Bartenev1-0/+20
This closes #407 issue on GitHub.
2020-08-13Basic variables support.Valentin Bartenev1-0/+59