summaryrefslogtreecommitdiffhomepage
path: root/src/nxt_http_parse.h (follow)
AgeCommit message (Collapse)AuthorFilesLines
2023-04-12HTTP: optimizing $request_line.Alejandro Colomar1-0/+1
Don't reconstruct a new string for the $request_line from the parsed method, target, and HTTP version, but rather keep a pointer to the original memory where the request line was received. This will be necessary for implementing URI rewrites, since we want to log the original request line, and not one constructed from the rewritten target. This implementation changes behavior (only for invalid requests) in the following way: Previous behavior was to log as many tokens from the request line as were parsed validly, thus: Request -> access log ; error log "GET / HTTP/1.1" -> "GET / HTTP/1.1" OK ; = "GET / HTTP/1.1" -> "GET / HTTP/1.1" [1] ; = "GET / HTTP/2.1" -> "GET / HTTP/2.1" OK ; = "GET / HTTP/1." -> "GET / HTTP/1." [2] ; "GET / HTTP/1. [null]" "GET / food" -> "GET / food" [2] ; "GET / food [null]" "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] ; = "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] ; = "GET food HTTP/1.1" -> "GET" ; "GET [null] [null]" "OPTIONS * HTTP/1.1" -> "OPTIONS" [3] ; "OPTIONS [null] [null]" "FOOBAR baz HTTP/1.1"-> "FOOBAR" ; "FOOBAR [null] [null]" "FOOBAR / HTTP/1.1" -> "FOOBAR / HTTP/1.1" ; = "get / HTTP/1.1" -> "-" ; " [null] [null]" "" -> "-" ; " [null] [null]" This behavior was rather inconsistent. We have several options to go forward with this patch: - NGINX behavior. Log the entire request line, up to '\r' | '\n', even if it was invalid. This is the most informative alternative. However, RFC-complying requests will probably not send invalid requests. This information would be interesting to users where debugging requests constructed manually via netcat(1) or a similar tool, or maybe for debugging a client, are important. It might be interesting to support this in the future if our users are interested; for now, since this approach requires looping over invalid requests twice, that's an overhead that we better avoid. - Previous Unit behavior This is relatively fast (almost as fast as the next alternative, the one we chose), but the implementation is ugly, in that we need to perform the same operation in many places around the code. If we want performance, probably the next alternative is better; if we want to be informative, then the first one is better (maybe in combination with the third one too). - Chosen behavior Only logging request lines when the request is valid. For any invalid request, or even unsupported ones, the request line will be logged as "-". Thus: Request -> access log [4] "GET / HTTP/1.1" -> "GET / HTTP/1.1" OK "GET / HTTP/1.1" -> "GET / HTTP/1.1" [1] "GET / HTTP/2.1" -> "-" [3] "GET / HTTP/1." -> "-" "GET / food" -> "-" "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] "GET / / HTTP/1.1" -> "GET / / HTTP/1.1" [2] "GET food HTTP/1.1" -> "-" "OPTIONS * HTTP/1.1" -> "-" "FOOBAR baz HTTP/1.1"-> "-" "FOOBAR / HTTP/1.1" -> "FOOBAR / HTTP/1.1" "get / HTTP/1.1" -> "-" "" -> "-" This is less informative than previous behavior, but considering how inconsistent it was, and that RFC-complying agents will probably not send us such requests, we're ready to lose that information in the log. This is of course the fastest and simplest implementation we can get. We've chosen to implement this alternative in this patch. Since we modified the behavior, this patch also changes the affected tests. [1]: Multiple successive spaces as a token delimiter is allowed by the RFC, but it is discouraged, and considered a security risk. It is currently supported by Unit, but we will probably drop support for it in the future. [2]: Unit currently supports spaces in the request-target. This is a violation of the relevant RFC (linked below), and will be fixed in the future, and consider those targets as invalid, returning a 400 (Bad Request), and thus the log lines with the previous inconsistent behavior would be changed. [3]: Not yet supported. [4]: In the error log, regarding the "log_routes" conditional logging of the request line, we only need to log the request line if it was valid. It doesn't make sense to log "" or "-" in case that the request was invalid, since this is only useful for understanding decisions of the router. In this case, the access log is more appropriate, which shows that the request was invalid, and a 400 was returned. When the request line is valid, it is printed in the error log exactly as in the access log. Link: <https://datatracker.ietf.org/doc/html/rfc9112#section-3> Suggested-by: Liam Crilly <liam@nginx.com> Reviewed-by: Zhidao Hong <z.hong@f5.com> Cc: Timo Stark <t.stark@nginx.com> Cc: Andrei Zeliankou <zelenkov@nginx.com> Cc: Andrew Clayton <a.clayton@nginx.com> Cc: Artem Konev <a.konev@f5.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-06-22Constified numerous function parameters.Andrew Clayton1-1/+1
As was pointed out by the cppcheck[0] static code analysis utility we can mark numerous function parameters as 'const'. This acts as a hint to the compiler about our intentions and the compiler will tell us when we deviate from them. [0]: https://cppcheck.sourceforge.io/
2020-11-17HTTP parser: allowed more characters in header field names.Valentin Bartenev1-5/+9
Previously, all requests that contained in header field names characters other than alphanumeric, or "-", or "_" were rejected with a 400 "Bad Request" error response. Now, the parser allows the same set of characters as specified in RFC 7230, including: "!", "#", "$", "%", "&", "'", "*", "+", ".", "^", "`", "|", and "~". Header field names that contain only these characters are considered valid. Also, there's a new option introduced: "discard_unsafe_fields". It accepts boolean value and it is set to "true" by default. When this option is "true", all header field names that contain characters in valid range, but other than alphanumeric or "-" are skipped during parsing. When the option is "false", these header fields aren't skipped. Requests with non-valid characters in header field names according to RFC 7230 are rejected regardless of "discard_unsafe_fields" setting. This closes #422 issue on GitHub.
2020-06-23Upstream chunked transfer encoding support.Igor Sysoev1-0/+16
2020-04-16Using malloc/free for the http fields hash.Max Romanov1-2/+2
This is required due to lack of a graceful shutdown: there is a small gap between the runtime's memory pool release and router process's exit. Thus, a worker thread may start processing a request between these two operations, which may result in an http fields hash access and subsequent crash. To simplify issue reproduction, it makes sense to add a 2 sec sleep before exit() in nxt_runtime_exit().
2019-11-14Initial proxy support.Igor Sysoev1-1/+2
2019-09-30HTTP parser: removed unused "exten" field.Valentin Bartenev1-1/+0
This field was intended for MIME type lookup by file extension when serving static files, but this use case is too narrow; only a fraction of requests targets static content, and the URI presumably isn't rewritten. Moreover, current implementation uses the entire filename for MIME type lookup if the file has no extension. Instead of extracting filenames and extensions when parsing requests, it's easier to obtain them right before serving static content; this behavior is already implemented. Thus, we can drop excessive logic from parser.
2019-09-16HTTP parser: removed unused "plus_in_target" flag.Valentin Bartenev1-6/+4
2019-09-16HTTP parser: removed unused "offset" field.Valentin Bartenev1-2/+0
Thanks to 洪志道 (Hong Zhi Dao).
2019-09-16HTTP parser: removed unused "exten_start" and "args_start" fields.Valentin Bartenev1-2/+0
2019-09-16Configuration: added ability to access object members with slashes.Valentin Bartenev1-0/+3
Now URI encoding can be used to escape "/" in the request path: GET /config/listeners/unix:%2Fpath%2Fto%2Fsocket/
2019-09-09Added "extern" to nxt_http_fields_hash_proto to avoid link issues.Max Romanov1-1/+1
2019-08-16Improving response header fields processing.Max Romanov1-0/+24
Fields are filtered one by one before being added to fields list. This avoids adding and then skipping connection-specific fields.
2019-05-30Added routing based on header fields.Igor Sysoev1-0/+5
2018-01-15Checking for major HTTP version.Valentin Bartenev1-0/+1
2018-01-15Improved HTTP version representation.Valentin Bartenev1-0/+7
2018-01-15HTTP parser: improved error reporting.Valentin Bartenev1-0/+6
2017-12-27HTTP parser: introduced nxt_http_parse_fields().Valentin Bartenev1-0/+2
2017-12-25HTTP parser: reworked header fields handling.Valentin Bartenev1-34/+21
2017-07-05Complex target parser copied from NGINX.Max Romanov1-0/+7
nxt_app_request_header_t fields renamed: - 'path' renamed to 'target'. - 'path_no_query' renamed to 'path' and contains parsed value.
2017-06-20HTTP parser: reduced memory consumption of header fields list.Valentin Bartenev1-32/+36
2017-06-20Using new memory pool implementation.Igor Sysoev1-2/+2
2017-06-13HTTP parser: decoupled header fields processing.Valentin Bartenev1-30/+53
2017-03-01HTTP parser.Valentin Bartenev1-0/+74
2017-03-01Removed legacy HTTP parser.Valentin Bartenev1-79/+0
2017-01-23Introducing tasks.Igor Sysoev1-2/+2
2017-01-17Initial version.Igor Sysoev1-0/+79