unit.git - Universal Web Application Server

Age	Commit message (Collapse)	Author	Files	Lines
2023-11-09	Python: Fix header field values character encoding.	Andrew Clayton	1	-2/+28
	On GitHub, @RomainMou reported an issue whereby HTTP header field values where being incorrectly reported as non-ascii by the Python .isacii() method. For example, using the following test application def application(environ, start_response): t = environ['HTTP_ASCIITEST'] t = "'" + t + "'" + " (" + str(len(t)) + ")" if t.isascii(): t = t + " [ascii]" else: t = t + " [non-ascii]" resp = t + "\n\n" start_response("200 OK", [("Content-Type", "text/plain")]) return (bytes(resp, 'latin1')) You would see the following $ curl -H "ASCIITEST: $" http://localhost:8080/ '$' (1) [non-ascii] '$' has an ASCII code of 0x24 (36). The initial idea was to adjust the second parameter to the PyUnicode_New() call from 255 to 127. This unfortunately had the opposite effect. $ curl -H "ASCIITEST: $" http://localhost:8080/ '$' (1) [ascii] Good. However... $ curl -H "ASCIITEST: £" http://localhost:8080/ '£' (2) [ascii] Not good. Let's take a closer look at this. '£' is not in basic ASCII, but is in extended ASCII with a value of 0xA3 (163). Its UTF-8 encoding is 0xC2 0xA3, hence the length of 2 bytes above. $ strace -s 256 -e sendto,recvfrom curl -H "ASCIITEST: £" http://localhost:8080/ sendto(5, "GET / HTTP/1.1\r\nHost: localhost:8080\r\nUser-Agent: curl/8.0.1\r\nAccept: /\r\nASCIITEST: \302\243\r\n\r\n", 92, MSG_NOSIGNAL, NULL, 0) = 92 recvfrom(5, "HTTP/1.1 200 OK\r\nContent-Type: text/plain\r\nServer: Unit/1.30.0\r\nDate: Mon, 22 May 2023 12:44:11 GMT\r\nTransfer-Encoding: chunked\r\n\r\n12\r\n'\302\243' (2) [ascii]\n\n\r\n0\r\n\r\n", 102400, 0, NULL, NULL) = 160 '£' (2) [ascii] So we can see curl sent it UTF-8 encoded '\302\243\' which is C octal escaped UTF-8 for 0xC2 0xA3, and we got the same back. But it should not be marked as ASCII. When doing PyUnicode_New(size, 127) it sets the buffer as ASCII. So we need to use another function and that function would appear to be PyUnicode_DecodeCharmap() Which creates an Unicode object with the correct ascii/non-ascii properties based on the character encoding. With this function we now get $ curl -H "ASCIITEST: $" http://localhost:8080/ '$' (1) [ascii] $ curl -H "ASCIITEST: £" http://localhost:8080/ '£' (2) [non-ascii] and for good measure $ curl -H "ASCIITEST: $ £" http://localhost:8080/ '$ £' (4) [non-ascii] $ curl -H "ASCIITEST: $" -H "ASCIITEST: £" http://localhost:8080/ '$, £' (5) [non-ascii] PyUnicode_DecodeCharmap() does require having the full string upfront so we need to build up the potentially comma separated header field values string before invoking this function. I did not want to touch the Python 2.7 code (which may or may not even be affected by this) so kept these changes completely isolated from that, hence a slight duplication with the for () loop. Python 2.7 was sunset on January 1st 2020[0], so this code will hopefully just disappear soon anyway. I also purposefully didn't touch other code that may well have similar issues (such as the HTTP header field names) if we ever get issue reports about them, we'll deal with them then. [0]: <https://www.python.org/doc/sunset-python-2/> Link: <https://docs.python.org/3/c-api/unicode.html> Closes: <https://github.com/nginx/unit/issues/868> Reported-by: RomainMou <https://github.com/RomainMou> Tested-by: RomainMou <https://github.com/RomainMou> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-11-08	Python: Do nxt_unit_sptr_get() earlier in nxt_python_field_value().	Andrew Clayton	1	-1/+2
	This is a preparatory patch for fixing an issue with the encoding of http header field values. This patch simply moves the nxt_unit_sptr_get() to the top of the function where we will need it in the next commit. Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2022-12-14	Python: Added "prefix" to configuration.	OutOfFocus4	1	-13/+36
	This patch gives users the option to set a `"prefix"` attribute for Python applications, either at the top level or for specific `"target"`s. If the attribute is present, the value of `"prefix"` must be a string beginning with `"/"`. If the value of the `"prefix"` attribute is longer than 1 character and ends in `"/"`, the trailing `"/"` is stripped. The purpose of the `"prefix"` attribute is to set the `SCRIPT_NAME` context value for WSGI applications and the `root_path` context value for ASGI applications, allowing applications to properly route requests regardless of the path that the server uses to expose the application. The context value is only set if the request's URL path begins with the value of the `"prefix"` attribute. In all other cases, the `SCRIPT_NAME` or `root_path` values are not set. In addition, for WSGI applications, the value of `"prefix"` will be stripped from the beginning of the request's URL path before it is sent to the application. Reviewed-by: Andrei Zeliankou <zelenkov@nginx.com> Reviewed-by: Artem Konev <artem.konev@nginx.com> Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-11-17	Refactored functions that set WSGI variables.	OutOfFocus4	1	-6/+24
	Splitting `nxt_python_add_sptr` into several functions will make future additions easier. Signed-off-by: Alejandro Colomar <alx@nginx.com>
2022-10-03	Renamed a couple of members of nxt_unit_request_t.	Andrew Clayton	1	-2/+2
	This is a preparatory patch that renames the 'local' and 'local_length' members of the nxt_unit_request_t structure to 'local_addr' and 'local_addr_length' in preparation for the adding of 'local_port' and 'local_port_length' members. Suggested-by: Zhidao HONG <z.hong@f5.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2021-07-20	Python: using default event_loop for main thread for ASGI.	Max Romanov	1	-2/+2
	Unit's ASGI implementation creates a new event loop to run an application for each thread since 542b5b8c0647. This may cause unexpected exceptions or strange bugs if asyncio synchronisation primitives are initialised before the application starts (e.g. globally). Although the approach with a new event loop for the main thread is consistent and helps to prepare the application to run in multiple threads, it can be a source of pain for people who just want to run single-threaded ASGI applications in Unit. This is related to #560 issue on GitHub.
2021-05-20	Python: support for multiple targets.	Oisin Canty	1	-2/+3

2020-12-14	Python: WSGI environment copying moved out of request processing.	Valentin Bartenev	1	-12/+53
	The WSGI environment dictionary contains a number of static items, that are pre-initialized on application start. Then it's copied for each request to be filled with request-related data. Now this dictionary copy operation will be done between processing of requests, which should save some CPU cycles during request processing and thus reduce response latency for non-peak load periods.
2020-11-05	Python: request processing in multiple threads.	Max Romanov	1	-202/+242
	This closes #459 issue on GitHub.
2020-10-13	Fixed building with Python 3.9.	Valentin Bartenev	1	-2/+2
	PyUnicode_GET_SIZE() in deprecated since 3.3 and will be removed in 3.12. In version 3.9 it was explicitly marked by deprecation warning causing compilation error with Unit. PyUnicode_GET_LENGTH() must be used instead.
2020-10-01	Python: ASGI server introduced.	Max Romanov	1	-18/+0
	This closes #461 issue on GitHub.
2020-09-15	Python: changed request headers format in router protocol.	Max Romanov	1	-10/+106
	The coming ASGI support requires raw HTTP headers format. Headers grouping and upcase code were moved to WSGI module.
2020-09-14	Python: split module initialization from WSGI implementation.	Max Romanov	1	-285/+25
	This is required for futher ASGI implementation.
2020-09-14	Python: source file moved to 'python' sub-directory.	Max Romanov	1	-0/+0
	No functional changes. Get ready for an increase in file number.
2020-07-31	Isolation: fixed the generation of mounts table.	Tiago Natel de Moura	1	-8/+4
	Since the introduction of rootfs feature, some language modules can't be configured multiple times. Now the configure generates a separate nxt_<module>_mounts.h for each module compiled.
2020-07-24	Configuration: added checking for presence of mandatory fields.	Valentin Bartenev	1	-5/+0

2020-05-28	Added "rootfs" feature.	Tiago Natel de Moura	1	-0/+9

2020-03-09	Refactor of process management.	Tiago Natel de Moura	1	-5/+8
	The process abstraction has changed to: setup(task, process) start(task, process_data) prefork(task, process, mp) The prefork() occurs in the main process right before fork. The file src/nxt_main_process.c is completely free of process specific logic. The creation of a process now supports a PROCESS_CREATED state. The The setup() function of each process can set its state to either created or ready. If created, a MSG_PROCESS_CREATED is sent to main process, where external setup can be done (required for rootfs under container). The core processes (discovery, controller and router) doesn't need external setup, then they all proceeds to their start() function straight away. In the case of applications, the load of the module happens at the process setup() time and The module's init() function has changed to be the start() of the process. The module API has changed to: setup(task, process, conf) start(task, data) As a direct benefit of the PROCESS_CREATED message, the clone(2) of processes using pid namespaces now doesn't need to create a pipe to make the child block until parent setup uid/gid mappings nor it needs to receive the child pid.
2020-03-12	Python: implementing input readline and line iterator.	Max Romanov	1	-2/+145

2019-12-24	Adding "limits/shm" configuration validation and parsing.	Max Romanov	1	-0/+1

2019-12-23	Python: pre-creation of objects for string constants.	Valentin Bartenev	1	-38/+156
	This is an optimization to avoid creating them at runtime on each request.
2019-11-14	Python: avoiding buffering of exception backtraces.	Valentin Bartenev	1	-7/+47
	A quote from the Python 3 documentation: \| When interactive, stdout and stderr streams are line-buffered. \| Otherwise, they are block-buffered like regular text files. As a result, if an exception occurred and PyErr_Print() was called, its output could be buffered but not printed to the log for a while (ultimately, until the interpreter finalization). If the application process crashed shortly, the backtrace was completely lost. Buffering can be disabled by redefining the sys.stderr stream object. However, interference with standard environment objects was deemed undesirable. Instead, sys.stderr.flush() is called every time after printing exceptions. A potential advantage here is that lines from backtraces won't be mixed with other lines in the log.
2019-11-14	Python: removed wrong PyErr_Print() call.	Valentin Bartenev	1	-1/+0
	PyCallable_Check() doesn't produce errors. The needless call was introduced in fdd6ed28e3b9.
2019-11-14	Python: optimized response object close() calling.	Valentin Bartenev	1	-7/+14
	PyObject_HasAttrString() is just a wrapper over PyObject_GetAttrString(), while PyObject_CallMethod() calls it as the first step. As a result, PyObject_GetAttrString() was called twice if close() was present. To get rid of PyObject_HasAttrString() while keeping the same behaviour, the PyObject_CallMethod() call has been decomposed into separate calls of PyObject_GetAttrString() and PyObject_CallFunction().
2019-11-14	Python: fixed an object leak when response close() is called.	Valentin Bartenev	1	-10/+19
	On success, PyObject_CallMethod() returns a new reference to the result of the call, which previously got lost. Also, error logging on failure was added. The issue was introduced by b0148ec28c4d.
2019-11-14	Python: refactored nxt_python_request_handler().	Valentin Bartenev	1	-56/+31

2019-11-14	Python: fixed potential object leak in case of allocation error.	Valentin Bartenev	1	-0/+2

2019-11-14	Python: improved error handling if response object isn't iterable.	Valentin Bartenev	1	-0/+1
	According to the documentation, PyObject_GetIter(): \| Raises TypeError and returns NULL if the object cannot be iterated. Previously, this exception wasn't printed or cleared and remained unhandled.
2019-11-14	Python: fixed handling of errors on response object iteration.	Valentin Bartenev	1	-8/+15
	According to the documentation, PyIter_Next(): \| If there are no remaining values, returns NULL with no exception set. \| If an error occurs while retrieving the item, returns NULL and passes \| along the exception. Previously, this exception wasn't properly handled and the response was finalized as successful. This issue was introduced in b0148ec28c4d. A check for PyErr_Occurred() located in the code below might print this traceback or occasionally catch an exception from one of the two response close() calls. Albeit that exceptions from the close() calls also need to be catched, it's clear that this particular check wasn't supposed to do so. This is another issue and it will be fixed later.
2019-11-13	Python: releasing GIL while waiting for a request.	Valentin Bartenev	1	-15/+23
	It unblocks other threads that can be forked by the application to work in background. This closes #336 issue on GitHub.
2019-10-23	Python: fixing Python 3.8 build with clang.	Max Romanov	1	-53/+7
	Python 3.8 has 'tp_print' field in PyTypeObject struct. This field is attributed as deprecated. So, clang generates warning (which is turned to error) as a result of initializing this field. From the other hand, it is impossible to omit this field in positional initialization. The solution is to use designated initializer. Silencing usage message during configure python. This is related to #331 issue on GitHub.
2019-10-22	Python: fixing build for Python 3.8.	Max Romanov	1	-0/+4
	Thanks to tonyafanasyev. This is related to #331 issue on GitHub.
2019-03-21	Adjusting request schema value according to connection tls state.	Max Romanov	1	-20/+7
	This closes #223 issue on GitHub.
2019-02-28	Made QUERY_STRING mandatory.	Valentin Bartenev	1	-5/+1
	According to CGI/1.1 RFC 3875: The server MUST set this variable; if the Script-URI does not include a query component, the QUERY_STRING MUST be defined as an empty string (""). Python's PEP 333(3) allows omitting it in WSGI interface; PHP docs force no requirements; PSGI and Rack specifications require it even if empty. When nginx proxies requests over FastCGI, it always provides QUERY_STRING. and some PHP apps have been observed to fail if it is missing (see issue #201 on GitHub). A drawback of this change (besides a small overhead) is that there will be no easy way to tell a missing query string from an empty one (i.e. requests with or without the "?" character); yet, it's negligible compared to the possible benefits of wider application compatibility. This closes #226 issue on GitHub.
2019-02-28	Introducing Java Servlet Container beta.	Max Romanov	1	-0/+1

2019-02-27	Fixed processing of SERVER_NAME after 77aad2c142a0.	Valentin Bartenev	1	-21/+8
	Previously, the nxt_router_prepare_msg() function expected server host among other headers unmodified. It's not true anymore since normalization of the Host header has been introduced in 77aad2c142a0. The nxt_unit_split_host() function was removed. It didn't work correctly with IPv6 literals. Anyway, after 77aad2c142a0 the port splitting is done in router while Host header processing.
2018-12-20	Python: fixed error reporting on initialization of applications.	Valentin Bartenev	1	-11/+10
	PyErr_Print() writes traceback to "sys.stderr", which is a file object that can buffer the output. If the process exits immediately, the buffer can be destroyed before flushing to the log. As a result, the user doesn't see the traceback. Now Py_Finalize() is also called in case of any errors during initialization. It finalizes the interpreter and flushes all data.
2018-12-20	Python: cleanup of nxt_python_init().	Valentin Bartenev	1	-10/+3
	- Removed surplus NULL assignments; - Added missing nxt_slow_path(); - Style cleanup.
2018-12-20	Python: fixed a typo in path error message.	Artem Konev	1	-1/+1

2018-12-19	Python: replaced PyErr_PrintEx(1) with PyErr_Print().	Valentin Bartenev	1	-4/+4
	These function calls are equivalent. No functional changes.
2018-09-20	Python: adjusted input.read(size) argument value interpretation.	Valentin Bartenev	1	-3/+5
	Previously, passing 0 resulted in reading the whole body and all negative values raised an exception. Now the behaviour is in consistentance with io.RawIOBase.read() interface, and passing 0 returns empty (byte) string, while -1 results in reading the whole body.
2018-08-06	Python: decoding unicode strings as Latin1.	Max Romanov	1	-5/+6
	According to PEP 3333, header names and values should be decoded as Latin1.
2018-08-06	Unit application library.	Max Romanov	1	-275/+397
	Library now used in all language modules. Old 'nxt_app_*' code removed. See src/test/nxt_unit_app_test.c for usage sample.
2018-06-25	Introduced nxt_length() macro.	Valentin Bartenev	1	-5/+5

2018-05-21	Added SERVER_SOFTWARE request meta-variable.	Valentin Bartenev	1	-0/+18

2018-04-25	Python: added a missing slash in the path to "pyenv.cfg".	Valentin Bartenev	1	-1/+1
	This closes #115 issue on GitHub.
2018-04-24	Support for PEP 405 virtual environments.	Valentin Bartenev	1	-8/+50
	This closes #96 issue on GitHub.
2018-04-19	Python: returning write() callable object from start_response().	Alexander Borisov	1	-1/+47
	According to PEP (3)333 the start_respose() function must return a write() callable. This closes #107 issue on GitHub.
2018-04-04	Changed version processing for modules.	Alexander Borisov	1	-1/+1

2018-03-15	Python: safety checks for request processing context.	Valentin Bartenev	1	-2/+14
	An application can store request related functions and mistakenly call them outside of request processing. Previously this resulted in segmentation fault due to unset nxt_python_run_ctx. Now an exception will be raised.