unit.git - Universal Web Application Server

Age	Commit message (Collapse)	Author	Files	Lines
2023-10-10	Wasm: Re-add a removed 'const' qualifier in nxt_rt_wasmtime.c.	Andrew Clayton	2	-2/+2
	This was inadvertently removed in 76086d6d ("Wasm: Allow to set the HTTP response status.") Fixes: 76086d6d ("Wasm: Allow to set the HTTP response status.") Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-25	Wasm: Allow uploads larger than 4GiB.	Andrew Clayton	1	-3/+5
	Currently Wasm modules are limited to a 32bit address space (until at least the memory64 work is completed). All the counters etc in the request structure were u32's. Which matched with 32bit memory limitation. However there is really no need to not allow >4GiB uploads that can be saved off to disk or some such. To do this we just need to increase the ->content_len & ->total_content_sent members to u64's. However because we need the request structure to have the exact same layout on 32bit (for Wasm modules) as it does on 64bit we need to re-jig the order of some of these members and add a four-byte padding member. Thus the request structure now looks like on 64bit (as shown by pahole(1)) struct nxt_wasm_request_s { uint32_t method_off; /* 0 4 / uint32_t method_len; / 4 4 / uint32_t version_off; / 8 4 / uint32_t version_len; / 12 4 / uint32_t path_off; / 16 4 / uint32_t path_len; / 20 4 / uint32_t query_off; / 24 4 / uint32_t query_len; / 28 4 / uint32_t remote_off; / 32 4 / uint32_t remote_len; / 36 4 / uint32_t local_addr_off; / 40 4 / uint32_t local_addr_len; / 44 4 / uint32_t local_port_off; / 48 4 / uint32_t local_port_len; / 52 4 / uint32_t server_name_off; / 56 4 / uint32_t server_name_len; / 60 4 / / --- cacheline 1 boundary (64 bytes) --- / uint64_t content_len; / 64 8 / uint64_t total_content_sent; / 72 8 / uint32_t content_sent; / 80 4 / uint32_t content_off; / 84 4 / uint32_t request_size; / 88 4 / uint32_t nfields; / 92 4 / uint32_t tls; / 96 4 / char __pad[4]; / 100 4 / nxt_wasm_http_field_t fields[]; / 104 0 / / size: 104, cachelines: 2, members: 25 / / last cacheline: 40 bytes / }; and the same structure (taken from unit-wasm) compiled as 32bit struct luw_req { u32 method_off; / 0 4 / u32 method_len; / 4 4 / u32 version_off; / 8 4 / u32 version_len; / 12 4 / u32 path_off; / 16 4 / u32 path_len; / 20 4 / u32 query_off; / 24 4 / u32 query_len; / 28 4 / u32 remote_off; / 32 4 / u32 remote_len; / 36 4 / u32 local_addr_off; / 40 4 / u32 local_addr_len; / 44 4 / u32 local_port_off; / 48 4 / u32 local_port_len; / 52 4 / u32 server_name_off; / 56 4 / u32 server_name_len; / 60 4 / / --- cacheline 1 boundary (64 bytes) --- / u64 content_len; / 64 8 / u64 total_content_sent; / 72 8 / u32 content_sent; / 80 4 / u32 content_off; / 84 4 / u32 request_size; / 88 4 / u32 nr_fields; / 92 4 / u32 tls; / 96 4 / char __pad[4]; / 100 4 / struct luw_hdr_field fields[]; / 104 0 / / size: 104, cachelines: 2, members: 25 / / last cacheline: 40 bytes */ }; We can see the structures have the same layout, same size and no padding. We need the __pad member as otherwise I saw gcc and clang on Alpine Linux automatically add the 'packed' attribute to the structure which made the two structures not match. Link: <https://github.com/WebAssembly/memory64> Link: <https://github.com/nginx/unit-wasm> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-25	Wasm: Fix multiple successive calls to the request_handler.	Andrew Clayton	1	-2/+2
	When trying to upload files to the luw-upload-reflector demo[0] above a certain size that would mean Unit would need to make more than two calls to the request_handler function in the Wasm module we would get the following error from wasmtime and the upload would stall on the third call to the request_handler WASMTIME ERROR: failed to call function [->wasm_request_handler] error while executing at wasm backtrace: 0: 0x5ce2 - <unknown>!memcpy 1: 0x7df - luw_req_buf_append at /home/andrew/src/unit-wasm/src/c/libunit-wasm.c:308:14 2: 0x3a1 - luw_request_handler at /home/andrew/src/unit-wasm/examples/c/luw-upload-reflector.c:110:3 Caused by: wasm trap: out of bounds memory access This was due to ->content_off (the offset of where the actual body content starts in the request structure/memory) being some overly large value. This was largely down to me being an idiot! Before calling the loop that makes the calls to the request_handler we would calculate the new offset, which is now just the size of the request structure as we don't re-send all the HTTP meta data and headers etc. However because this value is in the request structure which is in the shared memory and we use this same memory for requests and responses, when we make a response we overwrite this request structure with the response structure, so our ->content_off is now some wacked out value when we make the next call to the request_handler. To fix this we just need to reset ->content_off each time round the loop. There's also no point in setting ->nfields to 0, it has the same issue as above, but doesn't get re-used by the Wasm module anyway. [0]: <https://github.com/nginx/unit-wasm/blob/main/examples/c/luw-upload-reflector.c> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-09-25	Wasm: Allow to set the HTTP response status.	Andrew Clayton	3	-7/+49
	This commit enables WebAssembly modules to set the HTTP response status to something other than the previously hard coded '200 OK'. To do this they can make a call to nxt_wasm_set_resp_status() providing the required status code. If this function isn't called the status code defaults to '200 OK'. The WebAssembly module can also return -1 from the request_handler function as a short cut to signal a '500 Internal Server Error'. Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-17	Wasm: Add support for directory access.	Andrew Clayton	3	-1/+46
	Due to the sandboxed nature of WebAssembly, by default WASM modules don't have any access to the underlying filesystem. There is however a capabilities based mechanism[0] for allowing such access. This adds a config option to the 'wasm' application type; 'access.filesystem' which takes an array of directory paths that are then made available to the WASM module. This access works recursively, i.e everything under a specific path is allowed access to. Example config might look like "access" { "filesystem": [ "/tmp", "/var/tmp" ] } The actual mechanism used allows directories to be mapped differently in the guest. But at the moment we don't support that and just map say /tmp to /tmp. This can be revisited if it's something users clamour for. Network sockets are another resource that may be controlled in this manner, for example there is a wasi_config_preopen_socket() function, however this requires the runtime to open the network socket then effectively pass this through to the guest. This is something that can be revisited in the future if users desire it. [0]: <https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-capabilities.md> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
2023-08-17	Wasm: Add the core of initial WebAssembly language module support.	Andrew Clayton	3	-0/+801
	This adds the core of runtime WebAssembly[0] support. Future commits will enable this in the Unit core and expose the configuration. This introduces a new src/wasm directory for storing this source. We are initially using Wasmtime[0] as the WebAssembly runtime, however this has been designed with the ability to use different runtimes in mind. src/wasm/nxt_wasm.[ch] is the main interface to Unit. src/wasm/nxt_rt_wasmtime.c is the Wasmtime runtime support. This is nicely insulated from any knowledge of internal Unit workings. Wasmtime is what loads and runs the Wasm modules. The Wasm modules can export functions Wasmtime can call and Wasmtime can export functions that the module can call. We make use of both. The terminology used is that function exports are what the Wasm module exports and function imports are what the Wasm runtime exports to the module. We currently have four function imports (functions exported by the runtime to be called by the Wasm module). 1) nxt_wasm_get_init_mem_size This allows Wasm modules to get the size of the initially allocated shared memory. This is the size allocated at Unit startup and what the Wasm modules can assume they have access to (in reality this shared memory will likely be larger). The amount of memory allocated at startup is NXT_WASM_MEM_SIZE which as of this commit is 32MiB. We do actually allocate NXT_WASM_MEM_SIZE + NXT_WASM_PAGE_SIZE at startup which is an extra 64KiB (the smallest allocation unit), this is to allow room for the response structure and so module developers can just assume they have the full 32MiB for their actual response. 2) nxt_wasm_send_headers This allows WASM modules to send their headers. 3) nxt_wasm_send_response This allows WASM modules to send their response. 4) nxt_wasm_response_end This allows WASM modules to inform Unit they have finished sending their response. This calls nxt_unit_request_done() Then there are currently up to eight functions that a module can export. Three of which are required. These function can be named anything. I'll use the Unit configuration names to refer to them 1) request_handler The main driving function. This may be called multiple times for a single HTTP request if the request is larger than the shared memory. 2) malloc_handler Used to allocate a chunk of memory at language module startup. This memory is allocated from the WASM modules address space and is what is sued for communicating between the WASM module (the guest) and Unit (the host). 3) free_handler Used to free the memory from above at language module shutdown. Then there are the following optional handlers 1) module_init_handler If set, called at language module startup. 2) module_end_handler If set, called at language module shutdown. 3) request_init_handler If set, called at the start of request. Called only once per HTTP request. 4) request_end_handler If set, called once all of a request has been sent to the WASM module. 5) response_end_handler If set, called at the end of a request, once the WASM module has sent all its headers and data. 32bits We currently support 32bit WASM modules, I.e wasm32-wasi. Newer version of clang, 13+[2], do seem to have support for wasm64 as a target (which uses a LP64 model). However it's not entirely clear if the WASI SDK fully supports[3] this and by extension WASI libc/wasi-sysroot. 64bit support is something than can be explored more thoroughly in the future. As such in structures that are used to communicate between the host and guest we use 32bit ints. Even when a single byte might be enough. This is to avoid issues with structure layout differences between a 64bit host and 32bit guest (I.e WASM module) and the need for various bits of structure padding depending on host architecture. Instead everything is 4-byte aligned. [0]: <https://webassembly.org/> [1]: <https://wasmtime.dev/> [2]: <https://reviews.llvm.org/rG670944fb20b226fc22fa993ab521125f9adbd30a> [3]: <https://github.com/WebAssembly/wasi-sdk/issues/185> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>

This was inadvertently removed in 76086d6d ("Wasm: Allow to set the HTTP response status.") Fixes: 76086d6d ("Wasm: Allow to set the HTTP response status.") Signed-off-by: Andrew Clayton <a.clayton@nginx.com>

Currently Wasm modules are limited to a 32bit address space (until at least the memory64 work is completed). All the counters etc in the request structure were u32's. Which matched with 32bit memory limitation. However there is really no need to not allow >4GiB uploads that can be saved off to disk or some such. To do this we just need to increase the ->content_len & ->total_content_sent members to u64's. However because we need the request structure to have the exact same layout on 32bit (for Wasm modules) as it does on 64bit we need to re-jig the order of some of these members and add a four-byte padding member. Thus the request structure now looks like on 64bit (as shown by pahole(1)) struct nxt_wasm_request_s { uint32_t method_off; /* 0 4 */ uint32_t method_len; /* 4 4 */ uint32_t version_off; /* 8 4 */ uint32_t version_len; /* 12 4 */ uint32_t path_off; /* 16 4 */ uint32_t path_len; /* 20 4 */ uint32_t query_off; /* 24 4 */ uint32_t query_len; /* 28 4 */ uint32_t remote_off; /* 32 4 */ uint32_t remote_len; /* 36 4 */ uint32_t local_addr_off; /* 40 4 */ uint32_t local_addr_len; /* 44 4 */ uint32_t local_port_off; /* 48 4 */ uint32_t local_port_len; /* 52 4 */ uint32_t server_name_off; /* 56 4 */ uint32_t server_name_len; /* 60 4 */ /* --- cacheline 1 boundary (64 bytes) --- */ uint64_t content_len; /* 64 8 */ uint64_t total_content_sent; /* 72 8 */ uint32_t content_sent; /* 80 4 */ uint32_t content_off; /* 84 4 */ uint32_t request_size; /* 88 4 */ uint32_t nfields; /* 92 4 */ uint32_t tls; /* 96 4 */ char __pad[4]; /* 100 4 */ nxt_wasm_http_field_t fields[]; /* 104 0 */ /* size: 104, cachelines: 2, members: 25 */ /* last cacheline: 40 bytes */ }; and the same structure (taken from unit-wasm) compiled as 32bit struct luw_req { u32 method_off; /* 0 4 */ u32 method_len; /* 4 4 */ u32 version_off; /* 8 4 */ u32 version_len; /* 12 4 */ u32 path_off; /* 16 4 */ u32 path_len; /* 20 4 */ u32 query_off; /* 24 4 */ u32 query_len; /* 28 4 */ u32 remote_off; /* 32 4 */ u32 remote_len; /* 36 4 */ u32 local_addr_off; /* 40 4 */ u32 local_addr_len; /* 44 4 */ u32 local_port_off; /* 48 4 */ u32 local_port_len; /* 52 4 */ u32 server_name_off; /* 56 4 */ u32 server_name_len; /* 60 4 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 content_len; /* 64 8 */ u64 total_content_sent; /* 72 8 */ u32 content_sent; /* 80 4 */ u32 content_off; /* 84 4 */ u32 request_size; /* 88 4 */ u32 nr_fields; /* 92 4 */ u32 tls; /* 96 4 */ char __pad[4]; /* 100 4 */ struct luw_hdr_field fields[]; /* 104 0 */ /* size: 104, cachelines: 2, members: 25 */ /* last cacheline: 40 bytes */ }; We can see the structures have the same layout, same size and no padding. We need the __pad member as otherwise I saw gcc and clang on Alpine Linux automatically add the 'packed' attribute to the structure which made the two structures not match. Link: <https://github.com/WebAssembly/memory64> Link: <https://github.com/nginx/unit-wasm> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>

When trying to upload files to the luw-upload-reflector demo[0] above a certain size that would mean Unit would need to make more than two calls to the request_handler function in the Wasm module we would get the following error from wasmtime and the upload would stall on the third call to the request_handler WASMTIME ERROR: failed to call function [->wasm_request_handler] error while executing at wasm backtrace: 0: 0x5ce2 - <unknown>!memcpy 1: 0x7df - luw_req_buf_append at /home/andrew/src/unit-wasm/src/c/libunit-wasm.c:308:14 2: 0x3a1 - luw_request_handler at /home/andrew/src/unit-wasm/examples/c/luw-upload-reflector.c:110:3 Caused by: wasm trap: out of bounds memory access This was due to ->content_off (the offset of where the actual body content starts in the request structure/memory) being some overly large value. This was largely down to me being an idiot! Before calling the loop that makes the calls to the request_handler we would calculate the new offset, which is now just the size of the request structure as we don't re-send all the HTTP meta data and headers etc. However because this value is in the request structure which is in the shared memory and we use this same memory for requests and responses, when we make a response we overwrite this request structure with the response structure, so our ->content_off is now some wacked out value when we make the next call to the request_handler. To fix this we just need to reset ->content_off each time round the loop. There's also no point in setting ->nfields to 0, it has the same issue as above, but doesn't get re-used by the Wasm module anyway. [0]: <https://github.com/nginx/unit-wasm/blob/main/examples/c/luw-upload-reflector.c> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>

This commit enables WebAssembly modules to set the HTTP response status to something other than the previously hard coded '200 OK'. To do this they can make a call to nxt_wasm_set_resp_status() providing the required status code. If this function isn't called the status code defaults to '200 OK'. The WebAssembly module can also return -1 from the request_handler function as a short cut to signal a '500 Internal Server Error'. Signed-off-by: Andrew Clayton <a.clayton@nginx.com>

Due to the sandboxed nature of WebAssembly, by default WASM modules don't have any access to the underlying filesystem. There is however a capabilities based mechanism[0] for allowing such access. This adds a config option to the 'wasm' application type; 'access.filesystem' which takes an array of directory paths that are then made available to the WASM module. This access works recursively, i.e everything under a specific path is allowed access to. Example config might look like "access" { "filesystem": [ "/tmp", "/var/tmp" ] } The actual mechanism used allows directories to be mapped differently in the guest. But at the moment we don't support that and just map say /tmp to /tmp. This can be revisited if it's something users clamour for. Network sockets are another resource that may be controlled in this manner, for example there is a wasi_config_preopen_socket() function, however this requires the runtime to open the network socket then effectively pass this through to the guest. This is something that can be revisited in the future if users desire it. [0]: <https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-capabilities.md> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>

This adds the core of runtime WebAssembly[0] support. Future commits will enable this in the Unit core and expose the configuration. This introduces a new src/wasm directory for storing this source. We are initially using Wasmtime[0] as the WebAssembly runtime, however this has been designed with the ability to use different runtimes in mind. src/wasm/nxt_wasm.[ch] is the main interface to Unit. src/wasm/nxt_rt_wasmtime.c is the Wasmtime runtime support. This is nicely insulated from any knowledge of internal Unit workings. Wasmtime is what loads and runs the Wasm modules. The Wasm modules can export functions Wasmtime can call and Wasmtime can export functions that the module can call. We make use of both. The terminology used is that function exports are what the Wasm module exports and function imports are what the Wasm runtime exports to the module. We currently have four function imports (functions exported by the runtime to be called by the Wasm module). 1) nxt_wasm_get_init_mem_size This allows Wasm modules to get the size of the initially allocated shared memory. This is the size allocated at Unit startup and what the Wasm modules can assume they have access to (in reality this shared memory will likely be larger). The amount of memory allocated at startup is NXT_WASM_MEM_SIZE which as of this commit is 32MiB. We do actually allocate NXT_WASM_MEM_SIZE + NXT_WASM_PAGE_SIZE at startup which is an extra 64KiB (the smallest allocation unit), this is to allow room for the response structure and so module developers can just assume they have the full 32MiB for their actual response. 2) nxt_wasm_send_headers This allows WASM modules to send their headers. 3) nxt_wasm_send_response This allows WASM modules to send their response. 4) nxt_wasm_response_end This allows WASM modules to inform Unit they have finished sending their response. This calls nxt_unit_request_done() Then there are currently up to eight functions that a module can export. Three of which are required. These function can be named anything. I'll use the Unit configuration names to refer to them 1) request_handler The main driving function. This may be called multiple times for a single HTTP request if the request is larger than the shared memory. 2) malloc_handler Used to allocate a chunk of memory at language module startup. This memory is allocated from the WASM modules address space and is what is sued for communicating between the WASM module (the guest) and Unit (the host). 3) free_handler Used to free the memory from above at language module shutdown. Then there are the following optional handlers 1) module_init_handler If set, called at language module startup. 2) module_end_handler If set, called at language module shutdown. 3) request_init_handler If set, called at the start of request. Called only once per HTTP request. 4) request_end_handler If set, called once all of a request has been sent to the WASM module. 5) response_end_handler If set, called at the end of a request, once the WASM module has sent all its headers and data. 32bits We currently support 32bit WASM modules, I.e wasm32-wasi. Newer version of clang, 13+[2], do seem to have support for wasm64 as a target (which uses a LP64 model). However it's not entirely clear if the WASI SDK fully supports[3] this and by extension WASI libc/wasi-sysroot. 64bit support is something than can be explored more thoroughly in the future. As such in structures that are used to communicate between the host and guest we use 32bit ints. Even when a single byte might be enough. This is to avoid issues with structure layout differences between a 64bit host and 32bit guest (I.e WASM module) and the need for various bits of structure padding depending on host architecture. Instead everything is 4-byte aligned. [0]: <https://webassembly.org/> [1]: <https://wasmtime.dev/> [2]: <https://reviews.llvm.org/rG670944fb20b226fc22fa993ab521125f9adbd30a> [3]: <https://github.com/WebAssembly/wasi-sdk/issues/185> Reviewed-by: Alejandro Colomar <alx@nginx.com> Signed-off-by: Andrew Clayton <a.clayton@nginx.com>