summaryrefslogtreecommitdiffhomepage
path: root/src/nxt_unit.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2020-12-29Libunit: processing single port message.Max Romanov1-13/+0
This partially reverts the optimisation introduced in 1d84b9e4b459 to avoid an unpredictable block in nxt_unit_process_port_msg(). Under high load, this function may never return control to its caller, and the external event loop (in Node.js and Python asyncio) won't be able to process other scheduled events. To reproduce the issue, two request processing types are needed: 'fast' and 'furious'. The 'fast' one simply returns a small response, while the 'furious' schedules asynchronous calls to external resources. Thus, if Unit is subjected to a large amount of 'fast' requests, the 'furious' request processing freezes until the high load ends. The issue was found by Wu Jian Ping (@wujjpp) during Node.js stream implementation discussion and relates to PR #502 on GitHub.
2020-12-18Libunit: fixed shared memory waiting.Max Romanov1-1/+4
The nxt_unit_ctx_port_recv() function may return the NXT_UNIT_AGAIN code, in which case an attempt to reread the message should be made. The issue was reproduced in load testing with response sizes 16k and up. In the rare case of a NXT_UNIT_AGAIN result, a buffer of size -1 was processed, which triggered a 'message too small' alert; after that, the app process was terminated.
2020-12-18Limiting app queue notifications count in socket.Max Romanov1-1/+6
Under high load, a queue synchonization issue may occur, starting from the steady state when an app queue message is dequeued immediately after it has been enqueued. In this state, the router always puts the first message in the queue and is forced to notify the app about a new message in an empty queue using a socket pair. On the other hand, the application dequeues and processes the message without reading the notification from the socket, so the socket buffer overflows with notifications. The issue was reproduced during Unit load tests. After a socket buffer overflow, the router is unable to notify the app about a new first message. When another message is enqueued, a notification is not required, so the queue grows without being read by the app. As a result, request processing stops. This patch changes the notification algorithm by counting the notifications in the pipe instead of getting the number of messages in the queue.
2020-11-24Libunit: improved error logging around initialization env variable.Valentin Bartenev1-12/+31
2020-11-19Libunit: fixing read buffer leakage.Max Romanov1-0/+1
If shared queue is empty, allocated read buffer should be explicitly released. Found by Coverity (CID 363943). The issue was introduced in f5ba5973a0a3.
2020-11-18Libunit: fixing read buffer allocations on exit.Max Romanov1-5/+5
2020-11-18Libunit: closing active requests on quit.Max Romanov1-9/+28
2020-11-18Libunit: making minor tweaks.Max Romanov1-11/+4
Removing unnecessary context operations from shared queue processing loop. Initializing temporary queues only when required.
2020-11-18Go: removing C proxy functions and re-using goroutines.Max Romanov1-12/+63
2020-11-18Libunit: fixing racing condition in request struct recycling.Max Romanov1-2/+2
The issue occurred under highly concurrent request load in Go applications. Such applications are multi-threaded but use a single libunit context; any thread-safe code in the libunit context is only required for Go applications. As a result of improper request state reset, the recycled request structure was recovered in the released state, so further operations with this request resulted in 'response already sent' warnings. However, the actual response was never delivered to the router and the client.
2020-11-18Libunit: fixing racing condition for port add / state change.Max Romanov1-39/+87
The issue only occurred in Go applications because "port_send" is overloaded only in Go. To reproduce it, send multiple concurrent requests to the application after it has initialised. The warning message "[unit] [go] port NNN:dd not found" is the first visible aspect of the issue; the second and more valuable one is a closed connection, an error response, or a hanging response to some requests. When the application starts, it is unaware of the router's worker thread ports, so it requests the ports from the router after receiving requests from the corresponding router worker threads. When multiple requests are processed simultaneously, the router port may be required by several requests, so request processing starts only after the application receives the required port information. The port should be added to the Go port repository after its 'ready' flag is updated. Otherwise, Unit may start processing some requests and use the port before it is in the repository. The issue was introduced in changeset 78836321a126.
2020-11-18Libunit: improving logging consistency.Max Romanov1-0/+4
Debug logging depends on macros defined in nxt_auto_config.h.
2020-11-10Fixing multi-buffer body send to application.Max Romanov1-3/+7
Application shared queue only capable to pass one shared memory buffer. The rest buffers in chain needs to be send directly to application in response to REQ_HEADERS_AC message. The issue can be reproduced for configurations where 'body_buffer_size' is greater than memory segment size (10 Mb). Requests with body size greater than 10 Mb are just `stuck` e.g. not passed to application awaiting for more data from router. The bug was introduced in 1d84b9e4b459 (v1.19.0).
2020-10-28Preserving the app port write socket.Max Romanov1-6/+6
The socket is required for intercontextual communication in multithreaded apps.
2020-10-28Libunit: waking another context with the RPC_READY message.Max Romanov1-1/+37
2020-10-28Router: introducing the PORT_ACK message.Max Romanov1-5/+32
The PORT_ACK message is the router's response to the application's NEW_PORT message. After receiving PORT_ACK, the application is safe to process requests using this port. This message avoids a racing condition when the application starts processing a request from the shared queue and sends REQ_HEADERS_ACK. The REQ_HEADERS_ACK message contains the application port ID as reply_port, which the router uses to send request data. When the application creates a new port, it immediately sends it to the main router thread. Because the request is processed outside the main thread, a racing condition can occur between the receipt of the new port in the main thread and the receipt of REQ_HEADERS_ACK in the worker router thread where the same port is specified as reply_port.
2020-10-28Libunit: releasing cached read buffers when destroying context.Max Romanov1-0/+8
2020-10-28Libunit: added a function to discern main and worker contexts.Max Romanov1-0/+11
2020-10-28Libunit: gracefully quitting a multicontext application.Max Romanov1-24/+72
2020-10-28Libunit: protecting the new mmap from being used in another thread.Max Romanov1-1/+6
Until the mmap is received by the router, only the creator thread may use this mmap, so the "mmap not found" state in the router is avoided.
2020-10-06Removing a meaningless warning message.Max Romanov1-4/+0
Data in the queue and the socket are transmitted independently; special READ_QUEUE and READ_SOCKET message types are used for synchronization. The warning was accidentally committed with changeset 1d84b9e4b459.
2020-10-01Publishing libunit's malloc() and free() wrappers for apps.Max Romanov1-49/+53
2020-09-30Fixing leakage caused by incorrect in_hash flag cleanup.Max Romanov1-1/+3
Large-bodied requests are added to the request hash to be found when the body arrives. However, changeset 1d84b9e4b459 introduced a bug: the 'in_hash' flag, used to remove the request from the hash at request release, was cleared after the first successful request lookup. As a result, the entry was never removed.
2020-09-29Wrapping libunit's malloc() and free() calls for logging purposes.Max Romanov1-51/+82
This change aids heap usage analysis in applications. The alloc and free functions are also required for lvlhash due to the upcoming threading support, because using main nxt_memalign() and nxt_free() isn't safe in a multithreaded app environment. The reason is that these functions may use thread-local structures which aren't initialized properly in applications.
2020-09-15Hardening header names comparation for grouping.Max Romanov1-10/+67
2020-09-10Fixing WebSocket frame retain function.Max Romanov1-2/+13
Some of the pointers were not adjusted after frame's memory re-allocation. Fortunately, this function was not used and the bug has no effect.
2020-08-11Fixing return value initialization.Max Romanov1-19/+25
2020-08-11Style fixes for 2 file descriptors transfer over port.Max Romanov1-35/+36
Two consecutive fd and fd2 fields replaced with array.
2020-08-11Moving file descriptor blocking to libunit.Max Romanov1-11/+39
The default libunit behavior relies on blocking the recv() call for port file descriptors, which an application may override if needed. For external applications, port file descriptors were toggled to blocking mode before the exec() call. If the exec() call failed, descriptor remained blocked, so the process hanged while trying to read from it. This patch moves file descriptor mode switch inside libunit.
2020-08-11Wrapping close() call in libunit for logging.Max Romanov1-35/+44
2020-08-11Introducing application and port shared memory queues.Max Romanov1-250/+921
The goal is to minimize the number of syscalls needed to deliver a message.
2020-08-11Port message extended to transfer 2 file descriptors.Max Romanov1-3/+16
2020-08-11Process structures refactoring in runtime and libunit.Max Romanov1-203/+93
Generic process-to-process shared memory exchange is no more required. Here, it is transformed into a router-to-application pattern. The outgoing shared memory segments collection is now the property of the application structure. The applications connect to the router only, and the process only needs to group the ports.
2020-08-11Introducing the shared application port.Max Romanov1-106/+361
This is the port shared between all application processes which use it to pass requests for processing. Using it significantly simplifies the request processing code in the router. The drawback is 2 more file descriptors per each configured application and more complex libunit message wait/read code.
2020-08-11Changing router to application shared memory exchange protocol.Max Romanov1-108/+269
The application process needs to request the shared memory segment from the router instead of the latter pushing the segment before sending a request to the application. This is required to simplify the communication between the router and the application and to prepare the router for using the application shared port and then the queue.
2020-08-11Changing router to application port exchange protocol.Max Romanov1-27/+249
The application process needs to request the port from the router instead of the latter pushing the port before sending a request to the application. This is required to simplify the communication between the router and the application and to prepare the router to use the application shared port and then the queue.
2020-08-11Adding a reference counter to the libunit port structure.Max Romanov1-342/+336
The goal is to minimize the number of (pid, id) to port hash lookups which require a library mutex lock. The response port is found once per request, while the read port is initialized at startup.
2020-08-11Libunit refactoring: port management.Max Romanov1-261/+319
- Changed the port management callbacks to notifications, which e. g. avoids the need to call the libunit function - Added context and library instance reference counts for a safer resource release - Added the router main port initialization
2020-07-21Fixed non-debug log time format in libunit.Valentin Bartenev1-0/+7
This makes log format used in libunit consistent with the daemon, where milliseconds are printed only in the debug log level. Currently a compile time switch is used, since there's no support for runtime changing of a log level for now. But in the future this should be a runtime condition, similar to nxt_log_time_handler().
2020-03-09Refactor of process management.Tiago Natel de Moura1-1/+1
The process abstraction has changed to: setup(task, process) start(task, process_data) prefork(task, process, mp) The prefork() occurs in the main process right before fork. The file src/nxt_main_process.c is completely free of process specific logic. The creation of a process now supports a PROCESS_CREATED state. The The setup() function of each process can set its state to either created or ready. If created, a MSG_PROCESS_CREATED is sent to main process, where external setup can be done (required for rootfs under container). The core processes (discovery, controller and router) doesn't need external setup, then they all proceeds to their start() function straight away. In the case of applications, the load of the module happens at the process setup() time and The module's init() function has changed to be the start() of the process. The module API has changed to: setup(task, process, conf) start(task, data) As a direct benefit of the PROCESS_CREATED message, the clone(2) of processes using pid namespaces now doesn't need to create a pipe to make the child block until parent setup uid/gid mappings nor it needs to receive the child pid.
2020-04-10Resolving a racing condition while adding ports on the app's side.Max Romanov1-3/+25
An earlier attempt (ad6265786871) to resolve this condition on the router's side added a new issue: the app could get a request before acquiring a port.
2020-04-06Fixing 'find & add' racing condition in connected ports hash.Max Romanov1-0/+3
Missing error log messages added.
2020-03-30Fixing application process infinite loop.Max Romanov1-21/+28
Main process exiting before app process init may have caused hanging.
2020-03-30Handling change file message in libunit.Max Romanov1-0/+10
This is required for proper log file rotation action.
2020-03-30Attributing libunit logging function for arguments validation.Max Romanov1-5/+7
2020-03-12Using disk file to store large request body.Max Romanov1-7/+119
This closes #386 on GitHub.
2020-03-12Introducing readline function in libunit.Max Romanov1-0/+38
Ruby and Java modules now use this function instead of own implementations.
2020-02-04Removing duplicate macro definitions.Max Romanov1-4/+0
This issue was introduced in 2c7f79bf0a1f.
2020-02-03Initializing local buffer ctx_impl field for correct release.Max Romanov1-0/+1
Uninitialized ctx_impl field may cause crash in application process. To reproduce the issue, need to trigger shared memory buffer send error on application side. In our case, send error caused by router process crash. This issue was introduced in 2c7f79bf0a1f.
2020-02-03Added missing stream argument to error message.Max Romanov1-1/+2
Found by Coverity (CID 353386).