Mekhanik evgenii/fix 2284#2654
Conversation
5e1c6c8 to
71e4ce0
Compare
33e779d to
186a89f
Compare
cefbf32 to
1ebe940
Compare
b89b90b to
4d303b8
Compare
5ef34b1 to
8915a2b
Compare
e192eeb to
2b6fbc9
Compare
0ac79f5 to
7a53317
Compare
7a53317 to
1f9c14d
Compare
const-t
left a comment
There was a problem hiding this comment.
LGTM, but I left small suggestion.
55cef6f to
a5f86f7
Compare
a5f86f7 to
1d9a1d1
Compare
krizhanovsky
left a comment
There was a problem hiding this comment.
The pull request is probably fine, I reviewed only a couple of patches from it. The only reason for the reject is that I believe we need to adjust our process for so heavy code modifications. @const-t @EvgeniiMekhanik I want to have a discussion on this, probably on the upcoming meeting.
The main problem is that the whole PR is huge having 9 commits of changes from the skb layer to HTTP, reworked callback architecture and so on. The pull request is an implementation of #2284 feature. However, there is no any single commit, message or comment neither in the code nor in the issue or pull request about how all the 9 commits contribute to the problem solution (only 1d9a1d1 is an exception with a clear purpose).
For example, 1f9c14d :
Rework HTTTP/2 -> HTTP/1 request transformation
why do we need this to solve #2284?
Use pool-based approach for HTTP/2 to HTTP/1 request transformation.
Maybe it's fine
Reuse the same pool-based approach already used for HTTP/1 request forwarding, providing a unified implementation.
Eliminate extra header copying. Previously, headers were written into the linear skb area and then copied by_pskb_copy_fclone. Now headers are written directly into skb fragments, reducing memory usage and copy overhead.
I can review the change and see that the code is correct, doesn't hurt performance and doesn't impact security. However, why the change is required? Which problem it solves? It doesn't clear now and will not clear in a year to review what for this code change was made.
Please, in the pull request write a general design for the #2284 feature and describe each of the commits as the steps reaching the design goal.
Why this is important
- The big changes change the project architecture and the whole team must have clear and aligned understanding of the changes. We must review not only the code itself, but the design and architecture.
- Mature complex projects always have questions "why this code is written this way?" and the current state of the commits doesn't answer this question.
@ykargin @artem98 FYI - I think you also need to keep this in mind and integrate on your projects.
| } | ||
| if (*skb_head && !TFW_SKB_CB(*skb_head)->is_head) | ||
| ss_skb_setup_head_of_list(*skb_head, mark, tls_type); | ||
|
|
There was a problem hiding this comment.
@const-t @EvgeniiMekhanik I had a look onto b68d2ce , which introduced the code and IIRC this also fixed the case when we push a message into a socket and fail somewhere on a middle skb, so a partial message (broken data) is in the write queue. I'm I wrong and is it safe to remove the code now?
There was a problem hiding this comment.
Previously, we considered handling ENOMEM in a special way without tearing down the connection, later we decide to don't do it and immediately drop connection. For example in tfw_tls_encrypt we do not clean up the socket queue on error, potentially leaving broken data behind. This is currently acceptable because the connection is immediately terminated and the entire queue is subsequently cleaned up.
| @@ -1763,7 +1748,7 @@ void | |||
| ss_skb_dflt_destructor(struct sk_buff *skb) | |||
There was a problem hiding this comment.
Now it's just a regular function, not a destructor and not 'default', any more, so let's rename it to something more appropriate
1d9a1d1 to
dead18c
Compare
dead18c to
405c93e
Compare
`on_tcp_entail` is always called after `on_send`, so both callbacks can be combined into a single union, freeing an additional 8 bytes for callback private data. This private data will be used to store a pointer to the request, allowing us in the future to retrieve the request in `on_tcp_entail` from the skb head before it is enqueued into the socket write queue, in order to decide whether to drop or entail it. We decided to do so because the `TFW_SKB_CB` structure has very limited space, and there are no extra 8 bytes available to store a request pointer. We also cannot use the `opaque_data` pointer in `TFW_SKB_CB` to store a request pointer and implement a corresponding skb destructor, because if the server connection is destroyed, the destructor may be invoked from `ss_do_send -> ss_skb_queue_purge` or `ss_tx_action -> ss_skb_queue_purge`. In such cases, accessing the request structure may result in a use-after-free, since the request may already have been freed during server connection cleanup. This is a preparatory step toward fixing excessive memory usage caused by forwarded requests whose client connection has already been closed but remain in the server’s forwarding queue. These requests should be dropped before being placed into the server socket’s write queue. Part-of: release-server-queued-requests Issue: #2284
- Remove `opaque_data` from `skb->cb`, now use TfwClientMem instead. Use only default skb destructor to decrement client memory when skb is freed. - Save http2 response in `on_send_data`. Pass NULL (as a conn) in `on_send` callback, if appropriate socket is not active or conn is real NULL (previously we don't call `on_send` callback in such case). Destroy `on_send_data` directly in `on_send` callack if we can't send data (conn is NULL). This patch simplifies the code and makes it consistent with the previous changes to the `TFW_SKB_CB` structure, which already provides dedicated storage for data associated with the `on_send` and `on_tcp_entail` callbacks. Response objects are now stored in this dedicated storage. This allows us to apply the same approach to requests, allowing them to be accessed safely from the `on_send_data` and `on_tcp_entail_data` callbacks. Part-of: release-server-queued-requests Issue: #2284
- Rework `on_tcp_entail` callback invocation. Now it is called from `ss_skb_tcp_entail_list` not from `ss_skb_tcp_entail`. If this callback fails we drop all skbs from appropriate list. - Implement `tfw_http_req_on_tcp_entail` callback to check that client connection for sending request is not closed and drop skbs and request if connection is already closed. - Implement `ss_skb_copy_cb` function for correct copying of skb->cb (including all callbacks and destructors). Previously, requests that had already been placed into a server connection's `fwd_queue` were still forwarded to the server even after the originating client connection had been closed. Although the responses to such requests were no longer needed, the requests themselves were still fully transmitted to the backend. This patch changes this behavior by performing an additional client connection state check immediately before entailing a request's skbs to the server socket write queue for transmission. If the client connection has already been closed, the request is discarded together with all associated skbs instead of being sent to the server. The chosen approach has a theoretical disadvantage: if a large number of requests from a reset client connection coexist with many requests from active client connections and the TCP window is small, requests belonging to the inactive client connection may remain in the server connection queue for an extended period of time, consuming memory. An alternative approach would have been to drop the server connection and purge requests belonging to inactive client connections from the server send queue during reconnection. This approach was rejected because it would also impact active client connections by forcing their requests to be retransmitted. Since the server connection is also dropped upon receiving the first response byte for an inactive client connection (see bac89d1), the current implementation provides a practical and effective fix for issue #2284. Part-of: release-server-queued-requests Closes #2284
Previously there was a sofisticated logic in `ss_skb_tcp_entail_list` function, which we are planning to use to handle ENOMEM in a special way and do not drop connection. Since we decide to don't do it, remove this logic at all. (Made here since we already change `ss_skb_tcp_entail_list` in this PR).
- Remove unnecessary `skb_copy_cb` (`__pskb_copy_fclone` already do it). Instead of it zero all fields for the cloned skb, which are related to client memory accounting (since we set client memory for the new cloned skb manually). Zero all other fields for original skb, we need appropriate callbacks only for skbs, which we will send to the socket write queue. - Move `ss_skb_on_tcp_entail` under `if (TFW_SKB_CB(skb)->is_head)` to show that it should be called only to the head of the list. - Add comments. - introduce skb callback container. There are currently many cases where different callbacks are required for request/response and HTTP/2 frame sending. Since the space available in `skb->cb` is limited, introduce a callback container structure and store a pointer to it in `skb->cb` instead of embedding callback-specific data directly. In addition to improving code clarity and freeing space in skb->cb, this change allows an unlimited number of callbacks to be added in the future.
Since now we use only default skb destructor for client memory adjusting, we can remove appropriate pointer from `skb->cb` and save 8 bytes.
During making frames we adjust skb length to recalculate `snd_wnd`. It seems when we send http1 data, it's better to work in the same way. (Made here since we already change `ss_skb_tcp_entail_list` in this PR).
Use pool-based approach for HTTP/2 to HTTP/1 request transformation. - Reuse the same pool-based approach already used for HTTP/1 request forwarding, providing a unified implementation. - Eliminate extra header copying. Previously, headers were written into the linear skb area and then copied by `_pskb_copy_fclone`. Now headers are written directly into skb fragments, reducing memory usage and copy overhead. - do not allocate a new skb during http1 request/response preparing. We should remove all old skbs except skb_head. In `skb_head` we can remove all fragments and linear data and reuse it, instead of new skb allocation. This patch demonstrate a very good performance improvement for big requests without cache: 2284: finished in 50.04s, 330629.14 req/s, 202.22MB/s finished in 50.04s, 328837.60 req/s, 201.04MB/s finished in 50.04s, 329637.96 req/s, 201.61MB/s master: finished in 50.04s, 328963.94 req/s, 201.21MB/s finished in 50.06s, 329731.98 req/s, 201.58MB/s finished in 50.04s, 328985.10 req/s, 201.17MB/s 2284 2K: finished in 50.04s, 330025.50 req/s, 201.98MB/s finished in 50.04s, 329206.90 req/s, 201.32MB/s finished in 50.04s, 329906.58 req/s, 201.74MB/s master 2K: finished in 50.03s, 277741.36 req/s, 169.92MB/s finished in 50.03s, 294516.36 req/s, 180.08MB/s finished in 50.03s, 298020.96 req/s, 182.21MB/s 2284 4k: finished in 50.03s, 313275.10 req/s, 191.66MB/s finished in 50.03s, 313052.38 req/s, 191.45MB/s finished in 50.03s, 304237.44 req/s, 186.14MB/s master 4K: finished in 50.03s, 261815.08 req/s, 160.15MB/s finished in 50.03s, 273399.16 req/s, 167.18MB/s finished in 50.03s, 270743.96 req/s, 165.64MB/s This patch is also important in the context of fixing issue #2284. As described earlier, when a large number of requests from a reset client connection coexist with requests from active client connections and the TCP window is small, requests belonging to inactive client connections may remain queued on the server connection for an extended period of time, consuming memory. This patch significantly reduce this memory overhead. Previously, request headers were stored in the skb linear data and were copied during `_pskb_copy_fclone` (doubling memory consumption!) Request headers are now stored in skb fragments instead, avoiding the copy in `_pskb_copy_fclone` and reducing memory usage by approximately 50%.
Remove `ss_skb_dflt_destructor` - this function is currently used only in one place. Additionally, the function name does not accurately reflect its behavior.
405c93e to
adfa33a
Compare
No description provided.