Skip to content

feat(web): allow to shutdown faster when there is no more requests#1193

Open
joelwurtz wants to merge 5 commits into
HFQR:mainfrom
joelwurtz:feat/shutdown-when-no-requests
Open

feat(web): allow to shutdown faster when there is no more requests#1193
joelwurtz wants to merge 5 commits into
HFQR:mainfrom
joelwurtz:feat/shutdown-when-no-requests

Conversation

@joelwurtz
Copy link
Copy Markdown
Contributor

@joelwurtz joelwurtz commented Jan 30, 2025

Ref #1190

Still a draft but this is an example on how we could shutdown faster and cleanly even when there is still pending connections, this is a proof of concept on h1 dispatcher, h2 and h3 can do something similar also

I used a tokio::sync::watch channel tokio_util::CancellationToken to check for shutdown change but maybe there is a better solution (there is a lot of things to adapt, like naming, correct deps, etc ... but this is mainly an example for the moment)

@joelwurtz joelwurtz force-pushed the feat/shutdown-when-no-requests branch from 52ce564 to 27dfc69 Compare January 30, 2025 17:00
Comment thread test/tests/h1.rs
assert_eq!("GET Response", body);
}

handle.try_handle()?.stop(false);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before this change this would wait for the keep alive timeout, now it shut down instantly (because there is no pending request)

@joelwurtz joelwurtz force-pushed the feat/shutdown-when-no-requests branch from 2f236c9 to 91a691c Compare February 27, 2025 09:00
@joelwurtz joelwurtz changed the title feat(web): allow to shutdown faster when there is no more requests on h1 feat(web): allow to shutdown faster when there is no more requests Feb 27, 2025
@joelwurtz joelwurtz marked this pull request as ready for review February 27, 2025 09:01
@joelwurtz
Copy link
Copy Markdown
Contributor Author

joelwurtz commented Feb 27, 2025

This implementation works to shutdown faster on h1 / h2 and h3 when there is no more requests, however i'm sure there is a better way to do it

@joelwurtz joelwurtz force-pushed the feat/shutdown-when-no-requests branch 2 times, most recently from f2d2069 to cc92695 Compare February 27, 2025 10:22
@joelwurtz joelwurtz force-pushed the feat/shutdown-when-no-requests branch from cc92695 to 6a58422 Compare January 13, 2026 13:52
@joelwurtz joelwurtz force-pushed the feat/shutdown-when-no-requests branch from 6a58422 to 755b6c7 Compare April 14, 2026 08:05
@fakeshadow
Copy link
Copy Markdown
Collaborator

Can you leave out the h2 dispatcher part for now? The code base for it is still very messy and some code path surround shutdown handling could be get refactored recently. I can take over that part after I rule out all the obvious bugs. The service and server part are reasonable

@joelwurtz
Copy link
Copy Markdown
Contributor Author

Yeah sure will remove it (but last changes make it works fine).

Are you fine with the design of the "ShutdownToken" ?

I remove the deps to tokio util to use a "simple" approach, however it requires spin (which gives us a Mutex in a non std env), and it also require the "alloc" feature (to be able to box pin the future).

@fakeshadow
Copy link
Copy Markdown
Collaborator

I'm thinking about a "simpler" design which was partially working with the old http2 implementation. Being the "connection: close" header would cause connection shutdown. In http1 we already have this behavior embeded in Response encoding.

The new homebrew http2 dispatcher still lacks this behavior but I can imagine the process of it being:
response task poll Service::call to finish. Parse the respone for "connection" header and if it contains value of "close" we yield from the chained select futures and issue a server side GoAway frame with NO_ERROR reason. Then we transit to graceful GoAway shutdown like we received a peer GoAway with NO_ERROR. This means the h2 connection would stop receiving new streams and let existing stream tasks run to complete then goes into connection shutdown.

That's it for the dispatcher level. For high level http user I can imagine a cancel token or some channel based middleware after observe a certain conditon would add "connection: close" to all response headers. This would result in a robust graceful shutdown for http2 and http3. You just send a request and then the app would be go away without dropping other potential ongoing concurrent requests.

And of course we should allow hooking up cancel token into acceptor types that listening to new streams otherwise new streams may keep coming in.

This approach is not as directly(you have to issue one request to immediately trigger the shutdown) as hooking up cancel token to http services but I guess it's more useful in general.

What do you think of this approach?

@fakeshadow
Copy link
Copy Markdown
Collaborator

BTW if we are not adding cancel token to Listener types we can ultize ReadyService impl to stop http service from accepting new streams. In xitca-server we always call ReadyService::ready before accepting new streams. So the http services just need to pending and throttle itself after observing cancel token. This can also be achieved with middlware

@fakeshadow
Copy link
Copy Markdown
Collaborator

#1366

We can start from this PR. From what I see the cancel token is mostly about cancel the KeepAlive timer. We can expand an arbitrary type into util::timer::KeepAlive and poll it before the Sleep and yield KeepAliveOutput::Cancel. Then in h1 dispatcher we just treat the cancel as a non error break point of shutdown.

The cancel token can live inside HttpServiceConfig as a trait object like Arc<dyn CancelToken + Send + Sync> or Arc<dyn Fn() -> Box<dyn CancelToken> + Send + Sync>. The CancelToken itself would be with throttle and cancel methods. throttle method would be called in <HttpService as ReadyService>::ready to throtlle the readiness of service to prevent it from stealing new connections. cancel would be called in KeepAlive to race the timer to KeepAliveOutput::Cancel. One thing I'm still not decided is if we should expose the Token trait or a Timer trait. Token is a more direct approach but Timer could offer more expaned functionality where cancel being one of the extension

@joelwurtz
Copy link
Copy Markdown
Contributor Author

Thanks for the follow up, i think #1366 is a good start also, i was also wondering if the keep alive can be updated to also support shutdown.

However to make it works nicely i think it can be good to also have a specific object which is public an can be shared by users of this library.

As an example we work on a reverse proxy, and having something to inform us that the app is in shutdown mode is crucial to allow closing existant connection (like websocket).

All of my PR are in a fork actually, and it works nicely, not sure if this is wanted but with the shutdown and some other changes we are actually able to restart the server without loosing new connections

Existing connections are still shutdown, but properly, and when there is a new connection during shutdown it's in the accept queue until the new process is restarted, and we need this shutdown to be fast in order to not fill up this queue when migrating the socket from old to new process.

@fakeshadow
Copy link
Copy Markdown
Collaborator

Yes. From the get go we would only offer a default No-op type that can be overridden by API like HttpServiceConfig::set_cancel_token(MyToken) where MyToken would implement CancelToken trait from xitca-http's pov

We can later introduce a specific token implement in xitca-web that emit a cloneable thread safet smartpoint that can be used anywhere user like.

That said it would be even more useful if we simply expose a TImer trait that can cover the cancel token's use case and enable possibly runtime independant from xitca-http (Right now xitca-http has gotten rid of most async runtime denepdent to tokio except timer related stuff). Which is why I'm not decided yet if we should go with the CancelToken route right away (the timer usage in xitca-http is kinda tricky due to the low res timer usage where a 500ms resolution of async timer is used through tokio::spawn. It's the big and sole issue I want to work out right now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants