Skip to content

Config.uk, lwipopts.h: Expose TCP_MSL as configurable option#74

Open
harimishal1 wants to merge 1 commit intounikraft:stagingfrom
harimishal1:fix/expose-tcp-msl-config
Open

Config.uk, lwipopts.h: Expose TCP_MSL as configurable option#74
harimishal1 wants to merge 1 commit intounikraft:stagingfrom
harimishal1:fix/expose-tcp-msl-config

Conversation

@harimishal1
Copy link
Copy Markdown

@harimishal1 harimishal1 commented Mar 21, 2026

I was investigating this issue (catalog-core #77) about nginx's poor and unstable performance under repeated stress tests and traced the root cause to lwIP's TIME_WAIT behavior.

When benchmarking nginx with wrk -t 14 -d1m -c 30 repeatedly, throughput alternates between ~5,000 req/s and ~150 req/s. This happens because when connections close, they enter TIME_WAIT for 2 * TCP_MSL = 120 seconds. With MEMP_MEM_MALLOC=1 (heap mode), pool limits like MEMP_NUM_TCP_PCB don't actually bound allocations, so increasing LWIP_NUM_TCPCON has no effect. The real issue is that TIME_WAIT PCBs accumulate and stall new connection establishment in the tcpip thread. I observed 20-30 second delays on TCP handshakes during "bad" runs.

lwIP already defines TCP_MSL with an #ifndef guard in tcp_priv.h, so it's designed to be overridable. This PR exposes it as a Kconfig option (LWIP_TCP_MSL), keeping the default at 60000ms so existing behavior is unchanged.

I tested with nginx using TCP_MSL=5000 (TIME_WAIT = 10s):

Run Default (TCP_MSL=60s) TCP_MSL=5s
1 5,303 req/s 27,645 req/s
2 193 req/s 19,862 req/s
3 3,170 req/s 23,299 req/s
4 164 req/s 24,313 req/s

Verified that c-http, cpp-http, click, and nginx all build correctly with the new option at the default value.

lwIP's TCP_MSL (Maximum Segment Lifetime) determines how long
connections remain in TIME_WAIT state (2 * TCP_MSL). The default
120-second TIME_WAIT causes severe performance degradation in
applications that rapidly cycle TCP connections, as TIME_WAIT PCBs
accumulate and stall new connection establishment.

Add LWIP_TCP_MSL as a Kconfig integer option under the TCP
configuration menu. The default remains 60000ms (preserving
current RFC-compliant behavior). Applications with high connection
churn (e.g. nginx under repeated benchmarks) can lower this value
to reduce TIME_WAIT duration and improve throughput stability.

Signed-off-by: misharu <harimishal1@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant