KEP-5709: Add well-known pod network readiness gate by tssurya · Pull Request #5995 · kubernetes/enhancements

tssurya · 2026-04-05T11:45:45Z

One-line PR description: Adds a well-known pod network readiness gate

Issue link: Pod readiness gate for network readiness #5709

Other comments: Stems from KEP-4559 Redesigning Kubelet probes #4558 (comment)

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

tssurya · 2026-04-05T11:45:58Z

/sig network

k8s-ci-robot · 2026-04-05T11:45:58Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: tssurya
Once this PR has been reviewed and has the lgtm label, please assign mikezappa87 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

keps/sig-network/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tssurya · 2026-04-05T11:48:44Z

keps/sig-network/5709-pod-network-readiness-gates/kep.yaml

+status: implementable
+creation-date: 2026-04-05
+reviewers:
+  - "@danwinship"


who else should I add for reviewers and approvers how to get these names?

I'm happy to help with a review. I'm unsure if there are requirements for a reviewer though.

thank you Adrian! I'll add you as well to the reviewers list

tssurya · 2026-04-05T11:48:58Z

keps/sig-network/5709-pod-network-readiness-gates/kep.yaml

+
+# The following PRR answers are required at alpha release
+# List the feature gate name and the components for which it must be enabled
+feature-gates:


I think we don't need a feature gate but I can't tell...

You need a feature gate if you're modifying any core components (eg, kubelet), but not if the changes are all external to k/k

k8s-ci-robot · 2026-04-05T11:49:28Z

@tssurya: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-enhancements-test	`7bc66af`	link	true	`/test pull-enhancements-test`
pull-enhancements-verify	`7bc66af`	link	true	`/test pull-enhancements-verify`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

tssurya · 2026-04-05T11:50:01Z

keps/sig-network/5709-pod-network-readiness-gates/README.md

+
+<<[/UNRESOLVED]>>
+
+### Approach A: Network plugin webhook (no core changes)


I think I'm gonna get huge pushback for B and C approaches :D but wanted to put them up here first for review and then move them into alternatives
some of the later sections are not filled in just to allow some time for convergence on the approach

tssurya · 2026-04-05T11:53:07Z

cc @fasaxc @caseydavenport @joestringer PTAL since this is something that might be of interest to the network plugins

adrianmoisey · 2026-04-05T13:24:45Z

keps/sig-network/5709-pod-network-readiness-gates/kep.yaml

+status: implementable
+creation-date: 2026-04-05
+reviewers:
+  - "@danwinship"


I'm happy to help with a review. I'm unsure if there are requirements for a reviewer though.

adrianmoisey · 2026-04-05T13:39:14Z

keps/sig-network/5709-pod-network-readiness-gates/README.md

+- **Con:** If the webhook is unavailable and `failurePolicy` is
+  `Ignore`, pods are created without the gate and silently lose
+  protection.


Is there another con for when the failurePolicy is set to Fail, which may cause pods to be unable to be created??
(I don't know if this is implementation specific, I don't have much CNI experience, but can it be possible for only Pods that require the CNI plugin to be matched by the webhook? I assume this may be out of scope of the KEP.)

Is there another con for when the failurePolicy is set to Fail, which may cause pods to be unable to be created??

yea that's totally possible as well which has bigger impact but I thought since people are opting into webhooks that's something they live with but I can also call this aspect out, thanks for asking this

(I don't know if this is implementation specific, I don't have much CNI experience, but can it be possible for only Pods that require the CNI plugin to be matched by the webhook? I assume this may be out of scope of the KEP.)

this is a good question. I haven't implemented a webhook myself, but on investigating a bit more, it sounds like it can't differentiate, the closest we can get is a "namespaceSelector" filtering. So the webhook would get all CREATE pod events but inside the webhook handler I'd need to code the logic to ignore spec.hostNetwork and skip those...

failurePolicy would basically have to be set to Ignore here. Fail is just way too fragile.

Aren't both options too fragile?
You either get your pod without the CNI (which I assume is an undesired state) or you don't get the pod at all (also undesired, but may be a better failure mode)

But also, if we're letting CNIs handle this webhook, they could do whatever they want when they register the webhook, so I assume we should document both modes as a "Con".

danwinship

Didn't get all the way to the end, but there's already plenty to think about...

danwinship · 2026-04-06T13:23:28Z

keps/sig-network/5709-pod-network-readiness-gates/kep.yaml

+  - "@tssurya"
+owning-sig: sig-network
+participating-sigs:
+  - sig-network


you don't need to list sig-network as both "owning" and "participating"

danwinship · 2026-04-06T13:24:10Z

keps/sig-network/5709-pod-network-readiness-gates/kep.yaml

+
+# The following PRR answers are required at alpha release
+# List the feature gate name and the components for which it must be enabled
+feature-gates:


You need a feature gate if you're modifying any core components (eg, kubelet), but not if the changes are all external to k/k

danwinship · 2026-04-06T13:25:41Z

keps/sig-network/5709-pod-network-readiness-gates/README.md

+Kubernetes currently has no explicit signal for whether a pod's
+network has been fully programmed and is ready to receive traffic.


Suggested change

Kubernetes currently has no explicit signal for whether a pod's

network has been fully programmed and is ready to receive traffic.

Kubernetes currently has no explicit signal for whether a pod

has been fully attached to the pod network and is ready to receive traffic.

danwinship · 2026-04-06T13:26:47Z

keps/sig-network/5709-pod-network-readiness-gates/README.md

+The closest existing condition, [`PodReadyToStartContainers`][KEP-3085],
+indicates that the pod sandbox has been created and CNI `ADD` has
+returned — but not that the network datapath is fully programmed.
+This KEP introduces a built-in [pod readiness gate][KEP-580]


Suggested change

This KEP introduces a built-in [pod readiness gate][KEP-580]

This KEP introduces a well-known [pod readiness gate][KEP-580]

It's not built-in to Kubernetes, it's just a standard thing for pod network implementations to implement.

danwinship · 2026-04-06T13:29:39Z

keps/sig-network/5709-pod-network-readiness-gates/README.md

+condition that the network plugin sets to indicate network readiness,
+cleanly separating application readiness (answered by readiness probes) from
+network readiness (answered by the network plugin). This becomes
+especially important as [KEP-4559] moves kubelet probes to run


Suggested change

especially important as [KEP-4559] moves kubelet probes to run

especially important as [KEP-4559] proposes to move kubelet probes to run

since it's still not even provisional yet...

danwinship · 2026-04-06T14:11:18Z

keps/sig-network/5709-pod-network-readiness-gates/README.md

+outside the cluster (e.g., Ingress or cloud load balancers), as those
+are separate concerns with their own readiness signals.
+
+### User Stories


User stories are optional and can be omitted if they don't actually tell the reader anything new. (ie, don't just make up user stories to fill in the template, if you've already fully explained the problem to the extent that we understand it in the rest of the KEP)

danwinship · 2026-04-06T14:13:51Z

keps/sig-network/5709-pod-network-readiness-gates/README.md

+
+A compliance team requires that no traffic reach a pod before its
+NetworkPolicy rules are fully programmed. Today, there is a
+[documented race][np-pod-lifecycle] where a pod can receive traffic


No no no, the docs you're pointing to explicitly forbid implementations from having that race condition. You can mark the pod ready when some traffic is denied that should have been accepted, but you can't mark it ready when some traffic is accepted that should have been denied. This KEP should not change that (because requiring that all accept rules are fully programmed might drastically affect startup latency.)

danwinship · 2026-04-06T14:38:56Z

keps/sig-network/5709-pod-network-readiness-gates/README.md

+cluster network interface plus a high-speed RDMA interface. Each
+device may be programmed by a different plugin or driver, and each
+has its own readiness timeline. The pod should not receive traffic
+until all its network devices are fully plumbed. The network plugin


I don't think that's correct. The network readiness condition is just about "can the endpoint be reached by Services". As of right now, even when using DRA and multiple networks, Services are always reached over the cluster-default pod network, so that's what the network readiness condition should be checking.

If the code running within the pod needs access to secondary networks to do its job, then that's an application-level readiness issue, not a network readiness issue. (Even if the secondary network is attached, there's no guarantee that the remote database on that secondary network is actually up and running anyway; you would want to have your application-level readiness probe test that, and then in that case, there is no need to explicitly consider secondary-network-reachability.)

How any of this would interact with secondary networks in a future multi-network k8s environment depends on the multi-network networking model...

If Services always point to endpoints on the cluster-default pod network, then the network readiness API doesn't need to consider other networks.

If Services can exist on multiple networks, but any given Pod can only be an endpoint of Services on a single network, then we would want the Pod's Readiness to take into account its reachability only on that single network.

If Services can exist on multiple networks, and a given Pod may be an endpoint of Services on multiple networks, then probably a Pod's overall Readiness should not be tied to its reachability on any particular network, and we just should keep the signal separate from Pod Readiness, and have the service proxy start tracking both Readiness and reachability separately, so that future multi-network service proxies can correctly distinguish things like "Pod A is not ready; Pod B is ready and reachable over Network X but not over Network Y; Pod C is ready and reachable over both Network X and Network Y."

(though we could simplify and say that multi-network Pods can only be endpoints of Services when they are reachable on all of the networks they are attached to).

danwinship · 2026-04-06T14:46:13Z

keps/sig-network/5709-pod-network-readiness-gates/README.md

+   convention described in [KEP-580]. Trade-off: there is no single
+   condition name that operators and tooling can rely on across
+   clusters, and Approach B (kubelet hardcodes the condition) would
+   not work with per-plugin names.


Also, if there's no standard name, then you can't know for sure if the feature is being used in a given cluster (ie, if it's guaranteed that your pods won't become ready until the network is plumbed).

danwinship · 2026-04-06T14:53:17Z

keps/sig-network/5709-pod-network-readiness-gates/README.md

+   Gives operators and tooling a consistent name to query across any
+   cluster while still following the [KEP-580] naming convention.
+   Trade-off: in multi-plugin clusters only one plugin can own the
+   condition, or the plugins must coordinate who sets it.


While I'm not worried about the multi-network case, there is still the problem of clusters where the "pod network implementation" consists of multiple unrelated pieces. For example, if you're using flannel plus kube-network-policies, then both components affect whether the pod is fully reachable, but they don't coordinate with each other enough to be able to do a single condition... hm...

fasaxc · 2026-04-08T09:32:51Z

Thanks for opening this discussion. However, I'm not sure a readiness gate is enough, we've had a pretty strong signal from our users that they want the network to be ready before their process starts inside the pod. A readiness gate works for incoming service traffic, but it does nothing to delay start-up of the user's app inside the pod. Calico's original design split the CNI plugin and network policy parts, so that the CNI plugin would return as soon as the IPAM was done and veth created. Our policy is arranged so that new pods get no connectivity until the daemonset kicks in and applies the pod-specific rules (be that iptables/nftables/BPF). While we generally "win" that race, we can lose in a large cluster when an app starts quickly and immediately starts making outgoing connections. Many apps are written to fail if their first few requests fail, or if DNS is not accessible immediately.

Overall, I'd much rather have a solution that delays container execution inside the pod until we set some flag. Calico now has a mode where the CNI plugin will wait for up to N seconds for the policy to be programmed before continuing. This closes the gap but it might be surprising to CRI/Kubelet if the CNI plugin takes longer than expected.

danwinship · 2026-04-08T13:00:44Z

Yeah, that's another thing we could consider. I know in ovn-kubernetes, we intentionally return from the CNI plugin "early", because, IIRC, kubelet is basically blocked from starting up another pod until the sandbox creation complete, so if every CNI ADD call waits for the pod to be fully networked, it massively slows down the rate at which you can create new pods. Maybe we should fix that instead (since you're right, people really don't want their pods to start up with half-working networking...)

Add well-known pod network readiness gate

7bc66af

Signed-off-by: Surya Seetharaman <suryaseetharaman.9@gmail.com>

k8s-ci-robot requested a review from danwinship April 5, 2026 11:45

k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Apr 5, 2026

k8s-ci-robot requested a review from MikeZappa87 April 5, 2026 11:45

k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 5, 2026

tssurya mentioned this pull request Apr 5, 2026

Pod readiness gate for network readiness #5709

Open

4 tasks

tssurya commented Apr 5, 2026

View reviewed changes

adrianmoisey reviewed Apr 5, 2026

View reviewed changes

danwinship reviewed Apr 6, 2026

View reviewed changes


		<<[/UNRESOLVED]>>

		### Approach A: Network plugin webhook (no core changes)

		Kubernetes currently has no explicit signal for whether a pod's
		network has been fully programmed and is ready to receive traffic.

	This KEP introduces a built-in [pod readiness gate][KEP-580]
	This KEP introduces a well-known [pod readiness gate][KEP-580]

	especially important as [KEP-4559] moves kubelet probes to run
	especially important as [KEP-4559] proposes to move kubelet probes to run

Conversation

tssurya commented Apr 5, 2026

Uh oh!

tssurya commented Apr 5, 2026

Uh oh!

k8s-ci-robot commented Apr 5, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Apr 5, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tssurya commented Apr 5, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danwinship left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fasaxc commented Apr 8, 2026

Uh oh!

danwinship commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants