diff --git a/keps/sig-node/5999-h2c-container-probes/README.md b/keps/sig-node/5999-h2c-container-probes/README.md new file mode 100644 index 000000000000..9b2f611d3505 --- /dev/null +++ b/keps/sig-node/5999-h2c-container-probes/README.md @@ -0,0 +1,998 @@ + +# KEP-5999: HTTP/2 cleartext (h2c) container probes + + + + + + +- [Release Signoff Checklist](#release-signoff-checklist) +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) +- [Proposal](#proposal) + - [User Stories](#user-stories) + - [Story 1](#story-1) + - [Story 2](#story-2) + - [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional) + - [Risks and Mitigations](#risks-and-mitigations) +- [Design Details](#design-details) + - [Test Plan](#test-plan) + - [Prerequisite testing updates](#prerequisite-testing-updates) + - [Unit tests](#unit-tests) + - [Integration tests](#integration-tests) + - [e2e tests](#e2e-tests) + - [Graduation Criteria](#graduation-criteria) + - [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + - [Version Skew Strategy](#version-skew-strategy) +- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire) + - [Feature Enablement and Rollback](#feature-enablement-and-rollback) + - [Rollout, Upgrade and Rollback Planning](#rollout-upgrade-and-rollback-planning) + - [Monitoring Requirements](#monitoring-requirements) + - [Dependencies](#dependencies) + - [Scalability](#scalability) + - [Troubleshooting](#troubleshooting) +- [Implementation History](#implementation-history) +- [Drawbacks](#drawbacks) +- [Alternatives](#alternatives) +- [Infrastructure Needed (Optional)](#infrastructure-needed-optional) + + +## Release Signoff Checklist + + + +Items marked with (R) are required *prior to targeting to a milestone / release*. + +- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR) +- [ ] (R) KEP approvers have approved the KEP status as `implementable` +- [ ] (R) Design details are appropriately documented +- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors) + - [ ] e2e Tests for all Beta API Operations (endpoints) + - [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) + - [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free +- [ ] (R) Graduation criteria is in place + - [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md) within one minor version of promotion to GA +- [ ] (R) Production readiness review completed +- [ ] (R) Production readiness review approved +- [ ] "Implementation History" section is up-to-date for milestone +- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] +- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes + + + +[kubernetes.io]: https://kubernetes.io/ +[kubernetes/enhancements]: https://git.k8s.io/enhancements +[kubernetes/kubernetes]: https://git.k8s.io/kubernetes +[kubernetes/website]: https://git.k8s.io/website + +## Summary + + +HTTP/2 cleartext (h2c) is widely deployed where TLS terminates at the edge +(load balancers, ingress) while workloads still speak plain HTTP/2 on the pod +network—including gRPC and other HTTP/2 stacks. The wire format is defined in +HTTP/2 specifications from the Internet Engineering Task Force (IETF), including +cleartext use with prior knowledge of HTTP/2 on the connection, mainstream HTTP +implementations support h2c when configured. Kubernetes probes should use that +protocol rather than a separate HTTP/1.1-only listener. + +This KEP makes h2c (HTTP/2 cleartext) a supported option for container health +probes, so operators are not forced to add a second HTTP/1.1-only probe port or +depend on a `tcpSocket` probe that only shows the port is open without a +meaningful HTTP response. Today, `httpGet` on plain TCP is effectively HTTP/1.1; +`https` can use HTTP/2 after TLS, but there is no first-class way to probe h2c +on the workload port. The KEP proposes opt-in API surface and kubelet behavior +to perform those checks over h2c. + +## Motivation + + + +Community interest and prior discussion: [kubernetes/kubernetes#125599](https://github.com/kubernetes/kubernetes/issues/125599). + +HTTP/2 cleartext is defined in IETF HTTP/2 (including cleartext with prior +knowledge of HTTP/2 on the connection) and implemented in widely used HTTP +libraries, so the protocol is mature—not an ad hoc encoding. Together with +widespread in-cluster use when TLS terminates outside the pod, that supports +treating h2c as a first-class probe target. + +The kubelet already offers first-class `httpGet`, `https`, `tcpSocket`, `grpc`, +and `exec`, but none performs HTTP/2 on the workload’s cleartext listening port. +(extra HTTP/1.1-only listener, or `tcpSocket` without an HTTP-level check). + +Unlike gRPC probes—where part of the case was that gRPC-related stack was already +in the kubelet first-class h2c in the HTTP probe path can add or deepen HTTP/2 +client dependency. That cost is easier to accept when it matches how services +already listen, parallels HTTP/2 already reachable via `https` probes after +TLS, and keeps health checks declarative instead of pushing workloads toward +`exec` to approximate HTTP-level checks on the real port. Broader use of HTTP/2 +cleartext in other Kubernetes components over time would further support taking +on this dependency in the kubelet. + +### Goals + + +Enable HTTP/2 cleartext (h2c) container probes so apps are not forced to add a +separate HTTP/1.1-only probe port or rely on TCP probes that do not check a +real HTTP response on the workload port. + +### Non-Goals + + +This KEP does not alter gRPC probes or ingress behavior, only how the kubelet +performs cleartext HTTP/2 health checks when opted in. + +## Proposal + + + +We add opt-in support for HTTP/2 cleartext (h2c) container probes so the +kubelet can perform a real HTTP request over h2c to pod IP and port, without +relying on a second HTTP/1.1-only listener or a tcpSocket probe for that case. + + +1. **Option A: Extend httpGet** with an explicit way to request h2c while + keeping today’s behavior unchanged for specs that do not set it. + + Currently `HttpGetAction` supports mutiple options: + + ```yaml + readinessProbe: + httpGet: + path: /readyz + port: http + scheme: HTTP + httpHeaders: + - name: Custom-Header + value: value + ``` + + This proposal adds a `http2Cleartext` boolean: + + ```yaml + readinessProbe: + httpGet: + path: /readyz + port: http + scheme: HTTP + http2Cleartext: true + httpHeaders: + - name: Custom-Header + value: value + ``` + + A new optional field `http2Cleartext` is added to the struct: + + ```go + type HTTPGetAction struct { + Path string `json:"path,omitempty" protobuf:"bytes,1,opt,name=path"` + Port intstr.IntOrString `json:"port" protobuf:"bytes,2,opt,name=port"` + Host string `json:"host,omitempty" protobuf:"bytes,3,opt,name=host"` + Scheme URIScheme `json:"scheme,omitempty" protobuf:"bytes,4,opt,name=scheme,casttype=URIScheme"` + HTTPHeaders []HTTPHeader `json:"httpHeaders,omitempty" protobuf:"bytes,5,rep,name=httpHeaders"` + + // +optional + // When true, the probe uses HTTP/2 without TLS (often called "h2c") to connect + // to the container. When false or unset, behavior is unchanged from today (HTTP/1.1). + // Validation MUST reject incompatible combinations (see below). + HTTP2Cleartext *bool `json:"http2Cleartext,omitempty" protobuf:"varint,6,opt,name=http2Cleartext"` + } + ``` + + **Pros:** + + - Extends the existing httpGet probe so users are not forced to learn a new probe type or maintain separate docs for another handler name. + - Lets the same httpGet probe speak HTTP/2 cleartext (h2c) to the app’s real HTTP stack instead of forcing a separate HTTP/1.1 port or falling back to a TCP probe that only checks the socket. + - postStart / preStop hooks already support httpGet. If we extend httpGet for h2c, hooks can use the same switch without inventing a second way to do HTTP in lifecycle. + + **Cons:** + + - It keeps today’s httpGet footguns (named ports, host override) unless we add h2c-specific validation, which increases API and test surface. + - Treating h2c as full httpGet behavior (redirects, headers, invalid HTTPS combos, etc.) user picked HTTPS by mistake and so on keeps the implementation and test matrix large even if the h2c transport change is small. + - A must choose between overloading scheme (TLS vs HTTP version blur, needs sharp docs ) and a http2Cleartext boolean (avoids scheme overload but is less visible in manifests). + +2. **Option B: New probe field** (alongside httpGet / grpc, h2cGet) dedicated + to h2c, with gRPC-style constraints reviewers prefer: numeric port only (no + named port), no host override, probe target remains pod IP and port. + + This proposals adds a new probe type `h2cGet`: + + ```yaml + readinessProbe: + h2cGet: + port: 8080 + path: /readyz + httpHeaders: + - name: Host + value: my-service.local + ``` + + Adds a new struct named H2CGetAction: + + ```go + // H2CGetAction describes an HTTP GET request over HTTP/2 cleartext (h2c) to the pod's IP address. + // The kubelet connects to status.podIP:port; there is no host field—use httpHeaders if a custom + // Host / :authority is required. + type H2CGetAction struct { + // Port number on the container. Must be in the range 1 to 65535. + // Named ports are not supported (unlike httpGet). + Port int32 `json:"port" protobuf:"bytes,1,opt,name=port"` + // Path to access on the HTTP server. Defaults to "/" if empty (define explicitly in validation). + // +optional + Path string `json:"path,omitempty" protobuf:"bytes,2,opt,name=path"` + // Custom headers to set on the request. HTTP allows repeated headers. + // +optional + // +listType=atomic + HTTPHeaders []HTTPHeader `json:"httpHeaders,omitempty" protobuf:"bytes,3,rep,name=httpHeaders"` + } + ``` + + **Pros:** + + - A new probe type lets you say: no named ports, no host field—the same style as gRPC probes. That removes a bunch of httpGet problems without piling on rules inside httpGet. + - h2c-specific validation and docs live on one struct; less combinatorial explosion with scheme / TLS / HTTP/1.1. + - Connect to pod IP and port number only no separate host to point the connection somewhere else. So it’s harder to misconfigure “where am I probing?” and it matches the strict, clear style of gRPC probes. + - In future if we need any new option we can extend it. + + **Cons:** + + - It duplicates most of httpGet (path, httpHeaders, etc.) under a new handler, so we maintain two parallel APIs and two sets of docs for nearly the same HTTP GET probe. + - LifecycleHandler doesn’t mirror ProbeHandler (no grpc today). You must state whether h2cGet is probe-only initially or also added to hooks (extra API work). + +### User Stories + + + +#### Story 1 + +As a platform engineer operating Kubernetes behind a TLS-terminating load +balancer, I want to configure liveness/readiness probes that speak HTTP/2 +cleartext (h2c) to my app’s main port, so that I don’t have to run a second +HTTP/1.1-only port (with extra ingress rules and hardening) just to make probes +succeed. + +#### Story 2 + +As an application developer whose service listens with HTTP/2 without TLS inside +the cluster, I want the kubelet to perform a real HTTP health check +(status/path) over h2c, so that I am not forced to use a tcpSocket probe that +only proves the port is open and does not confirm a valid HTTP response. + +### Notes/Constraints/Caveats (Optional) + + + +### Risks and Mitigations + + + +## Design Details + + +This section will be completed after API review and SIG Node settles probe shape +(extend httpGet or new probe type) and field placement. + +### Test Plan + + + +- [ ] I/we understand the owners of the involved components may require updates + to existing tests to make this code solid enough prior to committing the + changes necessary to implement this enhancement. + +##### Prerequisite testing updates + + + +##### Unit tests + + + + + +- ``: `` - `` + +##### Integration tests + + + + + +- [test name](https://github.com/kubernetes/kubernetes/blob/2334b8469e1983c525c0c6382125710093a25883/test/integration/...): [integration master](https://testgrid.k8s.io/sig-release-master-blocking#integration-master?include-filter-by-regex=MyCoolFeature), [triage search](https://storage.googleapis.com/k8s-triage/index.html?test=MyCoolFeature) + +##### e2e tests + + + +- [test name](https://github.com/kubernetes/kubernetes/blob/2334b8469e1983c525c0c6382125710093a25883/test/e2e/...): [SIG ...](https://testgrid.k8s.io/sig-...?include-filter-by-regex=MyCoolFeature), [triage search](https://storage.googleapis.com/k8s-triage/index.html?test=MyCoolFeature) + +### Graduation Criteria + + + +### Upgrade / Downgrade Strategy + + + +### Version Skew Strategy + + + +## Production Readiness Review Questionnaire + + + +### Feature Enablement and Rollback + + + +###### How can this feature be enabled / disabled in a live cluster? + + + +- [ ] Feature gate (also fill in values in `kep.yaml`) + - Feature gate name: + - Components depending on the feature gate: +- [ ] Other + - Describe the mechanism: + - Will enabling / disabling the feature require downtime of the control + plane? + - Will enabling / disabling the feature require downtime or reprovisioning + of a node? + +###### Does enabling the feature change any default behavior? + + + +###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)? + + + +###### What happens if we reenable the feature if it was previously rolled back? + +###### Are there any tests for feature enablement/disablement? + + + +### Rollout, Upgrade and Rollback Planning + + + +###### How can a rollout or rollback fail? Can it impact already running workloads? + + + +###### What specific metrics should inform a rollback? + + + +###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? + + + +###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? + + + +### Monitoring Requirements + + + +###### How can an operator determine if the feature is in use by workloads? + + + +###### How can someone using this feature know that it is working for their instance? + + + +- [ ] Events + - Event Reason: +- [ ] API .status + - Condition name: + - Other field: +- [ ] Other (treat as last resort) + - Details: + +###### What are the reasonable SLOs (Service Level Objectives) for the enhancement? + + + +###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service? + + + +- [ ] Metrics + - Metric name: + - [Optional] Aggregation method: + - Components exposing the metric: +- [ ] Other (treat as last resort) + - Details: + +###### Are there any missing metrics that would be useful to have to improve observability of this feature? + + + +### Dependencies + + + +###### Does this feature depend on any specific services running in the cluster? + + + +### Scalability + + + +###### Will enabling / using this feature result in any new API calls? + + + +###### Will enabling / using this feature result in introducing new API types? + + + +###### Will enabling / using this feature result in any new calls to the cloud provider? + + + +###### Will enabling / using this feature result in increasing size or count of the existing API objects? + + + +###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs? + + + +###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components? + + + +###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)? + + + +### Troubleshooting + + + +###### How does this feature react if the API server and/or etcd is unavailable? + +###### What are other known failure modes? + + + +###### What steps should be taken if SLOs are not being met to determine the problem? + +## Implementation History + + + +## Drawbacks + + + +## Alternatives + + + +## Infrastructure Needed (Optional) + + diff --git a/keps/sig-node/5999-h2c-container-probes/kep.yaml b/keps/sig-node/5999-h2c-container-probes/kep.yaml new file mode 100644 index 000000000000..db76eac1c10d --- /dev/null +++ b/keps/sig-node/5999-h2c-container-probes/kep.yaml @@ -0,0 +1,35 @@ +title: HTTP/2 cleartext (h2c) container probes +kep-number: 5999 +authors: + - "@amritansh1502" + - "@ngopalak-redhat" +owning-sig: sig-node +participating-sigs: + - sig-network +status: provisional +creation-date: 2026-04-07 +reviewers: + - "@SergeyKanzhelev" +approvers: + - "@mrunalp" +see-also: + - https://github.com/kubernetes/kubernetes/issues/125599 +replaces: + +stage: alpha + +latest-milestone: "v1.37" +milestone: + alpha: "v1.37" + beta: "v1.38" + stable: "v1.39" + +feature-gates: + - name: H2CContainerProbe + components: + - kube-apiserver + - kubelet + +disable-supported: true + +metrics: \ No newline at end of file