KEP 5554: In place update pod resources alongside static cpu manager policy KEP creation#5555
Conversation
esotsal
commented
Sep 21, 2025
- One-line PR description: Create new KEP 5554: In place update pod resources alongside static cpu manager policy
- Issue link: Support In place update pod resources alongside static cpu manager policy #5554
- Other comments:
4c5c393 to
1240d58
Compare
|
@esotsal: GitHub didn't allow me to request PR reviews from the following users: Chunxia202410. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
1240d58 to
8973b16
Compare
24bfb5c to
a6c437b
Compare
I think the most appropriate answer, is i know that i don't know :-( |
Thanks for your time, please check the new KEP updates, in preparation for v1.36. It includes the |
|
Status update for PRR reviewer , upcoming PRR freeze on Wednesday 4th February 2026 (AoE) / Thursday 5th February 2026, 12:00 UTC. ( last update 3rd February 2026 ) Answered below open comments
Remaining open comments, working on , not blocking for alpha can be fine tuned in beta :
For remaining unresolved comments, it is upon to the reviewers to decide if provided answers are sufficient or blocking for this KEP to go alpha in v1.36. I think most important ones are below :
Please let me know, if I've missed a comment. Thanks in advance! |
natasha41575
left a comment
There was a problem hiding this comment.
I see my previous comments have been addressed - thanks!
|
|
||
| When the topology manager cannot generate a topology hint which satisfies the topology manager policy, the pod resize is marked as [Deferred](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources#resize-status). This means that while the node theoretically has enough CPU for the resize, it's not currently possible but can be re-evaluated later, potentially becoming feasible as pod churn frees up capacity. | ||
|
|
||
| Reasons for failure |
There was a problem hiding this comment.
(replying to @ffromani's comment on this thread: #5555 (comment))
I want us to be very careful what we set as Deferred, and err on the side of marking things Infeasible. We are planning Scheduler Preemption for IPPR in 1.37.
Scheduler preemption will be triggered on all Deferred resizes. Meaning that the scheduler logic will try to find pods to preempt based on priority class and the size of the pod. Because the scheduler is not NUMA-aware, it is only safe to mark a resize as "Deferred" if the kubelet knows that scheduler-triggered preemption can help the resize to succeed. Otherwise the scheduler will be preempting pods unnecessarily.
In your example scenario I can see it could be possible to do this, but do you think we can reliably implement this kind of logic? It seems both complex and fragile - it would require the kubelet to make a lot of assumptions of the scheduler's behavior and I'm not sure that's a direction we want to go.
My opinion is marking it as "Infeasible" is a safe step forward to unblock us now, while still leaving room to relax it to "Deferred" later if necessary.
There was a problem hiding this comment.
I had a pass and the KEP content LGTM. I have no major comments about the plan outlined with the KEP and, of course, I fully agree with the goal. I see a future extension path to incorporate the memory manager (and the device manager maybe?).
I think there's room to polish a bit the implementation details section, but it's pretty minor.
If we both target the next cycle, the new fields from our features will have to be added to the new CPU manager V3 state in the same release, thus creating a dependency on the serialization changes of whichever feature is merged first. I wonder how the process would be if the first merged feature is rolled back, but the second one will be maintained. At least we could keep any non-feature related changes that belong to the V3 state for the second feature to use. The path may be easier since we are not modifying the state in the same way, |
|
Thanks for sharing your thoughts @KevinTMtz , I think likewise.
I think releasing in v1.36 is wanted position for both, according to SIG Node v1.36 KEPs planning both are Considered for release, KEP 5554 with High priority and KEP 5526 with Medium priority.
I agree, based on above I think it is manageable and doable to add them both in v1.36 release. I don't have a preference on the merging order, both works for me, up to sig-node community, revieweres and approvers to decide what is most suited. |
natasha41575
left a comment
There was a problem hiding this comment.
ippr-specific bits LGTM
the rest of the content also LGTM, but admittedly I am not an expert in topology / cpu manager
/assign @ffromani @tallclair
This is an interesting part because I kinda feel the same in reverse. cpu/topology manager bits make sense but I can't really comment the IPPR integration.
Is this a correct 10k-feet summarization of the flow? |
We actually already run admission checks on resize. So the ideal flow I think would be to integrate TM feasibility checks (i.e. can TM generate a hint?) into the "admission" path, and integrate TM allocation of CPUs into the "resize actuation" path which happens during a pod sync. As an aside, I actually think kubernetes/kubernetes#133427 could help simplify the implementation of this KEP, because it unifies the existing resize feasibility checks with admission, allowing for different checks depending on whether we are adding or resizing a pod. So I can try to get this one in. IIUC TM already has its own admission handler, so we'd just need to make sure it does the right checks on resize. Does this make sense?
I think @tallclair has a holistic understanding of both sides, so his review will help tremendously. |
|
@dchen1107 , @tallclair , @natasha41575 , @ffromani , @deads2k , @KevinTMtz , @pravk03 are there any blocking items and/or open action point(s) for this KEP, which I might have missed for alpha PRR ? If so please let me know, enhancement freeze is tomorrow so would appreciate your feedback. Adding @whtssub ( this KEPs wrangler ) who has kindly updated KEPs status #5554 (comment) |
Hi @esotsal , thank you for any suggestions from you and the community regarding this issue. Since this feature is quite independent, we plan to address this part as a separate KEP. Thank you. |
It does, thanks for clairifying!
+1!! |
LGTM from my side! |
|
PRR looks good for alpha. I made a few comments about things we'll need to be sure we refine in beta. /approve |
Thanks updated this PR, to resolve those comments. |
|
/lgtm |
Thanks, updated KEP to resolve |
|
/lgtm |
|
/lgtm Decision to defer to TopologyManager for which CPUs to downsize LGTM. I'm not sure about the decision to forbid resizing below the initial count, but that is something we can easily revisit at a later date if there's a use case for it. |
|
|
||
| To effectively address the needs of both users and Kubernetes components for the realization of this KEP, the proposed implementation involves the following changes: | ||
|
|
||
| 1. Update the `CPUManager` checkpoint file format as stated in [ContainerCPUs checkpoint](#containercpus-checkpoint) section), which will serve as the single source of truth to represent the original and resized exclusive CPUs of an in place CPU resize of a Guaranteed Pod with CPU Static Policy. |
There was a problem hiding this comment.
Will the new format be feature gated?
There was a problem hiding this comment.
If feature gate is not set , ”resize” will be always empty, only original will be used. I haven’t thought making the new format feature gated , Any use case you have in mind ? Is it ok to continue discussion in the implementation PR?
There was a problem hiding this comment.
Will the new format be feature gated?
Such a simple question, i was not aware how important it was and the complexity to solve it. Thanks @tallclair for the question, considering v1.36 cycle reviews in this KEPs PR , short answer , it was missed and , yes, it must be feature gated as well as ALL code modifications. Why? To ensure k8s operation activities will not be impacted ( rollback, harmonized co-existence with other features touching checkpoint, ensuring v1.PorReasonInfeasible is returned when needed to reduce impacts on a node with unecessary resizes etc ). #5965 created to update the KEP with the modifications hoping we get to a consensus increasing confidence that most if not ALL risks have been considered and KEP can be 'alpha' in v1.37.
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dchen1107, deads2k, esotsal, tallclair The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |