KEP-5963: DRA Device Compatibility Groups#5964
KEP-5963: DRA Device Compatibility Groups#5964omeryahud wants to merge 4 commits intokubernetes:masterfrom
Conversation
|
Welcome @omeryahud! |
|
Hi @omeryahud. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
81a89cb to
9c9c306
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: omeryahud The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
9c9c306 to
a31625a
Compare
rajatchopra
left a comment
There was a problem hiding this comment.
Looks good. Subject to schedulers willing to adopt these suggestions.
| Add a `device.consumesCounters[].compatibilityGroups` field. Devices declare which | ||
| named groups they belong to. For two devices consuming counters from the same | ||
| counter set to be co-allocated, they must share at least one compatibility group. | ||
| Devices without this field are considered compatible with all groups. This |
There was a problem hiding this comment.
Should the default be 'not compatible with any group'? And for compatibility with all (or some) we can use a regex? Like '*'. Regex may have more benefits like 'fft-accelerator-*' to claim fmm-accelerator compatibility with all fft-accelerators. But mutual exclusivity between intra fmm and fft devices.
Then, an older version slice and a newer scheduler will automatically mean mutual exclusivity.
There was a problem hiding this comment.
Hi @rajatchopra, thanks for the input!
My worry is that if the default is anything other than "compatible with all devices", this will introduce API breakage.
Meaning that when upgrading to a k8s version with this change, and the default is not "compatible with all devices", all devices will become bricked until their DRA drivers are upgraded to a version that is aware of this change and define compatibility groups for them.
What I do think I can improve here is to explicitly state that devices without compatibilityGroups are only compatible with other devices without compatibilityGroups (since the constraint is bi-directional and transitive).
I will update this section
|
|
||
| ### Scheduler Changes | ||
|
|
||
| The DRA scheduler plugin is enhanced to: |
There was a problem hiding this comment.
Will it help the scheduler to know upfront the list of compatibility groups? Or is reaping the list from devices in a slice good enough?
We may want a .sharedCounters[].compatibilityGroups field if it makes it easier for the scheduler. Also makes the spec 'compile-correct'.
There was a problem hiding this comment.
I'd have to look more closely on the plugin implementation to understand that.
My understanding is that the scheduler:
- sums up all allocated requests to infer the current resource utilization (counterSets)
- then iterates over all available devices that match the request (by driver name or CEL expression), and attempts to allocate them one-by-one until it finds a valid allocation.
My thinking was to extend that logic to create a set of compatibilityGroups in step 1, and then filter out devices that do not specify a compatibilityGroup within the set.
Do you see how defining compatibilityGroups ahead of time can simplify this process? Or do you have a different process in mind?
|
/assign @alaypatel07 |
8816df4 to
451a468
Compare
Signed-off-by: Omer Yahud <oyahud@nvidia.com>
451a468 to
89aef0f
Compare
Signed-off-by: Omer Yahud <oyahud@nvidia.com>
Co-authored-by: Patrick Ohly <patrick.ohly@intel.com>
Tighten API semantics, production readiness, and version skew coverage ahead of the implementable review: - Resolve missing-field contradiction in favor of the strict "nil only matches nil" rule, stated consistently in Proposal and API. - Correct the transitivity claim to a symmetric-but-not-transitive pairwise group-intersection predicate. - Flesh out Version Skew Strategy with five concrete skew scenarios. - Populate Test Plan, rollback metric, upgrade-testing plan, and Implementation History. - Explain field placement on consumesCounters[] as an intentional hardware-scoped choice; add a multi-request/DeviceConstraints interaction subsection. - Document scheduler complexity as O(N*M*G) with expected bounds. - Expand Non-Goals, improve the scheduling event format to name the conflicting groups, and align Monitoring section with the new metric. - Clarify per-component behavior of the DRADeviceCompatibilityGroups feature gate. - Add Inverted-naming alternative, regenerate ToC.
Uh oh!
There was an error while loading. Please reload this page.