Skip to content

v0.12.2

Latest

Choose a tag to compare

@github-actions github-actions released this 19 Mar 17:49
· 22 commits to main since this release
v0.12.2

This is a new minor release of NRI Reference Plugins. It comes with a few new features, some bugfixes and dependency updates.

What's New

Topology Aware Policy

  • Topology Pools for CPU Clusters by L3 Cache On hardware where clustering CPUs by shared L3 cache results in a distinctive topological entity when compared to clustering by sockets, nearest NUMA node, or dies, the policy now creates L3 cache pools in the topology tree. In such setups it is now also possible to put an L3 cache level cap on unlimited burstable containers, in which case the policy will try to fit such a container within a L3 cache clustered set of CPUs with enough free resources. This configuration fragment will set L3 cache burstability cap as the global default:
config:
  reservedResources:
    cpu: 1
  ...
  # Constrain unlimited burstable containers close to a single NUMA node.
  unlimitedBurstable: l3cache

You can also annotate an L3 cache burstability cap for containers. For instance with this pod spec

apiVersion: v1
kind: Pod
metadata:
  name: burstable
  annotations:
    # constrain unlimited burstable containers to a single socket by default
    unlimited-burstable.resource-policy.nri.io/pod: 
    # but tighten this for ctr0 to an L3 cache cluster
    unlimited-burstable.resource-policy.nri.io/container.ctr0: l3cache
spec:
  containers:
  - name: ctr0
    image: myimage-ctr0
    imagePullPolicy: Always
    resources:
      requests:
        cpu: 2
        memory: 500M
      limits:
        memory: 500M
  - name: ctr1
    image: myimage-ctr1
    imagePullPolicy: Always
    resources:
      requests:
        cpu: 1
        memory: 100M
      limits:
        memory: 100M
...
  • Strict Topology Hint interpretation can be annotated on pods or containers. When annotated for strict hints, the policy will treat the fulfillment of topology hints a requirement instead of a preference and rather fail allocation of a container than assign resources to misaligned wrt. to a strict hint. For instance this pod spec annotates the highprio container within the pod for strict hints.
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
  annotations:
    # fail creation of highprio if strict alignment by hints is not possible
    strict.topologyhints.resource-policy.nri.io/container.highprio: 'true'
spec:
  containers:
  - name: highprio
    image: myimage-highprio
    imagePullPolicy: Always
    resources:
      requests:
        cpu: 1
        memory: 500M
      limits:
        cpu: 1
        memory: 500M
  - name: normal
    image: myimage-normal
    imagePullPolicy: Always
    resources:
      requests:
        cpu: 500m
        memory: 100M
      limits:
        cpu: 500m
        memory: 100M

Note that you might need to ensure that any unexpected implicit topology hints are disabled so they don't cause misalignment failures in strict mode. For instance you might want to disable all implicit hints from mounts and devices, if you want to enable strict alignment for pod resource API hints.

  • Strict CPU Isolation can now be annotated on pods or containers. Such an annotation turns the allocation of isolated CPUs into a requirement instead of a preference, failing container creation if isolated CPUs cannot be allocated. For instance this pod spec will require isolated CPUs for the highprio container of the pod.
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
  annotations:
    # fail creation of highprio if we can't allocate an isolated CPU for it.
    require-isolated-cpus.resource-policy.nri.io/container.highprio: "true"
spec:
  containers:
  - name: highprio
    image: myimage-highprio
    imagePullPolicy: Always
    resources:
      requests:
        cpu: 1
        memory: 500M
      limits:
        cpu: 1
        memory: 500M
  - name: normal
    image: myimage-normal
    imagePullPolicy: Always
    resources:
      requests:
        cpu: 500m
        memory: 100M
      limits:
        cpu: 500m
        memory: 100M

What's Changed

  • go.{mod,sum}: bump otel deps to latest 1.42.0. by @klihub in #641
  • resmgr: block cache save during initial sync. by @klihub in #640
  • go.{mod,sum}: bump goresctrl deps to v0.12.0. by @klihub in #642
  • topology-aware: add l3Cache topology/pool nodes. by @wongchar in #635
  • cpuallocator, topology-aware: handle weird die setups better by @klihub in #643
  • topology-aware: implement strict hint and CPU isolation preferences. by @klihub in #638

New Contributors

Full Changelog: v0.12.1...v0.12.2