Skip to content

Discussion: CRD refactoring#36

Closed
aliok wants to merge 4 commits intokubernetes-sigs:mainfrom
aliok:2026-03-10-crd-logical-grouping
Closed

Discussion: CRD refactoring#36
aliok wants to merge 4 commits intokubernetes-sigs:mainfrom
aliok:2026-03-10-crd-logical-grouping

Conversation

@aliok
Copy link
Copy Markdown
Member

@aliok aliok commented Mar 11, 2026

Issue: #32

Creating a PR with a paper prototype CRD design, so that we can collaborate.

I will create a separate PR once we decide on the API shape we want.

aliok added 2 commits March 11, 2026 15:17
Signed-off-by: Ali Ok <aliok@redhat.com>
Signed-off-by: Ali Ok <aliok@redhat.com>
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 11, 2026
Signed-off-by: Ali Ok <aliok@redhat.com>
Comment thread new_crd_wip.yaml
Comment on lines +148 to +160
# TODO: which one do you think we need from these VolumeMount fields?
# IMO, we should only have `readOnly` as the other fields are more advanced features.
# Alternatively, we can a copy of the https://pkg.go.dev/k8s.io/api@v0.35.0/core/v1#VolumeMount and
# make it a sub-field.
readOnly: ...
# recursiveReadonly: ...
# subPath: ...
# mountPropagation: ...
# subPathExpr: ...
# ... more fields from https://pkg.go.dev/k8s.io/api@v0.35.0/core/v1#VolumeMount
# Since we are not separating the volume source (ConfigMap, Secret, PVC, etc.) from the mount configuration
# (mountPath, subPath, readOnly, etc.) in this CRD design, we can only include these fields here in the mount
# configuration directly
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In native Kubernetes, storage mounting uses two separate concepts:

  1. Volume Source (spec.volumes[]): What to mount (ConfigMap, Secret, PVC, etc.) with a name
  2. Volume Mount (spec.volumeMounts[]): How to mount (mountPath, readOnly, subPath) referencing a volume name

Native Kubernetes Example:

  spec:
    volumes:
      - name: config-vol
        configMap:
          name: app-config
    containers:
      - volumeMounts:
          - name: config-vol          # References volume by name
            mountPath: /etc/config
            readOnly: true

This separation makes things flexible but adds verbosity.

Here, I think we should avoid that separation and use a single struct to specify both the volume source and the mount configuration. Reason is that we're providing an opinionated API and not the full Kubernetes API.

This would mean that we shouldn't have all fields from https://pkg.go.dev/k8s.io/api@v0.35.0/core/v1#VolumeMount, but some.

  spec:
    server:
      storage:
        - mountPath: /etc/config      # VolumeMount field
          readOnly: true                     # VolumeMount field
          type: ConfigMap                 # Type discriminator
          configMap:                          # VolumeSource (all native K8s fields)
            ..

Comment thread new_crd_wip.yaml Outdated
Comment thread new_crd_wip.yaml Outdated
networking:
...
---
apiVersion: mcp.x-k8s.io/v1alpha2
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this line on, full spec

Comment thread new_crd_wip.yaml Outdated
Comment thread new_crd_wip.yaml
# TODO: which one do you think we need from these VolumeMount fields?
# IMO, we should only have `readOnly` as the other fields are more advanced features.
# Alternatively, we can a copy of the https://pkg.go.dev/k8s.io/api@v0.35.0/core/v1#VolumeMount and
# make it a sub-field.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your comment.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would it make sense for an end-user to set read only to false?

Could you reasonably default to readOnly: true on everything but emptyDir and persistentVolumeClaim which I suspect would only be used in cases where the MCP server needs to write data somewhere?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be a reasonable starting point I think.

Comment thread new_crd_wip.yaml
liveness: { httpGet: { path: /healthz, port: 8080 } }
networking:
serviceType: ClusterIP

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also a labeling/annotation section?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could probably also be where resource references go for the BYO use case in the future

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just writing networking and others as examples for having a feeling about future subsections in the spec. When the need for labeling/annotating comes, we can create a field for it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  runtime:
    metadata:
      labels:
        cost-center: ml-team
      annotations:
        prometheus.io/scrape: "true"

I think when needed is there to above works well with the "runtime" section from this proposal

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think @jaideepr97 is right that once we really need it - instead of bloating up ours e.g. with runtime.metadata/annotation it might be worth to check the deploymentRef idea...

@matzew
Copy link
Copy Markdown
Member

matzew commented Mar 11, 2026

/assign @matzew

Signed-off-by: Ali Ok <aliok@redhat.com>
Comment thread new_crd_wip.yaml
name: example
spec:
source:
type: ContainerImage
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the source grouping and the possibilty for doing type in the future.

I think for now we anyways just want to do image? So I'd just do that.

For future, I think we can revisit. However, what I do not like is that the containerImage is not nested in there. Not sure that type is really a good choice. Perhaps something like actual type / strcut might be better

source:
  container:
    image:

that way we do not end up confusing string-based APIs (type: my text) and actual fieds?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this reads more natural/fluent

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spec:
  source:
    type: ContainerImage
    containerImage:
      ref: ghcr.io/example/server:latest

to

spec:
  source:
    containerImage:
      ref: ghcr.io/example/server:latest

So, get rid of type then? And, rename some fields?

Renaming fields is ok overall.
Do we want to remove the type?

I want to hear other people's opinion as well!

Copy link
Copy Markdown
Member

@matzew matzew Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, get rid of type then? And, rename some fields?

yep - because the type is just text, while something like an actual field (e.g. container) gives concrete and type-safe hints:

spec:
  source:
    container:
      image: ghcr.io/example/server:latest

(no need to also say containerImage, since that is now obvious)

and than if we want, we can later have other things... like npm? (just making things up)

spec:
  source:
    npm:
      repo: something

That way image can really only be applied when inside the "container api"

So I'd instead of type: container introduce an actual type/strcut for better grouping

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting from a API reviewer perspective, the type field is useful for version skew situations between the API version and the controller version.

For example, if you are going to support newer versions of the API being used with some skew of the controller - or use cases have more than just your controller reading the API and taking some action - adding a new optional field that can be set instead of say container will mean that the older client does not know how to reconcile that (that field didn't previously exist). Using the discriminated union with type: Container means that the older client can explicitly encode some default behavior if it sees type is set to a value it doesn't know how to handle.

In practice, this may not really be an issue for you folks so you just need to weigh the use cases and what kind of skew you do or do not support here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I forgot to mention that up until more recently it made enforcing "at most one of" semantics a bit easier.

Instead of checking that every other field isn't set when another field is you can just write a check that is equivalent to type == 'Container' ? has(container) : !has(container) to prevent container from being set if type isn't set to Container.

There was recently the addition of the kubebuilder:validation:AtMostOneOf marker to controller-gen that makes this semantic a bit easier to enforce now though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AtMostOneOf is a good idea for enfocring this.

Let's do it! 👍

Comment thread new_crd_wip.yaml
Comment on lines +200 to +201
type: PVC
persistentVolumeClaim:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think similar to above the persistentVolumeClaim is already telling this is a PVC, no?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, whatever we decide above, let's also apply that approach here too!

@matzew
Copy link
Copy Markdown
Member

matzew commented Mar 12, 2026

Overall - with some nits - I like the approach. This CRD is focused and opinionated, and serves a good base for the 80% case.

I think with something like deploymentRef on the future, stuff like node affinity etc could be just placed on those resources, instead of more bloat onto the CRD.

But overall good paper prototype, @aliok 👍

Comment thread new_crd_wip.yaml
runtime:

security:
serviceAccountName: example-service-account
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to allow setting a service account name here? Why would a user want to be able to configure this?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For something like the kubernetes mcp server the serviceaccount it runs under will influence the access it has to the kube api.
It is probably a more fringe case, but would be nice to support that case.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you deterministically defaulted this so that end users don't have to create the service account first? What is the expected behavior if this field isn't provided?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the field is not provided then it won't be set on the deployment and the pods will run under the default serviceaccount for that namespace.

Comment thread new_crd_wip.yaml
Comment on lines +227 to +230
# pod-level security context that applies to the entire pod (as opposed to container-level security context)
podSecurityContext:
runAsNonRoot: true
# ... other security context fields from https://pkg.go.dev/k8s.io/api@v0.35.2/core/v1#SecurityContext
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the pod going to have more than a single container for the MCP server?

When might a user want to be able to configure the overall pod security context over you defaulting it to the most secure security context you can?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale for adding that was that we may want a way to specify fsGroup for managing ownership/permissions on mounted volumes (like configmaps and secrets) which is not available in the container level securitycontext struct

Copy link
Copy Markdown
Member

@matzew matzew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@matzew: changing LGTM is restricted to collaborators

Details

In response to this:

/lgtm
/approve

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: aliok, matzew
Once this PR has been reviewed and has the lgtm label, please assign mrunalp for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@aliok
Copy link
Copy Markdown
Member Author

aliok commented Mar 16, 2026

/close

Closing as the real impl is here: #40

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@aliok: Closed this PR.

Details

In response to this:

/close

Closing as the real impl is here: #40

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@aliok aliok deleted the 2026-03-10-crd-logical-grouping branch March 26, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants