Add monitoring controller for metrics #1959
Conversation
|
Hi @mayleighnmyers. Thanks for your PR. I'm waiting for a istio-ecosystem or istio member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Hi @mayleighnmyers, How would users control this if they wanted to opt in or out? What will the API be for this? You can either open an issue and we can discuss it there or better yet create a SEP for this. We need to discuss how this will integrate with the operator before we can move forward with this. |
So this is still in progress. Good call for the opt in/out. Something we can consider is adding a field either in the Istio CR or the istioRevision CR to allow users to opt in since right now it is currently always enabled. Yuanlin and I will be talking about this more in some future meeting when we have all ducks in a row. :) |
| 1. As a user of Istio's metrics monitoring using Prometheus components, I want to automatically configure Prometheus metrics scraping jobs for Istio generated telemetry metrics. | ||
|
|
||
| ### API Changes | ||
| We will add a boolean field `spec.monitoring.enabled` in the `Istio` and `IstioRevision` CRDs. When a user enables it, a monitoring controller will reconcile a `ServiceMonitor` CR for the mesh control plane and `PodMonitor` CRs for each namespace labeled by the Istio sidecar injection label. |
There was a problem hiding this comment.
The implementation is missing some of the relabeling rules that exist in the upstream samples. Do we anticipate users needing to customize those and does that change the API?
There was a problem hiding this comment.
When you are using COO, is this label different for every prom stack: monitored-by: coo-prometheus? Will the API need to include the ability to customize this label?
There was a problem hiding this comment.
Do we anticipate users needing to customize those and does that change the API?
No, we will add a method about appending those relabeling spec as default value in the following commits.
When you are using COO, is this label different for every prom stack: monitored-by: coo-prometheus?
Yes, that label value depends on the prom stack name created by user.
| We will add a boolean field `spec.monitoring.enabled` in the `Istio` and `IstioRevision` CRDs. When a user enables it, a monitoring controller will reconcile a `ServiceMonitor` CR for the mesh control plane and `PodMonitor` CRs for each namespace labeled by the Istio sidecar injection label. | ||
|
|
||
| ### Architecture | ||
| We assume the related `ServiceMonitor` and `PodMonitor` CRDs and the API groups such as "monitoring.coreos.com/v1" , "monitoring.rhobs/v1" are available. We will add required RBAC permissions for creating, deleting and updating those two resources in the Sail Operator ClusterRole. |
There was a problem hiding this comment.
We can't assume this since the Sail Operator is supported on non-openshift clusters that may not have these APIs installed. Even on OpenShift clusters, how will the operator know whether to use monitoring.coreos.com/v1 or monitoring.rhobs/v1? Will we need to support both?
There was a problem hiding this comment.
I will add validation methods. It will query API server and validate if those two CRDs , APIs are installed.
I suggest using and supporting the monitoring.coreos.com/v1 group. The rhobs is specific from COO operator side and no difference in the two CRDs we use.
| 1. As a user of Istio's metrics monitoring using Prometheus components, I want to automatically configure Prometheus metrics scraping jobs for Istio generated telemetry metrics. | ||
|
|
||
| ### API Changes | ||
| We will add a boolean field `spec.monitoring.enabled` in the `Istio` and `IstioRevision` CRDs. When a user enables it, a monitoring controller will reconcile a `ServiceMonitor` CR for the mesh control plane and `PodMonitor` CRs for each namespace labeled by the Istio sidecar injection label. |
There was a problem hiding this comment.
I think we need to mention that it's an openshift specific requirement that we create a PodMonitor in each namespace otherwise non-openshift users would benefit from just using the NamspaceSelector on the PodMonitor. This is how the upstream sample is configured.
|
The implementation of this broadly looks fine but I think the API will need to be larger than a boolean and will require the sail operator to have additional permissions which may not be relevant for non-openshift users. With There could be a standalone Here This could even be its own deployment with its own ServiceAccount so that the Sail Operator wouldn't need to be granted additional permissions. Since a lot of the integrations may be OSSM specific, it may even make more sense to only have this available in Server side apply can ensure that any updates that the integrations controller makes to the The implementation included here would largely stay the same. The main change would be in the API to manage these integrations. |
22566a7 to
c43fabb
Compare
c43fabb to
21661f3
Compare
795caee to
3373e36
Compare
Implements a monitoring controller that creates ServiceMonitor and PodMonitor resources for IstioRevision objects. Key features: - Uses monitoring.rhobs/v1 API group for COO compatibility - Adds 'monitored-by: coo-prometheus' label for COO discovery - Creates ServiceMonitor (default-istiod) for istiod metrics - Creates PodMonitor (default-proxies) for istio-proxy metrics - Sets proper owner references for automatic cleanup Relates to: OSSM-13585 Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com>
Add ClusterRole permissions for ServiceMonitor and PodMonitor resources for both monitoring.coreos.com and monitoring.rhobs API groups to allow the monitoring controller to create and manage these resources. Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com> Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com>
Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com>
Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com>
Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com>
Signed-off-by: Yuanlin Xu <yuanlin.xu@redhat.com> Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com>
Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com>
Signed-off-by: Yuanlin Xu <yuanlin.xu@redhat.com>
Signed-off-by: Yuanlin Xu <yuanlin.xu@redhat.com>
Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com>
The gencheck job runs make gen, which updates bundle manifests from chart CRDs. The bundle copy of sailoperator.io_istios.yaml was missing the new spec.monitoring schema added for the monitoring controller. Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
25aa652 to
3c63e82
Compare
Signed-off-by: Mayleigh Tjapkes <mamyers@redhat.com>
Implements a monitoring controller that creates ServiceMonitor and PodMonitor resources for IstioRevision objects. Key features:
Relates to: OSSM-13585
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Related Issue/PR #
Additional information: