HDDS-15395. Grafana dashboard for container balancer#10398
Conversation
sreejasahithi
left a comment
There was a problem hiding this comment.
Thanks @navinko for working on this.
| "datasource": { | ||
| "name": "dfm13nk97y9kwf" |
There was a problem hiding this comment.
Don’t hardcode dfm13nk97y9kwf. Use ${datasource} like the other panels. Remove the hardcoded UID from the variable default too.
There was a problem hiding this comment.
Done. Also updated DatasourceVariable spec to "default"
| "expr": "sum(increase(container_balancer_metrics_data_size_moved_gb[$__range]))", | ||
| "legendFormat": "Moved Data Size (GB)", | ||
| "range": true |
There was a problem hiding this comment.
should this be sum(container_balancer_metrics_data_size_moved_gb_in_latest_iteration) instead?
since the title of this panel shows 'Size Moved (Latest)'
There was a problem hiding this comment.
Thanks for catching this !
For the latest run stats , was trying to get delta using total data size moved metrics "container_balancer_metrics_data_size_moved_gb"
- I was trying to apply rate with aggregate function sum (), which makes query invalid , to fix that it requires to bind with increase function .
Realised this is not even required we already have another metrics for latest _iteration
"container_balancer_metrics_data_size_moved_gb_in_latest_iteration"
Fixed this now :
There was a problem hiding this comment.
Thanks @sreejasahithi for review , updated json
What changes were proposed in this pull request?
Add Grafana dashboard for container balancer
Please describe your PR in detail:
Created a Grafana dashboard to display the metrics relevant to balancer operation.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-15395
How was this patch tested?
Simulated unbalanced cluster -> Triggered balancer through CLI -> Captured dashboard screen shot and validated panel
Used same json "Ozone - Container Balancer Metrics.json" for current PR
Screen shots for references: