Log total nodes in the scale-up by pmendelski · Pull Request #9347 · kubernetes/autoscaler

pmendelski · 2026-03-11T13:57:43Z

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

One of the most important logs in this component is the one informing about the final scale-up decision. ATM this log is usually a multiline detailed log entry that informs about the scale-up size split into multiple node groups with their actual sizes and target sizes. That's very helpful when analyzing a single scale-up decision but too detailed when analyzing a sequence of scale-up decisions in a long time interval.

I propose adding to the log the final number of nodes added by the scale-up.

Current log (for a scale-up split into 3 node groups):

Final scale-up plan: [{<NODE_GROUP_ID> SIZE->TARGET (max: MAX_SIZE)} {<NODE_GROUP_ID> SIZE->TARGET (max: MAX_SIZE)} {<NODE_GROUP_ID> SIZE->TARGET (max: MAX_SIZE)}]

Proposed log:

Final scale-up plan: <SCALE_UP_SIZE> [{<NODE_GROUP_ID> SIZE->TARGET (max: MAX_SIZE)} {<NODE_GROUP_ID> SIZE->TARGET (max: MAX_SIZE)} {<NODE_GROUP_ID> SIZE->TARGET (max: MAX_SIZE)}]

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2026-03-11T13:57:54Z

Hi @pmendelski. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot · 2026-03-11T13:58:12Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pmendelski
Once this PR has been reviewed and has the lgtm label, please assign feiskyer for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

cluster-autoscaler/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pmendelski · 2026-03-11T14:08:59Z


 	// Execute scale up.
-	klog.V(1).Infof("Final scale-up plan: %v", scaleUpInfos)
+	klog.V(1).Infof("Final scale-up plan: %v %v", totalCapacity, scaleUpInfos)


This PR was extracted from #9315 . Let's continue the discussion here @aleksandra-malinowska @jbtk

Main considerations:

potentially breaking log queries - Fair argument, but I think we should find a middle ground. Otherwise it could lead to treating our logs like an API and versioning them. In this case no information is dropped, scaleUpInfo format is unchanged, only a summary is added. I assume that tooling based on log parsing will from time to time require some tweaking and should be avoided.

why do we need it? - Figuring out a scale-up size in a regional cluster requires doing N subtractions and adding them in memory. For a single scale-up that's ok, but when analyzing a long sequence of scale-up it's suboptimal.

Shall we add something like finalScaleUpSize to give more context in the log line?

This is not a cleanup of any kind - please fix.

+1 to @damikag - it would be better to introduce some context for this number that is added here. If we are changing the log line we could also add information across how many groups this is spread (this is also sth that is visible in the log line, but requires parsing)

Have you considered another approach - adding a separate log line with this information? I guess that there is a tradeoff between changing this log line and perfomance/amount of logs if we add additional log line - have you considered this?

Fair argument, but I think we should find a middle ground. Otherwise it could lead to treating our logs like an API and versioning them.

Like I mentioned previously, this specific log line is fairly unique with regard to this. It's essentially a source-of-truth for the final CA decision, whatever happens later.

I'm not particularly attached to the rest of our logging, and I suspect it could probably use substantial improvements.

only a summary is added

My main issue is that this new addition is in the middle of the log and has zero context, it's just an integer inserted in between "scale-up plan:" and describing the actual plan.

The former makes it potentially misleading to people who routinely query CA logs for debugging purposes, as discussed before.

The latter makes it unhelpful for people who aren't used to reading CA logs. Depending on the scale-up size, it's just a magic number that can be mistaken for something else (# of scaled-up node groups, # of considered options, # of accommodated pods). You'd have to either read the code or reverse-engineer just to find out what it means, neither of which is ideal when starting basic troubleshooting.

At least let's add a message describing what this number means (ideally something easily understood by the new users, like "adding a total of %d nodes across %d groups"). Personally I'd also prefer if it was added at the end or logged separately, but with a descriptive message it's at least less likely to quietly break a debugging flow.

jbtk · 2026-03-12T09:46:46Z

/ok-to-test

Log total nodes in the scale-up

1ee4668

k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 11, 2026

k8s-ci-robot requested a review from aleksandra-malinowska March 11, 2026 13:58

k8s-ci-robot added the area/cluster-autoscaler label Mar 11, 2026

k8s-ci-robot requested a review from BigDarkClown March 11, 2026 13:58

k8s-ci-robot removed the do-not-merge/needs-area label Mar 11, 2026

pmendelski commented Mar 11, 2026

View reviewed changes

pmendelski mentioned this pull request Mar 11, 2026

Makes GCE manager handle createInstances call of more than 1k instances #9315

Merged

k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Mar 11, 2026

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 12, 2026

pmendelski closed this Apr 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log total nodes in the scale-up#9347

Log total nodes in the scale-up#9347
pmendelski wants to merge 1 commit intokubernetes:masterfrom
pmendelski:mendel-scaleup-log-summary

pmendelski commented Mar 11, 2026 •

edited

Loading

Uh oh!

k8s-ci-robot commented Mar 11, 2026

Uh oh!

k8s-ci-robot commented Mar 11, 2026

Uh oh!

pmendelski Mar 11, 2026

Uh oh!

damikag Mar 11, 2026

Uh oh!

jbtk Mar 11, 2026

Uh oh!

aleksandra-malinowska Mar 11, 2026

Uh oh!

jbtk commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

pmendelski commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

k8s-ci-robot commented Mar 11, 2026

Uh oh!

k8s-ci-robot commented Mar 11, 2026

Uh oh!

pmendelski Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

damikag Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

jbtk Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

aleksandra-malinowska Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

jbtk commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pmendelski commented Mar 11, 2026 •

edited

Loading