Skip to content

Fix check for draining#458

Draft
jose-caballero wants to merge 2 commits into
mainfrom
fix_check_for_draining
Draft

Fix check for draining#458
jose-caballero wants to merge 2 commits into
mainfrom
fix_check_for_draining

Conversation

@jose-caballero

Copy link
Copy Markdown
Contributor

Description:

Special Notes:


Submitter:

Have you (where applicable):

  • Added unit tests?
  • Checked the latest commit runs on Dev?
  • Updated the example config file(s) and README?

Reviewer

Does this PR:

  • Place non-StackStorm code into the lib directory?
  • Have unit tests for the action/sensor and lib layers?
  • Have clear and obvious action parameter names and descriptions?

Currently, flavors for GPUs and FPGAs, or flavors with more than 60 CPUs
are prevented from migration, unconditionally.

This change adds a new boolean parameter to work like this:
* when the workflow hypervisor.drain is used, the behavior remains
* when the action server.migrate is used, users can uncheck the box and
  allow the migration even for those cases
@jose-caballero jose-caballero marked this pull request as draft June 24, 2026 12:51
@codecov

codecov Bot commented Jun 24, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 85.71429% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 99.15%. Comparing base (509c54b) to head (00ae743).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
lib/apis/openstack_api/openstack_server.py 85.71% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #458      +/-   ##
==========================================
- Coverage   99.19%   99.15%   -0.04%     
==========================================
  Files         112      112              
  Lines        2726     2727       +1     
  Branches      338      339       +1     
==========================================
  Hits         2704     2704              
  Misses         19       19              
- Partials        3        4       +1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jose-caballero jose-caballero force-pushed the fix_check_for_draining branch from 6ac0e72 to 911b032 Compare June 24, 2026 12:59
@jose-caballero jose-caballero force-pushed the fix_check_for_draining branch from 0ed2fee to 00ae743 Compare June 24, 2026 14:29
required: true
default: true
check_flavor:
description: "Check the Flavor of the Server to decide if migration is feasible. When set as True, GPUs, FPGAs, and Flavors with more than 60 CPUs will NOT be migrated."

@DavidFair DavidFair Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, I would call it live_migration
Then the tick box (default) would do a live migration on CPU instances <60 cores

Unticking it would do a cold migration which can be done on anything but will shutdown while it happens

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may be a good idea for the long term, as a separate flag.

Note that the purpose here is slightly different. We are NOT doing live migrations during the outage week in July.

This change is just to allow the migration of VMs with flavors for GPUs/FPGAs and/or more that 60 CPUs, which are currently blocked by st2. This PR allows a way to keep that check for those types of flavors in place, to avoid trying risky migrations by mistake, but at the same time having a mechanism to force the migration this time.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we need to do a "cold migration" for these types of flavors
But we'll need to do this post-outage too (e.g. HW failure) and the steps will be the same

The fact it skips the checks is an implementation detail - we just want a way to force drain these types of flavors which will incur an offline

raise ValueError(
f"Attempted to move GPU or FPGA flavor, {server.flavor.name}, which is not allowed!"
)
def can_be_migrated(server: Server, check_flavor: bool):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put a default value here as check_flavor=false to keep backwards compatible with other workflows using this function

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was the other way around, before this PR we always did the check as it was always a live migration
Now the check makes it optional for cold migrations - existing workflows will assume a live migration instead

conn: Connection,
server_id: str,
snapshot: bool,
check_flavor: bool,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the same here check_flavor=false

conn=mock_connection,
server_id=mock_server_id,
snapshot=False,
check_flavor=True,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need this anymore if you put a default value

server_id=mock_server_id,
dest_host=dest_host,
snapshot=True,
check_flavor=True,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont need this if you put a default value

server_id=mock_server_id,
dest_host=dest_host,
snapshot=True,
check_flavor=True,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change this to check_flavor because you're overwriting what it set as an argument

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants