Skip to content

Fixes #460 - Verify VMID existence before creating VM#461

Open
jakduch wants to merge 2 commits intotheforeman:masterfrom
jakduch:fix/vmid-existence-check
Open

Fixes #460 - Verify VMID existence before creating VM#461
jakduch wants to merge 2 commits intotheforeman:masterfrom
jakduch:fix/vmid-existence-check

Conversation

@jakduch
Copy link
Copy Markdown

@jakduch jakduch commented Mar 6, 2026

Problem

When creating a new VM through Foreman, the assign_vmid method blindly trusts the VMID returned by Proxmox's /cluster/nextid endpoint without verifying that no VM with that ID already exists in the cluster.

This can lead to silent overwriting of existing VM configurations when:

  • A VM exists in Proxmox but is not managed by Foreman
  • Proxmox's /etc/pve/nextid file is stale after manual VM creation/deletion
  • A race condition occurs between concurrent VM creation requests

The existing id_valid? check only validates the VMID range (100–999999999), not whether the VMID is actually free.

Solution

Added a vmid_exists? method that checks all cluster nodes for existing VMs and containers with the given VMID. The assign_vmid method now:

  1. Gets a VMID candidate (from next_id or user input)
  2. Checks if a VM/container with that VMID already exists on any node
  3. If it exists, requests a new VMID from Proxmox and retries
  4. Repeats up to 5 times before raising an error

This ensures Foreman never creates or clones a VM with a VMID that is already in use, regardless of whether the existing VM is managed by Foreman or not.

Changes

  • assign_vmid: Added existence check loop with retry logic
  • vmid_exists?: New method that scans all cluster nodes for VMs and containers with the given VMID

Testing

  • If VMID is free → works exactly as before (no behavior change)
  • If VMID is taken → retries with a new VMID from next_id
  • If no free VMID found after 5 retries → raises a clear error
  • Both VMs and LXC containers are checked to prevent cross-type collisions

When creating a new VM, assign_vmid now checks whether the assigned
VMID already belongs to an existing VM or container on any cluster
node. If the VMID is taken, it requests a new one from Proxmox and
retries up to 5 times.

This prevents silent overwriting of unmanaged VMs when Proxmox's
/cluster/nextid returns a VMID that is already in use due to race
conditions or stale nextid state.
@jakduch jakduch force-pushed the fix/vmid-existence-check branch from f207cd1 to a98ddb7 Compare March 9, 2026 15:20
@Manisha15
Copy link
Copy Markdown
Contributor

Thanks for the contribution @jakduch . I tested it and if I create a proxmox host with vmid that already exists (say vmid of some container or another unmanaged proxmox vm), it creates foreman host but is unmanaged. Because it checks if vm exists in foreman and for existing vm it return true but still tries to create a vm ( using new_vm definition instead of create_vm where we use assign_vmid function) but then it's unmanaged. Is this the intended use case or should I test it in another test scenario?

@jakduch jakduch force-pushed the fix/vmid-existence-check branch from a98ddb7 to 5ee711b Compare March 9, 2026 15:24
@jakduch
Copy link
Copy Markdown
Author

jakduch commented Mar 9, 2026

Hi, thanks for testing!

Yes, this is the intended behavior. The PR specifically protects the create_vm path — when Foreman actively creates a new VM in Proxmox, assign_vmid now verifies the VMID isn't already taken and requests a new one if it is. This prevents the VMID collision described in #460.

The scenario you describe (existing Proxmox VM being adopted as an unmanaged Foreman host via new_vm) is a separate flow that doesn't go through create_vm/assign_vmid at all. That behavior is correct — Foreman should recognize the existing VM and associate with it rather than attempting to create a new one on top of it.

Move rescue Fog::Errors::NotFound inside the any? block so that
a NotFound from one node does not short-circuit the search across
remaining nodes.

Update node mock factories to stub servers.get and containers.get
returning nil, matching the vmid_exists? lookup during create_vm.
@jakduch jakduch force-pushed the fix/vmid-existence-check branch from 5ee711b to 8b70de5 Compare March 9, 2026 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants