Skip to content

fix: do not review - fix Ubuntu2204 HTTPSProxy PrivateDNS CSE exit50 kubelet#8809

Draft
SriHarsha001 wants to merge 2 commits into
mainfrom
sharsha/e2eHTTPSProxyPrivateDNS
Draft

fix: do not review - fix Ubuntu2204 HTTPSProxy PrivateDNS CSE exit50 kubelet#8809
SriHarsha001 wants to merge 2 commits into
mainfrom
sharsha/e2eHTTPSProxyPrivateDNS

Conversation

@SriHarsha001

Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Copilot AI review requested due to automatic review settings July 1, 2026 19:01
@SriHarsha001 SriHarsha001 changed the title fix Ubuntu2204 HTTPSProxy PrivateDNS CSE exit50 kubelet DO NOT REVIEW - fix Ubuntu2204 HTTPSProxy PrivateDNS CSE exit50 kubelet Jul 1, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the e2e VMSS provisioning harness to mitigate a known transient Linux CSE failure where the outbound connectivity preflight exits with ERR_OUTBOUND_CONN_FAIL (exit code 50). The approach is to detect that specific failure mode in the Azure VMExtensionProvisioningError payload and recreate the VMSS a bounded number of times to reduce PR-gate flakes.

Changes:

  • Add a bounded recreate loop in ConfigureAndCreateVMSS that retries VMSS creation when the failure is classified as transient exit-50.
  • Introduce cseExitCodeOutboundConnFail = "50" and a helper classifier isTransientOutboundCSEFailure(err).
  • Add unit tests to pin the exit code constant and validate the classifier behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
e2e/vmss.go Adds bounded recreate-on-exit-50 logic, plus helper functions for classification and synchronous VMSS deletion.
e2e/vmss_test.go Adds tests for the exit code constant and the transient-failure classifier.
e2e/const.go Introduces a named constant for the CSE outbound connectivity failure exit code (50).

Comment thread e2e/vmss.go
Comment on lines 83 to +101
func ConfigureAndCreateVMSS(ctx context.Context, s *Scenario) (*ScenarioVM, error) {
vm, err := CreateVMSSWithRetry(ctx, s)
var vm *ScenarioVM
var err error
for attempt := 0; ; attempt++ {
vm, err = CreateVMSSWithRetry(ctx, s)
if err == nil {
break
}
// Known transient e2e-infra flake: the CSE outbound connectivity preflight check
// (curl mcr.microsoft.com, optionally via the e2e proxy) intermittently fails all
// retries and exits ERR_OUTBOUND_CONN_FAIL (50) before kubelet starts. Recreate the
// node a bounded number of times to reduce PR-gate noise without masking real
// regressions, which fail consistently and survive the retry budget.
if attempt >= maxOutboundCSERetries || s.IsWindows() || config.Config.KeepVMSS || !isTransientOutboundCSEFailure(err) {
break
}
toolkit.Logf(ctx, "CSE failed with ERR_OUTBOUND_CONN_FAIL (exit %s) on VMSS %q: known transient e2e outbound flake, recreating node (attempt %d/%d)", cseExitCodeOutboundConnFail, s.Runtime.VMSSName, attempt+1, maxOutboundCSERetries)
deleteVMSSAndWait(ctx, s)
}
Comment thread e2e/vmss.go Outdated
@SriHarsha001 SriHarsha001 changed the title DO NOT REVIEW - fix Ubuntu2204 HTTPSProxy PrivateDNS CSE exit50 kubelet fix: DO NOT REVIEW - fix Ubuntu2204 HTTPSProxy PrivateDNS CSE exit50 kubelet Jul 1, 2026
@SriHarsha001 SriHarsha001 changed the title fix: DO NOT REVIEW - fix Ubuntu2204 HTTPSProxy PrivateDNS CSE exit50 kubelet fix: do not review - fix Ubuntu2204 HTTPSProxy PrivateDNS CSE exit50 kubelet Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants