Skip to content

Fix Flaky results on Library and Gateway controller test#2039

Open
fjglira wants to merge 2 commits into
istio-ecosystem:mainfrom
fjglira:fix-flay-e2e
Open

Fix Flaky results on Library and Gateway controller test#2039
fjglira wants to merge 2 commits into
istio-ecosystem:mainfrom
fjglira:fix-flay-e2e

Conversation

@fjglira

@fjglira fjglira commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

We are getting flaky results on this test and adding Eventually wrapper can help us to avoid this situations

What type of PR is this?

  • Enhancement / New Feature
  • Bug Fix
  • Refactor
  • Optimization
  • Test
  • Documentation Update

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #2038

Related Issue/PR #

Additional information:

We are getting flaky results on this test and adding Eventually wrapper can help us to avoid this situations

Signed-off-by: Francisco Herrera <fjglira@gmail.com>
@fjglira fjglira requested a review from a team as a code owner June 26, 2026 16:43
@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.32%. Comparing base (5d72f81) to head (a40edbd).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2039      +/-   ##
==========================================
- Coverage   76.57%   76.32%   -0.26%     
==========================================
  Files          58       58              
  Lines        3181     3181              
==========================================
- Hits         2436     2428       -8     
- Misses        608      613       +5     
- Partials      137      140       +3     
Flag Coverage Δ
integration-tests 69.95% <ø> (-0.34%) ⬇️
unit-tests 52.56% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

g.Expect(ready).To(BeTrue(), "curl-client pod is not ready")
}).Should(Succeed())
Success("All pods in gateway namespace are ready")
Success("Curl client pod is ready")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to check if this work, because now we are only validating that curl pod is running, but the previous step validates that all the gateway deploy are Ready

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses flaky E2E failures in the Sail Operator test suite by making readiness/drift assertions more resilient to eventual consistency and background reconciliation, specifically in the Library reconciliation and Gateway controller E2E tests.

Changes:

  • Wrap the ValidatingWebhookConfiguration drift mutation in an Eventually block to tolerate concurrent reconciler updates.
  • Narrow the Gateway controller “pod ready” check to the specific curl-client pod rather than requiring all pods in the gateway namespace to be ready.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
tests/e2e/library/library_reconcile_test.go Retries webhook drift mutation to reduce flakes from concurrent reconciliation.
tests/e2e/gatewaycontroller/gateway_controller_test.go Makes the curl-client readiness check more targeted to avoid unrelated pod readiness causing flakes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +279 to +285
// The background reconciler may update the webhook concurrently; retry on conflict.
Eventually(func(g Gomega) {
webhook := &admissionv1.ValidatingWebhookConfiguration{}
g.Expect(cl.Get(ctx, webhookKey, webhook)).To(Succeed())
webhook.Webhooks[0].Name = "xyz.xyz.xyz"
webhook.Webhooks[0].FailurePolicy = ptr.Of(admissionv1.Fail)
g.Expect(cl.Update(ctx, webhook)).To(Succeed())
@MaxBab

MaxBab commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

@fjglira

Looks ok to me.
I've added hold label in case the PR would not get merged after my approval, in case you would like to make more changes.
If, not just remove the hold label.

Signed-off-by: Francisco Herrera <fjglira@gmail.com>
@fjglira

fjglira commented Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

I had to do the change manually

@istio-testing

Copy link
Copy Markdown
Collaborator

@fjglira: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
e2e-kind_sail-operator_main a40edbd link true /test e2e-kind
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix TestLibrary and Gateway test flaky results

4 participants