Skip to content

MWPW-193327 - [infra] make nala test GH runners more stable and robust#771

Merged
afmicka merged 6 commits intomainfrom
fix_gh_runner
Apr 29, 2026
Merged

MWPW-193327 - [infra] make nala test GH runners more stable and robust#771
afmicka merged 6 commits intomainfrom
fix_gh_runner

Conversation

@honstar
Copy link
Copy Markdown
Contributor

@honstar honstar commented Apr 16, 2026

Intermittent nala test failures on self-hosted runners were caused by two apt failure modes: unattended-upgrades holding dpkg/apt locks, and transient Ubuntu mirror sync errors during apt-get update.

Add a composite action (.github/actions/prep-apt) that stops background apt services, waits up to 120s for all locks to clear, then runs npx playwright install-deps with up to 3 retries (15s apart).

Apply the composite action to all three self-hosted workflows (run-nala.yml studio job, run-nala-daily.yml, run-nala-milolibs.yaml), replacing the previous unconditional bare install-deps call. Also adds continue-on-error: true to the Playwright cache steps in daily and milolibs, consistent with the studio job.

The GitHub-hosted docs job in run-nala.yml is unchanged — it does not share the self-hosted runner apt environment.

Resolves https://jira.corp.adobe.com/browse/MWPW-193327
QA Checklist: https://wiki.corp.adobe.com/display/adobedotcom/M@S+Engineering+QA+Use+Cases

Please do the steps below before submitting your PR for a code review or QA

  • C1. Cover code with Unit Tests
  • C2. Add a Nala test (double check with #fishbags if nala test is needed)
  • C3. Verify all Checks are green (unit tests, nala tests)
  • C4. PR description contains working Test Page link where the feature can be tested
  • C5: you are ready to do a demo from Test Page in PR (bonus: write a working demo script that you'll use on Thursday, you can eventually put in your PR)
  • C.6 read your Jira one more time to validate that you've addressed all AC's and nothing is missing

🧪 Nala E2E Tests

Nala tests run automatically when you open this PR.

To run Nala tests again:

  1. Add the run nala label to this PR (in the right sidebar)
  2. Tests will run automatically on the current commit
  3. Any future commits will also trigger tests as long as the label remains

To stop automatic Nala tests:

  • Remove the run nala label

Note: Tests only run on commits if the run nala label is present. Add the label whenever you need tests to run on new changes.

Test URLs:

Intermittent nala test failures on self-hosted runners were caused by
two apt failure modes: unattended-upgrades holding dpkg/apt locks, and
transient Ubuntu mirror sync errors during apt-get update.

Add a composite action (.github/actions/prep-apt) that stops background
apt services, waits up to 120s for all locks to clear, then runs
npx playwright install-deps with up to 3 retries (15s apart).

Apply the composite action to all three self-hosted workflows
(run-nala.yml studio job, run-nala-daily.yml, run-nala-milolibs.yaml),
replacing the previous unconditional bare install-deps call. Also adds
continue-on-error: true to the Playwright cache steps in daily and
milolibs, consistent with the studio job.

The GitHub-hosted docs job in run-nala.yml is unchanged — it does not
share the self-hosted runner apt environment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@honstar honstar requested a review from afmicka April 16, 2026 13:48
@aem-code-sync
Copy link
Copy Markdown

aem-code-sync Bot commented Apr 16, 2026

Hello, I'm the AEM Code Sync Bot and I will run some actions to deploy your branch.
In case there are problems, just click the checkbox below to rerun the respective action.

  • Re-sync branch
Commits

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.44%. Comparing base (29dc5f3) to head (d6e8aaa).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #771      +/-   ##
==========================================
- Coverage   87.46%   87.44%   -0.02%     
==========================================
  Files         210      210              
  Lines       63081    63081              
==========================================
- Hits        55172    55164       -8     
- Misses       7909     7917       +8     

see 6 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 29dc5f3...d6e8aaa. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@aem-code-sync aem-code-sync Bot temporarily deployed to fix_gh_runner April 16, 2026 14:10 Inactive
@aem-code-sync aem-code-sync Bot temporarily deployed to fix_gh_runner April 16, 2026 16:15 Inactive
Copy link
Copy Markdown
Contributor

@npeltier npeltier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind creating a JIRA and attaching that PR to it?

@honstar honstar changed the title fix(ci): harden Playwright apt install on self-hosted runners MWPW-193327 - [infra] make nala test GH runners more stable and robust Apr 21, 2026
@honstar honstar marked this pull request as ready for review April 21, 2026 15:05
@honstar
Copy link
Copy Markdown
Contributor Author

honstar commented Apr 21, 2026

would you mind creating a JIRA and attaching that PR to it?

Done, ticket created: https://jira.corp.adobe.com/browse/MWPW-193327

@honstar honstar dismissed npeltier’s stale review April 28, 2026 08:32

Ticket has been created, dismissing review due to absence.

@afmicka afmicka merged commit ed45926 into main Apr 29, 2026
12 of 13 checks passed
@afmicka afmicka deleted the fix_gh_runner branch April 29, 2026 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants