Skip to content

[CLI] Add --workers=<n|auto> flag to configure worker thread count#3504

Merged
mho22 merged 3 commits intotrunkfrom
add-workers-cli-flag
Apr 22, 2026
Merged

[CLI] Add --workers=<n|auto> flag to configure worker thread count#3504
mho22 merged 3 commits intotrunkfrom
add-workers-cli-flag

Conversation

@pento
Copy link
Copy Markdown
Member

@pento pento commented Apr 17, 2026

Motivation for the change, related issues

The Playground CLI hardcodes the number of request-handling worker threads to 6 (chosen to match HTTP/1.1's per-origin browser connection limit). That's fine for a single browser, but it caps multi-client workloads — parallel Playwright contexts, load tests, fanning e2e suites — at six in-flight requests, which is too few on modern hosts.

The old --experimental-multi-worker flag that let users override this was deprecated and its value is now silently ignored, and the pre-hardcoding default of cpus - 1 was dropped along with it. There's no way to turn the dial back up.

Implementation details

Adds a new --workers=<n|auto> flag:

  • A positive integer sets an explicit worker count.
  • auto uses max(1, os.cpus().length - 1) — restores the old pre-hardcoding default.
  • When the flag is absent, the default is min(6, max(1, cpus - 1)): preserves today's 6-worker behavior on machines with ≥7 cores, shrinks on small hosts so we don't spawn more workers than cores. The Math.max(1, ...) guard covers single-core hosts and restricted environments where os.cpus() returns an empty array.
  • Validation happens in yargs coerce: rejects 0, negatives, non-integers, and any non-auto string with a clear error message, routed through the existing .fail() handler.

The resolver is a small exported pure function (resolveWorkerCount) so the min/max/cpu logic is unit-testable without spinning up a server.

--experimental-multi-worker stays parsed-but-ignored — existing invocations don't break — but now emits a one-line logger.warn pointing users at --workers.

Docs updated in both packages/playground/cli/README.md and the Docusaurus CLI reference page.

Breaking changes

None. The default worker count is unchanged on typical dev machines (≥7 cores → still 6). On hosts with 1–6 cores the default drops to cpus - 1, which is a behavior change on small hosts but strictly safer (we were previously spawning more workers than cores there).

Testing Instructions (or ideally a Blueprint)

Unit tests (fast, no server spawn):

npx nx run playground-cli:test-playground-cli --testNamePattern="resolveWorkerCount|worker count"

All 14 new tests should pass: default, explicit number, --workers=1 (single-worker bootstrap), --workers=auto, invalid 0, invalid abc, deprecation warning for --experimental-multi-worker, plus 7 unit tests stubbing os.cpus() to cover the min/max/empty-array edges.

Manual smoke:

# Explicit count — server prints "(12 workers)"
npx nx dev playground-cli server --workers=12

# auto — uses cpus-1
npx nx dev playground-cli server --workers=auto

# default — min(6, cpus-1)
npx nx dev playground-cli server

# Deprecated flag still works, now warns
npx nx dev playground-cli server --experimental-multi-worker=4

# Validation
npx nx dev playground-cli server --workers=0      # exits 1 with clear error
npx nx dev playground-cli server --workers=abc    # exits 1 with clear error

Multi-client verification (the motivating use case):

# Start with more than 6 workers
npx nx dev playground-cli server --workers=16 &

# Fire >6 concurrent requests; throughput should scale past the old 6-worker ceiling
seq 1 32 | xargs -P32 -I{} curl -s -o /dev/null -w "%{time_total}\n" http://127.0.0.1:9400/

The worker count was previously hardcoded to 6, capping multi-client
workloads (e2e suites, load testing) at six in-flight requests. The
deprecated --experimental-multi-worker flag accepted a number but its
value was silently discarded.

Add a new --workers flag that accepts a positive integer or "auto"
(= max(1, cpus-1)). The default is min(6, cpus-1), preserving today's
behavior on typical dev machines and shrinking on small hosts.
--experimental-multi-worker still parses but now emits a deprecation
warning pointing at --workers.
@pento pento requested review from a team, Copilot and mho22 April 17, 2026 05:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new --workers=<n|auto> CLI flag to control the number of request-handling worker threads, restoring an “auto” mode and keeping the existing 6-worker default behavior on larger hosts.

Changes:

  • Introduces --workers parsing/validation in the CLI and a pure resolveWorkerCount() helper.
  • Adds unit/integration-style tests covering defaults, auto, explicit counts, invalid values, and the deprecated flag warning.
  • Updates CLI documentation to describe --workers and deprecate --experimental-multi-worker.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
packages/playground/cli/src/run-cli.ts Adds --workers option, deprecation warning, and resolveWorkerCount() implementation
packages/playground/cli/tests/run-cli.spec.ts Adds coverage for worker resolution logic and CLI flag behavior
packages/playground/cli/README.md Documents new --workers flag and deprecates --experimental-multi-worker
packages/docs/site/docs/developers/05-local-development/04-wp-playground-cli.md Updates Docusaurus CLI reference with --workers and deprecation note

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/playground/cli/src/run-cli.ts
Comment thread packages/playground/cli/README.md
Comment thread packages/playground/cli/src/run-cli.ts
Comment thread packages/playground/cli/src/run-cli.ts
@mho22 mho22 self-assigned this Apr 21, 2026
@mho22
Copy link
Copy Markdown
Collaborator

mho22 commented Apr 21, 2026

Thanks @pento for this great pull request.

There's still a test that won't stop failing but I am pretty sure this isn't related to this PR. So we're almost good!

I discussed with @brandonpayton and we agreed on warning users on a minimal amount of workers in order for certain blueprints to work. Something like :

const targetWorkerCount = resolveWorkerCount(args.workers);

if (targetWorkerCount < 6) {
    logger.warn(
        `Worker count is below the safe threshold (6). ` +
        `This may increase the likelihood of deadlock` +
        `due to workers blocking on file locks.`
    );
}

What do you think?

@pento
Copy link
Copy Markdown
Member Author

pento commented Apr 21, 2026

Thanks for the feedback, @mho22!

The failing test suite seemed to be the same as was failing in other PRs when I created this one, so I didn't investigate. trunk looks to be passing now, so I've merged that to see if it helps. If it's still failing, I'll investigate further.

I think a warning could be useful, though I'm leaning towards only showing it if the user has deliberately set the worker count to less than 6. If the count has been restricted by the number of cores available, there's not really much they can do, making it a warning that can't be actioned.

Some blueprints deadlock when fewer than 6 workers are available,
because workers block on file locks. Surface two distinct warnings so
users can tell a config mistake from a hardware ceiling:

- Explicit --workers=<n|auto> resolving below 6 points them at the
  flag they just set.
- Default path reduced below 6 by cpus-1 names the CPU count so the
  lever is "bigger host," not "different flag."
@pento
Copy link
Copy Markdown
Member Author

pento commented Apr 21, 2026

I've investigated the error further. The macos worker has 3 cores, so this PR creates 2 workers: this is triggering the issue that #3494 addresses. I've been able to reproduce the error locally, and confirm that applying #3494 fixes it.

@chubes4
Copy link
Copy Markdown

chubes4 commented Apr 22, 2026

#3494 also fixes: Automattic/studio#3116

@mho22
Copy link
Copy Markdown
Collaborator

mho22 commented Apr 22, 2026

@pento Briliant. Thanks for the contribution.

@mho22 mho22 merged commit 6011949 into trunk Apr 22, 2026
46 of 47 checks passed
@mho22 mho22 deleted the add-workers-cli-flag branch April 22, 2026 06:48
mho22 pushed a commit that referenced this pull request Apr 23, 2026
## Motivation for the change, related issues

Follow-up to #3504.

`test-playground-cli (macos-latest)` started failing consistently after
#3504 merged. Every failing run showed the same five timeouts in
`packages/playground/cli/tests/file-locking.spec.ts`:

- `PHP flock() > should grant multiple shared locks on a file`
- `PHP flock() > should release a shared lock when its associated file
descriptor is closed`
- `PHP flock() > should release an exclusive lock when its associated
file descriptor is closed`
- `PHP flock() > should release a shared lock when the owning process
exits`
- `PHP flock() > should release an exclusive lock when the owning
process exits`

### Root cause

The `multi shared locks` test at `file-locking.spec.ts:1090` fires three
concurrent PHP requests with `Promise.all`. Each script busy-waits on a
coordination file that only the *next* script in the chain can write, so
all three must run concurrently.

Before #3504 the CLI hardcoded 6 workers, which was always enough. After
#3504 the default dropped to `min(6, max(1, cpus - 1))`. GitHub's
`macos-latest` runners report 3 CPUs, resolving to **2 workers** — so
PHP3 is permanently queued and the test deadlocks.

### Why four unrelated tests also fail

The `runCLI` server is created once in `beforeAll` and shared across
every test in the `describe`. When the multi-shared test times out, its
two workers are still spinning inside PHP-level `while
(file_get_contents(...) \!== ...)` loops — vitest can abort the JS
`await`, but there's no hook to unblock PHP. The four two-script tests
that follow multi-shared in file order each do
`Promise.all([fetchScript(php1), fetchScript(php2)])`, inherit a pool
with zero free workers, and time out at 60 s each.

Evidence this is the right model:
- The two-script tests that run *before* multi-shared pass
(deny-exclusive-with-shared, deny-shared-with-exclusive). The two-script
tests *after* fail. Failures aren't interleaved among the earlier ones.
- Every failure is a 60 s timeout, never an assertion error. Independent
flock races would surface as `expect(lock_acquired).toBe(false)`-style
failures.
- A 200-run audit of `test-playground-cli (macos-latest)` across all
branches shows no consistent failure pattern before #3504 merged.

## Implementation details

Pass `workers: 3` to `runCLI()` in the suite's `beforeAll`. Three is the
minimum the multi-shared test needs; anything lower re-introduces the
cascade. Explicit integer values bypass the `min(6, cpus-1)` default
clamp (confirmed by the `honors an explicit --workers=3` test at
`run-cli.spec.ts:1886`), so the suite is now robust to host CPU count.

No production code changes. Only the test file is touched.

## Testing Instructions (or ideally a Blueprint)

```bash
# Passes on macOS-like hosts with 3 CPUs:
npx nx test-playground-cli playground-cli --testFile=file-locking.spec.ts

# To reproduce the original failure, temporarily change workers: 3 to workers: 2:
# expect 5 timeouts in the PHP flock() describe.
```

On CI, the `test-playground-cli (macos-latest)` job should turn green.
Ubuntu and Windows runners were already passing (their default worker
count was already ≥3) and should remain so.

## AI disclosure

Per [WordPress AI
Guidelines](https://make.wordpress.org/ai/handbook/ai-guidelines/):

**AI assistance:** Yes
**Tool:** Claude Code (Claude Opus 4.7)
**Used for:** Root-cause analysis of the worker-starvation cascade,
drafting the fix and this PR description. All code was reviewed and
tested by the author.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants