Skip to content

[vm] Pin par_exec rayon workers to physical cores#19484

Closed
wqfish wants to merge 1 commit intomainfrom
pr19484
Closed

[vm] Pin par_exec rayon workers to physical cores#19484
wqfish wants to merge 1 commit intomainfrom
pr19484

Conversation

@wqfish
Copy link
Copy Markdown
Contributor

@wqfish wqfish commented Apr 17, 2026

When the block executor builds its parallel execution pool and the
process has at least as many physical cores available as worker threads,
pin each worker 1:1 to a distinct physical core. This prevents two
workers from landing on HT siblings of the same core, which degrades
throughput on CPU-bound Block-STM workloads.

Approach

  • Detect the physical cores the process may run on by grouping the CPUs
    from sched_getaffinity (via core_affinity::get_core_ids()) by
    their /sys/.../topology/thread_siblings_list, keeping one
    representative per group. Linux only; other platforms return None.
  • Pin via a rayon start_handler that calls
    core_affinity::set_for_current using the worker's thread index.
  • Skip pinning (fall back to the OS scheduler) when topology cannot be
    detected (non-Linux, sysfs unreadable, any read fails), the process
    has fewer physical cores than num_threads, or num_threads == 0.
  • Per-thread pin failures are logged at warn! and the thread continues
    unpinned; the pool still comes up.

Robustness

  • core_affinity::get_core_ids respects the process's affinity mask, so
    this composes correctly with cgroups, containers, and taskset.
  • Unit tests cover the sibling-grouping logic (HT on/off, restricted
    affinity set, read-failure path).

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

When the block executor builds its parallel execution pool and the
process has at least as many physical cores available as worker threads,
pin each worker 1:1 to a distinct physical core. This prevents two
workers from landing on HT siblings of the same core, which degrades
throughput on CPU-bound Block-STM workloads.

## Approach

- Detect the physical cores the process may run on by grouping the CPUs
  from `sched_getaffinity` (via `core_affinity::get_core_ids()`) by
  their `/sys/.../topology/thread_siblings_list`, keeping one
  representative per group. Linux only; other platforms return `None`.
- Pin via a `rayon` `start_handler` that calls
  `core_affinity::set_for_current` using the worker's thread index.
- Skip pinning (fall back to the OS scheduler) when topology cannot be
  detected (non-Linux, sysfs unreadable, any read fails), the process
  has fewer physical cores than `num_threads`, or `num_threads == 0`.
- Per-thread pin failures are logged at `warn!` and the thread continues
  unpinned; the pool still comes up.

## Robustness

- `core_affinity::get_core_ids` respects the process's affinity mask, so
  this composes correctly with cgroups, containers, and `taskset`.
- Unit tests cover the sibling-grouping logic (HT on/off, restricted
  affinity set, read-failure path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@wqfish wqfish closed this Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant