select_macro: Use Backoff snooze by pawurb · Pull Request #1251 · crossbeam-rs/crossbeam

pawurb · 2026-04-29T20:45:20Z

Hi, I'm using crossbeam channels with select! macro in https://github.com/pawurb/hotpath-rs. I've noticed a significant overhead of Thread::unpark calls when sending messages every 1μs, visible in samply traces:

I was able to resolve it by batching. But I've noticed that while eg. recv method uses backoff spin, if the same channel type is used inside a select! macro, backoff does not apply.

I used the following example to reproduce and benchmark it with hyperfine:

use std::sync::LazyLock;
use std::thread;
use std::time::{Duration, Instant};

use crossbeam_channel::{bounded, select, unbounded};

const DEFAULT_MESSAGES: u64 = 1_000_000;
const DEFAULT_SEND_INTERVAL_NS: u64 = 1000;

static MESSAGES: LazyLock<u64> = LazyLock::new(|| {
    std::env::var("MESSAGES_NUM")
        .ok()
        .and_then(|s| s.parse().ok())
        .unwrap_or(DEFAULT_MESSAGES)
});

static SEND_INTERVAL: LazyLock<Duration> = LazyLock::new(|| {
    Duration::from_nanos(
        std::env::var("SEND_INTERVAL_NS")
            .ok()
            .and_then(|s| s.parse().ok())
            .unwrap_or(DEFAULT_SEND_INTERVAL_NS),
    )
});

fn spin_for(d: Duration) {
    let start = Instant::now();
    while start.elapsed() < d {}
}

fn main() {
    let (work_tx, work_rx) = unbounded::<u64>();
    let (ctrl_tx, ctrl_rx) = bounded::<()>(1);

    let consumer = thread::spawn(move || {
        let mut count: u64 = 0;
        loop {
            select! {
                recv(work_rx) -> msg => match msg {
                    Ok(_) => count += 1,
                    Err(_) => break,
                },
                recv(ctrl_rx) -> _ => break,
            }
        }
        count
    });

    let start = Instant::now();
    for i in 0..*MESSAGES {
        work_tx.send(i).unwrap();
        spin_for(*SEND_INTERVAL);
    }
    drop(work_tx);
    let _ = ctrl_tx.send(());

    let received = consumer.join().unwrap();
    let elapsed = start.elapsed();

    println!("sent     : {}", *MESSAGES);
    println!("received : {}", received);
    println!("elapsed  : {:?}", elapsed);
    println!(
        "per-msg  : {:.2} ns",
        elapsed.as_nanos() as f64 / *MESSAGES as f64
    );
}

Results are similar on linux and macos:

  ┌──────────┬──────────────┬──────────────┬────────────────┐
  │ interval │   base (ms)  │   PR (ms)    │     rel Δ      │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 100 ns   │ 21.8 ± 0.9   │ 17.5 ± 3.3   │ -19.7%         │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 250 ns   │ 39.5 ± 1.3   │ 32.6 ± 3.4   │ -17.5%         │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 500 ns   │ 74.3 ± 3.8   │ 57.4 ± 5.1   │ -22.7%         │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 750 ns   │ 107.4 ± 2.3  │ 84.4 ± 2.9   │ -21.4%         │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 1000 ns  │ 139.8 ± 4.0  │ 106.5 ± 2.2  │ -23.8%         │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 1500 ns  │ 207.1 ± 1.3  │ 157.0 ± 0.7  │ -24.2%         │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 2000 ns  │ 270.8 ± 1.7  │ 225.3 ± 2.0  │ -16.8%         │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 2500 ns  │ 301.9 ± 3.1  │ 293.1 ± 0.7  │ -2.9%          │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 5 µs     │ 120.7 ± 9.0  │ 115.6 ± 2.6  │ -4.2%          │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 10 µs    │ 220.9 ± 3.5  │ 229.0 ± 5.9  │ +3.7%          │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 25 µs    │ 520.8 ± 4.7  │ 524.4 ± 5.0  │ +0.7%          │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 50 µs    │ 1022.2 ± 3.2 │ 1024.2 ± 4.1 │ +0.2%          │
  ├──────────┼──────────────┼──────────────┼────────────────┤
  │ 100 µs   │ 2025.6 ± 4.7 │ 2024.9 ± 1.3 │ -0.0%          │
  └──────────┴──────────────┴──────────────┴────────────────┘

~20% measurable improvement for messages sent up to ~2μs (I suspect it's ~max backoff spin duration), and no regression later. Let me know if you would consider this, change. I've seen other repo issues complaining about too much spinning. But maybe it makes sense for consistency with receivers implementation.

I can prepare other benchmarks, but not sure which scenario to test.

select_macro: Use Backoff spin

8970d92

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

select_macro: Use Backoff snooze#1251

select_macro: Use Backoff snooze#1251
pawurb wants to merge 1 commit intocrossbeam-rs:masterfrom
pawurb:select-backoff

pawurb commented Apr 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

pawurb commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

pawurb commented Apr 29, 2026 •

edited

Loading