Skip to content

fix(tests): stop leaked RabbitMQ test container on process exit#1573

Open
ytallo wants to merge 1 commit into
mainfrom
fix/rabbitmq-test-container-leak
Open

fix(tests): stop leaked RabbitMQ test container on process exit#1573
ytallo wants to merge 1 commit into
mainfrom
fix/rabbitmq-test-container-leak

Conversation

@ytallo
Copy link
Copy Markdown
Contributor

@ytallo ytallo commented Apr 29, 2026

Summary

  • The shared RabbitMQ test container was kept in a static OnceCell, and Rust does not run destructors on static items at process exit, so ContainerAsync::drop never fired — cargo test would leave a rabbitmq container running after the suite finished.
  • Capture the container id when it starts and register a libc::atexit hook that runs docker rm -f <id> synchronously on exit. libc is already a dev-dependency, so no new deps.
  • Ryuk is the usual fallback for this kind of leak but is unreliable on Docker Desktop, so we don't depend on it.

Test plan

  • cargo test -p engine --test rabbitmq_queue_integration and confirm docker ps shows no leftover rabbitmq container after the run.
  • Re-run twice in a row — second run should also start cleanly without an orphaned container.
  • Ctrl-C mid-run still leaves a container (atexit doesn't fire on SIGKILL); document is acceptable for now.

Summary by CodeRabbit

  • Tests
    • Enhanced RabbitMQ test helper to ensure proper cleanup of test containers and prevent resource leaks after test execution.

The shared RabbitMQ test container was held in a static OnceCell, but
Rust does not run destructors on static items at process exit, so
ContainerAsync::drop never fired and the container was never stopped.

Record the container id and register a libc::atexit hook that runs
`docker rm -f <id>` synchronously on exit. libc is already a dev-dep,
so no new dependencies.
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Apr 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
iii-website Ready Ready Preview, Comment Apr 29, 2026 1:47am

Request Review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

RabbitMQ test helper now stores container ID in a global Mutex<Option<String>> and registers an atexit handler to force-remove the Docker container on process exit, ensuring cleanup since static OnceCell prevents destructor execution.

Changes

Cohort / File(s) Summary
Container Cleanup Infrastructure
engine/tests/common/rabbitmq_helpers.rs
Added global Mutex<Option<String>> to record container ID and registered atexit handler that executes docker rm -f to prevent container leaks when test binary exits.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • andersonleal

Poem

🐰 A container that lingers, we could not abide,
So we trap the exit and let cleanup preside,
With mutex and atexit, a safety net spun,
No more Docker ghosts when the test binary's done! 🫐

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The description provides detailed context (problem, solution, test plan) but lacks the Apache 2 license checkbox required by the template. Check the Apache 2 license checkbox at the bottom of the description to confirm licensing rights before merging.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: fixing a leaked RabbitMQ test container cleanup on process exit.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/rabbitmq-test-container-leak

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@engine/tests/common/rabbitmq_helpers.rs`:
- Around line 25-35: Ensure the atexit handler is registered only after
CONTAINER_ID is set and assert that libc::atexit(stop_rabbitmq_container)
returns 0 (e.g. assert_eq!(..., 0, "failed to register RabbitMQ cleanup hook"))
so failures surface at init time; inside stop_rabbitmq_container, avoid using
eprintln! (unsafe during shutdown) — either suppress docker rm output or write
diagnostics using a low-level write to fd 2 if you must report errors, and check
the std::process::Command::status() result for non-zero and handle it (e.g. emit
minimal low-level error or ignore on success) instead of silently discarding it;
reference CONTAINER_ID, stop_rabbitmq_container, and libc::atexit when making
these changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5aede9a0-750b-48ff-a5c2-69390a64e56a

📥 Commits

Reviewing files that changed from the base of the PR and between fbf0b82 and 0ee290b.

📒 Files selected for processing (1)
  • engine/tests/common/rabbitmq_helpers.rs

Comment on lines +25 to +35
extern "C" fn stop_rabbitmq_container() {
let id = match CONTAINER_ID.lock() {
Ok(mut guard) => guard.take(),
Err(_) => return,
};
if let Some(id) = id {
let _ = std::process::Command::new("docker")
.args(["rm", "-f", &id])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's see the full file to understand the context
cd engine/tests/common && wc -l rabbitmq_helpers.rs && cat -n rabbitmq_helpers.rs

Repository: iii-hq/iii

Length of output: 5862


🏁 Script executed:

# Also check if there's a git diff available to see what changed
git diff HEAD -- engine/tests/common/rabbitmq_helpers.rs

Repository: iii-hq/iii

Length of output: 36


🏁 Script executed:

# Check what changed in the file around lines 2, 18-37, 60-67
git log --oneline -1 -- engine/tests/common/rabbitmq_helpers.rs

Repository: iii-hq/iii

Length of output: 127


🏁 Script executed:

# Get the actual patch
git show HEAD -- engine/tests/common/rabbitmq_helpers.rs | head -100

Repository: iii-hq/iii

Length of output: 4055


🌐 Web query:

atexit handler safety constraints libc async-safe functions eprintln

💡 Result:

atexit handlers in libc have strict safety constraints: they must terminate normally without calling exit again (undefined behavior), avoid longjmp, and cannot rely on consistent global state since they run during process shutdown after static destructors in some cases (e.g., Rust). They are NOT required to be async-signal-safe by POSIX; only _exit and _Exit are signal-safe among termination functions, and stdio functions like printf are unsafe in signal handlers due to buffering and locks. Using eprintln! (Rust's stderr println macro) in an atexit handler is unsafe: - It accesses global Stderr state via Lazy/OnceCell, which may be deinitialized during shutdown, causing panics like "cannot access stderr during shutdown". - println!/eprintln! use thread-local RefCell-wrapped stdout/stderr locks; concurrent calls (e.g., from signal interrupting print) panic on BorrowMutError. - In Rust+C interop, libc::atexit handlers run after Rust's at_exit queue, during/after TLS/static destruction, making std::io access unreliable. Safe alternatives in atexit handlers: - Use low-level write(2) to fd 2 (stderr), which is async-signal-safe. - libc::printf may work if no Rust runtime interference, but not guaranteed async-safe. - Avoid heap allocation, locks, or complex logic. For Rust programs, prefer std::process::exit over libc::exit to run Rust cleanup; for signal safety, use crates like signal-hook and async-signal-safe primitives. No source confirms eprintln! safe in atexit; evidence shows failures.

Citations:


Handle initialization and cleanup errors explicitly, but use safe atexit patterns.

The new exit-cleanup path silently discards both libc::atexit() registration status and docker rm exit codes. While the implementation is new (not a regression), it should surface failures to prevent silent leaks.

However, the suggested fix has two critical issues:

  1. eprintln! is unsafe in atexit handlers. It accesses global Stderr state which may be deinitialized during process shutdown, causing panics. Use low-level write(2) to stderr fd 2 instead, or avoid output altogether.

  2. Suggested ordering is backwards. The proposed diff registers the atexit handler before setting CONTAINER_ID, so the handler would run with no ID to clean up.

Safer approaches:

  • Assert that atexit registration succeeded: assert_eq!(libc::atexit(stop_rabbitmq_container), 0, "failed to register RabbitMQ cleanup hook"); (now runs at init time, fails early and explicitly)
  • For the docker removal, either suppress output if success is expected, or use low-level write(2) if diagnostics are needed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@engine/tests/common/rabbitmq_helpers.rs` around lines 25 - 35, Ensure the
atexit handler is registered only after CONTAINER_ID is set and assert that
libc::atexit(stop_rabbitmq_container) returns 0 (e.g. assert_eq!(..., 0, "failed
to register RabbitMQ cleanup hook")) so failures surface at init time; inside
stop_rabbitmq_container, avoid using eprintln! (unsafe during shutdown) — either
suppress docker rm output or write diagnostics using a low-level write to fd 2
if you must report errors, and check the std::process::Command::status() result
for non-zero and handle it (e.g. emit minimal low-level error or ignore on
success) instead of silently discarding it; reference CONTAINER_ID,
stop_rabbitmq_container, and libc::atexit when making these changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant