Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions engine/tests/common/rabbitmq_helpers.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
use serde_json::{Value, json};
use std::sync::Mutex;
use testcontainers::ContainerAsync;
use testcontainers::runners::AsyncRunner;
use testcontainers_modules::rabbitmq::RabbitMq;
Expand All @@ -14,6 +15,27 @@ pub struct RabbitMqTestContext {

static RABBITMQ: OnceCell<RabbitMqTestContext> = OnceCell::const_new();

// Rust does not run destructors on `static` items at process exit, so
// `ContainerAsync::drop` never fires for the container held in `RABBITMQ`.
// Record the container id and force-remove it via the Docker CLI from an
// `atexit` hook so the rabbit container does not leak after the test binary
// exits. Ryuk is an unreliable safety net on Docker Desktop.
static CONTAINER_ID: Mutex<Option<String>> = Mutex::new(None);

extern "C" fn stop_rabbitmq_container() {
let id = match CONTAINER_ID.lock() {
Ok(mut guard) => guard.take(),
Err(_) => return,
};
if let Some(id) = id {
let _ = std::process::Command::new("docker")
.args(["rm", "-f", &id])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
Comment on lines +25 to +35
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's see the full file to understand the context
cd engine/tests/common && wc -l rabbitmq_helpers.rs && cat -n rabbitmq_helpers.rs

Repository: iii-hq/iii

Length of output: 5862


🏁 Script executed:

# Also check if there's a git diff available to see what changed
git diff HEAD -- engine/tests/common/rabbitmq_helpers.rs

Repository: iii-hq/iii

Length of output: 36


🏁 Script executed:

# Check what changed in the file around lines 2, 18-37, 60-67
git log --oneline -1 -- engine/tests/common/rabbitmq_helpers.rs

Repository: iii-hq/iii

Length of output: 127


🏁 Script executed:

# Get the actual patch
git show HEAD -- engine/tests/common/rabbitmq_helpers.rs | head -100

Repository: iii-hq/iii

Length of output: 4055


🌐 Web query:

atexit handler safety constraints libc async-safe functions eprintln

💡 Result:

atexit handlers in libc have strict safety constraints: they must terminate normally without calling exit again (undefined behavior), avoid longjmp, and cannot rely on consistent global state since they run during process shutdown after static destructors in some cases (e.g., Rust). They are NOT required to be async-signal-safe by POSIX; only _exit and _Exit are signal-safe among termination functions, and stdio functions like printf are unsafe in signal handlers due to buffering and locks. Using eprintln! (Rust's stderr println macro) in an atexit handler is unsafe: - It accesses global Stderr state via Lazy/OnceCell, which may be deinitialized during shutdown, causing panics like "cannot access stderr during shutdown". - println!/eprintln! use thread-local RefCell-wrapped stdout/stderr locks; concurrent calls (e.g., from signal interrupting print) panic on BorrowMutError. - In Rust+C interop, libc::atexit handlers run after Rust's at_exit queue, during/after TLS/static destruction, making std::io access unreliable. Safe alternatives in atexit handlers: - Use low-level write(2) to fd 2 (stderr), which is async-signal-safe. - libc::printf may work if no Rust runtime interference, but not guaranteed async-safe. - Avoid heap allocation, locks, or complex logic. For Rust programs, prefer std::process::exit over libc::exit to run Rust cleanup; for signal safety, use crates like signal-hook and async-signal-safe primitives. No source confirms eprintln! safe in atexit; evidence shows failures.

Citations:


Handle initialization and cleanup errors explicitly, but use safe atexit patterns.

The new exit-cleanup path silently discards both libc::atexit() registration status and docker rm exit codes. While the implementation is new (not a regression), it should surface failures to prevent silent leaks.

However, the suggested fix has two critical issues:

  1. eprintln! is unsafe in atexit handlers. It accesses global Stderr state which may be deinitialized during process shutdown, causing panics. Use low-level write(2) to stderr fd 2 instead, or avoid output altogether.

  2. Suggested ordering is backwards. The proposed diff registers the atexit handler before setting CONTAINER_ID, so the handler would run with no ID to clean up.

Safer approaches:

  • Assert that atexit registration succeeded: assert_eq!(libc::atexit(stop_rabbitmq_container), 0, "failed to register RabbitMQ cleanup hook"); (now runs at init time, fails early and explicitly)
  • For the docker removal, either suppress output if success is expected, or use low-level write(2) if diagnostics are needed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@engine/tests/common/rabbitmq_helpers.rs` around lines 25 - 35, Ensure the
atexit handler is registered only after CONTAINER_ID is set and assert that
libc::atexit(stop_rabbitmq_container) returns 0 (e.g. assert_eq!(..., 0, "failed
to register RabbitMQ cleanup hook")) so failures surface at init time; inside
stop_rabbitmq_container, avoid using eprintln! (unsafe during shutdown) — either
suppress docker rm output or write diagnostics using a low-level write to fd 2
if you must report errors, and check the std::process::Command::status() result
for non-zero and handle it (e.g. emit minimal low-level error or ignore on
success) instead of silently discarding it; reference CONTAINER_ID,
stop_rabbitmq_container, and libc::atexit when making these changes.

}
}

/// Returns a shared RabbitMQ test context. The container is started on first call
/// and reused for all subsequent calls within the same test binary.
/// Panics if Docker is not available (by design -- no silent skipping).
Expand All @@ -34,6 +56,16 @@ pub async fn get_rabbitmq() -> &'static RabbitMqTestContext {
.expect("Failed to get RabbitMQ management port");
let amqp_url = format!("amqp://guest:guest@127.0.0.1:{}", port);
let mgmt_url = format!("http://127.0.0.1:{}", mgmt_port);

*CONTAINER_ID
.lock()
.expect("rabbit container id mutex poisoned") = Some(container.id().to_string());
// Safety: `stop_rabbitmq_container` is a plain `extern "C"` fn with no
// captured state; libc::atexit is sound to call here.
unsafe {
libc::atexit(stop_rabbitmq_container);
}

RabbitMqTestContext {
amqp_url,
mgmt_url,
Expand Down
Loading