feat: surface backend tracing logs to the frontend via stderr#6312
feat: surface backend tracing logs to the frontend via stderr#6312wolfv wants to merge 5 commits into
Conversation
The frontend now sets PIXI_BUILD_BACKEND_LOG_FORMAT=json when spawning a
build backend. In that mode the backend installs a custom tracing Layer
that serializes each event to `{sentinel}{json}\n` on stderr; the
frontend's stderr reader recognises the sentinel, parses the envelope
into a BackendLogRecord, and re-emits it through the frontend's own
tracing subscriber. Untagged lines (raw build output) flow to
BackendOutputStream unchanged. Standalone backend invocations keep the
pretty LoggingOutputHandler, so interactive debugging is unaffected.
https://claude.ai/code/session_01BGkocbCqTAjevABSqYxcWR
|
This is fantastic! The only thing that would be nice to have is if we could properly nest spans! Maybe you can ask clanker to do that? Im not sure how that works internally. |
For each BackendLogRecord, open and enter a frontend tracing span per name in record.spans before emitting the event, so the frontend subscriber sees the same hierarchy the backend reported. tracing requires &'static names at every callsite, so we leak one Callsite/Metadata pair per unique span name; the set of names is small and bounded in practice. https://claude.ai/code/session_01BGkocbCqTAjevABSqYxcWR
| pub(super) fn callsite_for(name: &str) -> &'static DynCallsite { | ||
| static INTERN: OnceLock<Mutex<HashMap<String, &'static DynCallsite>>> = OnceLock::new(); | ||
| let intern = INTERN.get_or_init(|| Mutex::new(HashMap::new())); | ||
| let mut guard = intern.lock().expect("dyn_span intern poisoned"); | ||
| if let Some(&cs) = guard.get(name) { | ||
| return cs; | ||
| } | ||
| let leaked_name: &'static str = Box::leak(name.to_owned().into_boxed_str()); | ||
| let cs: &'static DynCallsite = Box::leak(Box::new(DynCallsite { | ||
| name: leaked_name, | ||
| metadata: OnceLock::new(), | ||
| })); | ||
| // Initialise the metadata before publishing the entry so `Callsite::metadata` | ||
| // never observes an uninitialised slot. | ||
| let _ = cs.metadata_ref(); | ||
| guard.insert(name.to_owned(), cs); | ||
| cs | ||
| } | ||
|
|
There was a problem hiding this comment.
All iof this code is quite complex and a bit ugly t.b.h :), could we refactor this?
|
This is now the build output: I think rattler-build also outputs the build output as logs. |
Replace the per-event span name list with a tagged BackendLogRecord enum (Event/SpanOpen/SpanClose). The backend layer now implements on_new_span and on_close in addition to on_event, emitting lifecycle records with stable ids and parent links. The frontend stderr reader maintains a HashMap<u64, Span> mapping backend span ids to real frontend tracing spans created via Span::child_of, so spans live for their actual backend lifetime and events are emitted with the correct parent in scope. The leaked-callsite interner is still required to give the runtime span names a static identity for tracing's dispatcher, but it is now used once per span rather than once per event. https://claude.ai/code/session_01BGkocbCqTAjevABSqYxcWR
|
Thanks for testing! Indeed, we use the info level logs as the plaintext streaming logs in rattler-build and filter out tracing headers, per usual. I think we should maybe only forward WARNING from the backend this way? Or filter all rattler-build logs? |
rattler-build emits its plaintext build-output channel as INFO-level tracing events. Passing those through the structured JSON pipeline turned nicely-rendered output into double-encoded events with field cruft on the frontend. In JSON mode, install LoggingOutputHandler alongside JsonLogLayer with an info-only filter so INFO events still reach the user as formatted text (read by the frontend as untagged stderr), and have JsonLogLayer skip INFO events so they're not also emitted as structured records. Span lifecycle records still flow for all levels so non-INFO events under INFO spans keep their parent context. https://claude.ai/code/session_01BGkocbCqTAjevABSqYxcWR
What I like is that if we just pass all, we can let default log filtering handle that! Would it be possible to not emit the build logs as tracing logs perhaps? |
We can filter logs from Or we could filter all I didn't fully understand what your preference is? |
Description
This PR implements structured logging that allows the backend to emit
tracingevents as JSON-formatted log records that the frontend can parse and re-emit through its owntracingsubscriber. This enables backend logs to be interleaved with frontend logs while maintaining full structured information.Key changes:
New
pixi_build_types::logmodule - Defines the wire format for structured logs:BackendLogRecord: A serializable struct containing level, target, message, fields, timestamp, and span informationBACKEND_LOG_SENTINEL: A sentinel prefix (\u{1f}pixi-log\u{1f}) that marks stderr lines as structured recordsBACKEND_LOG_FORMAT_ENVthat the frontend uses to signal the backend to emit JSON logsNew
JsonLogLayerin backend - Atracinglayer that:tracingevents and converts them toBackendLogRecordJSONEnhanced stderr handling in frontend - Modified
stream_stderrto:BackendLogRecordJSONtracingsubscriber with target rewritten topixi_build_backend::<original-target>on_loghandler unchangedBackend integration - Modified
cli.rsto:PIXI_BUILD_BACKEND_LOG_FORMAT=jsonenvironment variableJsonLogLayerwhen invoked by the frontend, or the human-readableLoggingOutputHandlerfor standalone invocationsFrontend integration - Modified
json_rpc.rsto:PIXI_BUILD_BACKEND_LOG_FORMAT=jsonwhen spawning the backend processThe implementation ensures backward compatibility: raw build output (compiler messages, etc.) continues to flow through unchanged, while structured logs are captured and re-emitted with full context preservation.
How Has This Been Tested?
pixi_build_frontend/src/backend/stderr.rscovering:pixi_build_types/src/log.rscovering:BackendLogRecordround-trips through JSON serializationChecklist:
https://claude.ai/code/session_01BGkocbCqTAjevABSqYxcWR