feat(weave): add OTel tracing primitives to weave.trace_server (PR-7a of ddtrace migration)#7196
Draft
amwarrier wants to merge 1 commit into
Draft
feat(weave): add OTel tracing primitives to weave.trace_server (PR-7a of ddtrace migration)#7196amwarrier wants to merge 1 commit into
amwarrier wants to merge 1 commit into
Conversation
Introduces `weave.trace_server.tracing` — the OTel-equivalent of the
ddtrace-flavored helpers in `weave.trace_server.datadog`. Exposes two
public symbols:
- `@traced(name)` — wraps a function in an OTel span; refuses generators
at decoration time with a clear TypeError
- `@traced_generator(name)` — for streaming endpoints; `yield from`s
inside the span body so the span covers the full iteration
This is the abstraction layer for the upcoming ddtrace -> OpenTelemetry
migration of `weave/trace_server/`. Call-site migration follows in
subsequent PRs (PR-7b1 for clickhouse_trace_server_batched.py, PR-7b2 for
the remaining 11 files, PR-7c for datadog.py + pyproject removal).
The decorator's body is minimal because OTel's `start_as_current_span`
context manager already records exceptions and sets `Status(ERROR,
"{ExcType}: {msg}")` on Exception raises. BaseException subclasses
(GeneratorExit, CancelledError, KeyboardInterrupt, SystemExit) pass
through without marking the span errored — see
open-telemetry/opentelemetry-python#4484. So
the wrapper does not need its own try/except, and the historical
`generator_trace` wrapper in `datadog.py` (which existed to suppress
ddtrace's auto-mark-error-on-GeneratorExit) becomes deletable in PR-7c.
The module-scope `_tracer = trace.get_tracer(...)` is hoisted because
re-resolving inside every wrapper call costs ~7us/span (measured), and
these decorators run on hot CH-query paths. A `reset_tracer_cache()`
helper is exported for tests that swap the global TracerProvider
mid-process.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=05477a179e24aca42e20be53350cc07d6d7c0794 |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
5 tasks
amwarrier
added a commit
that referenced
this pull request
Jun 12, 2026
…ated) Integrated commit for testing in wandb/core's weave-trace image on wandbench-small. Represents the combined effect of upstream PRs: - #7196 (PR-7a): @Traced + @traced_generator in weave/trace_server/tracing.py - #7201 (PR-7b1): 47 wraps in clickhouse_trace_server_batched.py - #7202 (PR-7b2): 17 wraps + 2 trace blocks + 3 set_tag + 1 current_root in 10 files - #7203 (PR-7c): inline DogStatsD in datadog.py, drop ddtrace dep from pyproject.toml This branch sits on top of the wandb/core-pinned submodule SHA so the weave-trace image build picks up only PR-7 changes — no unrelated upstream drift. After this commit, `grep -rn "import ddtrace\|from ddtrace"` returns empty across weave/trace_server/. Only docstring/comment references remain. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First PR of the
weave/trace_server/ddtrace → OpenTelemetry migration. Adds the abstraction layer (@traced,@traced_generator) that subsequent PRs will use to replace@ddtrace.tracer.wrapdecorators across 12 files (66 wraps total).This PR is intentionally scope-limited to the decorator surface + tests. No call sites are migrated. Call-site migration follows:
clickhouse_trace_server_batched.py(47 of 66 wraps) — draft opened immediately after this PR.kafka.py,usage_utils.py,ttl_settings.py, etc).datadog.pyrewrite + removeddtrace>=2.7.0frompyproject.toml.Why a new module
weave.trace_server.datadogkeeps its existing public symbols (record_db_insert,set_root_span_dd_tags,db_insert_path, etc.). Only the tracing surface —@tracedand the generator variant — moves to the new module. This keeps the PR-7a diff scoped to additions, and the module name is honest: the SDK is OpenTelemetry, DD is just the backend.Public surface
Why no try/except in the decorator body
OTel's
start_as_current_spancontext manager already:__exit__Status(ERROR, "{ExcType}: {msg}")forExceptionsubclassesBaseException(GeneratorExit, CancelledError, KeyboardInterrupt, SystemExit) pass through without marking the span erroredReference: opentelemetry-python's
use_spanhelper catchesexcept Exception, notBaseException, per open-telemetry/opentelemetry-python#4484.This also means the historical
generator_tracewrapper indatadog.py(which existed to suppress ddtrace's auto-mark-error-on-GeneratorExit) is removable in PR-7c.Why
_tracer = trace.get_tracer(...)is hoisted to module scopeMeasured: 16.5μs/call with
get_tracer(...)inside the wrapper vs 9.95μs/call with it hoisted (40% per-call overhead saved). These decorators run on hot CH-query paths with dozens of spans per request, so the hoist matters.get_tracerreturns aProxyTracerthat caches its real-tracer reference on first span open and never re-resolves. To preserve test isolation when the globalTracerProvideris swapped between cases, areset_tracer_cache()helper is exported. Production code never needs to call it — there is exactly oneTracerProviderper process, set during application startup before any span opens.Test plan
functools.wrapsmetadata preservationCancelledErrorpass-through@traced_generator: span covers full iteration,GeneratorExitpass-through, exception markingtests/trace_server/suite (no call sites migrated yet)Out of scope
traced_with_service(per-span service override) — none of upstream's existing wraps use@ddtrace.tracer.wrap(service="X"), so adding it would be speculative. Will add in a future PR if a call site needs it.services/weave-trace/src/tracing/decorator.py) has its own equivalent decorator. Once this lands and the upstream snapshot advances in wandb/core, that file becomes deletable — tracked separately.🤖 Generated with Claude Code