Skip to content

chore(weave): validate monitor query fields on save#7191

Open
gtarpenning wants to merge 3 commits into
masterfrom
gtarpenning/monitor-query-validation
Open

chore(weave): validate monitor query fields on save#7191
gtarpenning wants to merge 3 commits into
masterfrom
gtarpenning/monitor-query-validation

Conversation

@gtarpenning

Copy link
Copy Markdown
Member

Summary

  • Validate Monitor.query field refs against ALLOWED_CALL_FIELDS at object-create time; reject unknown fields (e.g. operation_name) with the allowed list in the error.
  • Prevents the silent per-cycle InvalidFieldError storm in the scoring worker (a bad monitor query fails sampling forever today with no user signal).

Testing

functional obj_create tests (both backends via fixture): bad field rejected with allowed list + not stored; static/dynamic/absent queries accepted.

Validate Monitor.query field references against ALLOWED_CALL_FIELDS at
obj_create time so an unknown field (e.g. operation_name) is rejected with
a 4xx listing the allowed fields, instead of being stored and failing
silently on every scoring-worker cycle.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@wandbot-3000

wandbot-3000 Bot commented Jun 11, 2026

Copy link
Copy Markdown

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 95.12195% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
..._server/calls_query_builder/calls_query_builder.py 94.73% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

- Create Monitors via the client serialization so the stored query carries
  weave bookkeeping keys, exercising the strip path (proven load-bearing:
  removing the strip now fails the test).
- Assert the complete InvalidFieldError message with ==, not substrings.
- Derive ALLOWED_DYNAMIC_FIELD_PREFIXES from ALLOWED_CALL_FIELDS so they
  can't drift as *_dump dynamic fields change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@gtarpenning gtarpenning changed the title feat(weave): validate monitor query fields on save chore(weave): validate monitor query fields on save Jun 11, 2026
@gtarpenning gtarpenning marked this pull request as ready for review June 11, 2026 23:15
@gtarpenning gtarpenning requested a review from a team as a code owner June 11, 2026 23:15
…ld names

Address review on monitor query validation:
- guard tsi Query.model_validate so an object that merely shares the Monitor
  class name with a non-query payload is left untouched, not 500'd
- map ValueError/TypeError from compilation (e.g. empty $and) to InvalidRequest
  instead of leaking a raw 500
- drop internal `*_dump` column names from the allowed-fields error message
- strip only the three weave bookkeeping keys, not every `_`-prefixed key
- rename validate_query_field_names -> validate_query_compiles

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@mscavezze-cw mscavezze-cw left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks functional, but was difficult to read without claude. I offered some suggestions for this specific PR. There are probably things that could be done to the rest of this file, but that's out of scope for this PR and review.

return ALLOWED_CALL_FIELDS[name]


# Field references get_field_by_name accepts beyond exact ALLOWED_CALL_FIELDS

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard to know what these constants mean until reading the rest of the code and using claude to analyze the PR. I think these would be more self-documenting if a couple of changes were made:

  • It looks like they're mainly for creating the error message, but they're up above the validation code. Consider moving them to be closer to the error message and/or giving a little high-level context before launching into the details.
  • Consider moving the error messaging and maybe the validation into a separate file. This file is huge, and adding more new code to it isn't going to help readability or agent context limits.

leaf_object_class: str | None,
val: object,
) -> None:
"""Reject a Monitor whose `query` references a field outside the allowed set."""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment and implementation don't quite match. While it is true that this will detect a query with unacceptable fields, I think it will also detect queries that don't compile. At least that's how the code reads. That also seems reasonable, but the mismatch between the comment and code is confusing.

val: object,
) -> None:
"""Reject a Monitor whose `query` references a field outside the allowed set."""
if (

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Half of this function is answering the question "is this a monitor that we should validate". That's not immediately obvious. Consider ways to make the code more self-documenting.



def validate_query_compiles(query: tsi_query.Query) -> None:
"""Validate that `query` references only allowed call fields and is well-formed."""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment about mismatch between comment and what the code appears to be doing. Consider connecting the dots here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants