feat(RL): forward routed_experts_prompt_start via nvext by biswapanda · Pull Request #10562 · ai-dynamo/dynamo

biswapanda · 2026-06-11T00:05:53Z

Description

Forward the RL routed-experts capture offset (routed_experts_prompt_start) to the vLLM worker via nvext, so the engine trims the leading prompt rows from the returned routing tensor — instead of shipping full-sequence routing across the wire and trimming client-side.

Changes

nvext.rs: add NvExt.routed_experts_prompt_start: Option<u32> (request field, mirroring cache_salt).
preprocessor.rs: forward it in nvext_passthrough_args so it reaches the worker's extra_args.nvext.
handlers.py: in build_sampling_params, apply nvext.routed_experts_prompt_start onto SamplingParams.routed_experts_prompt_start (validated non-negative; vLLM clamps the upper bound) so the engine trims the prompt rows and the worker serializes the trimmed routing with the correct start.
test: build_sampling_params applies + clamps the nvext offset.

Context

Completes the routed-experts-on-dynamo_chat story (companion to #10529 worker serialize and PrimeIntellect-ai/renderers#79). The renderer now sends nvext.routed_experts_prompt_start and demotes its client-side trim to a back-compat fallback (no-op once the worker stamps start > 0). Engine-side trimming avoids the full-sequence routing blob crossing the wire on MoE rollouts.

Type of Change

New feature (non-breaking; field is optional, ignored by backends without routed-experts capture)

Testing

Python unit test for the build_sampling_params apply/clamp path.
Rust frontend build verified via CI; live engine-side-trim e2e pending.

Summary by CodeRabbit

Release Notes

New Features
- Added support for routed_experts_prompt_start parameter to configure mixture-of-experts (MoE) expert replay capture offset behavior in requests.
Tests
- Added unit test coverage for the new routed-experts parameter handling.

… trims routing engine-side

devin-ai-integration

Devin Review found 2 potential issues.

View 3 additional findings in Devin Review.

coderabbitai · 2026-06-11T00:11:44Z

Walkthrough

This PR adds support for a routed_experts_prompt_start parameter that enables MoE expert replay capture offset configuration. The parameter is defined in the OpenAI protocol layer, forwarded through request preprocessing, and consumed by the vLLM handler with validation and test coverage.

Changes

Routed Experts Prompt Start Parameter

Layer / File(s)	Summary
Protocol definition and request passthrough `lib/llm/src/protocols/openai/nvext.rs`, `lib/llm/src/preprocessor.rs`	`NvExt` struct adds `routed_experts_prompt_start: Option<u32>` field; `OpenAIPreprocessor::nvext_passthrough_args` forwards the field to generated `extra_args["nvext"]` JSON.
vLLM handler integration and validation `components/src/dynamo/vllm/handlers.py`, `components/src/dynamo/vllm/tests/test_vllm_unit.py`	`build_sampling_params` reads the parameter from nvext sources and assigns to `SamplingParams.routed_experts_prompt_start`; unit test verifies propagation and confirms negative values are clamped to `0`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description covers changes, context, and testing, but the Related Issues section requires either a closed issue or explicit confirmation—neither is provided in the required format.	Add either a 'Closes `#XXXX`' or 'Relates to `#XXXX`' statement, or explicitly check the 'Confirmed — no related issue' checkbox as required by the template.
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Title check	✅ Passed	The PR title accurately describes the main change: forwarding the routed_experts_prompt_start field via nvext across multiple components (Rust protocols, preprocessor, and Python handlers).

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

lib/llm/src/preprocessor.rs (1)

316-321: ⚡ Quick win

Add a Rust passthrough assertion for routed_experts_prompt_start.

Line 316-321 correctly forwards the field, but there’s no Rust regression assertion for this new nvext key in backend_extra_args tests. Extending the existing passthrough test would lock the contract and prevent silent drops in future refactors.

Suggested test extension

 #[test]
 fn test_backend_extra_args_preserves_nvext_and_sampling_extensions() {
     let request: NvCreateChatCompletionRequest = serde_json::from_value(serde_json::json!({
         "model": "test-model",
         "messages": [{"role": "user", "content": "hi"}],
         "detokenize": false,
         "allowed_token_ids": [10, 11],
         "bad_words_token_ids": [[12, 13]],
         "nvext": {
             "cache_salt": "step_7",
-            "extra_fields": ["completion_token_ids"]
+            "extra_fields": ["completion_token_ids"],
+            "routed_experts_prompt_start": 5
         }
     }))
     .unwrap();

     let extra_args = OpenAIPreprocessor::backend_extra_args(&request).unwrap();

     assert_eq!(extra_args["nvext"]["cache_salt"], "step_7");
+    assert_eq!(extra_args["nvext"]["routed_experts_prompt_start"], 5);
     assert_eq!(
         extra_args["nvext"]["extra_fields"],
         serde_json::json!(["completion_token_ids"])
     );

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/llm/src/preprocessor.rs` around lines 316 - 321, Extend the existing
backend_extra_args passthrough test to assert that
nvext.routed_experts_prompt_start is forwarded: locate the test that inspects
nvext_passthrough/backend_extra_args (the backend_extra_args tests), add an
assertion that the "routed_experts_prompt_start" key exists in the passthrough
map and that its value matches the expected value you set in the test input,
ensuring the test reads nvext.routed_experts_prompt_start and verifies
nvext_passthrough contains serde_json::json!(expected) (or equivalent value
comparison) to prevent future regressions.

components/src/dynamo/vllm/tests/test_vllm_unit.py (1)

906-910: ⚡ Quick win

Move test imports to module scope.

This test adds in-function imports; keep imports at module top for Python unit test additions.

Suggested diff

 import pytest
+from vllm.sampling_params import SamplingParams
+from dynamo.vllm.handlers import build_sampling_params
@@
 def test_build_sampling_params_applies_nvext_routed_experts_prompt_start():
@@
-    import pytest
-    from vllm.sampling_params import SamplingParams
-
-    from dynamo.vllm.handlers import build_sampling_params

As per coding guidelines, “keep all imports at module top (no imports inside functions)” for Python unit test additions.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/src/dynamo/vllm/tests/test_vllm_unit.py` around lines 906 - 910,
The test contains in-function imports; move the imports for pytest,
SamplingParams (from vllm.sampling_params) and build_sampling_params (from
dynamo.vllm.handlers) to the module top so they are declared at file scope
rather than inside the test function; update any references in the test to use
those top-level imports and remove the redundant local import statements.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@components/src/dynamo/vllm/tests/test_vllm_unit.py`:
- Around line 906-910: The test contains in-function imports; move the imports
for pytest, SamplingParams (from vllm.sampling_params) and build_sampling_params
(from dynamo.vllm.handlers) to the module top so they are declared at file scope
rather than inside the test function; update any references in the test to use
those top-level imports and remove the redundant local import statements.

In `@lib/llm/src/preprocessor.rs`:
- Around line 316-321: Extend the existing backend_extra_args passthrough test
to assert that nvext.routed_experts_prompt_start is forwarded: locate the test
that inspects nvext_passthrough/backend_extra_args (the backend_extra_args
tests), add an assertion that the "routed_experts_prompt_start" key exists in
the passthrough map and that its value matches the expected value you set in the
test input, ensuring the test reads nvext.routed_experts_prompt_start and
verifies nvext_passthrough contains serde_json::json!(expected) (or equivalent
value comparison) to prevent future regressions.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5f780311-3af3-44df-a984-ae91e3bb4768

📥 Commits

Reviewing files that changed from the base of the PR and between 722e720 and e23aa10.

📒 Files selected for processing (4)

components/src/dynamo/vllm/handlers.py
components/src/dynamo/vllm/tests/test_vllm_unit.py
lib/llm/src/preprocessor.rs
lib/llm/src/protocols/openai/nvext.rs

…st imports to module scope

… vLLM chat processor path

…r both nvext shapes in tests

…generation

feat(rl): forward routed_experts_prompt_start via nvext so the worker…

e23aa10

… trims routing engine-side

biswapanda requested review from a team as code owners June 11, 2026 00:05

pull-request-size Bot added the size/M label Jun 11, 2026

github-actions Bot added feat backend::vllm Relates to the vllm backend frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` labels Jun 11, 2026

devin-ai-integration Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread components/src/dynamo/vllm/tests/test_vllm_unit.py Outdated

Comment thread components/src/dynamo/vllm/tests/test_vllm_unit.py Outdated

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

biswapanda enabled auto-merge (squash) June 11, 2026 02:35

biswapanda added 3 commits June 11, 2026 00:25

test(rl): assert routed_experts_prompt_start passthrough and hoist te…

088bad1

…st imports to module scope

fix(frontend): forward routed_experts_prompt_start through the native…

17c2fcc

… vLLM chat processor path

fix(vllm): clamp routed_experts_prompt_start above u32 range and cove…

3b534be

…r both nvext shapes in tests

pull-request-size Bot added size/L and removed size/M labels Jun 11, 2026

copy-pr-bot Bot temporarily deployed to GITLAB June 11, 2026 07:44 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB June 11, 2026 07:46 Inactive

biswapanda added 2 commits June 11, 2026 01:36

feat(rl): add nvext.tokenize_only to return prompt token IDs without …

295f0c8

…generation

style(vllm): black-format routed_experts_prompt_start test

262b2b4

copy-pr-bot Bot temporarily deployed to GITLAB June 11, 2026 09:32 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB June 11, 2026 09:33 Inactive

biswapanda changed the title ~~feat(rl): forward routed_experts_prompt_start via nvext (engine-side routing trim)~~ feat(rl): forward routed_experts_prompt_start via nvext Jun 11, 2026

biswapanda changed the title ~~feat(rl): forward routed_experts_prompt_start via nvext~~ feat(RL): forward routed_experts_prompt_start via nvext Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(RL): forward routed_experts_prompt_start via nvext #10562

feat(RL): forward routed_experts_prompt_start via nvext #10562
biswapanda wants to merge 6 commits into
mainfrom
bis/routed-experts-prompt-start

biswapanda commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

biswapanda commented Jun 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Context

Type of Change

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

biswapanda commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading