fix(FR-2582): preserve final TPS value after LLM Playground response ends by yomybaby · Pull Request #6707 · lablup/backend.ai-webui

yomybaby · 2026-04-15T05:28:05Z

Summary

In the LLM Playground, the TPS (tokens per second) indicator dropped to 0 immediately after a streaming response finished because onFinish cleared startTime to null and ChatTokenCounter returned 0 whenever startTime was nullish. This made the final TPS measurement disappear from the UI.

This change:

Aligns TPS measurement with the standard LLM inference convention (vLLM, Ollama, NVIDIA GenAI-Perf, Anyscale): start the measurement window when the first output token actually arrives, not when the user presses send. This excludes file upload, network RTT, and prefill time (TTFT) from the TPS numerator, so the displayed value reflects pure decode rate.
Tracks the measurement window as { startTime, endTime } in ChatCard:
- startTime is set by a useEffect when status transitions to 'streaming' (i.e., the first token has been received).
- endTime is set by a useEffect when streaming ends — covers normal completion and abort / error paths, so TPS freezes correctly in every case instead of drifting downward indefinitely after stop().
- handleSendMessage resets both to null on every new send.
ChatTokenCounter now computes elapsed as ((endTime ?? Date.now()) - startTime) / 1000 and short-circuits to 0 when elapsed is non-positive, avoiding an Infinity TPS display when the computation runs before the first token chunk has been counted.

Files changed

react/src/components/Chat/ChatCard.tsx
react/src/components/Chat/ChatMessages.tsx
react/src/components/Chat/ChatTokenCounter.tsx

Manual test plan

Send a prompt to a model and confirm the TPS counter updates while the response streams in.
After the response completes, confirm the TPS value remains visible (frozen at the last measurement) instead of resetting to 0.
Send a second prompt and confirm the TPS counter resets and starts measuring the new response.
Click stop mid-stream: TPS should freeze at the partial value rather than continue drifting downward.
Send a prompt with a large file attachment: TPS should reflect only the model's decode rate, not the upload duration.

Verification

bash scripts/verify.sh -> === ALL PASS === (Relay, Lint, Format, TypeScript)

yomybaby · 2026-04-15T05:28:19Z

fix(FR-2582): preserve final TPS value after LLM Playground response ends #6707 👈 (View in Graphite)
main

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

flow:merge-queue - adds this PR to the back of the merge queue
flow:hotfix - for urgent changes, fast-track this PR to the front of the merge queue

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has required the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copilot

Pull request overview

Preserves the final “tokens per second” (TPS) value in the LLM Playground UI after a streaming response completes by capturing an end timestamp and using it in the TPS calculation.

Changes:

Add endTime state in ChatCard and set it when streaming finishes; reset it when a new message send starts.
Plumb endTime through ChatMessages into ChatTokenCounter.
Update TPS calculation to use (endTime ?? Date.now()) so TPS remains stable after completion.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
react/src/components/Chat/ChatCard.tsx	Tracks `endTime` for each run (set on finish, cleared on new send) and passes it down to message UI.
react/src/components/Chat/ChatMessages.tsx	Extends props to accept `endTime` and forwards it to the token counter.
react/src/components/Chat/ChatTokenCounter.tsx	Uses `endTime` (or current time while streaming) to compute TPS and keep the final value after completion.

github-actions · 2026-04-15T05:31:11Z

Coverage report for `./react`

St.^❔	Category	Percentage	Covered / Total
🔴	Statements	8.59%	1757/20443
🔴	Branches	7.88%	1131/14351
🔴	Functions	5.14%	285/5544
🔴	Lines	8.31%	1649/19837

Test suite run success

856 tests passing in 39 suites.

Report generated by 🧪jest coverage report action from 2e0a5e1

agatha197

LGTM

graphite-app · 2026-04-23T07:32:49Z

Merge activity

Apr 23, 7:32 AM UTC: agatha197 added this pull request to the Graphite merge queue.
Apr 23, 7:34 AM UTC: Merged by the Graphite merge queue.

…ends (#6707) Resolves #6705(FR-2582) ## Summary In the LLM Playground, the TPS (tokens per second) indicator dropped to `0` immediately after a streaming response finished because `onFinish` cleared `startTime` to `null` and `ChatTokenCounter` returned `0` whenever `startTime` was nullish. This made the final TPS measurement disappear from the UI. This change: - **Aligns TPS measurement with the standard LLM inference convention** ([vLLM](https://docs.vllm.ai/en/stable/design/metrics/), [Ollama](https://github.com/ollama/ollama/blob/main/docs/api.md), [NVIDIA GenAI-Perf](https://docs.nvidia.com/nim/benchmarking/llm/latest/metrics.html), [Anyscale](https://docs.anyscale.com/llm/serving/benchmarking/metrics)): start the measurement window when the **first output token** actually arrives, not when the user presses send. This excludes file upload, network RTT, and prefill time (TTFT) from the TPS numerator, so the displayed value reflects pure decode rate. - Tracks the measurement window as `{ startTime, endTime }` in `ChatCard`: - `startTime` is set by a `useEffect` when `status` transitions to `'streaming'` (i.e., the first token has been received). - `endTime` is set by a `useEffect` when streaming ends — covers normal completion **and** abort / error paths, so TPS freezes correctly in every case instead of drifting downward indefinitely after `stop()`. - `handleSendMessage` resets both to `null` on every new send. - `ChatTokenCounter` now computes elapsed as `((endTime ?? Date.now()) - startTime) / 1000` and short-circuits to `0` when elapsed is non-positive, avoiding an `Infinity` TPS display when the computation runs before the first token chunk has been counted. ## Files changed - `react/src/components/Chat/ChatCard.tsx` - `react/src/components/Chat/ChatMessages.tsx` - `react/src/components/Chat/ChatTokenCounter.tsx` ## Manual test plan - Send a prompt to a model and confirm the TPS counter updates while the response streams in. - After the response completes, confirm the TPS value remains visible (frozen at the last measurement) instead of resetting to `0`. - Send a second prompt and confirm the TPS counter resets and starts measuring the new response. - Click stop mid-stream: TPS should freeze at the partial value rather than continue drifting downward. - Send a prompt with a large file attachment: TPS should reflect only the model's decode rate, not the upload duration. ## Verification `bash scripts/verify.sh` -> `=== ALL PASS ===` (Relay, Lint, Format, TypeScript)

Copilot AI review requested due to automatic review settings April 15, 2026 05:28

github-actions Bot assigned yomybaby Apr 15, 2026

github-actions Bot added the size:S 10~30 LoC label Apr 15, 2026

Copilot started reviewing on behalf of yomybaby April 15, 2026 05:28 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

Comment thread react/src/components/Chat/ChatTokenCounter.tsx Outdated

Comment thread react/src/components/Chat/ChatCard.tsx Outdated

yomybaby force-pushed the 04-15-fix_fr-2582_preserve_final_tps_value_after_llm_playground_response_ends branch from 5a82835 to 1ca31e4 Compare April 15, 2026 05:37

github-actions Bot added the bug label Apr 15, 2026

yomybaby force-pushed the 04-15-fix_fr-2582_preserve_final_tps_value_after_llm_playground_response_ends branch from 1ca31e4 to d8d3d02 Compare April 15, 2026 07:02

github-actions Bot added size:M 30~100 LoC and removed size:S 10~30 LoC labels Apr 15, 2026

yomybaby requested a review from agatha197 April 21, 2026 04:05

agatha197 requested changes Apr 22, 2026

View reviewed changes

Comment thread react/src/components/Chat/ChatTokenCounter.tsx Outdated

yomybaby force-pushed the 04-15-fix_fr-2582_preserve_final_tps_value_after_llm_playground_response_ends branch from d8d3d02 to 91ff835 Compare April 23, 2026 03:38

yomybaby requested a review from agatha197 April 23, 2026 03:38

agatha197 approved these changes Apr 23, 2026

View reviewed changes

graphite-app Bot force-pushed the 04-15-fix_fr-2582_preserve_final_tps_value_after_llm_playground_response_ends branch from 91ff835 to 2e0a5e1 Compare April 23, 2026 07:33

graphite-app Bot merged commit 2e0a5e1 into main Apr 23, 2026
11 checks passed

graphite-app Bot deleted the 04-15-fix_fr-2582_preserve_final_tps_value_after_llm_playground_response_ends branch April 23, 2026 07:34

github-pages Bot temporarily deployed to github-pages April 23, 2026 07:35 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(FR-2582): preserve final TPS value after LLM Playground response ends#6707

fix(FR-2582): preserve final TPS value after LLM Playground response ends#6707
graphite-app[bot] merged 1 commit intomainfrom
04-15-fix_fr-2582_preserve_final_tps_value_after_llm_playground_response_ends

yomybaby commented Apr 15, 2026 •

edited

Loading

Uh oh!

yomybaby commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

Uh oh!

agatha197 left a comment

Uh oh!

graphite-app Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yomybaby commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files changed

Manual test plan

Verification

Uh oh!

yomybaby commented Apr 15, 2026

How to use the Graphite Merge Queue

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage report for ./react

Test suite run success

Uh oh!

Uh oh!

agatha197 left a comment

Choose a reason for hiding this comment

Uh oh!

graphite-app Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yomybaby commented Apr 15, 2026 •

edited

Loading

github-actions Bot commented Apr 15, 2026 •

edited

Loading

Coverage report for `./react`

graphite-app Bot commented Apr 23, 2026 •

edited

Loading