Skip to content

perf(ppo): gather response/loss-mask rows before log-prob+entropy CE (supersedes #2011)#2076

Open
Mantissagithub wants to merge 3 commits into
THUDM:mainfrom
Mantissagithub:perf/logprob-response-only-gather
Open

perf(ppo): gather response/loss-mask rows before log-prob+entropy CE (supersedes #2011)#2076
Mantissagithub wants to merge 3 commits into
THUDM:mainfrom
Mantissagithub:perf/logprob-response-only-gather

Merge branch 'main' into perf/logprob-response-only-gather

3dd0969
Select commit
Loading
Failed to load commit list.
Sign in for the full log view

Annotations

1 warning
agent-test (0, test_agent/test_agent_rollout_cpu.py)
succeeded Jun 18, 2026 in 51s