Skip to content

feat(rl): add REINFORCE advantage estimator#2083

Open
EazyReal wants to merge 1 commit into
THUDM:mainfrom
EazyReal:reinforce-estimator
Open

feat(rl): add REINFORCE advantage estimator#2083
EazyReal wants to merge 1 commit into
THUDM:mainfrom
EazyReal:reinforce-estimator

feat(rl): add REINFORCE advantage estimator

8f1c408
Select commit
Loading
Failed to load commit list.
Sign in for the full log view

Annotations

1 warning
agent-adapter-test (0, test_agent_adapters.py)
succeeded Jun 15, 2026 in 55s