Skip to content

feat(rl): composable current-policy importance-sampling correction (TIS hook)#2084

Open
EazyReal wants to merge 1 commit into
THUDM:mainfrom
EazyReal:off-policy-is
Open

feat(rl): composable current-policy importance-sampling correction (TIS hook)#2084
EazyReal wants to merge 1 commit into
THUDM:mainfrom
EazyReal:off-policy-is

feat(rl): composable off-policy importance-sampling correction

1b70bf3
Select commit
Loading
Failed to load commit list.
Sign in for the full log view