THUDM / slime Public

Notifications You must be signed in to change notification settings
Fork 896
Star 6.2k

Code
Issues 195
Pull requests 162
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security and quality
Insights

Pull requests: THUDM/slime

Labels 23 Milestones 0

New pull request New

162 Open 1,452 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add --loss-aggregation for the four ScaleRL pg_loss aggregation modes

#2090 opened Jun 16, 2026 by EazyReal Contributor

Loading…

Disk-level delta weight sync

#2089 opened Jun 16, 2026 by nanjiangwill Collaborator

Loading…

fix(opd): score teacher logprobs at rollout temperature, not 0

#2085 opened Jun 15, 2026 by EazyReal Contributor

Loading…

feat(rl): composable current-policy importance-sampling correction (TIS hook)

#2084 opened Jun 15, 2026 by EazyReal Contributor

Loading…

feat(rl): add REINFORCE advantage estimator

#2083 opened Jun 15, 2026 by EazyReal Contributor

Loading…

feat(coding_agent_rl): add SWE-bench harness evaluation path

#2079 opened Jun 15, 2026 by aoshen02 Contributor • Draft

3 tasks

fix(rollout): isolate per-trajectory exceptions in generate_and_rm_group

#2078 opened Jun 15, 2026 by aoshen02 Contributor

Loading…

fix(script): correct GLM-4.7 expert_model_parallel_size for single-node 8 GPU

#2077 opened Jun 15, 2026 by aoshen02 Contributor

Loading…

1 task

perf(ppo): gather response/loss-mask rows before log-prob+entropy CE (supersedes #2011)

#2076 opened Jun 14, 2026 by Mantissagithub

Loading…

Support Qwen3.5-VL (dense + MoE) via Megatron-Bridge

#2075 opened Jun 14, 2026 by demouo Contributor

Loading…

feat(rollouts) external rollouts endpoint with publish-only weight sync

#2071 opened Jun 12, 2026 by jvmncs

Loading…

4 tasks done

fix(agent): reuse a pooled SGLang client across turns and retry once on pre-connect connector errors

#2069 opened Jun 12, 2026 by EazyReal Contributor

Loading…

fix(sglang): authenticate engine control-plane and router calls

#2068 opened Jun 12, 2026 by EazyReal Contributor

Loading…

[megatron] don't re-assert no_sync_func every step with overlap_grad_reduce

#2066 opened Jun 12, 2026 by HaozheZhang6 • Draft

fix(dp_schedule): drop trailing rollouts when the aligned micro-batch target exceeds the sample count

#2065 opened Jun 12, 2026 by EazyReal Contributor

Loading…

fix(metrics): make compute_pass_rate ragged-safe for over-sampled batches

#2064 opened Jun 12, 2026 by EazyReal Contributor

Loading…

fix(agent): render OpenAI tool-call arguments as a mapping for chat templates

#2063 opened Jun 12, 2026 by EazyReal Contributor

Loading…

fix(grpo): correct reward attribution for fanned rollouts — full reward per segment + count each rollout once

#2062 opened Jun 12, 2026 by EazyReal Contributor

Loading…

fix(rollout): apply rollout sample filter in the rollout manager

#2061 opened Jun 12, 2026 by EazyReal Contributor

Loading…

(fix) retry transient Ray ActorUnavailableError during rollout engine bringup

#2059 opened Jun 12, 2026 by EazyReal Contributor

Loading…

[DON'T MERGE] run CI run-ci-megatron

#2053 opened Jun 11, 2026 by zhuzilin Contributor

Loading…

[Feature] Mopd (Multi-Teacher On-Policy distillation) supported

#2051 opened Jun 11, 2026 by leoyuppieqnew • Draft

fix(parsing): strip trailing EOS token from body_text after tool/reasoning parsing for code agent rl

#2049 opened Jun 10, 2026 by none0663 Contributor

Loading…

support --num-workers for dataset parallel loading

#2048 opened Jun 10, 2026 by demouo Contributor

Loading…

[docs] Fix OPD reverse KL formula in docs

#2039 opened Jun 9, 2026 by zihaocheng-buaa

Loading…

Previous 1 2 3 4 5 6 7 Next

Previous Next

ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!