fix(finetune): apply train.max_norm gradient clipping in finetune scripts by discobot · Pull Request #2264 · Lightning-AI/litgpt

discobot · 2026-06-13T11:17:28Z

--train.max_norm was rejected by validate_args in all four finetune scripts while the
finetune configs in config_hub expose it.

This removes max_norm from the unsupported list in lora.py, full.py, adapter.py, and
adapter_v2.py and applies fabric.clip_gradients(model, optimizer, max_norm=train.max_norm)
at optimizer-step boundaries (inside if not is_accumulating:), mirroring pretrain.py.
Clipping at step boundaries rather than after each fabric.backward() keeps it correct under
gradient accumulation, where per-micro-batch clipping would act on partial gradients and
conflict with no_backward_sync.

Unlike pretrain, max_norm remains optional: when unset (the default in every finetune config)
behavior is unchanged, so the existing null max_norm: entries in config_hub/finetune/* now
genuinely mean "no clipping" and need no edits. QLoRA is covered via the same lora.py path.

Adds a regression test per script that runs the tiny-config CPU training loop with
--train.max_norm=1.0 and asserts Fabric.clip_gradients is invoked once per optimizer step
with the configured value (these previously failed with
ValueError: ... doesn't support the 'max_norm' argument).

…ipts The four finetune scripts (lora, full, adapter, adapter_v2) listed train.max_norm as unsupported in validate_args even though the finetune configs in config_hub expose it. Remove it from the unsupported lists and clip gradients at optimizer-step boundaries with fabric.clip_gradients, mirroring pretrain.py. The argument remains optional, so behavior is unchanged when it is not set. Add regression tests asserting that clip_gradients is called once per optimizer step with the configured value.

discobot requested review from andyland, k223kim, lianakoleva and t-vi as code owners June 13, 2026 11:17

discobot mentioned this pull request Jun 13, 2026

Gradient Clipping Doesn't Work in Finetuning #2191

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(finetune): apply train.max_norm gradient clipping in finetune scripts#2264

fix(finetune): apply train.max_norm gradient clipping in finetune scripts#2264
discobot wants to merge 1 commit into
Lightning-AI:mainfrom
discobot:fix/2191-finetune-max-norm

discobot commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

discobot commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant