feat(adapter/nemo_rl): add NeMo RL adapter and wrapper util#78
feat(adapter/nemo_rl): add NeMo RL adapter and wrapper util#78
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request integrates ML Flashpoint with the NeMo RL framework, allowing users to leverage fast, in-memory checkpointing alongside traditional persistent saves. The new adapter transparently manages checkpointing by intercepting NeMo RL's native save mechanisms, providing a flexible and efficient way to handle model state during reinforcement learning training. This enhancement aims to improve training resilience and recovery without requiring significant modifications to existing NeMo RL workflows. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces ML Flashpoint integration for the NeMo RL framework, including a new MLFlashpointRLCheckpointManager for dual checkpointing (frequent MLF saves to tmpfs and infrequent standard saves) and a wrapper_util function to facilitate this integration. The documentation has been updated with usage instructions. Review comments highlight three areas for improvement: adding upfront validation for the save_strategy parameter in wrap_rl_components_with_mlflashpoint to prevent runtime errors, correcting a missing checkpoint_loader argument in the user guide's example code, and fixing an incorrect expected path in a unit test for MLF checkpoint saving.
d16556e to
cd345ff
Compare
840c26a to
76dabfd
Compare
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
… streamline CI pipeline
… and fix unit test expected paths
…lict with nemo_rl
402678f to
22c35dc
Compare
- Switch build-nemo-rl to use nvcr.io/nvidia/nemo-rl:v0.5.0 as the base environment. - Use a login shell (bash -l) to ensure container profiles and virtual environments are correctly loaded. - Unify standard and nemo-rl builds into a single parameterized 'build' job to reduce duplication. - Empty the 'nemo-rl' optional dependency in pyproject.toml, as dependencies are pre-installed in the container. - Added explanatory comments for non-obvious environment configurations.
Python Code Coverage Summary
Minimum allowed line rate is |
C++ Code Coverage Summary
Minimum allowed line rate is |
Python Code Coverage Summary
Minimum allowed line rate is |
C++ Code Coverage Summary
Minimum allowed line rate is |
This change adds integration support for checkpointing in NeMo/RL.
It introduces an
MLFlashpointRLCheckpointManager, which subtypesCheckpointManagerin a bespoke way - it does not re-initialize the parentCheckpointManageras that is already instantiated by the time users create anMLFlashpointRLCheckpointManager. Instead, it receives an instance of its parent in its__init__, and uses composition to re-expose behavior of that "parent" instance, overriding certain behaviors.Namely, it intercepts the given
policy'ssave_checkpointandload_checkpointmethods to a custom implementation that uses waterfall logic to try ML Flashpoint first, then the regular checkpointing logic if needed.