-
Notifications
You must be signed in to change notification settings - Fork 302
Add marimo MNIST + W&B Registry example #621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
johndmulhausen
wants to merge
17
commits into
main
Choose a base branch
from
jmulhausen/marimo-mnist-registry-example
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 11 commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
071913c
add marimo MNIST + W&B Registry example
johndmulhausen a690e06
apply Google Developer Style Guide pass to marimo MNIST example
johndmulhausen c4c7288
add workflow posting molab run-links for modified marimo notebooks
johndmulhausen 3d812be
use official molab shield badge in PR comment link
johndmulhausen 5d9d4b8
add W&B API key field to marimo notebook
johndmulhausen 360c5b9
clarify Registry view-only permission error in marimo notebook
johndmulhausen da3b0ba
use branch ref instead of commit SHA for molab links
johndmulhausen 2d65a7c
add concrete Registry access-grant remediation to view-only error
johndmulhausen 5457984
open molab links in /server (hosted runtime) mode
johndmulhausen a5c3933
export marimo notebooks to Markdown peers in CI
johndmulhausen 6690ce4
consolidate marimo notebook into input -> button -> run cells
johndmulhausen d1f04a7
address Copilot review: unused os, accuracy precision, path glob
johndmulhausen fcfef8b
move Train button below the training cell; clarify it as the run trigger
johndmulhausen ce18914
clarify W&B entity field label (username or team)
johndmulhausen aa342d1
correct entity guidance: runs require a team, not a username
johndmulhausen 4142351
add a consume step that downloads the model and classifies 10 digits
johndmulhausen 46e0686
fix marimo redefinition error in the consume cell
johndmulhausen File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,76 @@ | ||
| name: marimo export markdown | ||
|
|
||
| # On every push that changes a marimo notebook under examples/marimo/, export | ||
| # each notebook to a peer Markdown file (notebook.py -> notebook.md) and commit | ||
| # the result back to the branch, so the rendered Markdown always tracks the | ||
| # notebook. | ||
| # | ||
| # Loop safety (three independent guards): | ||
| # 1. Pushes made with GITHUB_TOKEN do not trigger new workflow runs — a | ||
| # GitHub Actions built-in, and the primary protection here. | ||
| # 2. The trigger watches only *.py; this job only ever commits *.md. | ||
| # 3. The commit message carries [skip ci]. | ||
| # | ||
| # marimo is pinned so exports are byte-deterministic (the front matter records | ||
| # the marimo version), which means an unchanged notebook never produces a | ||
| # spurious commit. Bump MARIMO_VERSION to refresh all exports on the next push. | ||
|
|
||
| on: | ||
| push: | ||
| paths: | ||
| - 'examples/marimo/**/*.py' | ||
|
|
||
| permissions: | ||
| contents: write | ||
|
|
||
| concurrency: | ||
| group: marimo-export-md-${{ github.ref }} | ||
| cancel-in-progress: true | ||
|
|
||
| env: | ||
| MARIMO_VERSION: "0.23.9" | ||
|
|
||
| jobs: | ||
| export-md: | ||
| # Redundant with the GITHUB_TOKEN protection above, but keeps things safe | ||
| # if someone later swaps in a personal access token. | ||
| if: github.actor != 'github-actions[bot]' | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Checkout branch | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| ref: ${{ github.ref_name }} | ||
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: "3.11" | ||
|
|
||
| - name: Install marimo | ||
| run: python -m pip install --quiet "marimo==${MARIMO_VERSION}" | ||
|
|
||
| - name: Export marimo notebooks to Markdown | ||
| run: | | ||
| shopt -s globstar nullglob | ||
| for nb in examples/marimo/**/*.py; do | ||
| # Only real marimo notebooks construct marimo.App(...). | ||
| if grep -q 'marimo\.App(' "$nb"; then | ||
| echo "Exporting $nb -> ${nb%.py}.md" | ||
| marimo export md "$nb" -o "${nb%.py}.md" -f | ||
| fi | ||
| done | ||
|
|
||
| - name: Commit and push if the Markdown changed | ||
| run: | | ||
| git config user.name 'github-actions[bot]' | ||
| git config user.email '41898282+github-actions[bot]@users.noreply.github.com' | ||
| # Only Markdown peers are generated, so staging the tree captures | ||
| # exactly the exported files (the notebooks themselves are untouched). | ||
| git add -A examples/marimo | ||
| if git diff --cached --quiet; then | ||
| echo "Markdown already up to date." | ||
| else | ||
| git commit -m "docs: export marimo notebook(s) to Markdown [skip ci]" | ||
| git push origin "HEAD:${{ github.ref_name }}" | ||
| fi |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,144 @@ | ||
| name: marimo molab links | ||
|
|
||
| # Posts — and keeps updated — a PR comment linking each modified marimo | ||
| # notebook to molab (https://molab.marimo.io), which runs any public marimo | ||
| # notebook on GitHub in a hosted environment with no local setup. | ||
| # | ||
| # Security note: this uses `pull_request_target` so the comment can also be | ||
| # posted on PRs from forks (a plain `pull_request` event gives fork PRs a | ||
| # read-only token that cannot comment). The job NEVER checks out or executes | ||
| # PR code — it only reads changed-file metadata and file contents as text | ||
| # through the API, then posts a comment. Do not add a checkout of the PR head | ||
| # or run any PR-provided code in this workflow. | ||
|
|
||
| on: | ||
| pull_request_target: | ||
| types: [opened, synchronize, reopened] | ||
| paths: | ||
| - '**.py' | ||
|
|
||
| permissions: | ||
| contents: read | ||
| pull-requests: write | ||
|
|
||
| jobs: | ||
| molab-links: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Comment molab links for modified marimo notebooks | ||
| uses: actions/github-script@v7 | ||
| with: | ||
| script: | | ||
| const pr = context.payload.pull_request; | ||
| const headOwner = pr.head.repo.owner.login; | ||
| const headRepo = pr.head.repo.name; | ||
| const headSha = pr.head.sha; // pin content detection to this PR revision | ||
| const headRef = pr.head.ref; // branch name for the (auto-tracking) links | ||
| const marker = '<!-- marimo-molab-links -->'; | ||
|
|
||
| // 1. List the files changed in this PR. | ||
| const files = await github.paginate(github.rest.pulls.listFiles, { | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| pull_number: pr.number, | ||
| per_page: 100, | ||
| }); | ||
|
|
||
| // 2. Keep added/modified .py files and decide whether each is a | ||
| // marimo notebook by inspecting its content (never executing it). | ||
| // Every marimo notebook constructs `marimo.App(...)`. | ||
| const isMarimo = /\bmarimo\.App\s*\(/; | ||
| const notebooks = []; | ||
| for (const f of files) { | ||
| if (f.status === 'removed') continue; | ||
| if (!f.filename.endsWith('.py')) continue; | ||
| try { | ||
| const res = await github.rest.repos.getContent({ | ||
| owner: headOwner, | ||
| repo: headRepo, | ||
| path: f.filename, | ||
| ref: headSha, | ||
| }); | ||
| if (!res.data.content) { | ||
| core.warning(`Skipping ${f.filename}: content not inlined (file too large?).`); | ||
| continue; | ||
| } | ||
| const content = Buffer.from(res.data.content, res.data.encoding).toString('utf8'); | ||
| if (isMarimo.test(content)) notebooks.push(f.filename); | ||
| } catch (err) { | ||
| core.warning(`Could not read ${f.filename}: ${err.message}`); | ||
| } | ||
| } | ||
|
|
||
| // 3. Find any prior comment so we update it in place instead of | ||
| // posting a new one on every push. | ||
| const comments = await github.paginate(github.rest.issues.listComments, { | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| issue_number: pr.number, | ||
| per_page: 100, | ||
| }); | ||
| const existing = comments.find(c => c.body && c.body.includes(marker)); | ||
|
|
||
| // 4. No marimo notebooks: clear a stale comment if present, else exit. | ||
| if (notebooks.length === 0) { | ||
| if (existing) { | ||
| await github.rest.issues.updateComment({ | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| comment_id: existing.id, | ||
| body: `${marker}\n_No marimo notebooks in the current changes._`, | ||
| }); | ||
| } | ||
| core.info('No marimo notebooks found; nothing to link.'); | ||
| return; | ||
| } | ||
|
|
||
| // 5. Build the comment. Links use the branch ref, not a commit | ||
| // SHA, so they always point at the latest revision without the | ||
| // comment needing an update on every push. GitHub resolves | ||
| // multi-segment (slashed) branch names in `blob/<ref>/<path>`, | ||
| // and molab fetches from GitHub, so slashed branches are fine. | ||
| const rows = notebooks.map((path) => { | ||
| // The `/server` suffix opens the notebook in a hosted runtime; | ||
| // without it molab shows a static, non-runnable preview. | ||
| const url = `https://molab.marimo.io/github/${headOwner}/${headRepo}/blob/${headRef}/${path}/server`; | ||
| return `| \`${path}\` | [](${url}) |`; | ||
| }).join('\n'); | ||
|
|
||
| const body = [ | ||
| marker, | ||
| '### ▶️ Run the marimo notebook(s) in this PR', | ||
| '', | ||
| '[molab](https://molab.marimo.io) launches any public marimo notebook on ' | ||
| + 'GitHub in a hosted environment — no local setup required.', | ||
| '', | ||
| '| Notebook | molab |', | ||
| '| --- | --- |', | ||
| rows, | ||
| '', | ||
| `_Links track the head of \`${headRef}\`._`, | ||
| ].join('\n'); | ||
|
|
||
| // 6. Upsert the comment (skip the write when nothing changed, so | ||
| // pushes that add no new notebook don't churn the comment). | ||
| if (existing) { | ||
| if (existing.body === body) { | ||
| core.info('Comment already up to date.'); | ||
| return; | ||
| } | ||
| await github.rest.issues.updateComment({ | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| comment_id: existing.id, | ||
| body, | ||
| }); | ||
| } else { | ||
| await github.rest.issues.createComment({ | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| issue_number: pr.number, | ||
| body, | ||
| }); | ||
| } | ||
| core.info(`Linked ${notebooks.length} marimo notebook(s).`); | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,95 @@ | ||
| # MNIST -> W&B Registry (marimo) | ||
|
|
||
| A [marimo](https://marimo.io) notebook that trains a small CNN on MNIST with | ||
| PyTorch, tracks the run in Weights & Biases, saves the trained model as a W&B | ||
| Artifact, and links that Artifact to a collection in the **W&B Registry**. | ||
|
|
||
| The notebook is the first marimo example in this repo and is intentionally | ||
| self-contained: dependencies are declared in a [PEP 723](https://peps.python.org/pep-0723/) | ||
| inline-script block at the top of `mnist_registry.py`, so [`uv`](https://docs.astral.sh/uv/) | ||
| can resolve them automatically. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| - Python 3.10 or newer. | ||
| - A W&B account, authenticated one of two ways: run `wandb login` in your | ||
| shell before launching the notebook, or paste your key into the **W&B API | ||
| key** field in the form. Get your key from | ||
| [wandb.ai/authorize](https://wandb.ai/authorize). | ||
| - A W&B **Registry** must exist in your org, and your account needs at least | ||
| the **Member** role on it for the final linking step (linking an artifact is | ||
| a write action). The built-in Model registry is provisioned automatically in | ||
| newer orgs. If linking fails (for example, from a view-only seat), the | ||
| notebook surfaces a remediation message in the last Registry cell instead of | ||
| crashing. See | ||
| [configuring registry access](https://docs.wandb.ai/guides/registry/configure_registry/). | ||
| - GPU is optional. Defaults are tuned to finish in roughly two minutes on CPU. | ||
|
|
||
| ## Run | ||
|
|
||
| Use `uvx` with marimo's sandbox mode — it creates an isolated virtual | ||
| environment from the inline dependencies in the notebook: | ||
|
|
||
| ```bash | ||
| uvx marimo edit mnist_registry.py --sandbox | ||
| ``` | ||
|
|
||
| marimo opens in your browser. Adjust hyperparameters in the form, then click | ||
| **Train model** to start the run. The run URL appears inline as soon as | ||
| training begins. | ||
|
|
||
| If you prefer pip: | ||
|
|
||
| ```bash | ||
| pip install -r requirements.txt | ||
| marimo edit mnist_registry.py | ||
| ``` | ||
|
|
||
| The notebook is interactive-only by design: training is gated by a button | ||
| click, so `marimo run` renders the form but never starts training without | ||
| an explicit click. | ||
|
|
||
| ## What you get | ||
|
|
||
| After a successful run: | ||
|
|
||
| - A W&B run with training and test metrics, gradient histograms (`wandb.watch`), | ||
| and up to 16 example test-set predictions logged as images. | ||
| - A model Artifact named `mnist-cnn-<run-id>` of type `model` with metadata | ||
| for test accuracy, parameter count, dataset sizes, and the full | ||
| hyperparameter dict. Tagged with the `latest` alias. | ||
| - A version of that Artifact linked into the configured Registry collection | ||
| (default: `wandb-registry-model/MNIST Classifiers`). | ||
|
|
||
| To consume the registered model from another script or notebook: | ||
|
|
||
| ```python | ||
| import wandb | ||
| api = wandb.Api() | ||
| art = api.artifact("wandb-registry-model/MNIST Classifiers:latest") | ||
| art.download() # writes mnist_cnn.pt under ./artifacts/ | ||
| ``` | ||
|
|
||
| ## Design notes | ||
|
|
||
| - **Training is gated by a button.** marimo cells re-run reactively when their | ||
| inputs change. Before the first click of **Train model**, slider changes do | ||
| not start a run. After a run completes, clicking **Train model** again | ||
| starts a new run with the current form values; the previous run finishes | ||
| cleanly first. | ||
| - **`wandb.run` finishes defensively** at the top of the training cell so | ||
| a second click of **Train model** does not nest runs in the same marimo | ||
| kernel. | ||
| - **`logged.wait()` runs** after `log_artifact` and before `link_artifact` | ||
| to avoid a race where the link tries to resolve a version that has not | ||
| finished committing server-side. | ||
| - **Registry failures soft-fail.** If `link_artifact` raises — usually | ||
| because the Registry does not exist in your org — the notebook | ||
| surfaces remediation guidance through `mo.callout` rather than aborting. | ||
|
|
||
| ## Reference | ||
|
|
||
| The CNN architecture and training loop mirror | ||
| [`examples/pytorch/pytorch-cnn-mnist/main.py`](../../pytorch/pytorch-cnn-mnist/main.py). | ||
| The Registry linking pattern follows | ||
| [`colabs/wandb_registry/zoo_wandb.ipynb`](../../../colabs/wandb_registry/zoo_wandb.ipynb). |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.