diff --git a/.claude/commands/README.md b/.claude/commands/README.md
new file mode 100644
index 00000000000..3767dac987a
--- /dev/null
+++ b/.claude/commands/README.md
@@ -0,0 +1,49 @@
+# Claude Code slash commands for the mcp-server test suite
+
+Three AI-assisted workflows wrapping `mcp-server/run-tests.sh` and the meshtastic MCP tools. Each one has a twin in `.github/prompts/` for Copilot users.
+
+| Slash command         | What it does                                                              | Copilot equivalent                       |
+| --------------------- | ------------------------------------------------------------------------- | ---------------------------------------- |
+| `/test [args]`        | Runs the test suite (auto-detects hardware) and interprets failures       | `.github/prompts/mcp-test.prompt.md`     |
+| `/diagnose [role]`    | Read-only device health report via the meshtastic MCP tools               | `.github/prompts/mcp-diagnose.prompt.md` |
+| `/repro <test> [n=5]` | Re-runs one test N times, diffs firmware logs between passes and failures | `.github/prompts/mcp-repro.prompt.md`    |
+
+## Why two surfaces
+
+The Claude Code commands and Copilot prompts cover the same three workflows but each speaks its host's idiom:
+
+- **Claude Code** (`/test`) uses `$ARGUMENTS` for pass-through, has direct access to Bash + all MCP tools registered in the user's settings, and runs in the terminal context.
+- **Copilot** (`/mcp-test`) runs in VS Code's agent mode; it has terminal + MCP access too but typically asks the operator to confirm inputs interactively.
+
+A contributor using either IDE gets equivalent assistance. Keep the two in sync when behavior changes — the diff of intent should be minimal.
+
+## House rules
+
+- **No destructive writes without explicit operator approval.** Skills that could reflash, factory-reset, or reboot a device must describe the action and stop — the operator authorizes.
+- **Interpret failures, don't just echo them.** The skill body should pull firmware log lines from `mcp-server/tests/report.html` (the `Meshtastic debug` section, attached by `tests/conftest.py::pytest_runtest_makereport`) and classify the failure.
+- **Keep MCP tool calls sequential per port.** SerialInterface holds an exclusive port lock; two parallel tool calls on the same port deadlock.
+- **Never speculate about root cause.** If the evidence doesn't support a classification, say "unknown" and list what you'd need to disambiguate.
+
+## Adding a new command
+
+1. Write the Claude Code version at `.claude/commands/<name>.md` with YAML frontmatter:
+
+   ```yaml
+   ---
+   description: one-line purpose (used for auto-invocation by the model)
+   argument-hint: [optional-hint]
+   ---
+   ```
+
+2. Write the Copilot equivalent at `.github/prompts/mcp-<name>.prompt.md` with:
+
+   ```yaml
+   ---
+   mode: agent
+   description: ...
+   ---
+   ```
+
+3. Add the row to the table above. Cross-link in both bodies.
+
+4. Smoke-test on Claude Code first (`/<name>` should appear in autocomplete), then in VS Code Copilot (`/mcp-<name>` in Chat).
diff --git a/.claude/commands/diagnose.md b/.claude/commands/diagnose.md
new file mode 100644
index 00000000000..45aa937a5b7
--- /dev/null
+++ b/.claude/commands/diagnose.md
@@ -0,0 +1,55 @@
+---
+description: Produce a device health report using the meshtastic MCP tools (device_info, list_nodes, get_config, short serial log capture)
+argument-hint: [role=all|nrf52|esp32s3|<port>]
+---
+
+# `/diagnose` — device health report
+
+Call the meshtastic MCP tool bundle and format a structured health report for one or all detected devices. Zero guesswork for the operator.
+
+## What to do
+
+1. **Enumerate hardware.** Call `mcp__meshtastic__list_devices(include_unknown=True)`. For each entry where `likely_meshtastic=True`, capture `port`, `vid`, `pid`, `description`.
+
+2. **Filter by `$ARGUMENTS`**:
+   - No args, `all` → every likely-meshtastic device.
+   - `nrf52` → only devices with `vid == 0x239a`.
+   - `esp32s3` → only devices with `vid == 0x303a` or `vid == 0x10c4`.
+   - A `/dev/cu.*` path → only that one port.
+   - Anything else → treat as a substring match against the `port` string.
+
+3. **For each selected device, in sequence (NOT parallel — SerialInterface holds an exclusive port lock):**
+   - `mcp__meshtastic__device_info(port=<p>)` — captures `my_node_num`, `long_name`, `short_name`, `firmware_version`, `hw_model`, `region`, `num_nodes`, `primary_channel`.
+   - `mcp__meshtastic__list_nodes(port=<p>)` — count of peers, which ones have `publicKey` set, SNR/RSSI distribution.
+   - `mcp__meshtastic__get_config(section="lora", port=<p>)` — region, preset, channel_num, tx_power, hop_limit.
+   - Optionally, if the device seems unhappy (fails to connect, `num_nodes==1` when ≥2 are plugged in, missing firmware*version), open a short firmware log window: `mcp__meshtastic__serial_open(port=<p>, env=<inferred-env>)`, wait 3s, `serial_read(session_id=<s>, max_lines=100)`, `serial_close(session_id=<s>)`. The env should be inferred from the VID map in `mcp-server/run-tests.sh` (nrf52 → rak4631, esp32s3 → heltec-v3) unless `MESHTASTIC_MCP_ENV*<ROLE>` is set.
+
+4. **Render per-device report** as:
+
+   ```text
+   [nrf52 @ /dev/cu.usbmodem1101]      fw=2.7.23.bce2825, hw=RAK4631
+     owner       : Meshtastic 40eb / 40eb
+     region/band : US, channel 88, LONG_FAST
+     tx_power    : 30 dBm, hop_limit=3
+     peers       : 1 (esp32s3 0x433c2428, pubkey ✓, SNR 6.0 / RSSI -24 dBm)
+     primary ch  : McpTest
+     firmware    : no panics in last 3s; NodeInfoModule emitted 2 broadcasts
+   ```
+
+   Keep it scannable. If a field is missing or abnormal (no pubkey for a known peer, region=UNSET, num_nodes inconsistent with the hub), flag it inline with a short `⚠︎ <one-line reason>`.
+
+5. **Cross-device correlation** (only when >1 device is inspected):
+   - Do both sides see each other in `nodesByNum`? If one does and the other doesn't, that's asymmetric NodeInfo — flag it.
+   - Do the LoRa configs match? (region, channel_num, modem_preset should all agree; mismatch = no mesh)
+   - Do the primary channel NAMES match? Mismatch = different PSK = no decode.
+
+6. **Suggest next actions only for specific, recognisable failure modes**:
+   - Stale PKI pubkey one-way → "run `/test tests/mesh/test_direct_with_ack.py` — the retry + nodeinfo-ping heals this in the test path."
+   - Region mismatch → "re-bake one side via `./mcp-server/run-tests.sh --force-bake`."
+   - Device unreachable → point at touch_1200bps + the CP2102-wedged-driver note in run-tests.sh.
+
+## What NOT to do
+
+- No writes. No `set_config`, no `reboot`, no `factory_reset`. This is a read-only diagnostic skill — if the operator wants to change state, they'll ask explicitly.
+- No `flash` / `erase_and_flash`. Those are separate escalations.
+- No holding SerialInterface across tool calls — open, query, close; next device. The port lock is exclusive.
diff --git a/.claude/commands/repro.md b/.claude/commands/repro.md
new file mode 100644
index 00000000000..52dcf222b93
--- /dev/null
+++ b/.claude/commands/repro.md
@@ -0,0 +1,65 @@
+---
+description: Re-run a specific test N times in isolation to triage flakes, diff firmware logs between passes and failures
+argument-hint: <test-node-id> [count=5]
+---
+
+# `/repro` — flakiness triage for one test
+
+Re-run a single pytest node ID N times in isolation, track pass rate, and surface what's _different_ in the firmware logs between the passing attempts and the failing ones. Turns "it's flaky, I guess" into "it fails when X, passes when Y."
+
+## What to do
+
+1. **Parse `$ARGUMENTS`**: first token is the pytest node id (e.g. `tests/mesh/test_direct_with_ack.py::test_direct_with_ack_roundtrip[nrf52->esp32s3]`); second token is an integer count (default `5`, cap at `20`). If the first token doesn't look like a test path (no `::` and no `tests/` prefix), treat the whole `$ARGUMENTS` as a `-k` filter instead.
+
+2. **Sanity-check the hub first** (so we're not measuring "nothing plugged in" N times): call `mcp__meshtastic__list_devices`. If the test name contains `nrf52` or `esp32s3` and the matching VID isn't present, stop and report — re-running won't help.
+
+3. **Loop N times**. For each iteration:
+
+   ```bash
+   ./mcp-server/run-tests.sh <test-id> --tb=short -p no:cacheprovider
+   ```
+
+   Capture: exit code, duration, and (on failure) the `Meshtastic debug` firmware log section from `mcp-server/tests/report.html`. `-p no:cacheprovider` suppresses pytest's `.pytest_cache` writes so iterations don't influence each other.
+
+4. **Track a small structured tally**:
+
+   ```text
+   attempt 1: PASS (42s)
+   attempt 2: FAIL (128s)  ← firmware log 200-line tail captured
+   attempt 3: PASS (39s)
+   attempt 4: FAIL (121s)
+   attempt 5: PASS (41s)
+   --------------------------------------
+   pass rate: 3/5 (60%)   |   mean duration: 74s
+   ```
+
+5. **On mixed outcomes**: diff the firmware log tails between a representative passing attempt and a representative failing attempt. Focus on:
+   - Error-level lines only present in failures (`PKI_UNKNOWN_PUBKEY`, `Alloc an err=`, `Skip send`, `No suitable channel`)
+   - Timing around the assertion event — did a broadcast go out, was there an ACK, did NAK fire?
+   - Device state fields that changed (nodesByNum entries, region/preset, channel_num)
+
+   Surface the top 3 differences as a "passes when / fails when" table. Don't dump full logs — pull specific lines with uptime timestamps.
+
+6. **Classify the flake** into one of:
+   - **LoRa airtime collision** → pass rate improves with fewer concurrent transmitters; propose a `time.sleep` gap or retry bump in the test body.
+   - **PKI key staleness** → fails on first attempt, passes after self-heal; existing retry loop in `test_direct_with_ack.py` handles this.
+   - **NodeInfo cooldown** → `Skip send NodeInfo since we sent it <600s ago` in fail-only logs; needs `broadcast_nodeinfo_ping()` warmup.
+   - **Hardware-specific** (one direction fails, other passes; one device's firmware is older; driver wedged) → specific recovery pointer.
+   - **Genuinely unknown** → say so; don't invent a root cause.
+
+7. **Report back** with:
+   - Pass rate and mean duration.
+   - Classification + evidence (the specific log lines that support it).
+   - A suggested next step (re-run with specific args, open `/diagnose`, edit a specific test file, nothing).
+
+## Examples
+
+- `/repro tests/mesh/test_direct_with_ack.py::test_direct_with_ack_roundtrip[esp32s3->nrf52] 10` — runs 10 times, diffs firmware logs.
+- `/repro broadcast_delivers` — no `::`, no `tests/`, so interpreted as `-k broadcast_delivers`; runs every matching test the default 5 times.
+- `/repro tests/telemetry/test_device_telemetry_broadcast.py 3` — shorter run for a slow test.
+
+## Constraints
+
+- Don't exceed `count=20` per invocation — airtime and USB wear add up. If the user asks for 50, negotiate down.
+- Don't rebuild firmware as part of triage; flakes that only reproduce under different firmware belong in a separate session.
+- If the FIRST attempt fails AND the rest all pass, that's a classic "state leak from a prior test" → say so and suggest running with `--force-bake` or starting from a clean state rather than chasing the first failure.
diff --git a/.claude/commands/test.md b/.claude/commands/test.md
new file mode 100644
index 00000000000..986ee1f31f6
--- /dev/null
+++ b/.claude/commands/test.md
@@ -0,0 +1,42 @@
+---
+description: Run the mcp-server test suite (auto-detects devices) and interpret the results
+argument-hint: [pytest-args]
+---
+
+# `/test` — mcp-server test runner with interpretation
+
+Run `mcp-server/run-tests.sh` and make sense of the output so the operator doesn't have to.
+
+## What to do
+
+1. **Invoke the wrapper.** From the firmware repo root, run:
+
+   ```bash
+   ./mcp-server/run-tests.sh $ARGUMENTS
+   ```
+
+   The wrapper auto-detects connected Meshtastic devices, maps each to its PlatformIO env, exports the required `MESHTASTIC_MCP_ENV_*` env vars, and invokes pytest. If the user passed no arguments, the wrapper supplies a sensible default set (`tests/ --html=tests/report.html --self-contained-html --junitxml=tests/junit.xml -v --tb=short`). A `--report-log=tests/reportlog.jsonl` arg is always appended (unless the operator passed their own). `--assume-baked` is deliberately NOT in the defaults — `test_00_bake.py` has its own skip-if-already-baked check and runs the ~8 s verification by default. Operators can opt into the fast path with `--assume-baked`, or force a reflash with `--force-bake`.
+
+2. **Read the pre-flight header.** First ~6 lines print the detected hub (role → port → env). If that line reads `detected hub : (none)`, the wrapper will narrow to `tests/unit` only — say so explicitly in your summary so the operator knows hardware tiers were skipped.
+
+3. **On pass**: one-line summary of the form `N passed, M skipped in <duration>`. Don't enumerate the 52 test names — the user can read those. Do mention if any test was SKIPPED for a NON-placeholder reason (e.g. "role not present on hub" is worth flagging).
+
+4. **On failure**: for every FAILED test, open `mcp-server/tests/report.html` and extract the `Meshtastic debug` section for that test. pytest-html embeds the firmware log stream + device state dump there; the 200-line firmware log tail is usually enough to explain the failure. Summarise: which test, one-line assertion message, the firmware log lines that matter (things like `PKI_UNKNOWN_PUBKEY`, `Skip send NodeInfo`, `Error=`, `Guru Meditation`, `assertion failed`).
+
+5. **Classify the failure** as one of:
+   - **Transient/flake**: LoRa collision, timing-sensitive assertion, first-attempt NAK + successful retry pattern. Propose `/repro <test_node_id>` to confirm.
+   - **Environmental**: device unreachable, port busy, CP2102 driver wedged. Suggest the specific recovery (replug USB, `touch_1200bps`, check `git status userPrefs.jsonc`).
+   - **Regression**: same assertion fails repeatedly, firmware log shows a new/unusual error. Surface the diff between expected and observed, identify the module likely responsible.
+
+6. **Never run destructive recovery automatically.** If a failure looks like it needs a reflash, factory*reset, or USB replug, \_describe what to do* — don't execute. The operator decides.
+
+## Arguments handling
+
+- No args → wrapper's defaults (full suite).
+- `$ARGUMENTS` passed verbatim to the wrapper, which passes them to pytest.
+- Common operator invocations: `/test tests/mesh`, `/test tests/mesh/test_direct_with_ack.py::test_direct_with_ack_roundtrip`, `/test --force-bake`, `/test -k telemetry`.
+
+## Side-effects to mention in summary
+
+- The session fixture snapshots `userPrefs.jsonc` at session start and restores at teardown (plus on `atexit`). After a clean run, `git status userPrefs.jsonc` should be empty. If the wrapper's pre-flight printed a warning about a stale sidecar, call that out — means a prior session crashed.
+- `mcp-server/tests/report.html` and `junit.xml` are regenerated on every run; the HTML is self-contained (shareable).
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 24e11bd4ddb..d12244229e6 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -429,6 +429,8 @@ Most workflows can be triggered manually via `workflow_dispatch` for testing.
 
 ## Testing
 
+### Native unit tests (C++)
+
 Unit tests in `test/` directory with 12 test suites:
 
 - `test_crypto/` - Cryptography
@@ -446,6 +448,164 @@ Run with: `pio test -e native`
 
 Simulation testing: `bin/test-simulator.sh`
 
+### Hardware-in-the-loop tests (`mcp-server/tests/`)
+
+Separate pytest suite that exercises real USB-connected Meshtastic devices. See the **MCP Server & Hardware Test Harness** section below for invocation, tier layout, and agent usage rules.
+
+## MCP Server & Hardware Test Harness
+
+The `mcp-server/` directory houses a firmware-aware [MCP](https://modelcontextprotocol.io/) server plus a pytest-based integration suite. AI agents that speak MCP get a well-defined tool surface for flashing, configuring, and inspecting physical Meshtastic devices — use it instead of hand-rolling `pio` or `meshtastic --port` calls where possible. `mcp-server/README.md` is the operator-facing setup doc; this section is the agent-facing usage contract.
+
+The repo registers the server via `.mcp.json` at the repo root — Claude Code picks it up automatically once `mcp-server/.venv/` is built (`cd mcp-server && python3 -m venv .venv && .venv/bin/pip install -e '.[test]'`).
+
+### When to use which surface
+
+| Goal                                              | Tool                                                                                                             |
+| ------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------- |
+| Find a connected device                           | `mcp__meshtastic__list_devices`                                                                                  |
+| Read a live node's config/state                   | `mcp__meshtastic__device_info`, `list_nodes`, `get_config`                                                       |
+| Mutate a device (owner, region, channels, reboot) | `set_owner`, `set_config`, `set_channel_url`, `reboot`, `shutdown`, `factory_reset` — all require `confirm=True` |
+| Flash firmware to a variant                       | `pio_flash` (any arch) or `erase_and_flash` (ESP32 factory install)                                              |
+| Stream serial logs while debugging                | `serial_open` → `serial_read` loop → `serial_close`                                                              |
+| Administer `userPrefs.jsonc` build-time constants | `userprefs_get`, `userprefs_set`, `userprefs_reset`, `userprefs_manifest`                                        |
+| Run the regression suite                          | `./mcp-server/run-tests.sh` (or `/test` slash command)                                                           |
+| Diagnose a specific device                        | `/diagnose [role]` slash command (read-only)                                                                     |
+| Triage a flaky test                               | `/repro <node-id> [count]` slash command                                                                         |
+
+**One MCP call per port at a time.** `SerialInterface` holds an exclusive OS-level lock on the serial port for its lifetime. If a `serial_*` session is open on `/dev/cu.usbmodem101`, calling `device_info` on the same port will fail fast pointing at the active session. Sequence calls: open → read/mutate → close, then next device. Never parallelize tool calls on the same port.
+
+### MCP tool surface (~32 tools)
+
+Grouped by purpose. Full argument shapes in `mcp-server/README.md`; a few high-value signatures are called out here.
+
+- **Discovery & metadata**: `list_devices`, `list_boards`, `get_board`
+- **Build & flash**: `build`, `clean`, `pio_flash`, `erase_and_flash` (ESP32 only), `update_flash` (ESP32 OTA), `touch_1200bps`
+- **Serial sessions** (long-running, 10k-line ring buffer): `serial_open`, `serial_read`, `serial_list`, `serial_close`
+- **Device reads**: `device_info`, `list_nodes`
+- **Device writes** (all require `confirm=True`): `set_owner`, `get_config`, `set_config`, `get_channel_url`, `set_channel_url`, `send_text`, `reboot`, `shutdown`, `factory_reset`, `set_debug_log_api`
+- **userPrefs admin** (build-time constants, not runtime config): `userprefs_get`, `userprefs_set`, `userprefs_reset`, `userprefs_manifest`, `userprefs_testing_profile`
+- **Vendor escape hatches**: `esptool_chip_info`, `esptool_erase_flash`, `esptool_raw`, `nrfutil_dfu`, `nrfutil_raw`, `picotool_info`, `picotool_load`, `picotool_raw`
+
+`confirm=True` is a tool-level gate on top of whatever permission prompt your MCP host shows. **Don't bypass it** by asking the host to auto-approve — it exists specifically because MCP hosts sometimes remember "always allow this tool" and that's dangerous for `factory_reset` and `erase_and_flash`.
+
+### Hardware test suite (`mcp-server/run-tests.sh`)
+
+The wrapper auto-detects connected devices (VID → role map: `0x239A` → `nrf52`, `0x303A`/`0x10C4` → `esp32s3`), maps each role to a PlatformIO env (`nrf52` → `rak4631`, `esp32s3` → `heltec-v3`, overridable via `MESHTASTIC_MCP_ENV_<ROLE>`), then invokes pytest. Zero pre-flight config needed from the operator.
+
+Suite tiers (collected + run in this order via `pytest_collection_modifyitems`):
+
+1. `tests/unit/` — pure Python (boards parse, pio wrapper, userPrefs parse, testing profile). No hardware.
+2. `tests/test_00_bake.py` — flashes each detected device with current `userPrefs.jsonc` merged with the session's test profile. Has its own skip-if-already-baked check comparing region + primary channel to the session profile; skips cheaply on warm devices.
+3. `tests/mesh/` — multi-device mesh: bidirectional send, broadcast delivery, direct-with-ACK, mesh formation within 60s. Parametrized `[nrf52->esp32s3]` and `[esp32s3->nrf52]`.
+4. `tests/telemetry/` — `DEVICE_METRICS_APP` broadcast timing.
+5. `tests/monitor/` — boot-log panic check.
+6. `tests/fleet/` — PSK seed session isolation.
+7. `tests/admin/` — channel URL roundtrip, owner persistence across reboot.
+8. `tests/provisioning/` — region + modem + slot bake, admin key presence, `UNSET` region blocks TX, userPrefs survive factory reset.
+
+Invocation patterns:
+
+```bash
+./mcp-server/run-tests.sh                                        # full suite (auto-bake-if-needed)
+./mcp-server/run-tests.sh --force-bake                           # reflash before testing
+./mcp-server/run-tests.sh --assume-baked                         # skip bake (caller vouches for device state)
+./mcp-server/run-tests.sh tests/mesh                             # one tier
+./mcp-server/run-tests.sh tests/mesh/test_direct_with_ack.py     # one file
+./mcp-server/run-tests.sh -k telemetry                           # name filter
+```
+
+**No hardware detected?** The wrapper auto-narrows to `tests/unit/` only and prints `detected hub : (none)` in the pre-flight header. Agents interpreting the output should call this out explicitly — a 52-test green run without hardware is qualitatively different from a 12-unit-test green run.
+
+**Artifacts every run produces:**
+
+- `mcp-server/tests/report.html` — self-contained pytest-html. Each test gets a `Meshtastic debug` section with the tail of firmware log + device state dump. **Open this first** on failures; it's the canonical evidence source.
+- `mcp-server/tests/junit.xml` — CI-parseable.
+- `mcp-server/tests/reportlog.jsonl` — pytest-reportlog stream (`$report_type` keyed JSONL). Consumed by the live TUI.
+- `mcp-server/tests/fwlog.jsonl` — firmware log mirror from the `meshtastic.log.line` pubsub topic. Populated by the `_firmware_log_stream` autouse session fixture.
+
+### Live TUI (`meshtastic-mcp-test-tui`)
+
+A Textual-based live view that wraps `run-tests.sh`. Tails reportlog for per-test state, streams firmware logs, polls device state at startup + post-run (gated out of the active run because `hub_devices` holds exclusive port locks). Key bindings:
+
+| Key | Action                                                                                                       |
+| --- | ------------------------------------------------------------------------------------------------------------ |
+| `r` | re-run focused test (leaf → that node id; internal node → directory or `-k`)                                 |
+| `f` | filter tree by substring                                                                                     |
+| `d` | failure detail modal (pulls `longrepr` + captured stdout from the reportlog)                                 |
+| `g` | export reproducer bundle (tar.gz with README, test_report.json, time-filtered fwlog, devices.json, env.json) |
+| `l` | toggle firmware log pane                                                                                     |
+| `x` | tool coverage modal                                                                                          |
+| `c` | cross-run history sparkline                                                                                  |
+| `q` | quit (SIGINT → SIGTERM → SIGKILL escalation, 5-s windows each)                                               |
+
+Launch:
+
+```bash
+cd mcp-server
+.venv/bin/meshtastic-mcp-test-tui                 # full suite
+.venv/bin/meshtastic-mcp-test-tui tests/mesh      # args pass through to pytest
+```
+
+The plain CLI stays primary; the TUI is for operators who want a live dashboard. Both consume the same `run-tests.sh`.
+
+### Slash commands (Claude Code + Copilot)
+
+Three AI-assisted workflows wrap the test harness. Claude Code operators get `/test`, `/diagnose`, `/repro`; Copilot operators get `/mcp-test`, `/mcp-diagnose`, `/mcp-repro`. Bodies:
+
+- `.claude/commands/{test,diagnose,repro}.md`
+- `.github/prompts/mcp-{test,diagnose,repro}.prompt.md`
+
+`.claude/commands/README.md` is the index.
+
+House rules for agents running these prompts:
+
+- **Interpret failures, don't just echo them.** Pull firmware log tails from `report.html` and classify each failure as transient / environmental / regression. Use the exact format in `.claude/commands/test.md`.
+- **No destructive writes without operator approval.** Any skill that could reflash, factory-reset, or reboot a device must describe the action and stop. The operator authorizes.
+- **Sequential MCP calls per port.** See above.
+- **"Unknown" is a valid classification.** If evidence doesn't support a root cause, say so and list what would disambiguate. Do not invent.
+
+### Key fixtures (test authors + agents debugging)
+
+`mcp-server/tests/conftest.py` provides:
+
+- **`_session_userprefs`** (autouse session) — snapshots `userPrefs.jsonc` at session start, merges the session test profile via `userprefs.merge_active(test_profile)`, restores at teardown. Four layers of safety: pytest teardown + `atexit` + sidecar file (`userPrefs.jsonc.mcp-session-bak`) + startup self-heal in `run-tests.sh`. **Do not edit `userPrefs.jsonc` from inside a test.**
+- **`_firmware_log_stream`** (autouse session) — subscribes to `meshtastic.log.line` pubsub on every connected `SerialInterface` and mirrors lines to `tests/fwlog.jsonl`. Drives the TUI firmware-log pane.
+- **`_debug_log_buffer`** (autouse per-test) — captures last 200 firmware log lines + device state for attachment to the pytest-html `Meshtastic debug` section on failure.
+- **`hub_devices`** (session) — `dict[role, SerialInterface]` with session-long exclusive port locks. Reason the TUI's device poller is gated to startup + post-run only.
+- **`baked_mesh`** — parametrized mesh-pair fixture; depends on `test_00_bake`. `pytest_generate_tests` in `conftest.py` auto-generates `[nrf52->esp32s3]` and `[esp32s3->nrf52]` variants.
+- **`test_profile`** — session-scoped dict: region, primary channel, admin key, PSK seed. Derived from `MESHTASTIC_MCP_SEED` (defaults to `mcp-<user>-<host>`).
+
+### Firmware integration points tied to the test harness
+
+Two firmware changes exist specifically so the test harness works reliably. **Keep these in mind when touching related code.**
+
+- **`src/mesh/StreamAPI.cpp` + `StreamAPI.h`** — `emitLogRecord` uses a dedicated `fromRadioScratchLog` + `txBufLog` pair and a `concurrency::Lock streamLock`. Before this fix, `debug_log_api_enabled=true` would tear `FromRadio` protobufs on the serial transport because `emitTxBuffer` and `emitLogRecord` shared a single scratch buffer. The conftest enables the log stream session-wide; without this fix the device would corrupt its own FromRadio replies mid-session.
+- **`src/mesh/PhoneAPI.cpp`** — `ToRadio` `Heartbeat(nonce=1)` triggers `nodeInfoModule->sendOurNodeInfo(NODENUM_BROADCAST, true, 0, true)` for serial clients, mirroring the pre-existing behavior for TCP/UDP clients in `PacketAPI.cpp`. The mesh tests rely on this to force a NodeInfo broadcast right after connect so the peer discovers them before the test's first assertion.
+
+If you're modifying `StreamAPI`, `PhoneAPI`, `NodeInfoModule`, or `userPrefs` flow, run `./mcp-server/run-tests.sh` at minimum before asking for review.
+
+### Recovery playbooks
+
+| Symptom                                                    | First check                                                   | Fix                                                                                                                                                                        |
+| ---------------------------------------------------------- | ------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `userPrefs.jsonc` dirty after test run                     | `git status --porcelain userPrefs.jsonc`                      | If non-empty, re-run `./mcp-server/run-tests.sh` once — the pre-flight self-heal restores from sidecar. If still dirty, `git checkout userPrefs.jsonc`.                    |
+| Port busy / wedged CP2102 on macOS                         | `lsof /dev/cu.usbserial-0001`                                 | Kill the holder. USB replug if the kernel still reports busy. Often a stale `pio device monitor` or zombie `meshtastic_mcp` process.                                       |
+| nRF52 appears unresponsive                                 | `list_devices` shows VID `0x239A` but `device_info` times out | `touch_1200bps(port=...)` drops it into the DFU bootloader → `pio_flash` re-installs.                                                                                      |
+| Multiple MCP server processes                              | `ps aux \| grep meshtastic_mcp` shows >1                      | Kill all but the one your MCP host spawned. Zombies hold ports and break tests.                                                                                            |
+| Mesh formation fails, one side sees peer but other doesn't | `/diagnose` (or `list_nodes` on both sides)                   | Asymmetric NodeInfo. `test_direct_with_ack` has a heal path; `/repro` it a few times. If persistent, both devices' clocks may be out of sync with their NodeInfo cooldown. |
+| "role not present on hub" in skip reasons                  | `list_devices`                                                | Expected if a device is unplugged. Reconnect before re-running the tier.                                                                                                   |
+| Tests fail only on first attempt then pass on rerun        | —                                                             | State leak from a prior session. Run with `--force-bake` to reset to a known state.                                                                                        |
+
+### Never do these without asking
+
+- `factory_reset` — wipes node identity; regenerates PKI keypair. Mesh peers will reject old DMs until re-exchange. Legitimate only when the operator explicitly wants it.
+- `erase_and_flash` — full chip erase; destroys all on-device state.
+- `esptool_erase_flash` / `esptool_raw` write/erase — bypasses pio's safety chain.
+- `set_config` on `lora.region` — changes regulatory domain; requires physical-location context the operator has and the agent doesn't.
+- `reboot` / `shutdown` mid-test — breaks fixture invariants.
+- `push -f`, `rebase -i`, `reset --hard`, or any history-rewriting git operation.
+- Clicking computer-use tools on web links in Mail/Messages/PDFs — open URLs via the claude-in-chrome MCP so the extension's link-safety checks apply.
+
 ## Resources
 
 - [Documentation](https://meshtastic.org/docs/)
diff --git a/.github/prompts/mcp-diagnose.prompt.md b/.github/prompts/mcp-diagnose.prompt.md
new file mode 100644
index 00000000000..c86826030d9
--- /dev/null
+++ b/.github/prompts/mcp-diagnose.prompt.md
@@ -0,0 +1,57 @@
+---
+mode: agent
+description: Device health report via the meshtastic MCP tools (Copilot equivalent of the Claude Code /diagnose slash command)
+---
+
+# `/mcp-diagnose` — device health report
+
+Equivalent of `.claude/commands/diagnose.md`. Use when the operator asks to "check the devices", "what's the mesh looking like", "is nrf52 alive", etc.
+
+This prompt assumes the meshtastic MCP server is registered with your VS Code Copilot agent. If it isn't, fall back to running `./mcp-server/run-tests.sh tests/unit` plus a short `device_info` script via the terminal.
+
+## What to do
+
+1. **Enumerate hardware** via the `list_devices` MCP tool (with `include_unknown=True`). For each entry where `likely_meshtastic=True`, capture `port`, `vid`, `pid`, `description`.
+
+2. **Apply the operator's filter** (if any):
+   - No filter → every likely-meshtastic device.
+   - `nrf52` → `vid == 0x239a`
+   - `esp32s3` → `vid == 0x303a` or `vid == 0x10c4`
+   - A `/dev/cu.*` path → only that port.
+   - Anything else → substring match on port.
+
+3. **For each selected device, in sequence (don't parallelize — SerialInterface holds an exclusive port lock):**
+   - `device_info(port=<p>)` → `my_node_num`, `long_name`, `short_name`, `firmware_version`, `hw_model`, `region`, `num_nodes`, `primary_channel`
+   - `list_nodes(port=<p>)` → peer count, which peers have `publicKey`, SNR/RSSI distribution
+   - `get_config(section="lora", port=<p>)` → region, preset, channel_num, tx_power, hop_limit
+   - If anything looks off (can't connect, `num_nodes` wrong, missing `firmware_version`), open a short firmware-log window: `serial_open(port=<p>, env=<inferred>)`, wait 3 seconds, `serial_read(session_id, max_lines=100)`, `serial_close(session_id)`. Infer env from VID (0x239a → `rak4631`, 0x303a/0x10c4 → `heltec-v3`) unless an `MESHTASTIC_MCP_ENV_<ROLE>` env var overrides it.
+
+4. **Render per-device report** as a compact block:
+
+   ```text
+   [nrf52 @ /dev/cu.usbmodem1101]      fw=2.7.23.bce2825, hw=RAK4631
+     owner       : Meshtastic 40eb / 40eb
+     region/band : US, channel 88, LONG_FAST
+     tx_power    : 30 dBm, hop_limit=3
+     peers       : 1 (esp32s3 0x433c2428, pubkey ✓, SNR 6.0 / RSSI -24 dBm)
+     primary ch  : McpTest
+     firmware    : no panics in last 3s
+   ```
+
+   Flag abnormalities inline with `⚠︎ <short reason>` — missing pubkey on a known peer, region UNSET, mismatched channel name, etc.
+
+5. **Cross-device correlation** (when >1 device selected):
+   - Do both see each other in `nodesByNum`?
+   - Do `region`, `channel_num`, `modem_preset` match across devices?
+   - Do the primary channel names match? (Different name → different PSK → no decode.)
+
+6. **Suggest next steps only for recognizable failure modes**, never speculatively:
+   - Stale PKI one-way → "`/mcp-test tests/mesh/test_direct_with_ack.py` — the test's retry+nodeinfo-ping heals this."
+   - Region mismatch → "re-bake one side via `./mcp-server/run-tests.sh --force-bake`."
+   - Device unreachable → refer operator to the touch_1200bps + CP2102-wedged-driver notes in `run-tests.sh`.
+
+## Hard constraints
+
+- **Read-only.** No `set_config`, no `reboot`, no `factory_reset`, no `flash`. If the operator wants mutation, they'll escalate explicitly.
+- **Open/query/close per device.** Never hold multiple SerialInterfaces to the same port. The port lock is exclusive.
+- **Don't infer env beyond the VID map** — if the operator has an unusual board, ask them which env to use rather than guessing.
diff --git a/.github/prompts/mcp-repro.prompt.md b/.github/prompts/mcp-repro.prompt.md
new file mode 100644
index 00000000000..be2963c3318
--- /dev/null
+++ b/.github/prompts/mcp-repro.prompt.md
@@ -0,0 +1,67 @@
+---
+mode: agent
+description: Re-run a specific test N times to triage flakes; diff firmware logs between passes and failures (Copilot equivalent of the Claude Code /repro slash command)
+---
+
+# `/mcp-repro` — flakiness triage for one test
+
+Equivalent of `.claude/commands/repro.md`. Use when the operator says "that one test is flaky — dig in", "repro the direct_with_ack failure", "why does X sometimes fail?".
+
+## What to do
+
+1. **Parse the operator's input** into two pieces:
+   - **Test identifier** — either a pytest node id (has `::` or starts with `tests/`) or a `-k`-style filter (plain substring like `direct_with_ack`).
+   - **Count** — integer, default `5`, cap at `20`. If the operator asks for 50, negotiate down and explain (airtime + USB wear).
+
+2. **Sanity-check the hub** via the `list_devices` MCP tool. If the test name references `nrf52` or `esp32s3` and the matching VID isn't present, stop and report — re-running won't help.
+
+3. **Loop** N times. Each iteration:
+
+   ```bash
+   ./mcp-server/run-tests.sh <test-id> --tb=short -p no:cacheprovider
+   ```
+
+   `-p no:cacheprovider` keeps pytest from caching anything between iterations. Capture: exit code, duration, and (on failure) the `Meshtastic debug` firmware-log section from `mcp-server/tests/report.html`.
+
+4. **Tally** results as you go:
+
+   ```text
+   attempt 1: PASS (42s)
+   attempt 2: FAIL (128s)    ← fw log captured
+   attempt 3: PASS (39s)
+   attempt 4: FAIL (121s)
+   attempt 5: PASS (41s)
+   --------------------------------------------------
+   pass rate: 3/5 (60%)  |  mean duration: 74s
+   ```
+
+5. **On mixed outcomes, diff the firmware logs** between one representative pass and one representative fail. Focus on:
+   - Error-level lines present only in failures (`PKI_UNKNOWN_PUBKEY`, `Alloc an err=`, `Skip send`, `No suitable channel`, `NAK`)
+   - Timing around the assertion point (broadcast sent? ACK received? retry fired?)
+   - Device-state fields that changed between attempts
+
+   Surface the top 3 differences as a compact "passes when / fails when" table with uptime timestamps. Don't dump full logs.
+
+6. **Classify** the flake into one of:
+   - **LoRa airtime collision** — pass rate improves with fewer concurrent transmitters. Suggest a `time.sleep` gap or retry bump in the test body.
+   - **PKI key staleness** — first attempt fails, subsequent ones pass; existing retry-loop pattern in `test_direct_with_ack.py` is the fix.
+   - **NodeInfo cooldown** — `Skip send NodeInfo since we sent it <600s ago` in fail-only logs; needs a `broadcast_nodeinfo_ping()` warmup.
+   - **Hardware-specific** — one direction consistently fails, firmware versions differ, CP2102 driver wedged, etc.
+   - **Unknown** — say so. Don't invent a root cause.
+
+7. **Report back** with:
+   - Pass rate + mean duration.
+   - Classification + the specific log evidence for it.
+   - A concrete next step (tighter assertion, more retries, open `/mcp-diagnose`, file a bug, nothing).
+
+## Examples
+
+- `tests/mesh/test_direct_with_ack.py::test_direct_with_ack_roundtrip[esp32s3->nrf52] 10` — 10 runs of that parametrized case.
+- `broadcast_delivers` — no `::`, no `tests/`; treat as `-k broadcast_delivers`; runs every match 5 times.
+- `tests/telemetry/test_device_telemetry_broadcast.py 3` — shorter count for a slow test.
+
+## Notes
+
+- If the FIRST attempt fails and the rest pass, that's a state-leak signature — suggest starting from `--force-bake` or a clean device state rather than chasing the first-failure firmware logs.
+- If ALL N fail, this isn't a flake — it's a regression. Say so, stop iterating, escalate to `/mcp-test` for full-suite context.
+- Don't rebuild firmware during triage. Flakes that only reproduce under different firmware belong in a separate session with a plan.
diff --git a/.github/prompts/mcp-test.prompt.md b/.github/prompts/mcp-test.prompt.md
new file mode 100644
index 00000000000..092ad3d856c
--- /dev/null
+++ b/.github/prompts/mcp-test.prompt.md
@@ -0,0 +1,51 @@
+---
+mode: agent
+description: Run the mcp-server test suite and interpret results (Copilot equivalent of the Claude Code /test slash command)
+---
+
+# `/mcp-test` — mcp-server test runner with interpretation
+
+Equivalent of the Claude Code `/test` slash command in `.claude/commands/test.md`. Use this when the operator asks you to "run the tests", "check the mcp test suite", "run the mesh tests", etc.
+
+## What to do
+
+1. **Invoke the wrapper** from the firmware repo root:
+
+   ```bash
+   ./mcp-server/run-tests.sh [pytest-args]
+   ```
+
+   If the operator specified a subset (e.g. "just the mesh tests"), pass it through as `tests/mesh` or a pytest `-k filter`. If they said nothing, use the wrapper's defaults (full suite with pytest-html report).
+
+   The wrapper auto-detects connected Meshtastic devices, maps each to its PlatformIO env, exports the required env vars, and invokes pytest. Zero pre-flight config needed from the operator.
+
+2. **Read the pre-flight header** (first few lines of wrapper output). The `detected hub :` line lists role → port → env mappings. If it reads `(none)`, the wrapper narrowed to `tests/unit` only — call that out explicitly so the operator knows hardware tiers were skipped.
+
+3. **On pass**: one-line summary like `N passed, M skipped in <duration>`. Don't enumerate test names. DO mention any non-placeholder SKIPs (things like "role not present on hub") because they indicate missing hardware or setup issues.
+
+4. **On failure**: open `mcp-server/tests/report.html` (pytest-html output, self-contained) and extract the `Meshtastic debug` section for each failed test. That section includes a firmware log stream (last 200 lines) and device state dump. For each failure, summarise:
+   - test name
+   - one-line assertion message
+   - the specific firmware log lines that explain why (look for `PKI_UNKNOWN_PUBKEY`, `Skip send NodeInfo`, `Error=`, `Guru Meditation`, `assertion failed`, `No suitable channel`)
+
+5. **Classify each failure** as one of:
+   - **Transient flake** — LoRa collision, first-attempt NAK with self-heal pattern, timing-sensitive assertion. Suggest `/mcp-repro <test-id>` to confirm.
+   - **Environmental** — device unreachable, port busy, CP2102 driver wedged on macOS. Suggest specific recovery (USB replug, `touch_1200bps`, `git status userPrefs.jsonc`).
+   - **Regression** — same assertion fails repeatedly on re-runs, firmware log shows novel errors. Identify the firmware module likely responsible.
+
+6. **Do NOT run destructive recovery automatically**. If a failure looks like it needs a reflash, factory*reset, or replug — \_describe the steps* and let the operator decide. Never burn airtime or flash cycles without approval.
+
+## Arguments convention
+
+Operators generally invoke this prompt either with no arguments (full suite) or with a specific subset. Examples:
+
+- `tests/mesh` — one tier
+- `tests/mesh/test_direct_with_ack.py::test_direct_with_ack_roundtrip` — one test
+- `--force-bake` — reflash devices first
+- `-k telemetry` — name-filter
+
+## Side-effects to confirm in your summary
+
+- `userPrefs.jsonc` should be clean after a successful run. The session fixture in `mcp-server/tests/conftest.py` (`_session_userprefs`) snapshots and restores. Check `git status --porcelain userPrefs.jsonc` and report if it's non-empty.
+- `mcp-server/tests/report.html` and `junit.xml` regenerate on every run.
+- The wrapper prints a warning if a `.mcp-session-bak` sidecar was left over from a crashed prior session and auto-restores from it — mention that if it happened.
diff --git a/.gitignore b/.gitignore
index 43cee78db73..f1eb9d852d7 100644
--- a/.gitignore
+++ b/.gitignore
@@ -54,3 +54,5 @@ CMakeLists.txt
 
 # PYTHONPATH used by the Nix shell
 .python3
+.claude/scheduled_tasks.lock
+userPrefs.jsonc.mcp-session-bak
diff --git a/.mcp.json b/.mcp.json
new file mode 100644
index 00000000000..c5cf2e55e5a
--- /dev/null
+++ b/.mcp.json
@@ -0,0 +1,11 @@
+{
+  "mcpServers": {
+    "meshtastic": {
+      "command": "./mcp-server/.venv/bin/python",
+      "args": ["-m", "meshtastic_mcp"],
+      "env": {
+        "MESHTASTIC_FIRMWARE_ROOT": "."
+      }
+    }
+  }
+}
diff --git a/.trunk/configs/.bandit b/.trunk/configs/.bandit
index d286ded8974..c70e7743b67 100644
--- a/.trunk/configs/.bandit
+++ b/.trunk/configs/.bandit
@@ -1,2 +1,28 @@
 [bandit]
-skips = B101
\ No newline at end of file
+# Rule IDs: https://bandit.readthedocs.io/en/latest/plugins/index.html
+#
+# B101 assert_used
+#   pytest assertions + internal invariants; required for pytest.
+# B110 try_except_pass
+#   best-effort cleanup paths (atexit handlers, pubsub unsubscribe,
+#   session-end file close, socket shutdown). Logging inside the
+#   except block would be worse than the silent pass — teardown is
+#   already at end-of-session and the surrounding caller has context.
+# B112 try_except_continue
+#   defensive loops over flaky sources (pubsub handlers, device
+#   re-enumeration polls). One failed iteration shouldn't abort the loop.
+# B404 import_subprocess
+#   mcp-server wraps PlatformIO, esptool, nrfutil, picotool, and the
+#   pytest test-runner — subprocess is a load-bearing import here, not
+#   a smell. The "consider possible security implications" advisory is
+#   redundant given the file-level review already applied.
+# B603 subprocess_without_shell_equals_true
+#   all subprocess calls use a static argv list; `shell=False` is the
+#   default and we never string-interpolate user input into the command.
+# B606 start_process_with_no_shell
+#   same invariant as B603 — running a binary via argv list (not
+#   `shell=True`) is the safe pattern bandit is asking for.
+#
+# Higher-severity checks (B102 exec_used, B301 pickle, B307 eval,
+# B602 shell=True, etc.) remain enabled.
+skips = B101,B110,B112,B404,B603,B606
\ No newline at end of file
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 00000000000..cd043c08787
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,113 @@
+# Agent instructions
+
+This repository is the [Meshtastic](https://meshtastic.org) firmware — a C++17 embedded codebase targeting ESP32 / nRF52 / RP2040 / STM32WL / Linux-Portduino LoRa mesh radios — plus a Python MCP server in `mcp-server/` that AI agents use to flash, configure, and test connected devices.
+
+## Primary instruction file
+
+**Read `.github/copilot-instructions.md` first.** That file is the canonical agent-facing document for this repo. It covers project layout, coding conventions (naming, module framework, Observer pattern, thread safety), the build system, CI/CD, the native C++ test suite, and — most importantly for automation work — the **MCP Server & Hardware Test Harness** section. Read it top-to-bottom before starting any non-trivial change.
+
+This file (`AGENTS.md`) is a short pointer + quick reference for agents that don't read `.github/copilot-instructions.md` by default.
+
+## Quick command reference
+
+| Action                           | Command                                                                             |
+| -------------------------------- | ----------------------------------------------------------------------------------- |
+| Build a firmware variant         | `pio run -e <env>` (e.g. `pio run -e rak4631`, `pio run -e heltec-v3`)              |
+| Clean + rebuild                  | `pio run -e <env> -t clean && pio run -e <env>`                                     |
+| Flash a device                   | `pio run -e <env> -t upload --upload-port <port>` (or use the `pio_flash` MCP tool) |
+| Run firmware unit tests (native) | `pio test -e native`                                                                |
+| Run MCP hardware tests           | `./mcp-server/run-tests.sh`                                                         |
+| Live TUI test runner             | `mcp-server/.venv/bin/meshtastic-mcp-test-tui`                                      |
+| Format before commit             | `trunk fmt`                                                                         |
+| Regenerate protobuf bindings     | `bin/regen-protos.sh`                                                               |
+| Generate CI matrix               | `./bin/generate_ci_matrix.py all [--level pr]`                                      |
+
+## MCP server (device + test automation)
+
+The `mcp-server/` package exposes ~32 MCP tools for device discovery, building, flashing, serial monitoring, and live-node administration. Tools are grouped as:
+
+- **Discovery**: `list_devices`, `list_boards`, `get_board`
+- **Build & flash**: `build`, `clean`, `pio_flash`, `erase_and_flash` (ESP32 factory), `update_flash` (ESP32 OTA), `touch_1200bps`
+- **Serial sessions**: `serial_open`, `serial_read`, `serial_list`, `serial_close`
+- **Device reads**: `device_info`, `list_nodes`
+- **Device writes** (require `confirm=True`): `set_owner`, `get_config`, `set_config`, `get_channel_url`, `set_channel_url`, `send_text`, `reboot`, `shutdown`, `factory_reset`, `set_debug_log_api`
+- **userPrefs admin**: `userprefs_get`, `userprefs_set`, `userprefs_reset`, `userprefs_manifest`, `userprefs_testing_profile`
+- **Vendor escape hatches**: `esptool_*`, `nrfutil_*`, `picotool_*`
+
+Setup: `cd mcp-server && python3 -m venv .venv && .venv/bin/pip install -e '.[test]'`. The repo registers the server via `.mcp.json` — Claude Code picks it up automatically.
+
+See `mcp-server/README.md` for argument shapes and the **MCP Server & Hardware Test Harness** section of `.github/copilot-instructions.md` for agent usage rules (tool surface, fixture contract, firmware integration points, recovery playbooks).
+
+## Slash commands (AI-assisted workflows)
+
+Three test-and-diagnose workflows exist as slash commands:
+
+- **`/test` (Claude Code) / `/mcp-test` (Copilot)** — run the hardware test suite and interpret failures
+- **`/diagnose` / `/mcp-diagnose`** — read-only device health report
+- **`/repro` / `/mcp-repro`** — flakiness triage: re-run one test N times, diff firmware logs between passes and failures
+
+Bodies live in `.claude/commands/` and `.github/prompts/` respectively. `.claude/commands/README.md` is the index.
+
+## House rules
+
+- **No destructive device operations without operator approval.** `factory_reset`, `erase_and_flash`, `reboot`, `shutdown`, history-rewriting git ops — describe the action and stop. Operator authorizes.
+- **One MCP call per serial port at a time.** The port lock is exclusive; concurrent calls deadlock. Sequence: open → read/mutate → close, then next device.
+- **`userPrefs.jsonc` is session state during tests.** The `_session_userprefs` fixture snapshots + restores it; never edit it from inside a test.
+- **Don't speculate about firmware root causes.** When evidence doesn't support a classification, say "unknown" and list what would disambiguate.
+- **Run `trunk fmt` before proposing a commit.** The `trunk_check` CI gate will reject unformatted code.
+- **`confirm=True` on destructive MCP tools is a real gate, not a formality.** Don't bypass it via auto-approve settings.
+
+## Typical agent workflows
+
+### Flashing a device
+
+1. `list_devices` → find the port + likely VID
+2. `list_boards` → confirm the env, or use the known default for the hardware
+3. `pio_flash(env=..., port=..., confirm=True)` for any arch, or `erase_and_flash(env=..., port=..., confirm=True)` for an ESP32 factory install
+
+### Inspecting live node state
+
+1. `device_info(port=...)` — short summary (node num, firmware version, region, peer count)
+2. `list_nodes(port=...)` — full peer table (SNR, RSSI, pubkey presence, last_heard)
+3. `get_config(section="lora", port=...)` — LoRa settings for cross-device comparison
+
+Sequence these; don't parallelize on the same port.
+
+### Testing a firmware change
+
+1. Build locally: `pio run -e <env>`
+2. Flash the test device: `pio_flash(env=..., port=..., confirm=True)`
+3. Run the suite: `./mcp-server/run-tests.sh tests/<tier>` or `/test tests/<tier>`
+4. On failure, open `mcp-server/tests/report.html` → `Meshtastic debug` section for the firmware log tail + device state dump
+5. Iterate
+
+### Debugging a flaky test
+
+1. `/repro <test-node-id> [count]` — re-runs the test N times, diffs firmware logs between passes and failures
+2. If the first attempt always fails and the rest pass, that's a state-leak pattern → suggest `--force-bake` or a clean device state, don't chase the first failure
+3. If all N fail, this isn't a flake — it's a regression. Stop iterating and escalate to `/test` for full-suite context.
+
+## Where to look
+
+| Path                              | What's there                                                                                         |
+| --------------------------------- | ---------------------------------------------------------------------------------------------------- |
+| `src/`                            | Firmware C++ source (`mesh/`, `modules/`, `platform/`, `graphics/`, `gps/`, `motion/`, `mqtt/`, …)   |
+| `src/mesh/`                       | Core: NodeDB, Router, Channels, CryptoEngine, radio interfaces, StreamAPI, PhoneAPI                  |
+| `src/modules/`                    | Feature modules; `Telemetry/Sensor/` has 50+ I2C sensor drivers                                      |
+| `variants/`                       | 200+ hardware variant definitions (`variant.h` + `platformio.ini` per board)                         |
+| `protobufs/`                      | `.proto` definitions; regenerate with `bin/regen-protos.sh`                                          |
+| `test/`                           | Firmware unit tests (12 suites; `pio test -e native`)                                                |
+| `mcp-server/`                     | Python MCP server + pytest hardware integration tests                                                |
+| `mcp-server/tests/`               | Tiered pytest suite: `unit/`, `mesh/`, `telemetry/`, `monitor/`, `fleet/`, `admin/`, `provisioning/` |
+| `.claude/commands/`               | Claude Code slash command bodies                                                                     |
+| `.github/prompts/`                | Copilot prompt bodies (mirrors of the Claude Code ones)                                              |
+| `.github/copilot-instructions.md` | **Primary agent instructions — read this**                                                           |
+| `.github/workflows/`              | CI pipelines                                                                                         |
+| `.mcp.json`                       | MCP server registration for Claude Code                                                              |
+
+## Recovery one-liners
+
+- **`userPrefs.jsonc` dirty after a test run?** Re-run `./mcp-server/run-tests.sh` once (pre-flight self-heals from the sidecar). If still dirty: `git checkout userPrefs.jsonc`.
+- **nRF52 not responding?** `mcp__meshtastic__touch_1200bps(port=...)` drops it into the DFU bootloader, then `pio_flash` re-installs.
+- **Port busy?** `lsof <port>` to find the holder. Usually a stale `pio device monitor` or zombie `meshtastic_mcp` process. Kill it.
+- **Multiple MCP servers running?** `ps aux | grep meshtastic_mcp` — zombies hold ports. Kill all but the one your host spawned.
diff --git a/mcp-server/.gitignore b/mcp-server/.gitignore
new file mode 100644
index 00000000000..f5180bc71a1
--- /dev/null
+++ b/mcp-server/.gitignore
@@ -0,0 +1,26 @@
+.venv/
+__pycache__/
+*.py[cod]
+*.egg-info/
+.pytest_cache/
+.mypy_cache/
+dist/
+build/
+
+# Test harness artifacts
+tests/report.html
+tests/junit.xml
+tests/reportlog.jsonl
+tests/fwlog.jsonl
+# Subprocess-output tee from pio/esptool/nrfutil/picotool (live flash
+# progress for the TUI; also a post-run diagnostic for plain CLI runs).
+tests/flash.log
+tests/tool_coverage.json
+tests/.coverage
+htmlcov/
+# Persistent run counter for meshtastic-mcp-test-tui header.
+tests/.tui-runs
+# Cross-run history (TUI duration sparkline).
+tests/.history/
+# Reproducer bundles (TUI `x` export on failed tests).
+tests/reproducers/
diff --git a/mcp-server/README.md b/mcp-server/README.md
new file mode 100644
index 00000000000..7d5fc551a7b
--- /dev/null
+++ b/mcp-server/README.md
@@ -0,0 +1,270 @@
+# Meshtastic MCP Server
+
+An [MCP](https://modelcontextprotocol.io) server for working with the Meshtastic firmware repo and connected devices. Lets Claude Code / Claude Desktop:
+
+- Discover USB-connected Meshtastic devices
+- Enumerate PlatformIO board variants (166+) with Meshtastic metadata
+- Build, clean, flash, erase-and-flash (factory), and OTA-update firmware
+- Read serial logs via `pio device monitor` (with board-specific exception decoders)
+- Trigger 1200bps touch-reset for bootloader entry (nRF52, ESP32-S3, RP2040)
+- Query and administer a running node via the [`meshtastic` Python API](https://github.com/meshtastic/python): owner name, config (LocalConfig + ModuleConfig), channels, messaging, reboot/shutdown/factory-reset
+- Call `esptool`, `nrfutil`, `picotool` directly when PlatformIO doesn't cover the operation
+
+## Design principle
+
+**PlatformIO first.** Its `pio run -t upload` knows the correct protocol, offsets, and post-build chain for every variant in `variants/`. Direct vendor-tool wrappers (`esptool_*`, `nrfutil_*`, `picotool_*`) exist as escape hatches for operations pio doesn't cover (blank-chip erase, DFU `.zip` packages, BOOTSEL-mode inspection).
+
+## Prerequisites
+
+- Python ≥ 3.11
+- [PlatformIO Core](https://platformio.org/install/cli) — `pio` on `$PATH` or at `~/.platformio/penv/bin/pio`
+- The Meshtastic firmware repo checked out somewhere (set via `MESHTASTIC_FIRMWARE_ROOT`)
+- Optional: `esptool`, `nrfutil`, `picotool` on `$PATH` (or under the firmware venv at `.venv/bin/`) if you want to use the direct-tool wrappers
+
+## Install
+
+```bash
+cd <firmware-repo>/mcp-server
+python3 -m venv .venv
+.venv/bin/pip install -e .
+```
+
+Verify:
+
+```bash
+MESHTASTIC_FIRMWARE_ROOT=<firmware-repo> .venv/bin/python -m meshtastic_mcp
+```
+
+The server blocks on stdin (that's correct — it speaks MCP over stdio). Ctrl-C to exit.
+
+## Register with Claude Code
+
+Edit `~/.claude/settings.json` (global) or `<firmware-repo>/.claude/settings.local.json` (project-only):
+
+```json
+{
+  "mcpServers": {
+    "meshtastic": {
+      "command": "<firmware-repo>/mcp-server/.venv/bin/python",
+      "args": ["-m", "meshtastic_mcp"],
+      "env": {
+        "MESHTASTIC_FIRMWARE_ROOT": "<firmware-repo>"
+      }
+    }
+  }
+}
+```
+
+Replace `<firmware-repo>` with the absolute path, e.g. `/Users/you/GitHub/firmware`. Restart Claude Code after editing.
+
+## Register with Claude Desktop
+
+Same `mcpServers` block, but in `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows).
+
+## Tools (38)
+
+### Discovery & metadata
+
+| Tool           | What it does                                                                               |
+| -------------- | ------------------------------------------------------------------------------------------ |
+| `list_devices` | USB/serial port listing, flags likely-Meshtastic candidates                                |
+| `list_boards`  | PlatformIO envs with `custom_meshtastic_*` metadata; filters by arch/supported/query/level |
+| `get_board`    | Full env dict incl. raw pio config                                                         |
+
+### Build & flash
+
+| Tool              | What it does                                                         |
+| ----------------- | -------------------------------------------------------------------- |
+| `build`           | `pio run -e <env>` (+ mtjson target)                                 |
+| `clean`           | `pio run -e <env> -t clean`                                          |
+| `pio_flash`       | `pio run -e <env> -t upload --upload-port <port>` — any architecture |
+| `erase_and_flash` | ESP32 full factory flash via `bin/device-install.sh`                 |
+| `update_flash`    | ESP32 OTA app-partition update via `bin/device-update.sh`            |
+| `touch_1200bps`   | 1200-baud open/close to trigger USB CDC bootloader entry             |
+
+### Serial log sessions
+
+Backed by long-running `pio device monitor` subprocesses with a 10k-line ring buffer per session and board-specific filters (`esp32_exception_decoder` auto-selected when you pass `env=`).
+
+| Tool           | What it does                                                       |
+| -------------- | ------------------------------------------------------------------ |
+| `serial_open`  | Start a monitor session; returns `session_id`                      |
+| `serial_read`  | Cursor-based pull; reports `dropped` if lines aged out of the ring |
+| `serial_list`  | All active sessions                                                |
+| `serial_close` | Terminate a session                                                |
+
+### Device reads
+
+| Tool          | What it does                                                                |
+| ------------- | --------------------------------------------------------------------------- |
+| `device_info` | my_node_num, long/short name, firmware version, region, channel, node count |
+| `list_nodes`  | Full node database with position, SNR, RSSI, last_heard, battery            |
+
+_The tool tables below document 38 currently registered MCP server tools._
+
+### Device writes
+
+| Tool                | What it does                                                               |
+| ------------------- | -------------------------------------------------------------------------- |
+| `set_owner`         | Long name + optional short name (≤4 chars)                                 |
+| `get_config`        | One section or all (LocalConfig + ModuleConfig)                            |
+| `set_config`        | Dot-path field write: `lora.region`=`"US"`, `device.role`=`"ROUTER"`, etc. |
+| `get_channel_url`   | Primary-only or include_all=admin URL                                      |
+| `set_channel_url`   | Import channels from a Meshtastic URL                                      |
+| `set_debug_log_api` | Enable or disable debug logging for the Meshtastic Python API client       |
+| `send_text`         | Broadcast or direct text message                                           |
+| `reboot`            | `localNode.reboot(secs)` — requires `confirm=True`                         |
+| `shutdown`          | `localNode.shutdown(secs)` — requires `confirm=True`                       |
+| `factory_reset`     | `localNode.factoryReset(full?)` — requires `confirm=True`                  |
+
+### Direct hardware tools (escape hatches)
+
+| Tool                  | What it does                                              |
+| --------------------- | --------------------------------------------------------- |
+| `esptool_chip_info`   | Read chip, MAC, crystal, flash size                       |
+| `esptool_erase_flash` | Full-chip erase (destructive)                             |
+| `esptool_raw`         | Pass-through; confirm=True required for write/erase/merge |
+| `nrfutil_dfu`         | DFU-flash a `.zip` package                                |
+| `nrfutil_raw`         | Pass-through                                              |
+| `picotool_info`       | Read Pico BOOTSEL-mode info                               |
+| `picotool_load`       | Load a UF2                                                |
+| `picotool_raw`        | Pass-through                                              |
+
+## Safety
+
+- **All destructive flash/admin tools require `confirm=True`** as a tool-level gate, on top of any permission prompt from Claude.
+- **Serial port is exclusive.** If a `serial_*` session is active on a port, `device_info`/admin tools on the same port will fail fast with a pointer at the active `session_id`. Close the session first.
+- **Flash confirmation by architecture**: `erase_and_flash` / `update_flash` error if the env's architecture isn't ESP32 — use `pio_flash` for nRF52/RP2040/STM32.
+
+## Environment variables
+
+| Var                        | Default                                                     | Purpose                                                             |
+| -------------------------- | ----------------------------------------------------------- | ------------------------------------------------------------------- |
+| `MESHTASTIC_FIRMWARE_ROOT` | walks up from cwd for `platformio.ini`                      | Pin the firmware repo                                               |
+| `MESHTASTIC_PIO_BIN`       | `~/.platformio/penv/bin/pio` → `$PATH` `pio` → `platformio` | Override `pio` location                                             |
+| `MESHTASTIC_ESPTOOL_BIN`   | `<firmware>/.venv/bin/esptool` → `$PATH`                    | Override esptool                                                    |
+| `MESHTASTIC_NRFUTIL_BIN`   | `$PATH`                                                     | Override nrfutil                                                    |
+| `MESHTASTIC_PICOTOOL_BIN`  | `$PATH`                                                     | Override picotool                                                   |
+| `MESHTASTIC_MCP_SEED`      | `mcp-<user>-<host>`                                         | PSK seed for test-harness session (CI override)                     |
+| `MESHTASTIC_MCP_FLASH_LOG` | `<mcp-server>/tests/flash.log`                              | Tee target for pio/esptool/nrfutil subprocess output (TUI tails it) |
+
+## Hardware Test Suite
+
+`mcp-server/tests/` holds a pytest-based integration suite that exercises
+real USB-connected Meshtastic devices against the MCP server surface. Separate
+from the native C++ unit tests in the firmware repo's top-level `test/`
+directory — this one validates the device-facing behavior end-to-end.
+
+### Invocation
+
+```bash
+./mcp-server/run-tests.sh                               # full suite (auto-detect + auto-bake-if-needed)
+./mcp-server/run-tests.sh --force-bake                  # reflash devices before testing
+./mcp-server/run-tests.sh --assume-baked                # skip the bake step (caller vouches for state)
+./mcp-server/run-tests.sh tests/mesh                    # one tier
+./mcp-server/run-tests.sh tests/mesh/test_traceroute.py # one file
+./mcp-server/run-tests.sh -k telemetry                  # pytest name filter
+```
+
+The wrapper auto-detects connected devices (VID `0x239A` → `nrf52` → env
+`rak4631`; `0x303A` or `0x10C4` → `esp32s3` → env `heltec-v3`), exports
+`MESHTASTIC_MCP_ENV_<ROLE>` env vars, and invokes pytest. Overrides via
+per-role env vars: `MESHTASTIC_MCP_ENV_NRF52=heltec-mesh-node-t114 ./run-tests.sh`.
+
+No hardware connected? The wrapper narrows to `tests/unit/` only and says so
+in the pre-flight header.
+
+### Tiers (run in this order)
+
+- **`bake`** (`tests/test_00_bake.py`) — flashes both hub roles with the
+  session's test profile. Has a skip-if-already-baked check (region + channel
+  match); `--force-bake` overrides.
+- **`unit`** — pure Python, no hardware. boards / PIO wrapper /
+  userPrefs-parse / testing-profile fixtures.
+- **`mesh`** — 2-device mesh: formation, broadcast delivery, direct+ACK,
+  traceroute, bidirectional. Parametrized over both directions.
+- **`telemetry`** — periodic telemetry broadcast + on-demand request/reply
+  (`TELEMETRY_APP` with `wantResponse=True`).
+- **`monitor`** — boot log has no panic markers within 60 s of reboot.
+- **`fleet`** — PSK-seed isolation: two labs with different seeds never
+  overlap.
+- **`admin`** — owner persistence across reboot, channel URL round-trip,
+  `lora.hop_limit` persistence.
+- **`provisioning`** — region/channel baking, userPrefs survive
+  `factory_reset(full=False)`.
+
+### Artifacts (regenerated every run, under `tests/`)
+
+- `report.html` — self-contained pytest-html report. Each test gets a
+  **Meshtastic debug** section attached on failure with a 200-line firmware
+  log tail + device-state dump. Open this first on failures.
+- `junit.xml` — CI-parseable.
+- `reportlog.jsonl` — `pytest-reportlog` event stream; consumed by the TUI.
+- `fwlog.jsonl` — firmware log mirror (`meshtastic.log.line` pubsub → JSONL).
+- `flash.log` — tee of all pio / esptool / nrfutil / picotool subprocess
+  output during the run (driven by `MESHTASTIC_MCP_FLASH_LOG`).
+
+### Live TUI
+
+```bash
+.venv/bin/meshtastic-mcp-test-tui
+.venv/bin/meshtastic-mcp-test-tui tests/mesh    # pytest args pass through
+```
+
+Textual-based wrapper over `run-tests.sh` with a live test tree, tier
+counters, pytest output pane, firmware-log pane, and a device-status strip.
+Key bindings: `r` re-run focused, `f` filter, `d` failure detail, `g` open
+`report.html`, `x` export reproducer bundle, `l` cycle fw-log filter, `q`
+quit (SIGINT → SIGTERM → SIGKILL escalation).
+
+### Slash commands
+
+Three AI-assisted workflows are wired up for Claude Code operators
+(`.claude/commands/`) and Copilot operators (`.github/prompts/`):
+`/test` (run + interpret), `/diagnose` (read-only health report), `/repro`
+(flake triage, N-times re-run with log diff).
+
+### House rules (for human + agent contributors)
+
+- Session-scoped fixtures in `tests/conftest.py` snapshot + restore
+  `userPrefs.jsonc`; **never edit `userPrefs.jsonc` from inside a test**.
+  Use the `test_profile` / `no_region_profile` fixtures for ephemeral
+  overrides.
+- `SerialInterface` holds an **exclusive port lock**; sequence calls
+  open → mutate → close, then next device. No parallel calls to the
+  same port.
+- Directed PKI-encrypted sends need **bilateral NodeInfo warmup** —
+  both sides must hold the other's current pubkey. See
+  `tests/mesh/_receive.py::nudge_nodeinfo_port` and the three directed-
+  send tests (`test_direct_with_ack`, `test_traceroute`,
+  `test_telemetry_request_reply`) for the canonical pattern.
+
+## Layout
+
+```text
+mcp-server/
+├── pyproject.toml
+├── README.md
+└── src/meshtastic_mcp/
+    ├── __main__.py         # entry: python -m meshtastic_mcp
+    ├── server.py           # FastMCP app + @app.tool() registrations (thin)
+    ├── config.py           # firmware_root, pio_bin, esptool_bin, etc.
+    ├── pio.py              # subprocess wrapper (timeouts, JSON, tail_lines)
+    ├── devices.py          # list_devices (findPorts + comports)
+    ├── boards.py           # list_boards / get_board (pio project config parse + cache)
+    ├── flash.py            # build, clean, flash, erase_and_flash, update_flash, touch_1200bps
+    ├── serial_session.py   # SerialSession + reader thread + ring buffer
+    ├── registry.py         # session registry + per-port locks
+    ├── connection.py       # connect(port) ctx mgr — SerialInterface + port lock
+    ├── info.py             # device_info, list_nodes
+    ├── admin.py            # set_owner, get/set_config, channels, send_text, reboot/shutdown/factory_reset
+    └── hw_tools.py         # esptool / nrfutil / picotool wrappers
+```
+
+## Troubleshooting
+
+- **"Could not locate Meshtastic firmware root"** — set `MESHTASTIC_FIRMWARE_ROOT`.
+- **"Could not find `pio`"** — install PlatformIO or set `MESHTASTIC_PIO_BIN`.
+- **"Port is held by serial session ..."** — call `serial_close(session_id)` or `serial_list` to find it.
+- **`factory.bin` not found after build** — the env may not be ESP32; only ESP32 envs produce a `.factory.bin`.
+- **`touch_1200bps` reported `new_port: null`** — the device may not have 1200bps-reset stdio, or the bootloader re-uses the same port name. Check `list_devices` manually.
diff --git a/mcp-server/pyproject.toml b/mcp-server/pyproject.toml
new file mode 100644
index 00000000000..d73bf795f5f
--- /dev/null
+++ b/mcp-server/pyproject.toml
@@ -0,0 +1,39 @@
+[project]
+name = "meshtastic-mcp"
+version = "0.1.0"
+description = "MCP server for Meshtastic firmware development: device discovery, PlatformIO tooling, flashing, serial monitoring, and device administration via the meshtastic Python API."
+readme = "README.md"
+requires-python = ">=3.11"
+license = { text = "GPL-3.0-only" }
+authors = [{ name = "thebentern" }]
+dependencies = ["mcp>=1.2", "pyserial>=3.5", "meshtastic>=2.7.8"]
+
+[project.optional-dependencies]
+dev = ["pytest>=7"]
+test = [
+  "pytest>=8",
+  "pytest-html>=4",
+  "pytest-reportlog>=0.4",
+  "pytest-timeout>=2.3",
+  "coverage[toml]>=7",
+  "pyyaml>=6",
+  # textual is required by the `meshtastic-mcp-test-tui` script (see
+  # `src/meshtastic_mcp/cli/test_tui.py`). Bundled into `test` rather than a
+  # separate `[tui]` extra because v1 expects test operators are the only
+  # consumers; revisit if install cost pushes back.
+  "textual>=0.50",
+]
+
+[project.scripts]
+meshtastic-mcp = "meshtastic_mcp.__main__:main"
+# Live TUI wrapping run-tests.sh — shells out to the same script the plain
+# CLI uses, tails pytest-reportlog for per-test state, and polls the device
+# list at startup + post-run (port lock forces it to stay idle during the run).
+meshtastic-mcp-test-tui = "meshtastic_mcp.cli.test_tui:main"
+
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel]
+packages = ["src/meshtastic_mcp"]
diff --git a/mcp-server/run-tests.sh b/mcp-server/run-tests.sh
new file mode 100755
index 00000000000..292e6e3a2f7
--- /dev/null
+++ b/mcp-server/run-tests.sh
@@ -0,0 +1,236 @@
+#!/usr/bin/env bash
+# mcp-server hardware test runner.
+#
+# Auto-detects connected Meshtastic devices, maps each to its PlatformIO env
+# via the same role table the pytest fixtures use, exports the right
+# MESHTASTIC_MCP_ENV_* env vars, and invokes pytest.
+#
+# Usage:
+#   ./run-tests.sh                        # full suite, default pytest args
+#   ./run-tests.sh tests/mesh             # subset (any pytest args pass through)
+#   ./run-tests.sh --force-bake           # override one default with another
+#   MESHTASTIC_MCP_ENV_NRF52=foo ./run-tests.sh   # override env per role
+#   MESHTASTIC_MCP_SEED=ci-run-42 ./run-tests.sh  # override PSK seed
+#
+# If zero supported devices are detected, only the unit tier runs.
+#
+# Also restores `userPrefs.jsonc` from the session-backup sidecar if a prior
+# run exited abnormally (belt to conftest.py's atexit suspenders).
+
+set -euo pipefail
+
+# cd to the script's directory so relative paths resolve consistently no
+# matter where the user invoked from.
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+cd "$SCRIPT_DIR"
+
+VENV_PY="$SCRIPT_DIR/.venv/bin/python"
+if [[ ! -x $VENV_PY ]]; then
+	echo "error: $VENV_PY not found or not executable." >&2
+	echo "       Bootstrap the venv first:" >&2
+	echo "         cd $SCRIPT_DIR && python3 -m venv .venv && .venv/bin/pip install -e '.[test]'" >&2
+	exit 2
+fi
+
+# Resolve firmware root the same way conftest.py does (this script sits in
+# mcp-server/, firmware repo root is one level up).
+FIRMWARE_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
+USERPREFS_PATH="$FIRMWARE_ROOT/userPrefs.jsonc"
+USERPREFS_SIDECAR="$USERPREFS_PATH.mcp-session-bak"
+
+# ---------- Pre-flight: recover stale userPrefs.jsonc from prior crash ----
+# If conftest.py's atexit hook didn't fire (SIGKILL, kernel panic, OS
+# restart), the sidecar is the ground truth. Self-heal before running so we
+# don't bake the previous run's dirty state into this run's firmware.
+if [[ -f $USERPREFS_SIDECAR ]]; then
+	echo "[pre-flight] found $USERPREFS_SIDECAR from a prior abnormal exit;" >&2
+	echo "             restoring userPrefs.jsonc before starting." >&2
+	cp "$USERPREFS_SIDECAR" "$USERPREFS_PATH"
+	rm -f "$USERPREFS_SIDECAR"
+fi
+
+# If userPrefs.jsonc has uncommitted changes BEFORE the run starts, that's
+# worth warning about — tests will snapshot this dirty state and restore to
+# it at the end, which may not be what the operator wants.
+if command -v git >/dev/null 2>&1; then
+	cd "$FIRMWARE_ROOT"
+	# Capture the git status into a local first — SC2312 flags command
+	# substitution inside `[[ -n ... ]]` because the exit code of `git
+	# status` is masked. A two-step assignment makes the failure path
+	# explicit (non-git, missing file) and keeps the bracket test clean.
+	_git_status_porcelain="$(git status --porcelain userPrefs.jsonc 2>/dev/null || true)"
+	if [[ -n $_git_status_porcelain ]]; then
+		echo "[pre-flight] warning: userPrefs.jsonc has uncommitted changes." >&2
+		echo "             Tests will snapshot THIS state and restore to it" >&2
+		echo "             at teardown. If that's not intended, run:" >&2
+		echo "               git checkout userPrefs.jsonc" >&2
+		echo "             and re-invoke." >&2
+	fi
+	cd "$SCRIPT_DIR"
+fi
+
+# ---------- Seed default --------------------------------------------------
+# Per-machine default so repeated runs from the same operator land on the
+# same PSK (makes --assume-baked valid across invocations). Operator can
+# override with an explicit env var if they want isolation (e.g. CI).
+if [[ -z ${MESHTASTIC_MCP_SEED-} ]]; then
+	WHO="$(whoami 2>/dev/null || echo anon)"
+	HOST="$(hostname -s 2>/dev/null || echo host)"
+	export MESHTASTIC_MCP_SEED="mcp-${WHO}-${HOST}"
+fi
+
+# ---------- Flash progress log --------------------------------------------
+# pio.py / hw_tools.py tee subprocess output (pio run -t upload, esptool,
+# nrfutil, picotool) to this file line-by-line as it arrives when this env
+# var is set. The TUI tails it so the operator sees live flash progress
+# instead of 3 minutes of silence during `test_00_bake.py`. Plain CLI users
+# also benefit — the log is a post-run diagnostic even without the TUI.
+# Truncate at session start so each run gets a clean log.
+export MESHTASTIC_MCP_FLASH_LOG="$SCRIPT_DIR/tests/flash.log"
+: >"$MESHTASTIC_MCP_FLASH_LOG"
+
+# ---------- Detect connected hardware -------------------------------------
+# In-process call to the same Python API the test fixtures use, so the
+# script never drifts from what pytest sees. Returns a JSON object
+# {role: port, ...}.
+ROLES_JSON="$(
+	"$VENV_PY" - <<'PY'
+import json
+import sys
+
+sys.path.insert(0, "src")
+from meshtastic_mcp import devices
+
+# Role → canonical VID map. Kept in sync with
+# `tests/conftest.py::hub_profile` defaults; if that changes, this must too.
+ROLE_BY_VID = {
+    0x239A: "nrf52",     # Adafruit / RAK nRF52 native USB (app + DFU)
+    0x303A: "esp32s3",   # Espressif native USB (ESP32-S3)
+    0x10C4: "esp32s3",   # CP2102 USB-UART (common on Heltec/LilyGO ESP32 boards)
+}
+
+out: dict[str, str] = {}
+for dev in devices.list_devices(include_unknown=True):
+    vid_raw = dev.get("vid") or ""
+    try:
+        if isinstance(vid_raw, str) and vid_raw.startswith("0x"):
+            vid = int(vid_raw, 16)
+        else:
+            vid = int(vid_raw)
+    except (TypeError, ValueError):
+        continue
+    role = ROLE_BY_VID.get(vid)
+    # First port wins per role — matches hub_devices fixture semantics.
+    if role and role not in out:
+        out[role] = dev["port"]
+
+json.dump(out, sys.stdout)
+PY
+)"
+
+# ---------- Map role → pio env --------------------------------------------
+# Honor MESHTASTIC_MCP_ENV_<ROLE> operator overrides; fall back to the
+# same defaults hardcoded in tests/conftest.py::_DEFAULT_ROLE_ENVS.
+resolve_env() {
+	local role="$1"
+	local default="$2"
+	local upper
+	upper="$(echo "$role" | tr '[:lower:]' '[:upper:]')"
+	local var="MESHTASTIC_MCP_ENV_${upper}"
+	eval "local override=\${$var:-}"
+	if [[ -n $override ]]; then
+		echo "$override"
+	else
+		echo "$default"
+	fi
+}
+
+NRF52_PORT="$(echo "$ROLES_JSON" | "$VENV_PY" -c 'import json,sys; print(json.loads(sys.stdin.read()).get("nrf52", ""))')"
+ESP32S3_PORT="$(echo "$ROLES_JSON" | "$VENV_PY" -c 'import json,sys; print(json.loads(sys.stdin.read()).get("esp32s3", ""))')"
+
+DETECTED=""
+if [[ -n $NRF52_PORT ]]; then
+	NRF52_ENV="$(resolve_env nrf52 rak4631)"
+	export MESHTASTIC_MCP_ENV_NRF52="$NRF52_ENV"
+	DETECTED="${DETECTED}  nrf52   @ ${NRF52_PORT} -> env=${NRF52_ENV}\n"
+fi
+if [[ -n $ESP32S3_PORT ]]; then
+	ESP32S3_ENV="$(resolve_env esp32s3 heltec-v3)"
+	export MESHTASTIC_MCP_ENV_ESP32S3="$ESP32S3_ENV"
+	DETECTED="${DETECTED}  esp32s3 @ ${ESP32S3_PORT} -> env=${ESP32S3_ENV}\n"
+fi
+
+# ---------- Pre-flight summary --------------------------------------------
+# Surface what pytest is about to do with respect to the bake phase: the
+# operator should see "will verify + bake if needed" by default, so a
+# 3-minute flash appearing mid-run isn't a surprise. Detection of the
+# explicit overrides is best-effort — we just scan $@ for the known flags.
+_bake_mode="auto (verify + bake if needed)"
+for _arg in "$@"; do
+	case "$_arg" in
+	--assume-baked) _bake_mode="skip (--assume-baked)" ;;
+	--force-bake) _bake_mode="force (--force-bake)" ;;
+	*) ;; # any other arg: pass-through; bake mode unchanged
+	esac
+done
+
+echo "mcp-server test runner"
+echo "  firmware root : $FIRMWARE_ROOT"
+echo "  seed          : $MESHTASTIC_MCP_SEED"
+echo "  bake          : $_bake_mode"
+if [[ -n $DETECTED ]]; then
+	echo "  detected hub  :"
+	printf "%b" "$DETECTED"
+else
+	echo "  detected hub  : (none)"
+fi
+echo
+
+# ---------- Invoke pytest -------------------------------------------------
+# If no devices detected, only the unit tier would produce meaningful
+# PASS/FAIL — every hardware test would SKIP with "role not present". We
+# narrow to tests/unit explicitly so the summary reads as "no hardware,
+# unit suite only" instead of "big skip count looks suspicious".
+if [[ -z $DETECTED && $# -eq 0 ]]; then
+	echo "[pre-flight] no supported devices detected; running unit tier only."
+	echo
+	exec "$VENV_PY" -m pytest tests/unit -v --report-log=tests/reportlog.jsonl
+fi
+
+# Default pytest args when the user passed none. Power users can invoke
+# `./run-tests.sh tests/mesh -v --tb=long` and skip all of these defaults.
+#
+# NOTE: `--assume-baked` is DELIBERATELY omitted here. `tests/test_00_bake.py`
+# has an internal skip-if-already-baked check (`_bake_role`: query device_info,
+# compare region + primary_channel to the session profile, skip on match).
+# So the fast path is ~8-10 s of verification overhead when the devices are
+# already baked — negligible next to the 2-6 min suite runtime. Letting
+# test_00_bake.py run means a fresh device, a re-seeded session, or a post-
+# factory-reset device gets flashed automatically instead of silently
+# skipping half the hardware tests with "not baked with session profile"
+# errors. Power users who know their hardware is current and want to shave
+# those seconds can pass `--assume-baked` explicitly.
+if [[ $# -eq 0 ]]; then
+	set -- tests/ \
+		--html=tests/report.html --self-contained-html \
+		--junitxml=tests/junit.xml \
+		-v --tb=short
+fi
+
+# Always emit `tests/reportlog.jsonl` (unless the operator explicitly passed
+# their own `--report-log=...`). Consumers — notably the
+# `meshtastic-mcp-test-tui` TUI — tail the reportlog for live per-test state.
+# Appending here means power-user invocations like `./run-tests.sh tests/mesh`
+# also produce it, not just the all-defaults invocation.
+_has_report_log=0
+for _arg in "$@"; do
+	case "$_arg" in
+	--report-log | --report-log=*) _has_report_log=1 ;;
+	*) ;; # any other arg: no-op; loop continues
+	esac
+done
+if [[ $_has_report_log -eq 0 ]]; then
+	set -- "$@" --report-log=tests/reportlog.jsonl
+fi
+
+exec "$VENV_PY" -m pytest "$@"
diff --git a/mcp-server/src/meshtastic_mcp/__init__.py b/mcp-server/src/meshtastic_mcp/__init__.py
new file mode 100644
index 00000000000..bd696afe01d
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/__init__.py
@@ -0,0 +1,3 @@
+"""Meshtastic MCP server — device discovery, PlatformIO tooling, and device admin."""
+
+__version__ = "0.1.0"
diff --git a/mcp-server/src/meshtastic_mcp/__main__.py b/mcp-server/src/meshtastic_mcp/__main__.py
new file mode 100644
index 00000000000..4ed67db3821
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/__main__.py
@@ -0,0 +1,11 @@
+"""Entry point for `python -m meshtastic_mcp`."""
+
+from meshtastic_mcp.server import app
+
+
+def main() -> None:
+    app.run()
+
+
+if __name__ == "__main__":
+    main()
diff --git a/mcp-server/src/meshtastic_mcp/admin.py b/mcp-server/src/meshtastic_mcp/admin.py
new file mode 100644
index 00000000000..6da92d860a4
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/admin.py
@@ -0,0 +1,377 @@
+"""Device administration: owner, config, channels, messaging, admin actions.
+
+All operations use the same `connect()` context manager so port selection,
+port-busy detection, and cleanup are handled uniformly.
+
+Config writes use a dot-path: the first segment names a section (e.g.
+`"lora"` in LocalConfig or `"mqtt"` in LocalModuleConfig), remaining segments
+walk protobuf fields. Enum fields accept their string names (`"US"` for
+`lora.region`) so callers don't need to know the numeric values.
+"""
+
+from __future__ import annotations
+
+from typing import Any
+
+from google.protobuf import descriptor as pb_descriptor
+from google.protobuf import json_format
+from meshtastic.protobuf import localonly_pb2
+
+from .connection import connect
+
+
+class AdminError(RuntimeError):
+    pass
+
+
+LOCAL_CONFIG_SECTIONS = {f.name for f in localonly_pb2.LocalConfig.DESCRIPTOR.fields}
+MODULE_CONFIG_SECTIONS = {
+    f.name for f in localonly_pb2.LocalModuleConfig.DESCRIPTOR.fields
+}
+
+
+def _require_confirm(confirm: bool, operation: str) -> None:
+    if not confirm:
+        raise AdminError(f"{operation} is destructive and requires confirm=True.")
+
+
+def _message_to_dict(msg: Any) -> dict[str, Any]:
+    # `including_default_value_fields` was renamed to
+    # `always_print_fields_with_no_presence` in protobuf 5.26+. Pick whichever
+    # kwarg the installed version accepts so we work against both.
+    kwargs: dict[str, Any] = {"preserving_proto_field_name": True}
+    import inspect
+
+    sig = inspect.signature(json_format.MessageToDict)
+    if "always_print_fields_with_no_presence" in sig.parameters:
+        kwargs["always_print_fields_with_no_presence"] = False
+    elif "including_default_value_fields" in sig.parameters:
+        kwargs["including_default_value_fields"] = False
+    return json_format.MessageToDict(msg, **kwargs)
+
+
+# ---------- owner ----------------------------------------------------------
+
+
+def set_owner(
+    long_name: str,
+    short_name: str | None = None,
+    port: str | None = None,
+) -> dict[str, Any]:
+    if short_name is not None and len(short_name) > 4:
+        raise AdminError("short_name must be 4 characters or fewer")
+    with connect(port=port) as iface:
+        iface.localNode.setOwner(long_name=long_name, short_name=short_name)
+    return {
+        "ok": True,
+        "long_name": long_name,
+        "short_name": short_name,
+    }
+
+
+# ---------- config reads ---------------------------------------------------
+
+
+def _section_container(node, section: str) -> tuple[Any, str]:
+    """Return (container_message, parent_name) for a section name.
+
+    Parent is 'localConfig' or 'moduleConfig' so callers know where to call
+    writeConfig() after mutating.
+    """
+    if section in LOCAL_CONFIG_SECTIONS:
+        return getattr(node.localConfig, section), "localConfig"
+    if section in MODULE_CONFIG_SECTIONS:
+        return getattr(node.moduleConfig, section), "moduleConfig"
+    raise AdminError(
+        f"Unknown config section: {section!r}. "
+        f"Valid sections: {sorted(LOCAL_CONFIG_SECTIONS | MODULE_CONFIG_SECTIONS)}"
+    )
+
+
+def get_config(section: str | None = None, port: str | None = None) -> dict[str, Any]:
+    """Read one or all config sections.
+
+    `section` may be any name in LocalConfig (device, lora, position, power,
+    network, display, bluetooth, security) or LocalModuleConfig (mqtt, serial,
+    telemetry, ...). Omit `section` or pass `"all"` for everything.
+    """
+    with connect(port=port) as iface:
+        node = iface.localNode
+        if section in (None, "all"):
+            lc = _message_to_dict(node.localConfig)
+            mc = _message_to_dict(node.moduleConfig)
+            return {
+                "config": {
+                    "localConfig": lc,
+                    "moduleConfig": mc,
+                }
+            }
+        container, _parent = _section_container(node, section)
+        return {"config": {section: _message_to_dict(container)}}
+
+
+# ---------- config writes --------------------------------------------------
+
+
+def _coerce_enum(field: pb_descriptor.FieldDescriptor, value: Any) -> int:
+    """Accept an enum value as either its int or its string name."""
+    enum_type = field.enum_type
+    if isinstance(value, bool):
+        raise AdminError(f"{field.name}: expected enum {enum_type.name}, got bool")
+    if isinstance(value, int):
+        if enum_type.values_by_number.get(value) is None:
+            raise AdminError(
+                f"{field.name}: {value} is not a valid {enum_type.name} value"
+            )
+        return value
+    if isinstance(value, str):
+        upper = value.upper()
+        ev = enum_type.values_by_name.get(upper)
+        if ev is None:
+            valid = sorted(enum_type.values_by_name.keys())
+            raise AdminError(
+                f"{field.name}: {value!r} is not a valid {enum_type.name}. "
+                f"Valid: {valid}"
+            )
+        return ev.number
+    raise AdminError(
+        f"{field.name}: expected enum {enum_type.name}, got {type(value).__name__}"
+    )
+
+
+def _coerce_scalar(field: pb_descriptor.FieldDescriptor, value: Any) -> Any:
+    t = field.type
+    FT = pb_descriptor.FieldDescriptor
+    if t == FT.TYPE_ENUM:
+        return _coerce_enum(field, value)
+    if t == FT.TYPE_BOOL:
+        if isinstance(value, bool):
+            return value
+        if isinstance(value, str):
+            return value.strip().lower() in ("true", "yes", "1", "on")
+        if isinstance(value, int):
+            return bool(value)
+    if t in (
+        FT.TYPE_INT32,
+        FT.TYPE_INT64,
+        FT.TYPE_UINT32,
+        FT.TYPE_UINT64,
+        FT.TYPE_SINT32,
+        FT.TYPE_SINT64,
+        FT.TYPE_FIXED32,
+        FT.TYPE_FIXED64,
+    ):
+        return int(value)
+    if t in (FT.TYPE_FLOAT, FT.TYPE_DOUBLE):
+        return float(value)
+    if t == FT.TYPE_STRING:
+        return str(value)
+    if t == FT.TYPE_BYTES:
+        if isinstance(value, (bytes, bytearray)):
+            return bytes(value)
+        return str(value).encode("utf-8")
+    raise AdminError(
+        f"{field.name}: unsupported field type {t}. Use raw protobuf for this field."
+    )
+
+
+def _walk_to_field(
+    root_msg: Any, path_segments: list[str]
+) -> tuple[Any, pb_descriptor.FieldDescriptor]:
+    """Walk `root_msg` by field names until the leaf; return (parent_msg, leaf_field_descriptor)."""
+    msg = root_msg
+    for i, name in enumerate(path_segments):
+        desc = msg.DESCRIPTOR
+        field = desc.fields_by_name.get(name)
+        if field is None:
+            trail = ".".join(path_segments[:i] or ["<root>"])
+            valid = [f.name for f in desc.fields]
+            raise AdminError(f"No field {name!r} in {trail}. Valid: {valid}")
+        is_last = i == len(path_segments) - 1
+        if is_last:
+            return msg, field
+        if field.type != pb_descriptor.FieldDescriptor.TYPE_MESSAGE:
+            raise AdminError(
+                f"{'.'.join(path_segments[:i+1])} is a scalar; cannot descend into it"
+            )
+        msg = getattr(msg, name)
+    # path_segments was empty
+    raise AdminError("Empty config path")
+
+
+def set_config(path: str, value: Any, port: str | None = None) -> dict[str, Any]:
+    """Set a single config field by dot-path and write it to the device.
+
+    Examples:
+        set_config("lora.region", "US")
+        set_config("lora.modem_preset", "LONG_FAST")
+        set_config("device.role", "ROUTER")
+        set_config("mqtt.enabled", True)
+        set_config("mqtt.address", "mqtt.example.com")
+
+    """
+    segments = [s for s in path.split(".") if s]
+    if not segments:
+        raise AdminError("path cannot be empty")
+    section = segments[0]
+
+    with connect(port=port) as iface:
+        node = iface.localNode
+        container, parent_name = _section_container(node, section)
+
+        # Treat the section as the root; the rest of the path walks into it.
+        leaf_parent, field = _walk_to_field(container, segments[1:] or [])
+        # Use `is_repeated` (modern upb protobuf API) rather than the
+        # deprecated `label == LABEL_REPEATED` check — the C-extension
+        # FieldDescriptor in protobuf >= 5.x doesn't expose `.label` at
+        # all, and `is_repeated` is the supported replacement that works
+        # across both the pure-python and upb backends.
+        if field.is_repeated:
+            raise AdminError(
+                f"{path!r} is a repeated field; v1 only supports scalar sets. "
+                "Use the raw meshtastic CLI for now."
+            )
+        old_raw = getattr(leaf_parent, field.name)
+        coerced = _coerce_scalar(field, value)
+        try:
+            setattr(leaf_parent, field.name, coerced)
+        except (TypeError, ValueError) as exc:
+            raise AdminError(f"{path}: {exc}") from exc
+
+        node.writeConfig(section)
+
+        # Stringify enums for the response (so the caller can see the change in
+        # the same vocabulary they used to set it).
+        if field.type == pb_descriptor.FieldDescriptor.TYPE_ENUM:
+            try:
+                old_display = field.enum_type.values_by_number[old_raw].name
+                new_display = field.enum_type.values_by_number[coerced].name
+            except Exception:
+                old_display, new_display = old_raw, coerced
+        else:
+            old_display, new_display = old_raw, coerced
+
+    return {
+        "ok": True,
+        "path": path,
+        "section": section,
+        "parent": parent_name,
+        "old_value": old_display,
+        "new_value": new_display,
+    }
+
+
+# ---------- channels -------------------------------------------------------
+
+
+def get_channel_url(
+    include_all: bool = False, port: str | None = None
+) -> dict[str, Any]:
+    with connect(port=port) as iface:
+        url = iface.localNode.getURL(includeAll=include_all)
+    return {"url": url}
+
+
+def set_channel_url(url: str, port: str | None = None) -> dict[str, Any]:
+    with connect(port=port) as iface:
+        # setURL replaces the channel set from the URL's contents. It does not
+        # return a count; we infer by counting non-DISABLED channels after.
+        iface.localNode.setURL(url)
+        channels = iface.localNode.channels or []
+        active = sum(1 for c in channels if getattr(c, "role", 0) != 0)
+    return {"ok": True, "channels_imported": active}
+
+
+# ---------- messaging ------------------------------------------------------
+
+
+def send_text(
+    text: str,
+    to: str | int | None = None,
+    channel_index: int = 0,
+    want_ack: bool = False,
+    port: str | None = None,
+) -> dict[str, Any]:
+    destination = to if to is not None else "^all"
+    with connect(port=port) as iface:
+        packet = iface.sendText(
+            text,
+            destinationId=destination,
+            wantAck=want_ack,
+            channelIndex=channel_index,
+        )
+        packet_id = getattr(packet, "id", None)
+    return {"ok": True, "packet_id": packet_id, "destination": destination}
+
+
+# ---------- diagnostics ----------------------------------------------------
+
+
+def set_debug_log_api(enabled: bool, port: str | None = None) -> dict[str, Any]:
+    """Toggle `config.security.debug_log_api_enabled` on the local node.
+
+    When enabled, firmware emits log lines as protobuf `LogRecord` messages
+    over the StreamAPI instead of raw text. meshtastic-python surfaces them
+    on pubsub topic `meshtastic.log.line`, which flows through the SAME
+    SerialInterface our tests already hold open — no `pio device monitor`
+    needed, no port-contention with admin/info calls.
+
+    Firmware gate: `src/SerialConsole.cpp` (`usingProtobufs &&
+    config.security.debug_log_api_enabled`). Setting persists in NVS; it
+    survives reboot. `factory_reset(full=False)` clears it unless it's
+    re-applied after reset.
+
+    Previously-documented concurrency hazard (emitLogRecord sharing the
+    main packet-emission buffers) has been fixed — see `StreamAPI.h`
+    where the log path now owns dedicated `fromRadioScratchLog` /
+    `txBufLog` buffers, and `StreamAPI::emitTxBuffer` +
+    `StreamAPI::emitLogRecord` both serialize their `stream->write`
+    calls via `streamLock`. Leaving the flag on under traffic is safe.
+    """
+    with connect(port=port) as iface:
+        sec = iface.localNode.localConfig.security
+        sec.debug_log_api_enabled = bool(enabled)
+        iface.localNode.writeConfig("security")
+    return {"ok": True, "debug_log_api_enabled": bool(enabled)}
+
+
+# ---------- admin actions --------------------------------------------------
+
+
+def reboot(
+    port: str | None = None, confirm: bool = False, seconds: int = 10
+) -> dict[str, Any]:
+    _require_confirm(confirm, "reboot")
+    with connect(port=port) as iface:
+        iface.localNode.reboot(secs=seconds)
+    return {"ok": True, "rebooting_in_s": seconds}
+
+
+def shutdown(
+    port: str | None = None, confirm: bool = False, seconds: int = 10
+) -> dict[str, Any]:
+    _require_confirm(confirm, "shutdown")
+    with connect(port=port) as iface:
+        iface.localNode.shutdown(secs=seconds)
+    return {"ok": True, "shutting_down_in_s": seconds}
+
+
+def factory_reset(
+    port: str | None = None, confirm: bool = False, full: bool = False
+) -> dict[str, Any]:
+    """Tell the node to factory-reset its config.
+
+    Works around a meshtastic-python 2.7.8 bug: `Node.factoryReset(full=True)`
+    internally does `p.factory_reset_config = True` where the field is
+    int32. protobuf 5.x rejects bool→int assignment as a TypeError. We build
+    the AdminMessage directly with int values (1=non-full, 2=full) and call
+    `_sendAdmin` to sidestep the SDK bug entirely.
+    """
+    _require_confirm(confirm, "factory_reset")
+    from meshtastic.protobuf import admin_pb2  # type: ignore[import-untyped]
+
+    with connect(port=port) as iface:
+        msg = admin_pb2.AdminMessage()
+        msg.factory_reset_config = 2 if full else 1
+        iface.localNode._sendAdmin(msg)
+    return {"ok": True, "full": full}
diff --git a/mcp-server/src/meshtastic_mcp/boards.py b/mcp-server/src/meshtastic_mcp/boards.py
new file mode 100644
index 00000000000..df5024800a6
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/boards.py
@@ -0,0 +1,159 @@
+"""Board / PlatformIO env enumeration.
+
+Parses `pio project config --json-output` — a nested list of
+`[section_name, [[key, value], ...]]` pairs — into a dict keyed by env name,
+extracting the `custom_meshtastic_*` metadata the firmware variants expose.
+
+The parsed config is cached and invalidated when `platformio.ini`'s mtime
+changes, so subsequent calls don't pay the 1–2s pio startup cost.
+"""
+
+from __future__ import annotations
+
+import threading
+from typing import Any
+
+from . import config, pio
+
+_CACHE_LOCK = threading.Lock()
+_CACHE: dict[str, Any] = {"mtime": None, "envs": None}
+
+
+def _parse_bool(value: Any) -> bool:
+    if isinstance(value, bool):
+        return value
+    if isinstance(value, str):
+        return value.strip().lower() in ("true", "yes", "1", "on")
+    return bool(value)
+
+
+def _parse_int(value: Any) -> int | None:
+    try:
+        return int(value)
+    except (TypeError, ValueError):
+        return None
+
+
+def _parse_tags(value: Any) -> list[str]:
+    if value is None:
+        return []
+    if isinstance(value, list):
+        return [str(v).strip() for v in value if str(v).strip()]
+    return [t.strip() for t in str(value).replace(",", " ").split() if t.strip()]
+
+
+def _env_record(env_name: str, items: list[list[Any]]) -> dict[str, Any]:
+    """Build a normalized dict for one env section."""
+    d = dict(items)
+    return {
+        "env": env_name,
+        "architecture": d.get("custom_meshtastic_architecture"),
+        "hw_model": _parse_int(d.get("custom_meshtastic_hw_model")),
+        "hw_model_slug": d.get("custom_meshtastic_hw_model_slug"),
+        "display_name": d.get("custom_meshtastic_display_name"),
+        "actively_supported": _parse_bool(
+            d.get("custom_meshtastic_actively_supported")
+        ),
+        "support_level": _parse_int(d.get("custom_meshtastic_support_level")),
+        "board_level": d.get("board_level"),  # "pr", "extra", or None
+        "tags": _parse_tags(d.get("custom_meshtastic_tags")),
+        "images": _parse_tags(d.get("custom_meshtastic_images")),
+        "board": d.get("board"),
+        "upload_speed": _parse_int(d.get("upload_speed")),
+        "upload_protocol": d.get("upload_protocol"),
+        "monitor_speed": _parse_int(d.get("monitor_speed")),
+        "monitor_filters": d.get("monitor_filters") or [],
+        "_raw": d,  # Full dict for get_board
+    }
+
+
+def _load_all() -> dict[str, dict[str, Any]]:
+    """Parse `pio project config` into `{env_name: record}`."""
+    raw = pio.run_json(["project", "config"], timeout=pio.TIMEOUT_PROJECT_CONFIG)
+    result: dict[str, dict[str, Any]] = {}
+    for section_name, items in raw:
+        if not isinstance(section_name, str) or not section_name.startswith("env:"):
+            continue
+        env_name = section_name.split(":", 1)[1]
+        result[env_name] = _env_record(env_name, items)
+    return result
+
+
+def _get_cached() -> dict[str, dict[str, Any]]:
+    root = config.firmware_root()
+    platformio_ini = root / "platformio.ini"
+    try:
+        mtime = platformio_ini.stat().st_mtime
+    except FileNotFoundError:
+        mtime = None
+
+    with _CACHE_LOCK:
+        if _CACHE["envs"] is not None and _CACHE["mtime"] == mtime:
+            return _CACHE["envs"]
+        envs = _load_all()
+        _CACHE["envs"] = envs
+        _CACHE["mtime"] = mtime
+        return envs
+
+
+def invalidate_cache() -> None:
+    with _CACHE_LOCK:
+        _CACHE["envs"] = None
+        _CACHE["mtime"] = None
+
+
+def _public_record(rec: dict[str, Any]) -> dict[str, Any]:
+    """Strip the `_raw` field for list outputs."""
+    return {k: v for k, v in rec.items() if not k.startswith("_")}
+
+
+def list_boards(
+    architecture: str | None = None,
+    actively_supported_only: bool = False,
+    query: str | None = None,
+    board_level: str | None = None,  # "release" | "pr" | "extra"
+) -> list[dict[str, Any]]:
+    """Enumerate PlatformIO envs with Meshtastic metadata.
+
+    Filters are cumulative (AND). `board_level="release"` means envs with no
+    explicit `board_level` set (the default release targets).
+    """
+    envs = _get_cached()
+    q = query.lower().strip() if query else None
+
+    out = []
+    for rec in envs.values():
+        if architecture and rec.get("architecture") != architecture:
+            continue
+        if actively_supported_only and not rec.get("actively_supported"):
+            continue
+        if board_level is not None:
+            rec_level = rec.get("board_level")
+            if board_level == "release":
+                if rec_level not in (None, ""):
+                    continue
+            elif rec_level != board_level:
+                continue
+        if q:
+            display = (rec.get("display_name") or "").lower()
+            env_name = rec.get("env", "").lower()
+            slug = (rec.get("hw_model_slug") or "").lower()
+            if q not in display and q not in env_name and q not in slug:
+                continue
+        out.append(_public_record(rec))
+
+    out.sort(key=lambda r: (r.get("architecture") or "", r.get("env")))
+    return out
+
+
+def get_board(env: str) -> dict[str, Any]:
+    """Full metadata for one env, including the raw pio config dict."""
+    envs = _get_cached()
+    rec = envs.get(env)
+    if rec is None:
+        raise KeyError(
+            f"Unknown env: {env!r}. Use list_boards() to see available envs."
+        )
+    public = _public_record(rec)
+    public["raw_config"] = rec["_raw"]
+    return public
diff --git a/mcp-server/src/meshtastic_mcp/cli/__init__.py b/mcp-server/src/meshtastic_mcp/cli/__init__.py
new file mode 100644
index 00000000000..04729b643e1
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/cli/__init__.py
@@ -0,0 +1,6 @@
+"""Command-line entry points that sit alongside the MCP server.
+
+Modules here are loaded on-demand by `[project.scripts]` entries in
+`pyproject.toml`. They are NOT imported by `meshtastic_mcp.server` or the
+admin/info tool surface — the MCP server stays pure stdio JSON-RPC.
+"""
diff --git a/mcp-server/src/meshtastic_mcp/cli/_flashlog.py b/mcp-server/src/meshtastic_mcp/cli/_flashlog.py
new file mode 100644
index 00000000000..889183bb30e
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/cli/_flashlog.py
@@ -0,0 +1,73 @@
+"""Flash progress log tailer for ``meshtastic-mcp-test-tui``.
+
+``pio.py`` / ``hw_tools.py`` tee subprocess output (``pio run -t upload``,
+``esptool erase_flash``, ``nrfutil dfu``, etc.) to ``tests/flash.log``
+line-by-line as it arrives — controlled by the ``MESHTASTIC_MCP_FLASH_LOG``
+env var that ``run-tests.sh`` sets. The TUI tails that file so the operator
+sees live flash progress in the pytest pane instead of 3 minutes of silence
+during ``test_00_bake``.
+
+Separate from ``_fwlog.py`` because that one parses JSONL, this one
+streams plain text lines. Same daemon-thread + EOF-backoff structure.
+"""
+
+from __future__ import annotations
+
+import pathlib
+import threading
+import time
+from typing import Callable
+
+
+class FlashLogTailer(threading.Thread):
+    """Tail a plain-text log file, publish each stripped line via ``post``.
+
+    ``post`` is invoked with a single ``str`` for every new line. Lines are
+    stripped of trailing newlines; empty lines after stripping are dropped.
+
+    The file may not exist yet when this thread starts — it's truncated by
+    ``run-tests.sh`` at session start, but if the tailer races the shell,
+    we tolerate FileNotFoundError for up to ``wait_s`` seconds.
+    """
+
+    def __init__(
+        self,
+        path: pathlib.Path,
+        post: Callable[[str], None],
+        stop: threading.Event,
+        *,
+        wait_s: float = 30.0,
+    ) -> None:
+        super().__init__(daemon=True, name="flashlog-tail")
+        self._path = path
+        self._post = post
+        self._stop = stop
+        self._wait_s = wait_s
+
+    def run(self) -> None:
+        deadline = time.monotonic() + self._wait_s
+        while not self._path.is_file():
+            if self._stop.is_set() or time.monotonic() > deadline:
+                return
+            time.sleep(0.1)
+        try:
+            fh = self._path.open("r", encoding="utf-8", errors="replace")
+        except OSError:
+            return
+        try:
+            while not self._stop.is_set():
+                line = fh.readline()
+                if not line:
+                    time.sleep(0.05)
+                    continue
+                line = line.rstrip("\r\n")
+                if not line:
+                    continue
+                try:
+                    self._post(line)
+                except Exception:
+                    # A post failure (e.g. closed app) is terminal for this
+                    # thread but we still want to close the file handle.
+                    return
+        finally:
+            fh.close()
diff --git a/mcp-server/src/meshtastic_mcp/cli/_fwlog.py b/mcp-server/src/meshtastic_mcp/cli/_fwlog.py
new file mode 100644
index 00000000000..7db20f81cc8
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/cli/_fwlog.py
@@ -0,0 +1,96 @@
+"""Firmware log tail worker for ``meshtastic-mcp-test-tui``.
+
+Complements v1's reportlog-tail worker. ``tests/conftest.py`` owns a
+session-scoped autouse fixture (``_firmware_log_stream``) that mirrors
+every ``meshtastic.log.line`` pubsub event to ``tests/fwlog.jsonl`` —
+one JSON object per line:
+
+    {"ts": 1729100000.123, "port": "/dev/cu.usbmodem1101", "line": "..."}
+
+The TUI tails that file from a worker thread; each new line becomes a
+:class:`FirmwareLogLine` message posted to the App. Same pattern as the
+reportlog tail worker — truncate on launch, tolerate missing file for
+30 s, back off at EOF.
+
+Kept in its own module so the (large) ``test_tui.py`` stays focused on
+the Textual App shell.
+"""
+
+from __future__ import annotations
+
+import json
+import pathlib
+import threading
+import time
+from typing import Any, Callable
+
+
+class FirmwareLogTailer(threading.Thread):
+    """Tail ``tests/fwlog.jsonl``, publish parsed records via ``post``.
+
+    ``post`` is the App's ``post_message`` (or any callable that accepts a
+    single payload arg). We pass parsed dicts rather than constructing
+    Textual Message objects here — keeps this module free of the
+    textual dependency so it's unit-testable in a bare venv.
+
+    Parameters
+    ----------
+    path:
+        Path to ``tests/fwlog.jsonl``. The file may not exist yet at
+        startup — pytest only creates it once the session fixture runs.
+    post:
+        Callable invoked with a dict ``{"ts", "port", "line"}`` for every
+        new line parsed from the file.
+    stop:
+        An event the App sets to signal shutdown.
+    wait_s:
+        How long to poll for the file's creation before giving up. Default
+        30 s; pytest collection on a cold cache can be slow.
+
+    """
+
+    def __init__(
+        self,
+        path: pathlib.Path,
+        post: Callable[[dict[str, Any]], None],
+        stop: threading.Event,
+        *,
+        wait_s: float = 30.0,
+    ) -> None:
+        super().__init__(daemon=True, name="fwlog-tail")
+        self._path = path
+        self._post = post
+        self._stop = stop
+        self._wait_s = wait_s
+
+    def run(self) -> None:
+        deadline = time.monotonic() + self._wait_s
+        while not self._path.is_file():
+            if self._stop.is_set() or time.monotonic() > deadline:
+                return
+            time.sleep(0.1)
+        try:
+            fh = self._path.open("r", encoding="utf-8")
+        except OSError:
+            return
+        try:
+            while not self._stop.is_set():
+                line = fh.readline()
+                if not line:
+                    time.sleep(0.05)
+                    continue
+                line = line.strip()
+                if not line:
+                    continue
+                try:
+                    record = json.loads(line)
+                except json.JSONDecodeError:
+                    continue
+                # Defensive: require the three fields we rely on.
+                if not isinstance(record, dict):
+                    continue
+                if "line" not in record:
+                    continue
+                self._post(record)
+        finally:
+            fh.close()
diff --git a/mcp-server/src/meshtastic_mcp/cli/_history.py b/mcp-server/src/meshtastic_mcp/cli/_history.py
new file mode 100644
index 00000000000..639dcec5f55
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/cli/_history.py
@@ -0,0 +1,127 @@
+"""Cross-run history for ``meshtastic-mcp-test-tui``.
+
+Persists one JSON object per pytest run to
+``mcp-server/tests/.history/runs.jsonl``. The TUI reads the last N
+entries on launch to render a duration sparkline in the header — a
+quick read on whether the suite is slowing down over time.
+
+Schema (keep small; the file can grow for months):
+
+    {"run": 42, "ts": 1729100000.0, "duration_s": 387.2,
+     "passed": 52, "failed": 0, "skipped": 23, "exit_code": 0,
+     "seed": "mcp-user-host"}
+"""
+
+from __future__ import annotations
+
+import json
+import pathlib
+import time
+from dataclasses import asdict, dataclass
+from typing import Iterable
+
+# Sparkline glyphs, low → high. 8 levels is the Unicode convention.
+_SPARK_BLOCKS = "▁▂▃▄▅▆▇█"
+
+
+@dataclass
+class RunRecord:
+    run: int
+    ts: float
+    duration_s: float
+    passed: int
+    failed: int
+    skipped: int
+    exit_code: int
+    seed: str
+
+
+class HistoryStore:
+    """Append-only JSONL store with bounded read.
+
+    Writes are fsynced after each append (the file is tiny; fsync cost
+    is negligible and protects against truncation on a crash).
+    """
+
+    def __init__(self, path: pathlib.Path, *, keep_last: int = 50) -> None:
+        self._path = path
+        self._keep_last = keep_last
+
+    def append(self, record: RunRecord) -> None:
+        try:
+            self._path.parent.mkdir(parents=True, exist_ok=True)
+            with self._path.open("a", encoding="utf-8") as fh:
+                fh.write(json.dumps(asdict(record)) + "\n")
+                fh.flush()
+        except Exception:
+            # Non-fatal: history is cosmetic.
+            pass
+
+    def read_recent(self) -> list[RunRecord]:
+        """Return the last ``keep_last`` records in chronological order."""
+        if not self._path.is_file():
+            return []
+        try:
+            lines = self._path.read_text(encoding="utf-8").splitlines()
+        except OSError:
+            return []
+        out: list[RunRecord] = []
+        # Parse tail-first so we don't waste work on a huge history.
+        for line in lines[-self._keep_last :]:
+            line = line.strip()
+            if not line:
+                continue
+            try:
+                raw = json.loads(line)
+            except json.JSONDecodeError:
+                continue
+            try:
+                out.append(RunRecord(**raw))
+            except TypeError:
+                # Schema drift; skip the record rather than crash.
+                continue
+        return out
+
+    def record_run(
+        self,
+        *,
+        run: int,
+        duration_s: float,
+        passed: int,
+        failed: int,
+        skipped: int,
+        exit_code: int,
+        seed: str,
+    ) -> RunRecord:
+        rec = RunRecord(
+            run=run,
+            ts=time.time(),
+            duration_s=float(duration_s),
+            passed=int(passed),
+            failed=int(failed),
+            skipped=int(skipped),
+            exit_code=int(exit_code),
+            seed=seed,
+        )
+        self.append(rec)
+        return rec
+
+
+def sparkline(values: Iterable[float], *, width: int = 20) -> str:
+    """Render a Unicode block-character sparkline from the last ``width`` values.
+
+    Returns an empty string for empty input so the header handles
+    "no history yet" gracefully.
+    """
+    buf = [v for v in values if v >= 0][-width:]
+    if not buf:
+        return ""
+    lo, hi = min(buf), max(buf)
+    if hi - lo < 1e-9:
+        return _SPARK_BLOCKS[len(_SPARK_BLOCKS) // 2] * len(buf)
+    n = len(_SPARK_BLOCKS) - 1
+    out = []
+    for v in buf:
+        idx = int(round((v - lo) / (hi - lo) * n))
+        out.append(_SPARK_BLOCKS[max(0, min(n, idx))])
+    return "".join(out)
diff --git a/mcp-server/src/meshtastic_mcp/cli/_reproducer.py b/mcp-server/src/meshtastic_mcp/cli/_reproducer.py
new file mode 100644
index 00000000000..420da3c76a7
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/cli/_reproducer.py
@@ -0,0 +1,214 @@
+"""Reproducer bundle builder for ``meshtastic-mcp-test-tui``.
+
+When the operator presses ``x`` on a failed test leaf, we package the
+minimum viable failure context into a tarball under
+``mcp-server/tests/reproducers/``:
+
+::
+
+    repro-<ts>-<short_nodeid>.tar.gz
+      ├── README.md            human-readable overview
+      ├── test_report.json     the failing TestReport event from reportlog
+      ├── fwlog.jsonl          firmware log filtered to the failure window
+      ├── devices.json         per-device device_info + lora config snapshot
+      └── env.json             seed, run #, pytest version, platform, hostname
+
+Separate module so the logic can be unit-tested without Textual. The
+TUI glue is thin — one key binding calls :func:`build_reproducer_bundle`
+with the focused test's state and shows the path in a modal.
+"""
+
+from __future__ import annotations
+
+import io
+import json
+import pathlib
+import platform
+import re
+import socket
+import tarfile
+import time
+from dataclasses import dataclass
+from typing import Any, Iterable
+
+
+@dataclass
+class ReproContext:
+    """Everything :func:`build_reproducer_bundle` needs. Shaped to map
+    cleanly onto the state the TUI already tracks — no extra data
+    collection required at export time."""
+
+    nodeid: str
+    longrepr: str
+    sections: list[tuple[str, str]]
+    start_ts: float | None
+    stop_ts: float | None
+    seed: str
+    run_number: int
+    exit_code: int | None
+    fwlog_path: pathlib.Path
+    output_dir: pathlib.Path
+    extra_device_rows: list[dict[str, Any]]  # [{role, port, info, ...}, ...]
+
+
+def _short_nodeid(nodeid: str) -> str:
+    """Collapse a pytest nodeid into a filename-safe slug (<= 60 chars)."""
+    # Drop the file path prefix; keep test name + parametrization.
+    tail = nodeid.split("::", 1)[-1] if "::" in nodeid else nodeid
+    slug = re.sub(r"[^A-Za-z0-9_.\-]", "_", tail)
+    return slug[:60].strip("_.-") or "test"
+
+
+def _filtered_fwlog(
+    fwlog_path: pathlib.Path,
+    start_ts: float | None,
+    stop_ts: float | None,
+    *,
+    pad_s: float = 5.0,
+) -> bytes:
+    """Return fwlog.jsonl lines whose ``ts`` lies in [start-pad, stop+pad]."""
+    if not fwlog_path.is_file():
+        return b""
+    if start_ts is None or stop_ts is None:
+        # Without a time window, include the whole file — rare; happens
+        # when a test fails in setup before pytest emitted a start ts.
+        try:
+            return fwlog_path.read_bytes()
+        except OSError:
+            return b""
+    lo, hi = start_ts - pad_s, stop_ts + pad_s
+    out = io.BytesIO()
+    try:
+        with fwlog_path.open("r", encoding="utf-8") as fh:
+            for line in fh:
+                stripped = line.strip()
+                if not stripped:
+                    continue
+                try:
+                    record = json.loads(stripped)
+                except json.JSONDecodeError:
+                    continue
+                ts = record.get("ts")
+                if not isinstance(ts, (int, float)):
+                    continue
+                if lo <= ts <= hi:
+                    out.write(line.encode("utf-8"))
+    except OSError:
+        return b""
+    return out.getvalue()
+
+
+def _readme(ctx: ReproContext) -> str:
+    t = time.strftime("%Y-%m-%d %H:%M:%S %Z", time.localtime())
+    return f"""# Reproducer bundle
+
+Exported by `meshtastic-mcp-test-tui` on {t}.
+
+## Failing test
+
+- **nodeid:** `{ctx.nodeid}`
+- **seed:** `{ctx.seed}`
+- **run #:** {ctx.run_number}
+- **suite exit code (at export time):** {ctx.exit_code if ctx.exit_code is not None else "in progress"}
+
+## Files in this archive
+
+| File | Contents |
+|---|---|
+| `test_report.json` | The pytest-reportlog `TestReport` event for the failing test — includes `longrepr`, captured `sections` (stdout/stderr/log), `duration`, `location`, `keywords`. |
+| `fwlog.jsonl` | Firmware log lines (from `meshtastic.log.line` pubsub) filtered to [start−5s, stop+5s] around the test's run window. Each line is `{{ts, port, line}}`. |
+| `devices.json` | Per-device snapshot at export time: `device_info` + `lora` config per detected role. |
+| `env.json` | Python version, platform, hostname, seed, run number. |
+
+## How to triage
+
+1. Open `test_report.json` and read `longrepr` + `sections` — most failures explain themselves there.
+2. If the failure is a mesh/telemetry assertion, `fwlog.jsonl` is where the answer usually lives. Grep for `Error=`, `NAK`, `PKI_UNKNOWN_PUBKEY`, `Skip send`, `Guru Meditation`, or the uptime timestamps around the assertion event.
+3. Compare `devices.json` against the expected state (e.g. `num_nodes >= 2`, `primary_channel == "McpTest"`, `region == "US"`). If fields disagree with the seed-derived USERPREFS profile, the device probably wasn't baked with this session's profile.
+
+## Reproducing locally
+
+```bash
+cd mcp-server
+MESHTASTIC_MCP_SEED='{ctx.seed}' .venv/bin/pytest '{ctx.nodeid}' --tb=long -v
+```
+"""
+
+
+def build_reproducer_bundle(ctx: ReproContext) -> pathlib.Path:
+    """Build a tarball under ``ctx.output_dir`` and return its path.
+
+    Parent dirs are created as needed. Errors during optional sections
+    (devices, env) are swallowed — the bundle is still useful without
+    them; refusing to export because the device poller had a hiccup
+    would be worse than the export missing a file.
+    """
+    ctx.output_dir.mkdir(parents=True, exist_ok=True)
+    ts = int(time.time())
+    slug = _short_nodeid(ctx.nodeid)
+    archive_path = ctx.output_dir / f"repro-{ts}-{slug}.tar.gz"
+
+    with tarfile.open(archive_path, "w:gz") as tar:
+
+        def _add(name: str, data: bytes) -> None:
+            info = tarfile.TarInfo(name=name)
+            info.size = len(data)
+            info.mtime = ts
+            tar.addfile(info, io.BytesIO(data))
+
+        # README
+        _add("README.md", _readme(ctx).encode("utf-8"))
+
+        # test_report.json — reconstruct from the fields the TUI stashes.
+        test_report = {
+            "nodeid": ctx.nodeid,
+            "outcome": "failed",
+            "longrepr": ctx.longrepr,
+            "sections": [list(s) for s in ctx.sections],
+            "start": ctx.start_ts,
+            "stop": ctx.stop_ts,
+        }
+        _add(
+            "test_report.json",
+            json.dumps(test_report, indent=2, default=str).encode("utf-8"),
+        )
+
+        # fwlog.jsonl (filtered)
+        _add("fwlog.jsonl", _filtered_fwlog(ctx.fwlog_path, ctx.start_ts, ctx.stop_ts))
+
+        # devices.json
+        try:
+            devices_payload = json.dumps(
+                ctx.extra_device_rows or [], indent=2, default=str
+            )
+        except Exception:
+            devices_payload = "[]"
+        _add("devices.json", devices_payload.encode("utf-8"))
+
+        # env.json
+        try:
+            from importlib.metadata import version as _pkg_version
+
+            pytest_version = _pkg_version("pytest")
+        except Exception:
+            pytest_version = "unknown"
+        env_payload = {
+            "seed": ctx.seed,
+            "run": ctx.run_number,
+            "exit_code": ctx.exit_code,
+            "export_ts": ts,
+            "python": platform.python_version(),
+            "pytest": pytest_version,
+            "platform": f"{platform.system()} {platform.release()} {platform.machine()}",
+            "hostname": socket.gethostname(),
+        }
+        _add("env.json", json.dumps(env_payload, indent=2).encode("utf-8"))
+
+    return archive_path
+
+
+def iter_entries(archive_path: pathlib.Path) -> Iterable[str]:
+    """Yield member names — used by callers that want to confirm the bundle shape."""
+    with tarfile.open(archive_path, "r:gz") as tar:
+        for m in tar.getmembers():
+            yield m.name
diff --git a/mcp-server/src/meshtastic_mcp/cli/test_tui.py b/mcp-server/src/meshtastic_mcp/cli/test_tui.py
new file mode 100644
index 00000000000..33201101b1a
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/cli/test_tui.py
@@ -0,0 +1,1782 @@
+"""Textual TUI wrapping `mcp-server/run-tests.sh`.
+
+Launch:  ``meshtastic-mcp-test-tui [pytest-args]``
+
+The TUI *wraps* ``run-tests.sh``; it never replaces it. Same script, same
+env-var resolution, same ``userPrefs.jsonc`` session fixture. Four data
+sources drive live state:
+
+1. ``tests/reportlog.jsonl`` — written by ``pytest-reportlog``. Tailed in a
+   worker thread; each JSON line is published as a :class:`ReportLogEvent`
+   message. This is the authoritative source for tree population + per-test
+   outcome.
+2. The pytest subprocess ``stdout`` + ``stderr`` streams — line-by-line,
+   published as :class:`PytestLine` messages and rendered verbatim in the
+   pytest pane.
+3. ``tests/fwlog.jsonl`` — firmware log stream. Written by the
+   ``_firmware_log_stream`` autouse session fixture in ``conftest.py``
+   (mirrors every ``meshtastic.log.line`` pubsub event), tailed by the
+   :class:`FirmwareLogTailer` worker, displayed in a wrap-enabled
+   RichLog with cycleable port filter.
+4. ``devices.list_devices()`` + ``info.device_info(port)`` — polled only at
+   startup and again after ``RunFinished``. Device polling while pytest
+   holds a SerialInterface would deadlock on the exclusive port lock; the
+   existing ``hub_devices`` fixture is session-scoped so there is no safe
+   "between tests" window. The header reflects this with a "(stale)"
+   marker while the run is active.
+
+Key bindings (see :class:`TestTuiApp.BINDINGS`):
+    ``r`` re-run focused  ``f`` filter tree  ``d`` failure detail
+    ``g`` open report.html  ``l`` cycle firmware-log port filter
+    ``x`` export reproducer bundle  ``c`` tool-coverage panel
+    ``q`` / Ctrl-C  graceful quit with SIGINT → SIGTERM → SIGKILL escalation
+
+Shipped today (v1 + v2 slice): test tree + tier counters with progress bars,
+pytest tail, live firmware log with port filter, device strip with
+"currently running" status column, failure-detail modal, reproducer bundle
+export (filters fwlog by test's start/stop timestamps), tool-coverage
+modal, cross-run history sparkline in the header, clean SIGINT
+propagation. Still open (see the plan file): mesh topology mini-diagram
+and airtime / channel-utilization gauges.
+"""
+
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import pathlib
+import signal
+import subprocess
+import sys
+import threading
+import time
+from dataclasses import dataclass, field
+from typing import Any, Iterator
+
+# ---------------------------------------------------------------------------
+# Configuration constants
+# ---------------------------------------------------------------------------
+
+# Tier names that map nodeids like "tests/<tier>/..." to counter buckets.
+# Order here == display order in the tier-counters table. Matches the order
+# `pytest_collection_modifyitems` in `conftest.py` uses:
+#   bake → unit → mesh → telemetry → monitor → fleet → admin → provisioning
+# so the counters table reads top-to-bottom in execution order.
+#
+# "bake" is the synthetic tier for `tests/test_00_bake.py` — the file sits
+# at the `tests/` root rather than under a tier subdirectory, so without
+# this mapping `_tier_of_nodeid` would return "other" and the bake outcomes
+# would be silently dropped from both the tier table and the history
+# record (which sums tier counters to compute passed/failed/skipped).
+TIERS = (
+    "bake",
+    "unit",
+    "mesh",
+    "telemetry",
+    "monitor",
+    "fleet",
+    "admin",
+    "provisioning",
+)
+
+# Relative paths from the mcp-server root.
+_REPORTLOG_RELATIVE = "tests/reportlog.jsonl"
+_FWLOG_RELATIVE = "tests/fwlog.jsonl"
+# pio / esptool / nrfutil / picotool tee subprocess output here when
+# `MESHTASTIC_MCP_FLASH_LOG` is set (see `pio._run_capturing`). run-tests.sh
+# sets that env var; the TUI also sets it for direct `_spawn_pytest` calls
+# so `r`-key re-runs that skip the wrapper still get tee'd output.
+_FLASHLOG_RELATIVE = "tests/flash.log"
+_REPORT_HTML_RELATIVE = "tests/report.html"
+_TOOL_COVERAGE_RELATIVE = "tests/tool_coverage.json"
+_HISTORY_RELATIVE = "tests/.history/runs.jsonl"
+_REPRODUCERS_RELATIVE = "tests/reproducers"
+_RUN_TESTS_RELATIVE = "run-tests.sh"
+_RUN_COUNTER_RELATIVE = "tests/.tui-runs"
+
+# Graceful-shutdown budgets (seconds) for the pytest subprocess when the
+# user hits `q`. Matches what the existing CLI's atexit + userprefs sidecar
+# self-heal expects.
+_SIGINT_GRACE_S = 5.0
+_SIGTERM_GRACE_S = 5.0
+
+
+# ---------------------------------------------------------------------------
+# Path resolution
+# ---------------------------------------------------------------------------
+
+
+def _mcp_server_root() -> pathlib.Path:
+    """Locate the mcp-server directory (the one containing run-tests.sh)."""
+    here = pathlib.Path(__file__).resolve()
+    # Walk up until we find pyproject.toml with a matching project name, or
+    # default to the three-up ancestor (src/meshtastic_mcp/cli/test_tui.py →
+    # .../mcp-server). The walk-up protects against unusual checkouts.
+    for parent in (here.parent, *here.parents):
+        if (parent / "pyproject.toml").is_file() and (
+            parent / "run-tests.sh"
+        ).is_file():
+            return parent
+    return here.parents[3]
+
+
+# ---------------------------------------------------------------------------
+# Data classes
+# ---------------------------------------------------------------------------
+
+
+@dataclass
+class LeafReport:
+    """Per-test state drawn from reportlog events.
+
+    Outcomes mirror pytest's: "passed" | "failed" | "skipped" | "running".
+    """
+
+    nodeid: str
+    tier: str
+    outcome: str = "pending"
+    duration_s: float = 0.0
+    longrepr: str = ""
+    # Captured stdout / stderr / firmware-log sections from the test's
+    # `TestReport.sections` — shown in the failure-detail modal.
+    sections: list[tuple[str, str]] = field(default_factory=list)
+    # Wall-clock start/stop from the TestReport event. Used by the
+    # reproducer exporter (`x`) to filter `tests/fwlog.jsonl` down to
+    # just the lines around the failure window.
+    start_ts: float | None = None
+    stop_ts: float | None = None
+
+
+@dataclass
+class TierCounters:
+    tier: str
+    passed: int = 0
+    failed: int = 0
+    skipped: int = 0
+    running: int = 0
+    remaining: int = 0
+
+
+@dataclass
+class DeviceRow:
+    role: str | None
+    port: str
+    vid: str
+    pid: str
+    description: str
+    # Populated from info.device_info when available; empty dict when we
+    # haven't queried (or when the poller is paused).
+    info: dict[str, Any] = field(default_factory=dict)
+
+
+@dataclass
+class State:
+    """Shared state owned by the App; written by workers under `lock`.
+
+    UI code reads via Textual Message handlers which run on the UI thread
+    in the order workers called `post_message` — so reads don't need the
+    lock themselves.
+    """
+
+    lock: threading.Lock = field(default_factory=threading.Lock)
+    tiers: dict[str, TierCounters] = field(
+        default_factory=lambda: {t: TierCounters(tier=t) for t in TIERS}
+    )
+    leaves: dict[str, LeafReport] = field(default_factory=dict)
+    # Ordered list of nodeids in the order they were first seen — lets us
+    # rebuild the tree deterministically.
+    nodeid_order: list[str] = field(default_factory=list)
+    devices: list[DeviceRow] = field(default_factory=list)
+    run_active: bool = False
+    exit_code: int | None = None
+    # nodeid of the currently-running test. Set on `when="setup"` +
+    # outcome="passed" (body about to execute); cleared on `when="call"`
+    # (any outcome) or on `when="setup"` + outcome="failed" (no body
+    # window). Drives the device-table "Status" column so the operator
+    # can see which test is touching a given device right now.
+    running_nodeid: str | None = None
+    # `time.monotonic()` captured when `running_nodeid` was set. Surfaced
+    # as live-updating elapsed-time ("RUNNING: test_bake_nrf52 (1:23)") so
+    # an operator staring at a ~3 min `test_00_bake` or a `mesh_formation`
+    # with a 60 s ceiling has concrete evidence the test isn't stuck.
+    running_started_at: float | None = None
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _tier_of_nodeid(nodeid: str) -> str:
+    """Map a pytest nodeid to its tier bucket. Unknown → 'other'.
+
+    `tests/test_00_bake.py::...` is special-cased to the synthetic `bake`
+    tier — it's a top-level file (no tier subdirectory) so the generic
+    "second path segment" logic would miss it and route the bake outcomes
+    into the non-existent `other` bucket.
+    """
+    parts = nodeid.split("/", 2)
+    if len(parts) >= 2 and parts[0] == "tests":
+        # Bake file sits at `tests/test_00_bake.py` — dedicated bucket.
+        if parts[1].startswith("test_00_bake"):
+            return "bake"
+        candidate = parts[1]
+        if candidate in TIERS:
+            return candidate
+    return "other"
+
+
+def _file_of_nodeid(nodeid: str) -> str:
+    """Extract the test file name (e.g. 'test_boards.py') from a nodeid."""
+    left = nodeid.split("::", 1)[0]
+    return left.rsplit("/", 1)[-1]
+
+
+def _testname_of_nodeid(nodeid: str) -> str:
+    """Extract the 'test_foo[param]' suffix from a nodeid, or the full thing."""
+    if "::" in nodeid:
+        return nodeid.split("::", 1)[1]
+    return nodeid
+
+
+def _roles_from_nodeid(nodeid: str) -> set[str]:
+    """Infer which device roles a parametrized test touches.
+
+    Patterns we recognize (from the existing ``conftest.py`` parametrization
+    in ``pytest_generate_tests``):
+
+    - ``test_foo[nrf52]``            → {"nrf52"}           (baked_single)
+    - ``test_foo[nrf52->esp32s3]``   → {"nrf52", "esp32s3"} (mesh_pair)
+
+    Unparametrized tests (no bracket) return an empty set — the caller
+    should fall back to "this test involves ALL detected devices" rather
+    than pretending it touches none.
+    """
+    if "[" not in nodeid or not nodeid.endswith("]"):
+        return set()
+    try:
+        inner = nodeid.rsplit("[", 1)[1][:-1]
+    except Exception:
+        return set()
+    # Split on "->" for directed mesh pairs; otherwise treat as single role.
+    parts = [p.strip() for p in inner.split("->")] if "->" in inner else [inner.strip()]
+    return {p for p in parts if p}
+
+
+def _parse_events(path: pathlib.Path) -> Iterator[dict[str, Any]]:
+    """Yield parsed JSON dicts from a reportlog file, skipping malformed lines.
+
+    Used for smoke-testing the parser against a finished file; the live
+    worker has its own tail loop.
+    """
+    if not path.is_file():
+        return
+    with path.open("r", encoding="utf-8") as fh:
+        for line in fh:
+            line = line.strip()
+            if not line:
+                continue
+            try:
+                yield json.loads(line)
+            except json.JSONDecodeError:
+                continue
+
+
+def _load_run_number(counter_path: pathlib.Path) -> int:
+    """Bump + persist a monotonic run counter used in the TUI header."""
+    try:
+        n = int(counter_path.read_text().strip())
+    except Exception:
+        n = 0
+    n += 1
+    try:
+        counter_path.parent.mkdir(parents=True, exist_ok=True)
+        counter_path.write_text(str(n))
+    except Exception:
+        # Non-fatal: the counter is cosmetic.
+        pass
+    return n
+
+
+def _resolve_seed() -> str:
+    """Mirror the default-seed resolution from run-tests.sh.
+
+    Operator can override via MESHTASTIC_MCP_SEED. Matches the
+    per-user/per-host default so repeated invocations land on the same PSK
+    (makes --assume-baked valid across invocations).
+    """
+    if explicit := os.environ.get("MESHTASTIC_MCP_SEED"):
+        return explicit
+    try:
+        who = os.environ.get("USER") or os.environ.get("LOGNAME") or "anon"
+    except Exception:
+        who = "anon"
+    try:
+        import socket
+
+        host = socket.gethostname().split(".", 1)[0]
+    except Exception:
+        host = "host"
+    return f"mcp-{who}-{host}"
+
+
+def _format_duration(seconds: float) -> str:
+    if seconds < 60:
+        return f"{seconds:5.1f}s"
+    m, s = divmod(int(seconds), 60)
+    return f"{m:d}:{s:02d}"
+
+
+# ---------------------------------------------------------------------------
+# Textual imports (lazy — only when main() runs, so `_parse_events` can be
+# imported by smoke tests without requiring textual installed in every env)
+# ---------------------------------------------------------------------------
+
+
+def _import_textual() -> Any:
+    """Return a namespace carrying every Textual class we use.
+
+    Deferred import keeps `_parse_events` + `_tier_of_nodeid` importable
+    from tests / smoke scripts without pulling in the UI stack.
+    """
+    import textual
+    from textual.app import App, ComposeResult
+    from textual.binding import Binding
+    from textual.containers import Horizontal, Vertical
+    from textual.message import Message
+    from textual.screen import ModalScreen
+    from textual.widgets import DataTable, Footer, Input, RichLog, Static, Tree
+
+    ns = argparse.Namespace()
+    ns.App = App
+    ns.Binding = Binding
+    ns.ComposeResult = ComposeResult
+    ns.DataTable = DataTable
+    ns.Footer = Footer
+    ns.Horizontal = Horizontal
+    ns.Input = Input
+    ns.Message = Message
+    ns.ModalScreen = ModalScreen
+    ns.RichLog = RichLog
+    ns.Static = Static
+    ns.Tree = Tree
+    ns.Vertical = Vertical
+    ns.textual = textual
+    return ns
+
+
+# ---------------------------------------------------------------------------
+# main() — the important scaffolding lives here so that when we bail out
+# before entering the Textual event loop (missing terminal, --help, etc.)
+# nothing has grabbed the screen yet.
+# ---------------------------------------------------------------------------
+
+
+def main(argv: list[str] | None = None) -> int:
+    """Entry point for `meshtastic-mcp-test-tui`."""
+    argv = list(argv if argv is not None else sys.argv[1:])
+
+    parser = argparse.ArgumentParser(
+        prog="meshtastic-mcp-test-tui",
+        description=(
+            "Live Textual TUI wrapping mcp-server/run-tests.sh. "
+            "Passes any unrecognized arguments through to pytest."
+        ),
+        allow_abbrev=False,
+    )
+    parser.add_argument(
+        "--no-tui",
+        action="store_true",
+        help=(
+            "Skip the TUI and exec run-tests.sh directly. Useful as a health "
+            "check that the wrapper argv+env resolution is working."
+        ),
+    )
+    args, pytest_args = parser.parse_known_args(argv)
+
+    root = _mcp_server_root()
+    run_tests = root / _RUN_TESTS_RELATIVE
+    reportlog = root / _REPORTLOG_RELATIVE
+    fwlog = root / _FWLOG_RELATIVE
+    flashlog = root / _FLASHLOG_RELATIVE
+    counter = root / _RUN_COUNTER_RELATIVE
+
+    if not run_tests.is_file():
+        print(
+            f"error: could not locate {_RUN_TESTS_RELATIVE} relative to "
+            f"{root}. Is this the mcp-server checkout?",
+            file=sys.stderr,
+        )
+        return 2
+
+    # Always clear stale log files before launching pytest. The TUI's tail
+    # workers race pytest file-creation; starting from a known-empty state
+    # avoids mid-line-decode confusion from the prior run. The fwlog session
+    # fixture also truncates on its end, and run-tests.sh truncates the
+    # flashlog — triple-truncate is deliberate (whichever side creates the
+    # file first, it starts empty).
+    for p in (reportlog, fwlog, flashlog):
+        try:
+            p.unlink(missing_ok=True)
+        except Exception:
+            pass
+
+    # Compute + persist the run counter for the header (cosmetic).
+    run_number = _load_run_number(counter)
+    seed = _resolve_seed()
+    # Export the seed so the subprocess inherits the SAME value the TUI
+    # displays. run-tests.sh computes its own fallback if unset, and we'd
+    # end up with a header / wrapper-header mismatch if we let that happen.
+    os.environ.setdefault("MESHTASTIC_MCP_SEED", seed)
+    # Turn on subprocess-output tee'ing so `pio._run_capturing` writes each
+    # line of pio / esptool / nrfutil / picotool output to `tests/flash.log`
+    # as it arrives. The TUI tails that file and routes each line to the
+    # pytest pane so the operator sees live flash progress during long
+    # `pio run -t upload` / `esptool erase_flash` operations. run-tests.sh
+    # also sets this when invoked directly — `setdefault` so the wrapper's
+    # value wins when present.
+    os.environ.setdefault("MESHTASTIC_MCP_FLASH_LOG", str(flashlog))
+
+    # --no-tui: exec run-tests.sh directly. Useful for diagnosing wrapper
+    # env / argv handling without getting into Textual's alternate screen.
+    if args.no_tui:
+        cmd = [str(run_tests), *pytest_args]
+        os.execv(str(run_tests), cmd)  # noqa: S606 — intentional
+
+    # Textual UI import is deferred so `--help` and `--no-tui` do not pay
+    # the ~40 MB startup cost.
+    try:
+        tx = _import_textual()
+    except ImportError as exc:
+        print(
+            f"error: textual is not installed ({exc}). Install with: "
+            f"pip install -e '.[test]'",
+            file=sys.stderr,
+        )
+        return 2
+
+    # Narrow-terminal warning (see plan §8 risk 2). Textual itself degrades,
+    # but a heads-up helps a first-time user.
+    term = os.environ.get("TERM", "")
+    if term in ("", "dumb", "screen") and not os.environ.get("TEXTUAL_NO_TERM_HINT"):
+        print(
+            f"[hint] TERM={term!r} may render poorly. Try "
+            f"`TERM=xterm-256color meshtastic-mcp-test-tui ...` if the layout "
+            f"looks broken.",
+            file=sys.stderr,
+        )
+
+    app = _build_app(
+        tx=tx,
+        root=root,
+        run_tests=run_tests,
+        reportlog=reportlog,
+        fwlog=fwlog,
+        flashlog=flashlog,
+        seed=seed,
+        run_number=run_number,
+        pytest_args=pytest_args,
+    )
+
+    # App.run() returns the subprocess exit code via `app.exit(returncode)`.
+    return_value = app.run()
+    if isinstance(return_value, int):
+        return return_value
+    return 0
+
+
+# ---------------------------------------------------------------------------
+# Everything below is only reachable once Textual is importable. `tx` is
+# the namespace returned by `_import_textual()` so we don't scatter `from
+# textual import ...` across the file.
+# ---------------------------------------------------------------------------
+
+
+def _build_app(
+    *,
+    tx: Any,
+    root: pathlib.Path,
+    run_tests: pathlib.Path,
+    reportlog: pathlib.Path,
+    fwlog: pathlib.Path,
+    flashlog: pathlib.Path,
+    seed: str,
+    run_number: int,
+    pytest_args: list[str],
+) -> Any:
+    """Assemble TestTuiApp with its Textual-dependent inner classes.
+
+    Keeping the class definitions inside a factory means `main()` can
+    short-circuit (--no-tui, terminal-check, argparse error) before we
+    force Textual's import cost.
+    """
+
+    # Helper modules — lazy-imported here so the top-of-file import cost
+    # only kicks in when main() has decided to run the TUI.
+    from . import _flashlog as _flashlog_mod
+    from . import _fwlog as _fwlog_mod
+    from . import _history as _history_mod
+    from . import _reproducer as _reproducer_mod
+
+    # ---------------- Messages ----------------
+
+    class ReportLogEvent(tx.Message):
+        def __init__(self, event: dict[str, Any]) -> None:
+            self.event = event
+            super().__init__()
+
+    class PytestLine(tx.Message):
+        def __init__(self, source: str, line: str) -> None:
+            self.source = source  # "stdout" | "stderr"
+            self.line = line
+            super().__init__()
+
+    class FirmwareLogLine(tx.Message):
+        def __init__(self, record: dict[str, Any]) -> None:
+            # {"ts": float, "port": str | None, "line": str}
+            self.record = record
+            super().__init__()
+
+    class FlashLogLine(tx.Message):
+        """Plain-text line from `tests/flash.log` — pio / esptool / nrfutil /
+        picotool output tee'd by `pio._run_capturing`. Routed to the pytest
+        pane so the operator sees live flash progress during `test_00_bake`
+        instead of 3 minutes of pytest-captured silence."""
+
+        def __init__(self, line: str) -> None:
+            self.line = line
+            super().__init__()
+
+    class DeviceSnapshot(tx.Message):
+        def __init__(self, rows: list[DeviceRow]) -> None:
+            self.rows = rows
+            super().__init__()
+
+    class RunFinished(tx.Message):
+        def __init__(self, returncode: int) -> None:
+            self.returncode = returncode
+            super().__init__()
+
+    # ---------------- Workers ----------------
+
+    class ReportlogWorker(threading.Thread):
+        """Tail `reportlog.jsonl`, publish each event."""
+
+        def __init__(self, app: Any, path: pathlib.Path, stop: threading.Event) -> None:
+            super().__init__(daemon=True, name="reportlog-tail")
+            self._app = app
+            self._path = path
+            self._stop = stop
+
+        def run(self) -> None:
+            # Wait up to 30 s for pytest to create the file (first call on
+            # a cold cache can be slow).
+            wait_deadline = time.monotonic() + 30.0
+            while not self._path.is_file():
+                if self._stop.is_set() or time.monotonic() > wait_deadline:
+                    return
+                time.sleep(0.1)
+            try:
+                fh = self._path.open("r", encoding="utf-8")
+            except OSError:
+                return
+            try:
+                while not self._stop.is_set():
+                    line = fh.readline()
+                    if not line:
+                        time.sleep(0.05)
+                        continue
+                    line = line.strip()
+                    if not line:
+                        continue
+                    try:
+                        event = json.loads(line)
+                    except json.JSONDecodeError:
+                        continue
+                    self._app.post_message(ReportLogEvent(event))
+            finally:
+                fh.close()
+
+    class SubprocessReaderWorker(threading.Thread):
+        """Read one stream line-by-line and publish PytestLine messages."""
+
+        def __init__(
+            self,
+            app: Any,
+            stream: Any,
+            source: str,
+            stop: threading.Event,
+        ) -> None:
+            super().__init__(daemon=True, name=f"subprocess-{source}")
+            self._app = app
+            self._stream = stream
+            self._source = source
+            self._stop = stop
+
+        def run(self) -> None:
+            try:
+                for line in iter(self._stream.readline, ""):
+                    if self._stop.is_set():
+                        break
+                    self._app.post_message(
+                        PytestLine(source=self._source, line=line.rstrip("\n"))
+                    )
+            except Exception:
+                # stream closed / subprocess died; not fatal.
+                pass
+
+    class DevicePollerWorker(threading.Thread):
+        """Poll list_devices() + device_info() at startup and after RunFinished.
+
+        Deliberately NOT polling during the run — `hub_devices` is a
+        session-scoped fixture holding SerialInterfaces across the whole
+        session, and device_info() would deadlock on the exclusive port
+        lock. Header shows "(stale)" during the gap.
+        """
+
+        def __init__(self, app: Any, state: State, stop: threading.Event) -> None:
+            super().__init__(daemon=True, name="device-poller")
+            self._app = app
+            self._state = state
+            self._stop = stop
+            self._trigger = threading.Event()
+
+        def trigger(self) -> None:
+            self._trigger.set()
+
+        def run(self) -> None:
+            # Perform one poll at startup; then wait for explicit triggers.
+            self._poll_once()
+            while not self._stop.is_set():
+                if self._trigger.wait(timeout=0.5):
+                    self._trigger.clear()
+                    if self._stop.is_set():
+                        break
+                    with self._state.lock:
+                        active = self._state.run_active
+                    if active:
+                        continue
+                    self._poll_once()
+
+        def _poll_once(self) -> None:
+            try:
+                from meshtastic_mcp import devices as devices_mod
+                from meshtastic_mcp import info as info_mod
+            except Exception as exc:  # pragma: no cover
+                self._app.post_message(
+                    PytestLine(
+                        source="stderr", line=f"[tui] device import failed: {exc!r}"
+                    )
+                )
+                return
+            rows: list[DeviceRow] = []
+            try:
+                raw = devices_mod.list_devices(include_unknown=True)
+            except Exception as exc:
+                self._app.post_message(
+                    PytestLine(
+                        source="stderr", line=f"[tui] list_devices failed: {exc!r}"
+                    )
+                )
+                return
+            for d in raw:
+                vid_raw = d.get("vid") or ""
+                try:
+                    vid_i = (
+                        int(vid_raw, 16)
+                        if isinstance(vid_raw, str) and vid_raw.startswith("0x")
+                        else int(vid_raw)
+                    )
+                except (TypeError, ValueError):
+                    vid_i = 0
+                role = None
+                if vid_i == 0x239A:
+                    role = "nrf52"
+                elif vid_i in (0x303A, 0x10C4):
+                    role = "esp32s3"
+                if not role and not d.get("likely_meshtastic"):
+                    continue
+                row = DeviceRow(
+                    role=role,
+                    port=d.get("port", ""),
+                    vid=str(vid_raw),
+                    pid=str(d.get("pid") or ""),
+                    description=d.get("description", "") or "",
+                )
+                if role:
+                    try:
+                        row.info = info_mod.device_info(port=row.port, timeout_s=6.0)
+                    except Exception as exc:
+                        row.info = {"error": repr(exc)}
+                rows.append(row)
+            self._app.post_message(DeviceSnapshot(rows=rows))
+
+    # ---------------- Modals ----------------
+
+    class FailureDetailScreen(tx.ModalScreen):
+        """Show a failed test's longrepr + captured sections."""
+
+        BINDINGS = [tx.Binding("escape,q", "dismiss", "close")]
+
+        def __init__(self, leaf: LeafReport, report_html: pathlib.Path) -> None:
+            self._leaf = leaf
+            self._report_html = report_html
+            super().__init__()
+
+        def compose(self) -> Any:
+            yield tx.Static(
+                f"[bold]{self._leaf.nodeid}[/bold]   "
+                f"outcome=[red]{self._leaf.outcome}[/red]   "
+                f"duration={_format_duration(self._leaf.duration_s)}",
+                id="failure-detail-header",
+            )
+            log = tx.RichLog(
+                highlight=False, markup=False, wrap=False, id="failure-detail-log"
+            )
+            yield log
+            yield tx.Static(
+                f"[dim]Full HTML report: {self._report_html}[/dim]   [esc] close",
+                id="failure-detail-footer",
+            )
+
+        def on_mount(self) -> None:
+            log = self.query_one("#failure-detail-log", tx.RichLog)
+            if self._leaf.longrepr:
+                log.write(self._leaf.longrepr)
+                log.write("")
+            for section_name, section_text in self._leaf.sections:
+                log.write(f"--- {section_name} ---")
+                log.write(section_text)
+                log.write("")
+            if not self._leaf.longrepr and not self._leaf.sections:
+                log.write("(no longrepr or captured sections in reportlog event)")
+
+        def action_dismiss(self, _result: Any = None) -> None:
+            self.dismiss()
+
+    class FilterInputScreen(tx.ModalScreen[str]):
+        """Prompt the user for a tree filter substring (empty clears)."""
+
+        BINDINGS = [tx.Binding("escape", "cancel", "cancel")]
+
+        def compose(self) -> Any:
+            yield tx.Static("filter test tree (substring, empty = clear):")
+            yield tx.Input(placeholder="nodeid substring", id="filter-input")
+
+        def on_input_submitted(self, event: Any) -> None:
+            self.dismiss(event.value.strip())
+
+        def action_cancel(self) -> None:
+            self.dismiss(None)
+
+    class CoverageModal(tx.ModalScreen):
+        """Read `tests/tool_coverage.json` (written by `tests/tool_coverage.py`
+        at `pytest_sessionfinish`) and render a two-column summary of which
+        MCP tools got exercised by the run. `(no coverage data yet)` while
+        the run is in flight."""
+
+        BINDINGS = [tx.Binding("escape,q,c", "dismiss", "close")]
+
+        def __init__(self, coverage_path: pathlib.Path) -> None:
+            self._path = coverage_path
+            super().__init__()
+
+        def compose(self) -> Any:
+            yield tx.Static("[bold]MCP tool coverage[/bold]", id="coverage-header")
+            yield tx.RichLog(
+                highlight=False, markup=True, wrap=False, id="coverage-log"
+            )
+            yield tx.Static(
+                f"[dim]{self._path}[/dim]   [esc] close",
+                id="coverage-footer",
+            )
+
+        def on_mount(self) -> None:
+            log = self.query_one("#coverage-log", tx.RichLog)
+            if not self._path.is_file():
+                log.write("(no coverage data — tool_coverage.json not written yet)")
+                log.write("")
+                log.write("Coverage is emitted at pytest_sessionfinish; this")
+                log.write("file appears after the suite completes.")
+                return
+            try:
+                data = json.loads(self._path.read_text(encoding="utf-8"))
+            except Exception as exc:
+                log.write(f"[red]failed to read {self._path}:[/red] {exc!r}")
+                return
+            calls = data.get("calls") or {}
+            if not calls:
+                log.write("(tool_coverage.json present but no calls recorded)")
+                return
+            exercised = sorted(
+                ((n, c) for n, c in calls.items() if c > 0), key=lambda x: -x[1]
+            )
+            unexercised = sorted(n for n, c in calls.items() if c == 0)
+            log.write(f"[b]{len(exercised)} / {len(calls)} MCP tools exercised[/b]")
+            log.write("")
+            log.write("[green]exercised[/green] (count):")
+            for name, count in exercised:
+                log.write(f"  {count:>4}  {name}")
+            if unexercised:
+                log.write("")
+                log.write("[dim]not exercised:[/dim]")
+                for name in unexercised:
+                    log.write(f"        {name}")
+
+        def action_dismiss(self, _result: Any = None) -> None:
+            self.dismiss()
+
+    class ReproducerResultModal(tx.ModalScreen):
+        """Show the exported reproducer tarball path with a short instruction."""
+
+        BINDINGS = [tx.Binding("escape,q,enter", "dismiss", "close")]
+
+        def __init__(
+            self, archive_path: pathlib.Path, error: str | None = None
+        ) -> None:
+            self._archive = archive_path
+            self._error = error
+            super().__init__()
+
+        def compose(self) -> Any:
+            if self._error:
+                yield tx.Static(f"[red]Reproducer export failed:[/red] {self._error}")
+            else:
+                yield tx.Static("[bold green]Reproducer bundle written[/bold green]")
+                yield tx.Static(f"[cyan]{self._archive}[/cyan]")
+                yield tx.Static("")
+                yield tx.Static(
+                    "Contains: README.md, test_report.json, fwlog.jsonl (time-filtered),"
+                )
+                yield tx.Static(
+                    "devices.json, env.json. Attach to an issue / paste the path in chat."
+                )
+            yield tx.Static("")
+            yield tx.Static("[dim][esc] close[/dim]")
+
+        def action_dismiss(self, _result: Any = None) -> None:
+            self.dismiss()
+
+    # ---------------- App ----------------
+
+    class TestTuiApp(tx.App):
+        CSS = """
+        Screen { layout: vertical; }
+        #header-bar { height: 2; padding: 0 1; background: $panel; }
+        #tier-table { height: auto; max-height: 11; }
+        #body { height: 1fr; }
+        #tree-pane { width: 50%; border-right: solid $primary-background; }
+        #right-pane { width: 50%; layout: vertical; }
+        #pytest-pane { height: 50%; border-bottom: solid $primary-background; }
+        #fwlog-header { height: 1; padding: 0 1; background: $panel; }
+        #fwlog-pane { height: 1fr; }
+        Tree { height: 100%; }
+        RichLog { height: 100%; }
+        #device-table { height: auto; max-height: 6; }
+        """
+
+        TITLE = "mcp-server test runner"
+
+        BINDINGS = [
+            tx.Binding("r", "rerun_focused", "re-run focused"),
+            tx.Binding("f", "filter_tree", "filter"),
+            tx.Binding("d", "failure_detail", "failure detail"),
+            tx.Binding("g", "open_html_report", "open report.html"),
+            tx.Binding("x", "export_reproducer", "export reproducer"),
+            tx.Binding("c", "coverage_panel", "coverage"),
+            tx.Binding("l", "cycle_fwlog_filter", "fw log filter"),
+            tx.Binding("q,ctrl+c", "quit_app", "quit"),
+        ]
+
+        def __init__(self) -> None:
+            super().__init__()
+            self._state = State()
+            self._root = root
+            self._run_tests = run_tests
+            self._reportlog = reportlog
+            self._fwlog = fwlog
+            self._flashlog = flashlog
+            self._report_html = root / _REPORT_HTML_RELATIVE
+            self._tool_coverage = root / _TOOL_COVERAGE_RELATIVE
+            self._repro_dir = root / _REPRODUCERS_RELATIVE
+            self._seed = seed
+            self._run_number = run_number
+            self._pytest_args = pytest_args
+            self._start_time = time.monotonic()
+            self._proc: subprocess.Popen[str] | None = None
+            self._stop = threading.Event()
+            self._reportlog_worker: ReportlogWorker | None = None
+            self._stdout_worker: SubprocessReaderWorker | None = None
+            self._stderr_worker: SubprocessReaderWorker | None = None
+            self._device_worker: DevicePollerWorker | None = None
+            self._fwlog_worker: _fwlog_mod.FirmwareLogTailer | None = None
+            self._flashlog_worker: _flashlog_mod.FlashLogTailer | None = None
+            self._tree_filter: str = ""
+            self._sigint_count = 0
+            # Firmware-log port filter: None = all, else exact port match.
+            self._fwlog_filter: str | None = None
+            # Ordered set of distinct ports we've seen firmware log lines
+            # from — the `l` key cycles through these.
+            self._fwlog_ports: list[str] = []
+            # Cross-run history.
+            self._history_store = _history_mod.HistoryStore(
+                root / _HISTORY_RELATIVE, keep_last=40
+            )
+            self._history_cache = self._history_store.read_recent()
+
+        # -------- composition / mount --------
+
+        def compose(self) -> Any:
+            yield tx.Static(self._header_text(), id="header-bar")
+            tier_table = tx.DataTable(id="tier-table", show_cursor=False)
+            yield tier_table
+            with tx.Horizontal(id="body"):
+                with tx.Vertical(id="tree-pane"):
+                    yield tx.Tree("tests", id="test-tree")
+                with tx.Vertical(id="right-pane"):
+                    with tx.Vertical(id="pytest-pane"):
+                        yield tx.RichLog(
+                            id="pytest-log",
+                            highlight=False,
+                            markup=False,
+                            wrap=False,
+                            max_lines=5000,
+                        )
+                    yield tx.Static(self._fwlog_header_text(), id="fwlog-header")
+                    with tx.Vertical(id="fwlog-pane"):
+                        yield tx.RichLog(
+                            id="fwlog-log",
+                            highlight=False,
+                            markup=False,
+                            # `wrap=True` so long firmware log lines (some
+                            # hit ~200 chars — full packet hex dumps plus
+                            # source tags) don't get truncated at the
+                            # right edge. The right pane is ~50% of the
+                            # terminal so even a wide terminal has a
+                            # ~90-char cap; plain truncation dropped the
+                            # uptime counter or packet id off the end.
+                            wrap=True,
+                            max_lines=5000,
+                        )
+            yield tx.DataTable(id="device-table", show_cursor=False)
+            yield tx.Footer()
+
+        def _fwlog_header_text(self) -> str:
+            filt = self._fwlog_filter or "(all ports)"
+            return f"firmware log   filter: [b]{filt}[/b]   [l] cycle"
+
+        def on_mount(self) -> None:
+            # Tier-counters table. `add_column` (singular) lets us pick
+            # the key explicitly — `add_columns` (plural) in textual 8.x
+            # returns auto-generated keys that are tedious to track
+            # separately, and update_cell(column_key=<label>) silently
+            # no-ops because the key is not the label. "Progress" is the
+            # new v2 column — a small [=====  ] bar; see `_progress_bar`.
+            tier_table = self.query_one("#tier-table", tx.DataTable)
+            for col in (
+                "Tier",
+                "Passed",
+                "Failed",
+                "Skipped",
+                "Running",
+                "Remaining",
+                "Progress",
+            ):
+                tier_table.add_column(col, key=col)
+            for t in TIERS:
+                tier_table.add_row(t, "0", "0", "0", "0", "0", "", key=t)
+            # Device table. "Status" shows which test (if any) is currently
+            # running on this device — derived from the running_nodeid plus
+            # role inference from the nodeid's `[...]` parametrization.
+            dev_table = self.query_one("#device-table", tx.DataTable)
+            for col in (
+                "Role",
+                "Port",
+                "Firmware",
+                "HW",
+                "Region",
+                "Channel",
+                "Peers",
+                "Status",
+            ):
+                dev_table.add_column(col, key=col)
+            # Launch workers + subprocess
+            self._device_worker = DevicePollerWorker(self, self._state, self._stop)
+            self._device_worker.start()
+            self._reportlog_worker = ReportlogWorker(self, self._reportlog, self._stop)
+            self._reportlog_worker.start()
+            # Firmware log tail worker — publishes FirmwareLogLine messages.
+            self._fwlog_worker = _fwlog_mod.FirmwareLogTailer(
+                path=self._fwlog,
+                post=lambda rec: self.post_message(FirmwareLogLine(rec)),
+                stop=self._stop,
+            )
+            self._fwlog_worker.start()
+            # Flash log tail worker — plain-text pio/esptool/nrfutil/picotool
+            # output tee'd by `pio._run_capturing`. Routes each line into the
+            # pytest pane so the operator has live feedback during long flash
+            # operations (`pio run -t upload` is ~3 min of silence otherwise).
+            self._flashlog_worker = _flashlog_mod.FlashLogTailer(
+                path=self._flashlog,
+                post=lambda line: self.post_message(FlashLogLine(line)),
+                stop=self._stop,
+            )
+            self._flashlog_worker.start()
+            self._spawn_pytest(self._pytest_args)
+            # Header tick (seed / runtime / sparkline re-renders at 1 Hz).
+            # Also refreshes the device-status column so the per-test elapsed
+            # time climbs live during silent test bodies (flash, long mesh
+            # timeouts, etc.) — cheap: device-table is 1-2 rows.
+            self.set_interval(1.0, self._on_tick)
+
+        def _header_text(self) -> str:
+            elapsed = time.monotonic() - self._start_time
+            phase = (
+                "running"
+                if self._state.run_active
+                else ("done" if self._state.exit_code is not None else "starting")
+            )
+            stale = " (devices stale)" if self._state.run_active else ""
+            # Sparkline over recent run durations (oldest → newest).
+            spark = _history_mod.sparkline(
+                (r.duration_s for r in self._history_cache), width=20
+            )
+            spark_segment = f"   history: {spark}" if spark else ""
+            return (
+                f"mcp-server test runner   "
+                f"seed: [b]{self._seed}[/b]   "
+                f"run #{self._run_number}   "
+                f"elapsed {_format_duration(elapsed)}   "
+                f"phase: [b]{phase}[/b]{stale}"
+                f"{spark_segment}"
+            )
+
+        def _refresh_header(self) -> None:
+            try:
+                self.query_one("#header-bar", tx.Static).update(self._header_text())
+            except Exception:
+                pass
+
+        def _on_tick(self) -> None:
+            """1 Hz tick: refresh header clock + any live-updating cells.
+
+            The device-status cell embeds the running test's elapsed time
+            (`RUNNING: test_bake_nrf52 (1:23)`), which needs to re-render
+            each second during long silent test bodies. Cheap — O(devices),
+            which is 1–2 rows in practice. Skipped when no test is
+            running so we don't burn cycles when the TUI is idle.
+            """
+            self._refresh_header()
+            if self._state.running_started_at is not None:
+                self._refresh_device_status()
+
+        # -------- subprocess management --------
+
+        def _spawn_pytest(self, extra_args: list[str]) -> None:
+            env = os.environ.copy()
+            env.setdefault("MESHTASTIC_MCP_SEED", self._seed)
+            cmd = [str(self._run_tests), *extra_args]
+            # `run-tests.sh` has a `[ "$#" -eq 0 ]` guard that applies the
+            # full default-args set:
+            #     tests/ --html=tests/report.html --self-contained-html
+            #     --junitxml=tests/junit.xml -v --tb=short
+            # plus an unconditional `--report-log` append at the end. If we
+            # pre-append `--report-log` here when `extra_args` is empty, $#
+            # becomes 1 and the whole defaults block is skipped — pytest
+            # then runs without the `tests/` positional (discovers from the
+            # mcp-server root and potentially drags in production modules
+            # named `test_*.py`), without the HTML/junit reports the /test
+            # skill relies on for failure interpretation, and without
+            # `-v --tb=short` output formatting.
+            #
+            # So: only append `--report-log` when the operator explicitly
+            # passed pytest args (e.g. the `r`-key re-run-focused-test
+            # case, where the wrapper's defaults are already bypassed by
+            # the explicit arg). Trust the wrapper's own injection in the
+            # no-args path.
+            if extra_args and not any(a.startswith("--report-log") for a in cmd):
+                cmd.append(f"--report-log={self._reportlog}")
+            log = self.query_one("#pytest-log", tx.RichLog)
+            log.write(f"$ {' '.join(cmd)}")
+            try:
+                self._proc = subprocess.Popen(  # noqa: S603
+                    cmd,
+                    stdout=subprocess.PIPE,
+                    stderr=subprocess.PIPE,
+                    bufsize=1,
+                    text=True,
+                    start_new_session=True,
+                    env=env,
+                    cwd=str(self._root),
+                )
+            except Exception as exc:
+                log.write(f"[tui] failed to spawn pytest: {exc!r}")
+                return
+            with self._state.lock:
+                self._state.run_active = True
+                self._state.exit_code = None
+            self._stdout_worker = SubprocessReaderWorker(
+                self, self._proc.stdout, "stdout", self._stop
+            )
+            self._stdout_worker.start()
+            self._stderr_worker = SubprocessReaderWorker(
+                self, self._proc.stderr, "stderr", self._stop
+            )
+            self._stderr_worker.start()
+            # Watchdog thread that posts RunFinished when the subprocess exits.
+            threading.Thread(
+                target=self._watch_exit, daemon=True, name="pytest-watch"
+            ).start()
+
+        def _watch_exit(self) -> None:
+            if self._proc is None:
+                return
+            rc = self._proc.wait()
+            self.post_message(RunFinished(returncode=rc))
+
+        # -------- message handlers --------
+
+        def on_report_log_event(self, message: Any) -> None:
+            ev = message.event
+            rt = ev.get("$report_type")
+            if rt == "SessionStart":
+                return
+            if rt == "CollectReport":
+                # pytest-reportlog emits CollectReport once per collected
+                # node (directory, module, class, session). Leaf items
+                # appear as nodeids in the `result` array; parent-level
+                # collections have empty `result`.
+                for item in ev.get("result") or []:
+                    self._register_leaf(item.get("nodeid", ""))
+                return
+            if rt == "TestReport":
+                nodeid = ev.get("nodeid", "")
+                when = ev.get("when")
+                outcome = ev.get("outcome")
+
+                # Phase 1: update the "currently running" marker that
+                # drives the device-status strip. `when="setup"` +
+                # outcome=passed means the test body is about to execute;
+                # `when="call"` (any outcome) means it just finished;
+                # `when="setup"` + outcome in {failed, skipped} also
+                # clears, since the body will never run.
+                if when == "setup" and outcome == "passed":
+                    self._state.running_nodeid = nodeid
+                    self._state.running_started_at = time.monotonic()
+                    self._refresh_device_status()
+                elif when == "call" or (
+                    when == "setup" and outcome in ("failed", "skipped")
+                ):
+                    if self._state.running_nodeid == nodeid:
+                        self._state.running_nodeid = None
+                        self._state.running_started_at = None
+                        self._refresh_device_status()
+
+                # Phase 2: emit an authoritative leaf outcome.
+                #   `call` + terminal: test body ran.
+                #   `setup` + failed:  promote setup error to a failed leaf.
+                #   `setup` + skipped: fixture-level `pytest.skip(...)`.
+                #                      Our `mesh_pair`, `baked_mesh`, and
+                #                      `hub_devices` fixtures all do this
+                #                      when a role isn't detected or the
+                #                      bake doesn't match. Without this
+                #                      branch, those tests would never
+                #                      register in the tree and the tier
+                #                      counters would silently lie — e.g.
+                #                      the telemetry tier showed 0/0/0
+                #                      while 4 tests were actually skipped.
+                #   `rerun` (pytest-rerunfailures): rewind to pending.
+                # Teardown outcomes are intentionally ignored — a
+                # teardown failure shouldn't overwrite the call's
+                # authoritative pass/fail.
+                if when == "call" and outcome in ("passed", "failed", "skipped"):
+                    self._apply_outcome(nodeid, outcome, ev)
+                elif when == "setup" and outcome in ("failed", "skipped"):
+                    self._apply_outcome(nodeid, outcome, ev)
+                elif outcome == "rerun":
+                    self._apply_outcome(nodeid, "pending", ev)
+                return
+            if rt == "SessionFinish":
+                return
+            # Unknown — ignore silently.
+
+        def on_pytest_line(self, message: Any) -> None:
+            log = self.query_one("#pytest-log", tx.RichLog)
+            prefix = "" if message.source == "stdout" else "[stderr] "
+            log.write(f"{prefix}{message.line}")
+
+        def on_flash_log_line(self, message: Any) -> None:
+            """Route a pio/esptool/nrfutil line into the pytest pane.
+
+            Prefixed `[flash]` so the operator can visually separate
+            tee'd subprocess output from pytest's own stdout. Without this
+            routing, long flash operations are a 3-minute black hole of
+            pytest-captured silence.
+            """
+            log = self.query_one("#pytest-log", tx.RichLog)
+            log.write(f"[flash] {message.line}")
+
+        def on_firmware_log_line(self, message: Any) -> None:
+            rec = message.record
+            port = rec.get("port")
+            line = rec.get("line", "")
+            # Track distinct ports for `l` filter cycling. The ordered-set
+            # trick — list membership — is fine here because `_fwlog_ports`
+            # is tiny (2-3 entries for a typical lab).
+            if port and port not in self._fwlog_ports:
+                self._fwlog_ports.append(port)
+                # Refresh the fwlog header to show the newly-available port.
+                try:
+                    self.query_one("#fwlog-header", tx.Static).update(
+                        self._fwlog_header_text()
+                    )
+                except Exception:
+                    pass
+            # Filter: None = show all; otherwise exact port match.
+            if self._fwlog_filter and port != self._fwlog_filter:
+                return
+            log = self.query_one("#fwlog-log", tx.RichLog)
+            port_tag = ""
+            if port:
+                # Show only the last path component — `/dev/cu.usbmodem1101`
+                # is long; `usbmodem1101` is enough when the filter is
+                # "all".
+                tail = port.rsplit("/", 1)[-1]
+                port_tag = f"[{tail}] "
+            log.write(f"{port_tag}{line}")
+
+        @staticmethod
+        def _progress_bar(counters: TierCounters, *, width: int = 10) -> str:
+            done = counters.passed + counters.failed + counters.skipped
+            total = done + counters.running + counters.remaining
+            if total <= 0:
+                return ""
+            filled = int(round(width * done / total))
+            bar = "█" * filled + "·" * (width - filled)
+            return f"{bar} {done}/{total}"
+
+        def on_device_snapshot(self, message: Any) -> None:
+            with self._state.lock:
+                self._state.devices = list(message.rows)
+            dev_table = self.query_one("#device-table", tx.DataTable)
+            dev_table.clear()
+            for row in message.rows:
+                info = row.info or {}
+                role = row.role or "?"
+                fw = info.get("firmware_version", "—")
+                hw = info.get("hw_model", "—")
+                region = info.get("region", "—")
+                channel = info.get("primary_channel", "—")
+                peers = info.get("num_nodes")
+                if peers is None:
+                    peers = "—"
+                else:
+                    peers = str(max(int(peers) - 1, 0))  # exclude self
+                status = self._status_for_role(role)
+                # Row key = port path (stable, unique, survives re-snapshots).
+                dev_table.add_row(
+                    role,
+                    row.port,
+                    str(fw),
+                    str(hw),
+                    str(region),
+                    str(channel),
+                    peers,
+                    status,
+                    key=row.port,
+                )
+
+        def _status_for_role(self, role: str) -> str:
+            """Status cell for the given role: 'idle' or 'RUNNING: <short> (M:SS)'.
+
+            A running test whose nodeid doesn't carry an explicit role
+            parametrization (no `[...]` bracket) is treated as touching
+            every device — that matches how `test_bidirectional` and the
+            pytest_sessionstart-level tests work in practice.
+
+            The trailing `(M:SS)` is live-updated by `_on_tick` at 1 Hz
+            and gives the operator concrete "still running" evidence for
+            long-silent test bodies (flash, long mesh timeouts).
+            """
+            nodeid = self._state.running_nodeid
+            if not nodeid:
+                return "idle"
+            roles = _roles_from_nodeid(nodeid)
+            if roles and role not in roles:
+                return "idle"
+            short = _testname_of_nodeid(nodeid)
+            # Compute elapsed for the live counter. Budget 8 chars at the
+            # end of the cell — `(12:34)` plus a space. Shorten `short`
+            # first, then tack on the elapsed suffix.
+            started = self._state.running_started_at
+            elapsed_suffix = ""
+            if started is not None:
+                elapsed_suffix = f" ({_format_duration(time.monotonic() - started)})"
+            # Truncate the test name to fit; Status column is the
+            # rightmost column and the device table is horizontally
+            # short. 40 chars + the elapsed suffix keeps the
+            # parametrization suffix visible for mesh_pair tests.
+            if len(short) > 40:
+                short = short[:37] + "…"
+            return f"RUNNING: {short}{elapsed_suffix}"
+
+        def _refresh_device_status(self) -> None:
+            """Update the Status cell for every detected device.
+
+            Called whenever `running_nodeid` transitions (setup → call).
+            Cheap: O(devices) which is 1–2 rows in practice.
+            """
+            try:
+                dev_table = self.query_one("#device-table", tx.DataTable)
+            except Exception:
+                return
+            for row in self._state.devices:
+                role = row.role or "?"
+                try:
+                    dev_table.update_cell(
+                        row.port, "Status", self._status_for_role(role)
+                    )
+                except Exception:
+                    # Row key might not exist yet if a snapshot hasn't
+                    # populated it — harmless; next snapshot will carry
+                    # the fresh status value.
+                    pass
+
+        def on_run_finished(self, message: Any) -> None:
+            with self._state.lock:
+                self._state.run_active = False
+                self._state.exit_code = message.returncode
+            log = self.query_one("#pytest-log", tx.RichLog)
+            log.write(f"[tui] pytest exited with {message.returncode}")
+            # Trigger a fresh device poll now that ports are free again.
+            if self._device_worker is not None:
+                self._device_worker.trigger()
+            # Persist a history record — one line per run, tailed by the
+            # sparkline on every subsequent TUI launch.
+            duration_s = time.monotonic() - self._start_time
+            passed = sum(t.passed for t in self._state.tiers.values())
+            failed = sum(t.failed for t in self._state.tiers.values())
+            skipped = sum(t.skipped for t in self._state.tiers.values())
+            try:
+                rec = self._history_store.record_run(
+                    run=self._run_number,
+                    duration_s=duration_s,
+                    passed=passed,
+                    failed=failed,
+                    skipped=skipped,
+                    exit_code=message.returncode,
+                    seed=self._seed,
+                )
+                self._history_cache = self._history_store.read_recent()
+                log.write(
+                    f"[tui] history: recorded run #{rec.run} "
+                    f"({passed}p/{failed}f/{skipped}s in {_format_duration(duration_s)})"
+                )
+            except Exception as exc:
+                log.write(f"[tui] history persist failed: {exc!r}")
+
+        # -------- tree + counters --------
+
+        def _register_leaf(self, nodeid: str) -> None:
+            if not nodeid or nodeid in self._state.leaves:
+                return
+            tier = _tier_of_nodeid(nodeid)
+            leaf = LeafReport(nodeid=nodeid, tier=tier)
+            self._state.leaves[nodeid] = leaf
+            self._state.nodeid_order.append(nodeid)
+            counters = self._state.tiers.get(tier)
+            if counters is not None:
+                counters.remaining += 1
+                self._refresh_tier_row(tier)
+            self._add_to_tree(leaf)
+
+        def _apply_outcome(self, nodeid: str, outcome: str, ev: dict[str, Any]) -> None:
+            if not nodeid:
+                return
+            leaf = self._state.leaves.get(nodeid)
+            if leaf is None:
+                # First event for this nodeid is the report itself (no
+                # collection event seen) — register on the fly.
+                self._register_leaf(nodeid)
+                leaf = self._state.leaves[nodeid]
+            prev = leaf.outcome
+            leaf.outcome = outcome
+            leaf.duration_s = float(ev.get("duration", 0.0) or 0.0)
+            # Wall-clock start/stop — pytest-reportlog emits these as
+            # float seconds (Unix epoch). Used by the reproducer exporter
+            # to window fwlog.jsonl down to just the failure's context.
+            start = ev.get("start")
+            stop = ev.get("stop")
+            if isinstance(start, (int, float)):
+                leaf.start_ts = float(start)
+            if isinstance(stop, (int, float)):
+                leaf.stop_ts = float(stop)
+            longrepr = ev.get("longrepr") or ""
+            if isinstance(longrepr, dict):
+                # pytest-reportlog may serialize as {"reprcrash": ..., "reprtraceback": ...}
+                longrepr = json.dumps(longrepr, indent=2, default=str)
+            leaf.longrepr = longrepr
+            sections = ev.get("sections") or []
+            if isinstance(sections, list):
+                leaf.sections = [
+                    (
+                        (s[0], s[1])
+                        if isinstance(s, (list, tuple)) and len(s) >= 2
+                        else ("section", str(s))
+                    )
+                    for s in sections
+                ]
+            counters = self._state.tiers.get(leaf.tier)
+            if counters is None:
+                return
+            # Undo prior bucket, apply new one.
+            if prev in ("passed", "failed", "skipped"):
+                setattr(counters, prev, max(getattr(counters, prev) - 1, 0))
+            elif prev == "pending":
+                counters.remaining = max(counters.remaining - 1, 0)
+            if outcome in ("passed", "failed", "skipped"):
+                setattr(counters, outcome, getattr(counters, outcome) + 1)
+            elif outcome == "pending":
+                counters.remaining += 1
+            self._refresh_tier_row(leaf.tier)
+            self._refresh_tree_leaf(leaf)
+
+        def _refresh_tier_row(self, tier: str) -> None:
+            counters = self._state.tiers.get(tier)
+            if counters is None:
+                return
+            try:
+                table = self.query_one("#tier-table", tx.DataTable)
+                table.update_cell(tier, column_key="Passed", value=str(counters.passed))
+                table.update_cell(tier, column_key="Failed", value=str(counters.failed))
+                table.update_cell(
+                    tier, column_key="Skipped", value=str(counters.skipped)
+                )
+                table.update_cell(
+                    tier, column_key="Running", value=str(counters.running)
+                )
+                table.update_cell(
+                    tier, column_key="Remaining", value=str(counters.remaining)
+                )
+                table.update_cell(
+                    tier, column_key="Progress", value=self._progress_bar(counters)
+                )
+            except Exception:
+                # Column-key API differs slightly across textual versions;
+                # fall back to positional update if needed.
+                pass
+
+        def _add_to_tree(self, leaf: LeafReport) -> None:
+            tree = self.query_one("#test-tree", tx.Tree)
+            root_node = tree.root
+            tier_node = None
+            for child in root_node.children:
+                if str(child.label).split(" ")[0] == leaf.tier:
+                    tier_node = child
+                    break
+            if tier_node is None:
+                tier_node = root_node.add(leaf.tier, expand=False)
+            file_name = _file_of_nodeid(leaf.nodeid)
+            file_node = None
+            for child in tier_node.children:
+                if str(child.label).split(" ")[0] == file_name:
+                    file_node = child
+                    break
+            if file_node is None:
+                file_node = tier_node.add(file_name, expand=False)
+            glyph = self._glyph_for(leaf.outcome)
+            file_node.add_leaf(
+                f"{glyph} {_testname_of_nodeid(leaf.nodeid)}", data=leaf.nodeid
+            )
+
+        def _refresh_tree_leaf(self, leaf: LeafReport) -> None:
+            tree = self.query_one("#test-tree", tx.Tree)
+            glyph = self._glyph_for(leaf.outcome)
+            label = f"{glyph} {_testname_of_nodeid(leaf.nodeid)}"
+            for tier_node in tree.root.children:
+                if str(tier_node.label).split(" ")[0] != leaf.tier:
+                    continue
+                for file_node in tier_node.children:
+                    if str(file_node.label).split(" ")[0] != _file_of_nodeid(
+                        leaf.nodeid
+                    ):
+                        continue
+                    for leaf_node in file_node.children:
+                        if getattr(leaf_node, "data", None) == leaf.nodeid:
+                            leaf_node.set_label(label)
+                            return
+
+        @staticmethod
+        def _glyph_for(outcome: str) -> str:
+            return {
+                "passed": "[green]✓[/green]",
+                "failed": "[red]✗[/red]",
+                "skipped": "[dim]○[/dim]",
+                "pending": "·",
+                "running": "[yellow]●[/yellow]",
+            }.get(outcome, "?")
+
+        # -------- actions --------
+
+        def action_rerun_focused(self) -> None:
+            if self._state.run_active:
+                self.bell()
+                return
+            tree = self.query_one("#test-tree", tx.Tree)
+            node = tree.cursor_node
+            if node is None:
+                self.bell()
+                return
+            target: str | None = None
+            if getattr(node, "data", None):
+                target = str(node.data)  # leaf: full nodeid
+            else:
+                # Internal node — derive a pytest arg.
+                labels = []
+                cur: Any = node
+                while cur is not None and cur.parent is not None:
+                    labels.append(str(cur.label).split(" ")[0])
+                    cur = cur.parent
+                if labels:
+                    target = "tests/" + "/".join(reversed(labels))
+            if not target:
+                self.bell()
+                return
+            # Reset state + tree for the new run.
+            self._reset_for_rerun()
+            self._spawn_pytest([target])
+
+        def action_filter_tree(self) -> None:
+            def _apply(value: str | None) -> None:
+                if value is None:
+                    return
+                self._tree_filter = value
+                self._apply_tree_filter()
+
+            self.push_screen(FilterInputScreen(), _apply)
+
+        def _apply_tree_filter(self) -> None:
+            tree = self.query_one("#test-tree", tx.Tree)
+            needle = self._tree_filter.lower()
+            for tier_node in tree.root.children:
+                tier_match_count = 0
+                for file_node in tier_node.children:
+                    file_match_count = 0
+                    for leaf_node in file_node.children:
+                        nodeid = str(getattr(leaf_node, "data", "") or "")
+                        match = (not needle) or (needle in nodeid.lower())
+                        leaf_node.display = match
+                        if match:
+                            file_match_count += 1
+                    file_node.display = file_match_count > 0 or not needle
+                    if file_match_count > 0:
+                        tier_match_count += 1
+                tier_node.display = tier_match_count > 0 or not needle
+
+        def action_failure_detail(self) -> None:
+            tree = self.query_one("#test-tree", tx.Tree)
+            node = tree.cursor_node
+            if node is None or not getattr(node, "data", None):
+                self.bell()
+                return
+            leaf = self._state.leaves.get(str(node.data))
+            if leaf is None or leaf.outcome != "failed":
+                self.bell()
+                return
+            self.push_screen(FailureDetailScreen(leaf, self._report_html))
+
+        def action_open_html_report(self) -> None:
+            if not self._report_html.is_file():
+                self.bell()
+                return
+            try:
+                # macOS + Linux cover — falls through silently on failure.
+                opener = "open" if sys.platform == "darwin" else "xdg-open"
+                subprocess.Popen([opener, str(self._report_html)])  # noqa: S603,S607
+            except Exception:
+                self.bell()
+
+        def action_cycle_fwlog_filter(self) -> None:
+            """Cycle the firmware-log port filter: None → port1 → port2 → … → None."""
+            if not self._fwlog_ports:
+                self.bell()
+                return
+            cycle = [None, *self._fwlog_ports]
+            try:
+                idx = cycle.index(self._fwlog_filter)
+            except ValueError:
+                idx = 0
+            self._fwlog_filter = cycle[(idx + 1) % len(cycle)]
+            try:
+                self.query_one("#fwlog-header", tx.Static).update(
+                    self._fwlog_header_text()
+                )
+            except Exception:
+                pass
+
+        def action_coverage_panel(self) -> None:
+            self.push_screen(CoverageModal(self._tool_coverage))
+
+        def action_export_reproducer(self) -> None:
+            """Key `x`: export a reproducer bundle for the focused failed test.
+
+            Only fires when the tree cursor is on a leaf that we've seen
+            fail (has a TestReport with outcome=failed). Anything else
+            bells + no-ops so we don't write empty bundles.
+            """
+            tree = self.query_one("#test-tree", tx.Tree)
+            node = tree.cursor_node
+            if node is None or not getattr(node, "data", None):
+                self.bell()
+                return
+            leaf = self._state.leaves.get(str(node.data))
+            if leaf is None or leaf.outcome != "failed":
+                self.bell()
+                return
+            # Snapshot current device state into the bundle so the
+            # receiving human has the same context you had when exporting.
+            device_rows: list[dict[str, Any]] = []
+            for row in self._state.devices:
+                device_rows.append(
+                    {
+                        "role": row.role,
+                        "port": row.port,
+                        "vid": row.vid,
+                        "pid": row.pid,
+                        "description": row.description,
+                        "info": row.info,
+                    }
+                )
+            ctx = _reproducer_mod.ReproContext(
+                nodeid=leaf.nodeid,
+                longrepr=leaf.longrepr,
+                sections=list(leaf.sections),
+                start_ts=leaf.start_ts,
+                stop_ts=leaf.stop_ts,
+                seed=self._seed,
+                run_number=self._run_number,
+                exit_code=self._state.exit_code,
+                fwlog_path=self._fwlog,
+                output_dir=self._repro_dir,
+                extra_device_rows=device_rows,
+            )
+            try:
+                archive = _reproducer_mod.build_reproducer_bundle(ctx)
+            except Exception as exc:
+                self.push_screen(
+                    ReproducerResultModal(pathlib.Path("(none)"), error=repr(exc))
+                )
+                return
+            self.push_screen(ReproducerResultModal(archive))
+
+        def action_quit_app(self) -> None:
+            # First press: initiate graceful shutdown of the pgroup.
+            # Second press within 2 s: hard-kill.
+            if self._proc is None or self._proc.poll() is not None:
+                self._cleanup_and_exit()
+                return
+            self._sigint_count += 1
+            if self._sigint_count == 1:
+                try:
+                    os.killpg(self._proc.pid, signal.SIGINT)
+                except ProcessLookupError:
+                    pass
+                log = self.query_one("#pytest-log", tx.RichLog)
+                log.write("[tui] sent SIGINT; waiting up to 5 s for graceful exit…")
+                # Escalator thread
+                threading.Thread(target=self._escalate_kill, daemon=True).start()
+            else:
+                self._hard_kill()
+
+        def _escalate_kill(self) -> None:
+            deadline = time.monotonic() + _SIGINT_GRACE_S
+            while time.monotonic() < deadline:
+                if self._proc is None or self._proc.poll() is not None:
+                    self.call_from_thread(self._cleanup_and_exit)
+                    return
+                time.sleep(0.1)
+            if self._proc is not None and self._proc.poll() is None:
+                try:
+                    os.killpg(self._proc.pid, signal.SIGTERM)
+                except ProcessLookupError:
+                    pass
+            deadline = time.monotonic() + _SIGTERM_GRACE_S
+            while time.monotonic() < deadline:
+                if self._proc is None or self._proc.poll() is not None:
+                    self.call_from_thread(self._cleanup_and_exit)
+                    return
+                time.sleep(0.1)
+            self._hard_kill()
+
+        def _hard_kill(self) -> None:
+            if self._proc is not None and self._proc.poll() is None:
+                try:
+                    os.killpg(self._proc.pid, signal.SIGKILL)
+                except ProcessLookupError:
+                    pass
+            self.call_from_thread(self._cleanup_and_exit)
+
+        def _cleanup_and_exit(self) -> None:
+            self._stop.set()
+            self.exit(return_code=self._state.exit_code or 0)
+
+        def _reset_for_rerun(self) -> None:
+            """Clear counters + tree + leaves for a focused re-run."""
+            with self._state.lock:
+                self._state.leaves.clear()
+                self._state.nodeid_order.clear()
+                for t in self._state.tiers.values():
+                    t.passed = t.failed = t.skipped = t.running = t.remaining = 0
+                self._state.exit_code = None
+                # Defensive: the call-event handler would have cleared this
+                # at the end of the prior run, but if the prior run was
+                # interrupted (SIGINT during a test body) it may linger.
+                self._state.running_nodeid = None
+                self._state.running_started_at = None
+            # Device status cells need to go back to "idle" — otherwise
+            # the prior run's RUNNING: marker sticks until the next test
+            # actually starts.
+            self._refresh_device_status()
+            # Reset UI
+            tier_table = self.query_one("#tier-table", tx.DataTable)
+            for t in TIERS:
+                tier_table.update_cell(t, column_key="Passed", value="0")
+                tier_table.update_cell(t, column_key="Failed", value="0")
+                tier_table.update_cell(t, column_key="Skipped", value="0")
+                tier_table.update_cell(t, column_key="Running", value="0")
+                tier_table.update_cell(t, column_key="Remaining", value="0")
+                tier_table.update_cell(t, column_key="Progress", value="")
+            tree = self.query_one("#test-tree", tx.Tree)
+            tree.root.remove_children()
+            log = self.query_one("#pytest-log", tx.RichLog)
+            log.write("")
+            log.write("[tui] --- re-run ---")
+            # Clear the fwlog pane too — it's fresh context for the new run.
+            try:
+                self.query_one("#fwlog-log", tx.RichLog).clear()
+            except Exception:
+                pass
+            # Reset fwlog filter state; the conftest truncates fwlog.jsonl
+            # on fixture setup, but we also unlink here so our tail worker
+            # sees the new file from byte 0.
+            self._fwlog_filter = None
+            self._fwlog_ports = []
+            try:
+                self.query_one("#fwlog-header", tx.Static).update(
+                    self._fwlog_header_text()
+                )
+            except Exception:
+                pass
+            self._start_time = time.monotonic()
+            for p in (self._reportlog, self._fwlog):
+                try:
+                    p.unlink(missing_ok=True)
+                except Exception:
+                    pass
+
+    return TestTuiApp()
+
+
+if __name__ == "__main__":
+    sys.exit(main())
diff --git a/mcp-server/src/meshtastic_mcp/config.py b/mcp-server/src/meshtastic_mcp/config.py
new file mode 100644
index 00000000000..7ece1003290
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/config.py
@@ -0,0 +1,137 @@
+"""Resolves the firmware repo root and the binaries we invoke.
+
+Everything that needs a path (the firmware root, `pio`, `esptool`, etc.) goes
+through this module so the rest of the package never calls `shutil.which` or
+parses environment variables directly.
+"""
+
+from __future__ import annotations
+
+import os
+import shutil
+from pathlib import Path
+from typing import Iterable
+
+
+class ConfigError(RuntimeError):
+    """Raised when a required path or binary cannot be resolved."""
+
+
+def firmware_root() -> Path:
+    """Resolve the root of the Meshtastic firmware repo.
+
+    Resolution order:
+      1. `MESHTASTIC_FIRMWARE_ROOT` env var.
+      2. Walk up from `cwd` looking for a directory with `platformio.ini`.
+    """
+    env = os.environ.get("MESHTASTIC_FIRMWARE_ROOT")
+    if env:
+        root = Path(env).expanduser().resolve()
+        if not (root / "platformio.ini").is_file():
+            raise ConfigError(
+                f"MESHTASTIC_FIRMWARE_ROOT={env!r} does not contain platformio.ini"
+            )
+        return root
+
+    cur = Path.cwd().resolve()
+    for candidate in (cur, *cur.parents):
+        if (candidate / "platformio.ini").is_file():
+            return candidate
+    raise ConfigError(
+        "Could not locate Meshtastic firmware root. Set MESHTASTIC_FIRMWARE_ROOT "
+        "to the directory containing platformio.ini."
+    )
+
+
+def _first_existing(paths: Iterable[Path]) -> Path | None:
+    for p in paths:
+        if p and p.is_file() and os.access(p, os.X_OK):
+            return p
+    return None
+
+
+def pio_bin() -> Path:
+    """Resolve the `pio` binary.
+
+    Order: MESHTASTIC_PIO_BIN → ~/.platformio/penv/bin/pio (PlatformIO keeps
+    this one current) → `pio` on PATH → `platformio` on PATH.
+    """
+    env = os.environ.get("MESHTASTIC_PIO_BIN")
+    if env:
+        p = Path(env).expanduser()
+        if p.is_file() and os.access(p, os.X_OK):
+            return p
+        raise ConfigError(f"MESHTASTIC_PIO_BIN={env!r} is not an executable file")
+
+    penv = Path.home() / ".platformio" / "penv" / "bin" / "pio"
+    if penv.is_file() and os.access(penv, os.X_OK):
+        return penv
+
+    for name in ("pio", "platformio"):
+        w = shutil.which(name)
+        if w:
+            return Path(w)
+
+    raise ConfigError(
+        "Could not find `pio`. Install PlatformIO (https://platformio.org/install/cli) "
+        "or set MESHTASTIC_PIO_BIN."
+    )
+
+
+def _hw_tool(env_var: str, names: tuple[str, ...], install_hint: str) -> Path:
+    """Shared resolver for esptool / nrfutil / picotool.
+
+    Prefers the firmware repo's own `.venv/bin/<name>` (esptool lives there),
+    then PATH.
+    """
+    env = os.environ.get(env_var)
+    if env:
+        p = Path(env).expanduser()
+        if p.is_file() and os.access(p, os.X_OK):
+            return p
+        raise ConfigError(f"{env_var}={env!r} is not an executable file")
+
+    try:
+        venv_bin = firmware_root() / ".venv" / "bin"
+    except ConfigError:
+        venv_bin = None
+
+    for name in names:
+        if venv_bin is not None:
+            p = venv_bin / name
+            if p.is_file() and os.access(p, os.X_OK):
+                return p
+
+    for name in names:
+        w = shutil.which(name)
+        if w:
+            return Path(w)
+
+    raise ConfigError(
+        f"Could not find `{names[0]}`. {install_hint} "
+        f"Or set {env_var} to an absolute path."
+    )
+
+
+def esptool_bin() -> Path:
+    return _hw_tool(
+        "MESHTASTIC_ESPTOOL_BIN",
+        ("esptool", "esptool.py"),
+        "Install via `pip install esptool`.",
+    )
+
+
+def nrfutil_bin() -> Path:
+    return _hw_tool(
+        "MESHTASTIC_NRFUTIL_BIN",
+        ("nrfutil", "adafruit-nrfutil"),
+        "Install via `pip install adafruit-nrfutil` or download Nordic nRF Util.",
+    )
+
+
+def picotool_bin() -> Path:
+    return _hw_tool(
+        "MESHTASTIC_PICOTOOL_BIN",
+        ("picotool",),
+        "Install via `brew install picotool` or build from https://github.com/raspberrypi/picotool.",
+    )
diff --git a/mcp-server/src/meshtastic_mcp/connection.py b/mcp-server/src/meshtastic_mcp/connection.py
new file mode 100644
index 00000000000..17a7e2c8985
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/connection.py
@@ -0,0 +1,84 @@
+"""Context manager for meshtastic.SerialInterface connections.
+
+Every info/admin tool goes through `connect(port)` so we have a single place
+that:
+  - auto-selects the port when one likely_meshtastic device is present,
+  - fails fast if a serial_session is already holding the port,
+  - guarantees `.close()` is called, even on exception.
+
+The `SerialInterface` blocks on construction waiting for the node database;
+that's fine for v1 since every tool is a short-lived request.
+"""
+
+from __future__ import annotations
+
+from contextlib import contextmanager
+from typing import Iterator
+
+from . import devices, registry
+
+
+class ConnectionError(RuntimeError):
+    pass
+
+
+def resolve_port(port: str | None) -> str:
+    """Pick a port: explicit > sole likely_meshtastic candidate > error."""
+    if port:
+        return port
+    candidates = [d for d in devices.list_devices() if d["likely_meshtastic"]]
+    if not candidates:
+        raise ConnectionError(
+            "No Meshtastic devices detected. Plug one in or pass `port` explicitly. "
+            "Run `list_devices` with include_unknown=True to see all serial ports."
+        )
+    if len(candidates) > 1:
+        ports = ", ".join(c["port"] for c in candidates)
+        raise ConnectionError(
+            f"Multiple Meshtastic devices detected ({ports}). "
+            "Specify `port` explicitly."
+        )
+    return candidates[0]["port"]
+
+
+@contextmanager
+def connect(port: str | None = None, timeout_s: float = 8.0) -> Iterator:
+    """Open a `meshtastic.SerialInterface` and always close it.
+
+    Raises `ConnectionError` immediately if another serial session holds the
+    port (a `pio device monitor` in `serial_sessions/`, for instance).
+    """
+    from meshtastic.serial_interface import (
+        SerialInterface,  # type: ignore[import-untyped]
+    )
+
+    resolved = resolve_port(port)
+
+    active = registry.active_session_for_port(resolved)
+    if active is not None:
+        raise ConnectionError(
+            f"Port {resolved} is held by serial session {active.id}. "
+            "Call `serial_close` first."
+        )
+
+    lock = registry.port_lock(resolved)
+    if not lock.acquire(blocking=False):
+        raise ConnectionError(
+            f"Port {resolved} is busy — another device operation is in flight. "
+            "Retry shortly."
+        )
+
+    iface = None
+    try:
+        iface = SerialInterface(devPath=resolved, connectNow=True, noProto=False)
+        yield iface
+    finally:
+        if iface is not None:
+            try:
+                iface.close()
+            except Exception:
+                pass
+        try:
+            lock.release()
+        except RuntimeError:
+            pass
diff --git a/mcp-server/src/meshtastic_mcp/devices.py b/mcp-server/src/meshtastic_mcp/devices.py
new file mode 100644
index 00000000000..c4805c1ab2c
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/devices.py
@@ -0,0 +1,75 @@
+"""USB/serial device discovery.
+
+Combines the canonical `meshtastic.util.findPorts()` allowlist/blocklist with
+the richer metadata (`serial.tools.list_ports.comports()`) so callers see
+VID/PID, descriptions, and manufacturer strings alongside the "is this likely
+a Meshtastic device" signal.
+"""
+
+from __future__ import annotations
+
+from typing import Any
+
+from serial.tools import list_ports
+
+
+def _to_hex(value: int | None) -> str | None:
+    if value is None:
+        return None
+    return f"0x{value:04x}"
+
+
+def list_devices(include_unknown: bool = False) -> list[dict[str, Any]]:
+    """Return enriched info for serial ports, flagging Meshtastic candidates.
+
+    `likely_meshtastic` is True when the port's USB VID matches the Meshtastic
+    allowlist (`0x239a` Adafruit/RAK, `0x303a` Espressif). When no allowlisted
+    ports are present, ports whose VID is NOT in the blocklist (J-Link, ST-LINK,
+    PPK2, etc.) are surfaced as `likely_meshtastic=False` candidates.
+
+    With `include_unknown=False` (default), we return only ports that are
+    plausibly Meshtastic. With `include_unknown=True`, every serial port the
+    OS knows about is returned (useful for debugging "why isn't my board
+    detected").
+    """
+    # Import lazily so the module loads even without the `meshtastic` package
+    # (useful for introspection / schema generation).
+    from meshtastic import util as mt_util  # type: ignore[import-untyped]
+
+    meshtastic_ports: set[str] = set(mt_util.findPorts(eliminate_duplicates=True))
+    whitelist = getattr(mt_util, "whitelistVids", {})
+    blacklist = getattr(mt_util, "blacklistVids", {})
+
+    results: list[dict[str, Any]] = []
+    for info in list_ports.comports():
+        port_path = info.device
+        vid = info.vid
+        in_whitelist = vid is not None and vid in whitelist
+        in_blacklist = vid is not None and vid in blacklist
+
+        likely = port_path in meshtastic_ports and in_whitelist
+        # If no allowlisted ports were found, findPorts falls back to
+        # everything-not-in-blacklist; treat those as plausible candidates
+        # but not "likely".
+        fallback_candidate = port_path in meshtastic_ports and not in_whitelist
+
+        if not likely and not fallback_candidate and not include_unknown:
+            continue
+
+        results.append(
+            {
+                "port": port_path,
+                "vid": _to_hex(vid),
+                "pid": _to_hex(info.pid),
+                "description": info.description or None,
+                "manufacturer": info.manufacturer or None,
+                "product": info.product or None,
+                "serial_number": info.serial_number or None,
+                "likely_meshtastic": likely,
+                "blacklisted": in_blacklist,
+            }
+        )
+
+    # Stable ordering: likely_meshtastic first, then by port path
+    results.sort(key=lambda r: (not r["likely_meshtastic"], r["port"]))
+    return results
diff --git a/mcp-server/src/meshtastic_mcp/flash.py b/mcp-server/src/meshtastic_mcp/flash.py
new file mode 100644
index 00000000000..2c41a7c21e6
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/flash.py
@@ -0,0 +1,447 @@
+"""Build, clean, flash, and bootloader-entry operations.
+
+Design: pio is the preferred path for every architecture via `flash()`. For
+ESP32 factory flashes we shell out to `bin/device-install.sh` (which knows
+about partition offsets and the OTA/littlefs partitions); for ESP32 OTA
+updates we use `bin/device-update.sh`. Both scripts require the build
+artifacts to exist, so these tools build first if needed.
+"""
+
+from __future__ import annotations
+
+import subprocess
+import threading
+import time
+from pathlib import Path
+from typing import Any
+
+import serial
+
+from . import boards, config, devices, pio, userprefs
+
+# Meshtastic variants use both `esp32s3` and `esp32-s3` style names across
+# variants/*/platformio.ini (no consistency enforced). Accept both spellings.
+ESP32_ARCHES = {
+    "esp32",
+    "esp32s2",
+    "esp32-s2",
+    "esp32s3",
+    "esp32-s3",
+    "esp32c3",
+    "esp32-c3",
+    "esp32c6",
+    "esp32-c6",
+}
+
+
+class FlashError(RuntimeError):
+    pass
+
+
+def _require_confirm(confirm: bool, operation: str) -> None:
+    if not confirm:
+        raise FlashError(
+            f"{operation} is destructive and requires confirm=True. "
+            "This will overwrite firmware on the device."
+        )
+
+
+def _artifacts_for(env: str) -> list[Path]:
+    build_dir = config.firmware_root() / ".pio" / "build" / env
+    if not build_dir.is_dir():
+        return []
+    patterns = (
+        "firmware*.bin",
+        "firmware*.uf2",
+        "firmware*.hex",
+        "firmware*.zip",
+        "firmware*.elf",
+        "*.mt.json",
+        "littlefs-*.bin",
+    )
+    out: list[Path] = []
+    for pat in patterns:
+        out.extend(sorted(build_dir.glob(pat)))
+    return out
+
+
+def _factory_bin_for(env: str) -> Path | None:
+    build_dir = config.firmware_root() / ".pio" / "build" / env
+    if not build_dir.is_dir():
+        return None
+    matches = sorted(build_dir.glob("firmware-*.factory.bin"))
+    return matches[0] if matches else None
+
+
+def _firmware_bin_for(env: str) -> Path | None:
+    """Return the OTA-update firmware binary (app partition only)."""
+    build_dir = config.firmware_root() / ".pio" / "build" / env
+    if not build_dir.is_dir():
+        return None
+    # device-update.sh expects firmware-<env>-<version>.bin (not .factory.bin)
+    matches = sorted(
+        p
+        for p in build_dir.glob("firmware-*.bin")
+        if not p.name.endswith(".factory.bin")
+    )
+    return matches[0] if matches else None
+
+
+def _userprefs_summary(active: dict[str, str]) -> dict[str, Any]:
+    """Compact summary of which USERPREFS_* are baked into the build."""
+    return {"count": len(active), "keys": sorted(active.keys())}
+
+
+def build(
+    env: str,
+    with_manifest: bool = True,
+    userprefs_overrides: dict[str, Any] | None = None,
+) -> dict[str, Any]:
+    """Run `pio run -e <env>` and return artifact paths.
+
+    `userprefs_overrides` (optional): dict of `USERPREFS_<KEY>: value` to inject
+    into userPrefs.jsonc for this build only. File is restored byte-for-byte
+    on exit. Use `userprefs_set()` for persistent changes.
+    """
+    args = ["run", "-e", env]
+    if with_manifest:
+        args.extend(["-t", "mtjson"])
+    with userprefs.temporary_overrides(userprefs_overrides) as effective:
+        result = pio.run(args, timeout=pio.TIMEOUT_BUILD, check=False)
+    return {
+        "exit_code": result.returncode,
+        "artifacts": [str(p) for p in _artifacts_for(env)],
+        "stdout_tail": pio.tail_lines(result.stdout, 200),
+        "stderr_tail": pio.tail_lines(result.stderr, 200),
+        "duration_s": round(result.duration_s, 2),
+        "userprefs": _userprefs_summary(effective),
+    }
+
+
+def clean(env: str) -> dict[str, Any]:
+    """Run `pio run -e <env> -t clean`."""
+    result = pio.run(["run", "-e", env, "-t", "clean"], timeout=120, check=False)
+    return {
+        "exit_code": result.returncode,
+        "stdout_tail": pio.tail_lines(result.stdout, 200),
+        "stderr_tail": pio.tail_lines(result.stderr, 200),
+        "duration_s": round(result.duration_s, 2),
+    }
+
+
+def flash(
+    env: str,
+    port: str,
+    confirm: bool = False,
+    userprefs_overrides: dict[str, Any] | None = None,
+) -> dict[str, Any]:
+    """`pio run -e <env> -t upload --upload-port <port>`. All architectures.
+
+    `userprefs_overrides` (optional): see `build()` — the rebuild-before-upload
+    that pio performs will pick up the injected values.
+    """
+    _require_confirm(confirm, "flash")
+    with userprefs.temporary_overrides(userprefs_overrides) as effective:
+        result = pio.run(
+            ["run", "-e", env, "-t", "upload", "--upload-port", port],
+            timeout=pio.TIMEOUT_UPLOAD,
+            check=False,
+        )
+    return {
+        "exit_code": result.returncode,
+        "stdout_tail": pio.tail_lines(result.stdout, 200),
+        "stderr_tail": pio.tail_lines(result.stderr, 200),
+        "duration_s": round(result.duration_s, 2),
+        "userprefs": _userprefs_summary(effective),
+    }
+
+
+def _check_esp32_env(env: str) -> str:
+    rec = boards.get_board(env)
+    arch = rec.get("architecture")
+    if arch not in ESP32_ARCHES:
+        raise FlashError(
+            f"Env {env!r} has architecture {arch!r}, not ESP32. "
+            "Use `flash` for non-ESP32 boards."
+        )
+    return arch
+
+
+def _run_install_script(script: Path, port: str, binary: Path) -> dict[str, Any]:
+    """Invoke bin/device-install.sh or bin/device-update.sh."""
+    t0 = time.monotonic()
+    proc = subprocess.run(
+        [str(script), "-p", port, "-f", str(binary)],
+        cwd=str(config.firmware_root()),
+        capture_output=True,
+        text=True,
+        timeout=pio.TIMEOUT_UPLOAD,
+    )
+    duration = time.monotonic() - t0
+    return {
+        "exit_code": proc.returncode,
+        "stdout_tail": pio.tail_lines(proc.stdout, 200),
+        "stderr_tail": pio.tail_lines(proc.stderr, 200),
+        "duration_s": round(duration, 2),
+    }
+
+
+def erase_and_flash(
+    env: str,
+    port: str,
+    confirm: bool = False,
+    skip_build: bool = False,
+    userprefs_overrides: dict[str, Any] | None = None,
+) -> dict[str, Any]:
+    """ESP32-only: full erase + factory flash via bin/device-install.sh.
+
+    `userprefs_overrides`: baked into the factory.bin via a fresh build. If
+    overrides are provided we always force a rebuild (skip_build=True errors
+    in that case) since a cached factory.bin would not reflect the new prefs.
+    """
+    _require_confirm(confirm, "erase_and_flash")
+    _check_esp32_env(env)
+
+    if userprefs_overrides and skip_build:
+        raise FlashError(
+            "userprefs_overrides forces a rebuild so the factory.bin reflects "
+            "the new values; skip_build=True is incompatible."
+        )
+
+    with userprefs.temporary_overrides(userprefs_overrides) as effective:
+        # If overrides were provided, always build; otherwise only build if
+        # no factory.bin is present.
+        factory = _factory_bin_for(env)
+        if factory is None or userprefs_overrides:
+            if skip_build:
+                raise FlashError(
+                    f"No factory.bin found for env {env!r} and skip_build=True. "
+                    "Run `build` first or set skip_build=False."
+                )
+            build_args = ["run", "-e", env, "-t", "mtjson"]
+            build_result = pio.run(build_args, timeout=pio.TIMEOUT_BUILD, check=False)
+            if build_result.returncode != 0:
+                return {
+                    "exit_code": build_result.returncode,
+                    "stdout_tail": pio.tail_lines(build_result.stdout, 200),
+                    "stderr_tail": pio.tail_lines(build_result.stderr, 200),
+                    "duration_s": round(build_result.duration_s, 2),
+                    "error": "build failed before erase_and_flash could run",
+                    "userprefs": _userprefs_summary(effective),
+                }
+            factory = _factory_bin_for(env)
+            if factory is None:
+                raise FlashError(
+                    f"Build succeeded but no factory.bin appeared in .pio/build/{env}/"
+                )
+
+        script = config.firmware_root() / "bin" / "device-install.sh"
+        if not script.is_file():
+            raise FlashError(f"device-install.sh not found at {script}")
+        result = _run_install_script(script, port, factory)
+
+    result["userprefs"] = _userprefs_summary(effective)
+    return result
+
+
+def update_flash(
+    env: str,
+    port: str,
+    confirm: bool = False,
+    skip_build: bool = False,
+    userprefs_overrides: dict[str, Any] | None = None,
+) -> dict[str, Any]:
+    """ESP32-only: OTA app-partition update via bin/device-update.sh.
+
+    `userprefs_overrides`: baked into the firmware.bin via a fresh build. If
+    overrides are provided we always force a rebuild.
+    """
+    _require_confirm(confirm, "update_flash")
+    _check_esp32_env(env)
+
+    if userprefs_overrides and skip_build:
+        raise FlashError(
+            "userprefs_overrides forces a rebuild so the firmware.bin reflects "
+            "the new values; skip_build=True is incompatible."
+        )
+
+    with userprefs.temporary_overrides(userprefs_overrides) as effective:
+        firmware = _firmware_bin_for(env)
+        if firmware is None or userprefs_overrides:
+            if skip_build:
+                raise FlashError(
+                    f"No firmware.bin found for env {env!r} and skip_build=True. "
+                    "Run `build` first or set skip_build=False."
+                )
+            build_args = ["run", "-e", env, "-t", "mtjson"]
+            build_result = pio.run(build_args, timeout=pio.TIMEOUT_BUILD, check=False)
+            if build_result.returncode != 0:
+                return {
+                    "exit_code": build_result.returncode,
+                    "stdout_tail": pio.tail_lines(build_result.stdout, 200),
+                    "stderr_tail": pio.tail_lines(build_result.stderr, 200),
+                    "duration_s": round(build_result.duration_s, 2),
+                    "error": "build failed before update_flash could run",
+                    "userprefs": _userprefs_summary(effective),
+                }
+            firmware = _firmware_bin_for(env)
+            if firmware is None:
+                raise FlashError(
+                    f"Build succeeded but no firmware.bin appeared in .pio/build/{env}/"
+                )
+
+        script = config.firmware_root() / "bin" / "device-update.sh"
+        if not script.is_file():
+            raise FlashError(f"device-update.sh not found at {script}")
+        result = _run_install_script(script, port, firmware)
+
+    result["userprefs"] = _userprefs_summary(effective)
+    return result
+
+
+def _do_1200bps_touch(port: str, settle_ms: int, touch_timeout_s: float = 3.0) -> None:
+    """Open port at 1200 baud and close, bounded by a worker thread.
+
+    Both the open and the close can block on a busy CDC device — we wrap the
+    whole thing in a worker so the caller returns in at most `touch_timeout_s`
+    regardless. The touch is signal-only: the USB configuration change to
+    1200 baud alone is enough to trip the Adafruit bootloader's reset, so a
+    worker that's still blocked in the background after timeout has already
+    delivered the signal.
+    """
+    errors: list[BaseException] = []
+
+    def _inner() -> None:
+        try:
+            s = serial.Serial(port, 1200)
+        except serial.SerialException as exc:
+            if "No such file" in str(exc) or "could not open" in str(exc).lower():
+                raise
+            return  # other serial errors mid-open are expected during DFU entry
+        try:
+            time.sleep(settle_ms / 1000.0)
+        finally:
+            try:
+                s.close()
+            except Exception:
+                pass
+
+    def _runner() -> None:
+        try:
+            _inner()
+        except BaseException as exc:  # re-raised on caller thread after join
+            errors.append(exc)
+
+    worker = threading.Thread(target=_runner, daemon=True)
+    worker.start()
+    worker.join(timeout=touch_timeout_s)
+    if worker.is_alive():
+        return  # signal already delivered; allow daemon worker to finish/exit
+    if errors:
+        raise errors[0]
+
+
+# Adafruit nRF52 bootloader VID/PID (BOTH RAK4631 and most Feather nRF52 boards).
+# See https://github.com/adafruit/Adafruit_nRF52_Bootloader
+_NRF52_BOOTLOADER_VID = 0x239A
+_NRF52_BOOTLOADER_PIDS = {
+    0x0029,  # Adafruit nRF52 bootloader (generic, used by RAK4631)
+    0x002A,  # Adafruit Feather Express bootloader variant
+    0x4029,  # alt seen on some boards
+}
+
+
+def _find_nrf52_bootloader_port() -> dict[str, Any] | None:
+    """Return a dict for any currently-enumerated nRF52 bootloader port, or None."""
+    for d in devices.list_devices(include_unknown=True):
+        vid_str = d.get("vid")
+        pid_str = d.get("pid")
+        if vid_str is None or pid_str is None:
+            continue
+        try:
+            vid = int(vid_str, 16) if isinstance(vid_str, str) else int(vid_str)
+            pid = int(pid_str, 16) if isinstance(pid_str, str) else int(pid_str)
+        except ValueError:
+            continue
+        if vid == _NRF52_BOOTLOADER_VID and pid in _NRF52_BOOTLOADER_PIDS:
+            return d
+    return None
+
+
+def touch_1200bps(
+    port: str,
+    settle_ms: int = 250,
+    poll_timeout_s: float = 8.0,
+    retries: int = 2,
+) -> dict[str, Any]:
+    """Open port at 1200 baud, close immediately — triggers USB CDC bootloader.
+
+    Works for: nRF52840 (Adafruit bootloader), ESP32-S3 (native USB download
+    mode), RP2040 (when built with 1200bps-reset stdio), Arduino Leonardo/Micro.
+
+    For nRF52 specifically: after the touch, polls for the Adafruit bootloader
+    VID/PID (0x239A / 0x0029) for up to `poll_timeout_s` seconds. Adafruit's
+    bootloader docs note a touch sometimes needs to be repeated, so this
+    retries up to `retries` times. The returned `new_port` is the bootloader
+    port (distinct from the app port) — exactly what's needed for `pio run
+    -t upload` to drive nrfutil.
+
+    For non-nRF52 devices (ESP32-S3, RP2040, Arduino), falls back to
+    "any-new-port appeared" detection.
+
+    Returns `{ok, former_port, new_port, new_port_vid_pid, attempts}`.
+    """
+    before_list = devices.list_devices(include_unknown=True)
+    before_ports = {d["port"] for d in before_list}
+
+    attempts = 0
+    new_port_info: dict[str, Any] | None = None
+
+    for attempt in range(1, retries + 1):
+        attempts = attempt
+        _do_1200bps_touch(port, settle_ms=settle_ms, touch_timeout_s=3.0)
+
+        # Poll for either (a) the nRF52 bootloader VID/PID appearing, or
+        # (b) a brand-new port appearing that wasn't there before.
+        deadline = time.monotonic() + poll_timeout_s
+        while time.monotonic() < deadline:
+            time.sleep(0.2)
+
+            bootloader = _find_nrf52_bootloader_port()
+            if bootloader is not None:
+                new_port_info = bootloader
+                break
+
+            current = devices.list_devices(include_unknown=True)
+            current_paths = {d["port"] for d in current}
+            added = current_paths - before_ports
+            if added:
+                added_record = next((d for d in current if d["port"] in added), None)
+                if added_record:
+                    new_port_info = added_record
+                    break
+
+        if new_port_info is not None:
+            break
+        # No bootloader appeared; try touching again (Adafruit recommends
+        # sometimes requiring two touches for reliability).
+
+    if new_port_info is not None:
+        return {
+            "ok": True,
+            "former_port": port,
+            "new_port": new_port_info["port"],
+            "new_port_vid_pid": (
+                new_port_info.get("vid"),
+                new_port_info.get("pid"),
+            ),
+            "attempts": attempts,
+        }
+
+    return {
+        "ok": False,
+        "former_port": port,
+        "new_port": None,
+        "new_port_vid_pid": (None, None),
+        "attempts": attempts,
+    }
diff --git a/mcp-server/src/meshtastic_mcp/hw_tools.py b/mcp-server/src/meshtastic_mcp/hw_tools.py
new file mode 100644
index 00000000000..4275539baf0
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/hw_tools.py
@@ -0,0 +1,243 @@
+"""Direct wrappers around vendor flashing tools: esptool, nrfutil, picotool.
+
+These are escape hatches. Prefer the pio-based tools in flash.py when they
+cover the operation — pio knows the correct offsets, protocols, and filters
+for every supported board. Use these when pio doesn't: to erase a bricked
+ESP32, DFU-flash an nRF52 zip package, or inspect an RP2040's bootloader.
+
+Every destructive `*_raw` subcommand is gated by `confirm=True` so callers
+can't accidentally `--write-flash` from freeform args.
+"""
+
+from __future__ import annotations
+
+import re
+import subprocess
+from pathlib import Path
+from typing import Any, Sequence
+
+from . import config, pio
+
+_TIMEOUT_SHORT = 30
+_TIMEOUT_LONG = 600
+
+
+class ToolError(RuntimeError):
+    pass
+
+
+def _run(
+    binary: Path,
+    args: Sequence[str],
+    *,
+    timeout: float = _TIMEOUT_LONG,
+    cwd: Path | None = None,
+) -> dict[str, Any]:
+    # Shared with pio.run(): if `MESHTASTIC_MCP_FLASH_LOG` is set, each line
+    # of output is tee'd to that file as it arrives so the TUI can show live
+    # esptool/nrfutil/picotool progress instead of 3 minutes of silence.
+    full = [str(binary), *args]
+    try:
+        rc, stdout, stderr, duration = pio._run_capturing(
+            full,
+            cwd=cwd,
+            timeout=timeout,
+            tee_header=f"{binary.name} {' '.join(args)}",
+        )
+    except subprocess.TimeoutExpired as exc:
+        raise ToolError(
+            f"{binary.name} {' '.join(args)} timed out after {timeout}s"
+        ) from exc
+    return {
+        "exit_code": rc,
+        "stdout": stdout,
+        "stderr": stderr,
+        "stdout_tail": pio.tail_lines(stdout, 200),
+        "stderr_tail": pio.tail_lines(stderr, 200),
+        "duration_s": round(duration, 2),
+    }
+
+
+def _require_confirm(confirm: bool, what: str) -> None:
+    if not confirm:
+        raise ToolError(f"{what} is destructive and requires confirm=True.")
+
+
+# ---------- esptool --------------------------------------------------------
+
+ESPTOOL_DESTRUCTIVE = {
+    "write_flash",
+    "write-flash",
+    "erase_flash",
+    "erase-flash",
+    "erase_region",
+    "erase-region",
+    "merge_bin",
+    "merge-bin",
+}
+
+
+def _parse_esptool_chip_info(stdout: str) -> dict[str, Any]:
+    """Parse `esptool chip_id` / `flash_id` output into structured fields."""
+    result: dict[str, Any] = {
+        "chip": None,
+        "mac": None,
+        "crystal_mhz": None,
+        "flash_size": None,
+        "features": [],
+    }
+    for line in stdout.splitlines():
+        line = line.strip()
+        if m := re.match(r"Chip is (.+)", line):
+            result["chip"] = m.group(1).strip()
+        elif m := re.match(r"MAC: ([0-9a-fA-F:]+)", line):
+            result["mac"] = m.group(1)
+        elif m := re.match(r"Crystal is (\d+)MHz", line):
+            result["crystal_mhz"] = int(m.group(1))
+        elif m := re.match(r"Detected flash size: (\S+)", line):
+            result["flash_size"] = m.group(1)
+        elif m := re.match(r"Features: (.+)", line):
+            result["features"] = [f.strip() for f in m.group(1).split(",") if f.strip()]
+    return result
+
+
+def esptool_chip_info(port: str) -> dict[str, Any]:
+    binary = config.esptool_bin()
+    # `chip_id` prints chip + mac + crystal + features. `flash_id` adds flash.
+    combined = _run(binary, ["--port", port, "flash_id"], timeout=_TIMEOUT_SHORT)
+    if combined["exit_code"] != 0:
+        raise ToolError(
+            f"esptool failed (exit {combined['exit_code']}):\n{combined['stderr_tail']}"
+        )
+    parsed = _parse_esptool_chip_info(combined["stdout"])
+    return {**parsed, "raw_stdout_tail": combined["stdout_tail"]}
+
+
+def esptool_erase_flash(port: str, confirm: bool = False) -> dict[str, Any]:
+    """Full-chip erase. Leaves the device unbootable until reflashed."""
+    _require_confirm(confirm, "esptool_erase_flash")
+    binary = config.esptool_bin()
+    # esptool v5 uses `erase-flash`, older uses `erase_flash`. Try the new name
+    # first; if it fails with unknown command, retry old.
+    res = _run(binary, ["--port", port, "erase-flash"], timeout=_TIMEOUT_LONG)
+    if (
+        res["exit_code"] != 0
+        and "unrecognized" in (res["stderr"] or res["stdout"]).lower()
+    ):
+        res = _run(binary, ["--port", port, "erase_flash"], timeout=_TIMEOUT_LONG)
+    return res
+
+
+def esptool_raw(
+    args: list[str], port: str | None = None, confirm: bool = False
+) -> dict[str, Any]:
+    """Raw esptool passthrough. Destructive subcommands require confirm=True."""
+    if not args:
+        raise ToolError("args must not be empty")
+    # Find the first non-flag arg (the subcommand).
+    subcommand = next((a for a in args if not a.startswith("-")), None)
+    if subcommand and subcommand.replace("-", "_") in {
+        s.replace("-", "_") for s in ESPTOOL_DESTRUCTIVE
+    }:
+        _require_confirm(confirm, f"esptool {subcommand}")
+
+    binary = config.esptool_bin()
+    full_args: list[str] = []
+    if port:
+        full_args.extend(["--port", port])
+    full_args.extend(args)
+    return _run(binary, full_args, timeout=_TIMEOUT_LONG)
+
+
+# ---------- nrfutil --------------------------------------------------------
+
+NRFUTIL_DESTRUCTIVE = {"dfu", "settings"}
+
+
+def nrfutil_dfu(port: str, package_path: str, confirm: bool = False) -> dict[str, Any]:
+    _require_confirm(confirm, "nrfutil_dfu")
+    pkg = Path(package_path).expanduser()
+    if not pkg.is_file():
+        raise ToolError(f"Package not found: {pkg}")
+    binary = config.nrfutil_bin()
+    return _run(
+        binary,
+        ["dfu", "serial", "--package", str(pkg), "--port", port, "-b", "115200"],
+        timeout=_TIMEOUT_LONG,
+    )
+
+
+def nrfutil_raw(args: list[str], confirm: bool = False) -> dict[str, Any]:
+    if not args:
+        raise ToolError("args must not be empty")
+    subcommand = next((a for a in args if not a.startswith("-")), None)
+    if subcommand in NRFUTIL_DESTRUCTIVE:
+        _require_confirm(confirm, f"nrfutil {subcommand}")
+    binary = config.nrfutil_bin()
+    return _run(binary, args, timeout=_TIMEOUT_LONG)
+
+
+# ---------- picotool -------------------------------------------------------
+
+PICOTOOL_DESTRUCTIVE = {"load", "reboot", "save", "erase"}
+
+
+def _parse_picotool_info(stdout: str) -> dict[str, Any]:
+    result: dict[str, Any] = {
+        "vendor": None,
+        "product": None,
+        "serial": None,
+        "flash_size": None,
+        "program_name": None,
+        "program_version": None,
+    }
+    for line in stdout.splitlines():
+        line = line.strip()
+        if m := re.match(r"Program information:", line):
+            continue
+        if m := re.match(r"name:\s*(.+)", line):
+            result["program_name"] = m.group(1).strip()
+        elif m := re.match(r"version:\s*(.+)", line):
+            result["program_version"] = m.group(1).strip()
+        elif m := re.match(r"vendor:\s*(.+)", line):
+            result["vendor"] = m.group(1).strip()
+        elif m := re.match(r"product:\s*(.+)", line):
+            result["product"] = m.group(1).strip()
+        elif m := re.match(r"serial number:\s*(.+)", line):
+            result["serial"] = m.group(1).strip()
+        elif m := re.match(r"flash size:\s*(.+)", line):
+            result["flash_size"] = m.group(1).strip()
+    return result
+
+
+def picotool_info(port: str | None = None) -> dict[str, Any]:
+    """Read device info from a Pico in BOOTSEL mode. `port` is informational
+    only — picotool auto-detects."""
+    binary = config.picotool_bin()
+    res = _run(binary, ["info", "-a"], timeout=_TIMEOUT_SHORT)
+    if res["exit_code"] != 0:
+        raise ToolError(
+            f"picotool info failed (exit {res['exit_code']}): "
+            "is the Pico in BOOTSEL mode?\n" + res["stderr_tail"]
+        )
+    parsed = _parse_picotool_info(res["stdout"])
+    return {**parsed, "raw_stdout_tail": res["stdout_tail"]}
+
+
+def picotool_load(uf2_path: str, confirm: bool = False) -> dict[str, Any]:
+    _require_confirm(confirm, "picotool_load")
+    uf2 = Path(uf2_path).expanduser()
+    if not uf2.is_file():
+        raise ToolError(f"UF2 not found: {uf2}")
+    binary = config.picotool_bin()
+    return _run(binary, ["load", "-x", "-t", "uf2", str(uf2)], timeout=_TIMEOUT_LONG)
+
+
+def picotool_raw(args: list[str], confirm: bool = False) -> dict[str, Any]:
+    if not args:
+        raise ToolError("args must not be empty")
+    subcommand = next((a for a in args if not a.startswith("-")), None)
+    if subcommand in PICOTOOL_DESTRUCTIVE:
+        _require_confirm(confirm, f"picotool {subcommand}")
+    binary = config.picotool_bin()
+    return _run(binary, args, timeout=_TIMEOUT_LONG)
diff --git a/mcp-server/src/meshtastic_mcp/info.py b/mcp-server/src/meshtastic_mcp/info.py
new file mode 100644
index 00000000000..57d04de104f
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/info.py
@@ -0,0 +1,103 @@
+"""Read-only device queries via meshtastic.SerialInterface."""
+
+from __future__ import annotations
+
+from typing import Any
+
+from .connection import connect
+
+
+def _primary_channel_name(iface) -> str | None:
+    try:
+        channels = iface.localNode.channels or []
+    except AttributeError:
+        return None
+    for ch in channels:
+        role = getattr(ch, "role", None)
+        # Role enum: 0 DISABLED, 1 PRIMARY, 2 SECONDARY
+        if role == 1:
+            name = getattr(getattr(ch, "settings", None), "name", None)
+            return name or "(default)"
+    return None
+
+
+def device_info(port: str | None = None, timeout_s: float = 8.0) -> dict[str, Any]:
+    """Return summary info for the connected device."""
+    with connect(port=port, timeout_s=timeout_s) as iface:
+        my = iface.myInfo
+        meta = iface.metadata
+        local = iface.localNode
+
+        # Owner (long/short name) is on the local node's user record
+        long_name: str | None = None
+        short_name: str | None = None
+        hw_model: str | int | None = None
+        if iface.nodesByNum and my is not None:
+            local_rec = iface.nodesByNum.get(my.my_node_num, {})
+            user = local_rec.get("user") or {}
+            long_name = user.get("longName")
+            short_name = user.get("shortName")
+            hw_model = user.get("hwModel")
+
+        region = None
+        if local is not None and local.localConfig is not None:
+            try:
+                lora = local.localConfig.lora
+                # region is an enum; get its string name
+                region = (
+                    lora.DESCRIPTOR.fields_by_name["region"]
+                    .enum_type.values_by_number[lora.region]
+                    .name
+                )
+            except Exception:
+                region = None
+
+        return {
+            "port": iface.devPath if hasattr(iface, "devPath") else port,
+            "my_node_num": getattr(my, "my_node_num", None),
+            "long_name": long_name,
+            "short_name": short_name,
+            "firmware_version": getattr(meta, "firmware_version", None),
+            "hw_model": hw_model,
+            "region": region,
+            "num_nodes": len(iface.nodesByNum) if iface.nodesByNum else 0,
+            "primary_channel": _primary_channel_name(iface),
+        }
+
+
+def _node_record(node_dict: dict[str, Any]) -> dict[str, Any]:
+    user = node_dict.get("user") or {}
+    position = node_dict.get("position") or None
+    device_metrics = node_dict.get("deviceMetrics") or {}
+    return {
+        "node_num": node_dict.get("num"),
+        "user": {
+            "long_name": user.get("longName"),
+            "short_name": user.get("shortName"),
+            "hw_model": user.get("hwModel"),
+            "role": user.get("role"),
+        },
+        "position": (
+            {
+                "latitude": position.get("latitude"),
+                "longitude": position.get("longitude"),
+                "altitude": position.get("altitude"),
+                "time": position.get("time"),
+            }
+            if position
+            else None
+        ),
+        "snr": node_dict.get("snr"),
+        "rssi": node_dict.get("rssi"),
+        "last_heard": node_dict.get("lastHeard"),
+        "battery_level": device_metrics.get("batteryLevel"),
+        "is_favorite": bool(node_dict.get("isFavorite", False)),
+    }
+
+
+def list_nodes(port: str | None = None, timeout_s: float = 8.0) -> list[dict[str, Any]]:
+    """Return the device's node database."""
+    with connect(port=port, timeout_s=timeout_s) as iface:
+        if not iface.nodesByNum:
+            return []
+        return [_node_record(n) for n in iface.nodesByNum.values()]
diff --git a/mcp-server/src/meshtastic_mcp/pio.py b/mcp-server/src/meshtastic_mcp/pio.py
new file mode 100644
index 00000000000..c0c23f9bba4
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/pio.py
@@ -0,0 +1,295 @@
+"""Subprocess wrappers around the `pio` CLI.
+
+Every PlatformIO interaction in this package funnels through `run()` so we
+have a single place that owns timeouts, buffer sizes, JSON parsing, and the
+"stderr on exit-0 is informational" convention.
+
+`run()` has two execution paths:
+
+* Fast path (default): `subprocess.run(capture_output=True)` — buffered, one
+  return; fine for sub-second pio calls like `pio --version` or
+  `pio project config --json-output`.
+* Streaming path: when the `MESHTASTIC_MCP_FLASH_LOG` env var is set, each
+  output line is tee'd to that file as it arrives via a threaded reader.
+  The TUI tails the file to give live flash progress — otherwise a 3-minute
+  `pio run -t upload` is completely silent to the operator.
+
+`hw_tools.py` shares the streaming helper via `pio._run_capturing()` so
+esptool/nrfutil/picotool output also streams when the env var is set.
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import subprocess
+import threading
+import time
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Sequence, TextIO
+
+from . import config
+
+# 10 MB matches the reference impl (jl-codes/platformio-mcp). Build output can
+# be hundreds of KB; we'd rather keep it in memory than truncate.
+_MAX_BUFFER = 10 * 1024 * 1024
+
+# Per-operation defaults (seconds). None = no timeout.
+TIMEOUT_DEFAULT = 120
+TIMEOUT_PROJECT_CONFIG = 60
+TIMEOUT_DEVICE_LIST = 15
+TIMEOUT_BUILD = 900
+TIMEOUT_UPLOAD = 600
+
+
+class PioError(RuntimeError):
+    """pio exited non-zero."""
+
+    def __init__(self, args: Sequence[str], returncode: int, stdout: str, stderr: str):
+        self.args = list(args)
+        self.returncode = returncode
+        self.stdout = stdout
+        self.stderr = stderr
+        tail = (stderr or stdout).strip().splitlines()[-20:]
+        super().__init__(
+            f"pio {' '.join(args)} failed with exit {returncode}:\n" + "\n".join(tail)
+        )
+
+
+class PioTimeout(RuntimeError):
+    """pio did not return within the timeout."""
+
+
+@dataclass
+class PioResult:
+    args: list[str]
+    returncode: int
+    stdout: str
+    stderr: str
+    duration_s: float
+
+
+_FLASH_LOG_ENV = "MESHTASTIC_MCP_FLASH_LOG"
+
+
+def _flash_log_path() -> Path | None:
+    """Return the path to tee subprocess output to, or None if streaming off.
+
+    Controlled by `MESHTASTIC_MCP_FLASH_LOG`. `run-tests.sh` sets this to
+    `tests/flash.log`; the TUI tails that file so `pio run -t upload` shows
+    live progress in the pytest pane.
+    """
+    raw = os.environ.get(_FLASH_LOG_ENV)
+    if not raw:
+        return None
+    return Path(raw)
+
+
+def _run_capturing(
+    argv: Sequence[str],
+    *,
+    cwd: Path | None = None,
+    timeout: float | None = None,
+    tee_header: str | None = None,
+) -> tuple[int, str, str, float]:
+    """Run a subprocess, capture stdout+stderr, optionally tee to the flash log.
+
+    Returns `(returncode, stdout_str, stderr_str, duration_s)`. Raises
+    `subprocess.TimeoutExpired` on timeout (callers map this to their own
+    domain-specific error).
+
+    Fast path: `subprocess.run(capture_output=True)` when no flash log is
+    configured (unchanged behavior).
+
+    Streaming path: `Popen` with line-buffered stdout+stderr pipes; two
+    reader threads accumulate into result strings AND append each line to
+    the flash log file. Stdout and stderr stay separate in the return value
+    (so `stderr_tail` still means stderr), but are interleaved in the log
+    file in the order they arrived — that's what a human wants to read.
+    """
+    log_path = _flash_log_path()
+    t0 = time.monotonic()
+
+    if log_path is None:
+        # Fast path — unchanged.
+        proc = subprocess.run(
+            list(argv),
+            cwd=str(cwd) if cwd else None,
+            capture_output=True,
+            text=True,
+            timeout=timeout,
+        )
+        return (
+            proc.returncode,
+            proc.stdout or "",
+            proc.stderr or "",
+            time.monotonic() - t0,
+        )
+
+    # Streaming path: line-buffered Popen, threaded readers, tee to file.
+    # Ensure parent directory exists so the first tee write doesn't fail.
+    log_path.parent.mkdir(parents=True, exist_ok=True)
+    log_fh: TextIO | None = None
+    try:
+        log_fh = log_path.open("a", encoding="utf-8")
+    except OSError:
+        pass
+    # Append mode: the TUI truncates on startup, the session may produce
+    # many tee'd commands (erase + flash + factory-reset response), and
+    # we want all of them chronologically in one log.
+    proc = subprocess.Popen(  # noqa: S603
+        list(argv),
+        cwd=str(cwd) if cwd else None,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+        text=True,
+        bufsize=1,  # line-buffered
+    )
+    stdout_chunks: list[str] = []
+    stderr_chunks: list[str] = []
+    log_lock = threading.Lock()
+
+    def _append_log(line: str) -> None:
+        # Hold the lock briefly to serialize interleaved stdout/stderr writes
+        # so a half-written line from one stream doesn't get garbled by the
+        # other.
+        nonlocal log_fh
+        with log_lock:
+            if log_fh is None:
+                return
+            try:
+                log_fh.write(line)
+                log_fh.flush()
+            except OSError:
+                # Log file disappeared (umount, operator deleted the dir).
+                # Don't let that bubble up — the subprocess output is still
+                # collected in-memory for the return value.
+                try:
+                    log_fh.close()
+                except OSError:
+                    pass
+                log_fh = None
+
+    def _tee(stream, sink: list[str]) -> None:
+        try:
+            for line in stream:
+                sink.append(line)
+                _append_log(line)
+        except Exception:
+            pass
+
+    # Header line so the operator can tell commands apart in the log.
+    if tee_header:
+        _append_log(f"\n--- {tee_header} (start)\n")
+
+    assert proc.stdout is not None and proc.stderr is not None
+    t_out = threading.Thread(
+        target=_tee, args=(proc.stdout, stdout_chunks), daemon=True
+    )
+    t_err = threading.Thread(
+        target=_tee, args=(proc.stderr, stderr_chunks), daemon=True
+    )
+    t_out.start()
+    t_err.start()
+
+    # `Popen.wait` with a timeout is the cleanest way to get TimeoutExpired.
+    try:
+        proc.wait(timeout=timeout)
+    except subprocess.TimeoutExpired:
+        proc.kill()
+        proc.wait()
+        # Drain readers before re-raising so we don't leave threads behind.
+        t_out.join(timeout=2)
+        t_err.join(timeout=2)
+        raise
+
+    t_out.join()
+    t_err.join()
+    duration = time.monotonic() - t0
+
+    if tee_header:
+        _append_log(f"--- {tee_header} (exit {proc.returncode} in {duration:.1f}s)\n")
+
+    try:
+        return (
+            proc.returncode,
+            "".join(stdout_chunks),
+            "".join(stderr_chunks),
+            duration,
+        )
+    finally:
+        if log_fh is not None:
+            try:
+                log_fh.close()
+            except OSError:
+                pass
+
+
+def run(
+    args: Sequence[str],
+    *,
+    cwd: Path | None = None,
+    timeout: float | None = TIMEOUT_DEFAULT,
+    check: bool = True,
+) -> PioResult:
+    """Invoke `pio <args>` and return captured output.
+
+    `cwd` defaults to the firmware root. `check=True` raises `PioError` on
+    non-zero exit; set `check=False` to inspect `returncode` manually.
+
+    If `MESHTASTIC_MCP_FLASH_LOG` is set, output is also tee'd to that file
+    line-by-line as it arrives (for live flash progress in the TUI).
+    """
+    binary = str(config.pio_bin())
+    work_dir = cwd or config.firmware_root()
+    full = [binary, *args]
+    try:
+        rc, stdout, stderr, duration = _run_capturing(
+            full,
+            cwd=work_dir,
+            timeout=timeout,
+            tee_header=f"pio {' '.join(args)}",
+        )
+    except subprocess.TimeoutExpired as exc:
+        raise PioTimeout(f"pio {' '.join(args)} timed out after {timeout}s") from exc
+
+    result = PioResult(
+        args=list(args),
+        returncode=rc,
+        stdout=stdout,
+        stderr=stderr,
+        duration_s=duration,
+    )
+    if check and result.returncode != 0:
+        raise PioError(args, result.returncode, result.stdout, result.stderr)
+    return result
+
+
+def run_json(
+    args: Sequence[str],
+    *,
+    cwd: Path | None = None,
+    timeout: float | None = TIMEOUT_DEFAULT,
+):
+    """Run pio with `--json-output` appended and parse the result."""
+    full = list(args)
+    if "--json-output" not in full:
+        full.append("--json-output")
+    res = run(full, cwd=cwd, timeout=timeout, check=True)
+    if not res.stdout.strip():
+        raise PioError(args, 0, res.stdout, res.stderr or "pio returned empty output")
+    try:
+        return json.loads(res.stdout)
+    except json.JSONDecodeError as exc:
+        raise PioError(
+            args, 0, res.stdout[:2000], f"invalid JSON from pio: {exc}"
+        ) from exc
+
+
+def tail_lines(text: str, n: int = 200) -> str:
+    """Last `n` lines of `text`, joined with newlines. Empty string stays empty."""
+    if not text:
+        return ""
+    lines = text.splitlines()
+    return "\n".join(lines[-n:])
diff --git a/mcp-server/src/meshtastic_mcp/registry.py b/mcp-server/src/meshtastic_mcp/registry.py
new file mode 100644
index 00000000000..cfad1975e94
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/registry.py
@@ -0,0 +1,98 @@
+"""In-memory registry of active serial monitor sessions and port locks.
+
+Two things live here so the rest of the package has a single place to reach
+them:
+  1. `sessions`: `{session_id: SerialSession}` for pio device monitor subprocs.
+  2. `port_locks`: `{port: threading.Lock}` so admin/info tools can fail fast
+     when a serial monitor or another meshtastic client already owns a port.
+"""
+
+from __future__ import annotations
+
+import threading
+from typing import Any
+
+from .serial_session import SerialSession, close_session
+
+_LOCK = threading.Lock()
+_sessions: dict[str, SerialSession] = {}
+_port_locks: dict[str, threading.Lock] = {}
+
+
+def register_session(session: SerialSession) -> None:
+    with _LOCK:
+        _sessions[session.id] = session
+
+
+def get_session(session_id: str) -> SerialSession:
+    with _LOCK:
+        session = _sessions.get(session_id)
+    if session is None:
+        raise KeyError(f"Unknown session_id: {session_id!r}")
+    return session
+
+
+def remove_session(session_id: str) -> SerialSession | None:
+    with _LOCK:
+        return _sessions.pop(session_id, None)
+
+
+def active_session_for_port(port: str) -> SerialSession | None:
+    """Find any active (non-eof) session owning `port`."""
+    sweep_dead()
+    with _LOCK:
+        for s in _sessions.values():
+            if s.port == port and s.proc.poll() is None:
+                return s
+    return None
+
+
+def all_sessions() -> list[SerialSession]:
+    with _LOCK:
+        return list(_sessions.values())
+
+
+def sweep_dead() -> int:
+    """Remove sessions whose subprocess has exited. Returns count removed."""
+    removed_sessions: list[SerialSession] = []
+    with _LOCK:
+        for sid, s in list(_sessions.items()):
+            if s.proc.poll() is not None:
+                removed_sessions.append(_sessions.pop(sid))
+    for session in removed_sessions:
+        try:
+            close_session(session)
+        except Exception:
+            pass
+    return len(removed_sessions)
+
+
+def shutdown_all() -> None:
+    """Close every live session (called on server exit)."""
+    with _LOCK:
+        items = list(_sessions.items())
+        _sessions.clear()
+    for _sid, session in items:
+        try:
+            close_session(session)
+        except Exception:
+            pass
+
+
+def port_lock(port: str) -> threading.Lock:
+    """Per-port lock for SerialInterface / admin tool serialization."""
+    with _LOCK:
+        lock = _port_locks.get(port)
+        if lock is None:
+            lock = threading.Lock()
+            _port_locks[port] = lock
+        return lock
+
+
+def snapshot() -> dict[str, Any]:
+    """Debug dump: session count, port lock count."""
+    with _LOCK:
+        return {
+            "sessions": len(_sessions),
+            "port_locks": len(_port_locks),
+        }
diff --git a/mcp-server/src/meshtastic_mcp/serial_session.py b/mcp-server/src/meshtastic_mcp/serial_session.py
new file mode 100644
index 00000000000..b9c71d1d0f0
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/serial_session.py
@@ -0,0 +1,216 @@
+"""Long-running serial monitor sessions via `pio device monitor`.
+
+Why pio instead of raw pyserial: pio applies the board's monitor_filters —
+`esp32_exception_decoder` symbolicates crash stacks, `time` adds timestamps,
+etc. Raw pyserial would give us bytes; pio gives us developer-grade logs.
+
+Each session runs `pio device monitor` in a subprocess, with a daemon reader
+thread draining stdout into a bounded ring buffer. Callers pull lines via
+`serial_read` using a cursor that survives across calls.
+"""
+
+from __future__ import annotations
+
+import collections
+import subprocess
+import threading
+import time
+import uuid
+from dataclasses import dataclass, field
+from typing import Any
+
+from . import boards, config
+
+_BUFFER_MAX_LINES = 10_000
+_POLL_NEW_PORT_TIMEOUT_S = 3.0
+
+
+@dataclass
+class SerialSession:
+    id: str
+    port: str
+    baud: int
+    filters: list[str]
+    env: str | None
+    proc: subprocess.Popen
+    buffer: collections.deque = field(
+        default_factory=lambda: collections.deque(maxlen=_BUFFER_MAX_LINES)
+    )
+    # Total lines seen (not bounded by buffer maxlen). `dropped = total - len(buffer)`
+    # if the reader has advanced past buffer head.
+    total_lines: int = 0
+    started_at: float = field(default_factory=time.time)
+    stopped_at: float | None = None
+    lock: threading.Lock = field(default_factory=threading.Lock)
+    _thread: threading.Thread | None = None
+
+
+def _drain(session: SerialSession) -> None:
+    """Reader thread: line-by-line pull stdout into buffer."""
+    assert session.proc.stdout is not None
+    try:
+        for line in session.proc.stdout:
+            line_stripped = line.rstrip("\r\n")
+            with session.lock:
+                session.buffer.append(line_stripped)
+                session.total_lines += 1
+    except Exception:  # pragma: no cover - defensive
+        pass
+    finally:
+        session.stopped_at = time.time()
+
+
+def open_session(
+    port: str,
+    baud: int = 115200,
+    env: str | None = None,
+    filters: list[str] | None = None,
+) -> SerialSession:
+    """Spawn `pio device monitor` and return a SerialSession.
+
+    If `env` is supplied, pio resolves baud and filters from platformio.ini.
+    Otherwise uses the supplied `baud` and `filters` (default `['direct']`).
+    """
+    args = ["device", "monitor", "--port", port, "--no-reconnect"]
+    effective_filters: list[str]
+    effective_baud: int = baud
+    if env is not None:
+        args.extend(["-e", env])
+        raw_config: dict[str, Any] = {}
+        try:
+            raw = boards.get_board(env).get("raw_config")
+            if isinstance(raw, dict):
+                raw_config = raw
+        except Exception:
+            raw_config = {}
+
+        monitor_speed = raw_config.get("monitor_speed")
+        has_board_speed = False
+        if monitor_speed is not None:
+            try:
+                effective_baud = int(str(monitor_speed).strip())
+                has_board_speed = True
+            except (TypeError, ValueError):
+                pass
+
+        monitor_filters_raw = raw_config.get("monitor_filters")
+        parsed_board_filters: list[str] = []
+        if isinstance(monitor_filters_raw, str):
+            for token in monitor_filters_raw.replace("\n", ",").split(","):
+                item = token.strip()
+                if item:
+                    parsed_board_filters.append(item)
+        elif isinstance(monitor_filters_raw, list):
+            parsed_board_filters = [
+                str(item).strip() for item in monitor_filters_raw if str(item).strip()
+            ]
+
+        has_board_filters = len(parsed_board_filters) > 0
+        effective_filters = (
+            parsed_board_filters if has_board_filters else (filters or [])
+        )
+
+        if not has_board_speed:
+            args.extend(["--baud", str(effective_baud)])
+        if not has_board_filters:
+            for f in effective_filters:
+                args.extend(["--filter", f])
+    else:
+        args.extend(["--baud", str(baud)])
+        effective_filters = filters or ["direct"]
+        for f in effective_filters:
+            args.extend(["--filter", f])
+
+    binary = str(config.pio_bin())
+    work_dir = str(config.firmware_root())
+
+    proc = subprocess.Popen(
+        [binary, *args],
+        cwd=work_dir,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.STDOUT,
+        text=True,
+        bufsize=1,  # line-buffered
+    )
+
+    session = SerialSession(
+        id=uuid.uuid4().hex,
+        port=port,
+        baud=effective_baud,
+        filters=effective_filters,
+        env=env,
+        proc=proc,
+    )
+    t = threading.Thread(target=_drain, args=(session,), daemon=True)
+    t.start()
+    session._thread = t
+    return session
+
+
+def read_session(
+    session: SerialSession, max_lines: int = 200, since_cursor: int | None = None
+) -> dict[str, Any]:
+    """Snapshot recent lines from the buffer.
+
+    Cursor semantics: the global cursor is `total_lines` at read time. Pass
+    `since_cursor` from a previous response's `new_cursor` to page forward.
+    `since_cursor=0` reads everything still in the ring buffer.
+    """
+    with session.lock:
+        total = session.total_lines
+        buf_len = len(session.buffer)
+        head_cursor = total - buf_len  # cursor value at buffer[0]
+        current_buffer = list(session.buffer)
+
+    if since_cursor is None:
+        since_cursor = head_cursor
+
+    # Clamp: never read what's aged out of the buffer.
+    effective_start = max(since_cursor, head_cursor)
+    # Number of lines skipped because they aged out between reads.
+    dropped = max(0, head_cursor - since_cursor) if since_cursor < head_cursor else 0
+
+    start_idx = effective_start - head_cursor
+    end_idx = min(start_idx + max_lines, buf_len)
+    lines = current_buffer[start_idx:end_idx]
+    new_cursor = effective_start + len(lines)
+
+    eof = session.proc.poll() is not None
+    return {
+        "lines": lines,
+        "new_cursor": new_cursor,
+        "eof": eof,
+        "dropped": dropped,
+    }
+
+
+def close_session(session: SerialSession) -> bool:
+    """Terminate the subprocess and join the reader thread."""
+    proc = session.proc
+    if proc.poll() is None:
+        try:
+            proc.terminate()
+            proc.wait(timeout=3)
+        except subprocess.TimeoutExpired:
+            proc.kill()
+            proc.wait(timeout=3)
+    if session._thread is not None:
+        session._thread.join(timeout=3)
+    session.stopped_at = session.stopped_at or time.time()
+    return True
+
+
+def session_summary(session: SerialSession) -> dict[str, Any]:
+    with session.lock:
+        line_count = session.total_lines
+    return {
+        "session_id": session.id,
+        "port": session.port,
+        "baud": session.baud,
+        "filters": session.filters,
+        "env": session.env,
+        "started_at": session.started_at,
+        "stopped_at": session.stopped_at,
+        "line_count": line_count,
+        "eof": session.proc.poll() is not None,
+    }
diff --git a/mcp-server/src/meshtastic_mcp/server.py b/mcp-server/src/meshtastic_mcp/server.py
new file mode 100644
index 00000000000..7cb8db65e3d
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/server.py
@@ -0,0 +1,590 @@
+"""FastMCP server wiring — 38 tools across 7 categories.
+
+Each tool handler is a thin delegation to a named module (pio.py, admin.py,
+etc.). Business logic does not live here.
+"""
+
+from __future__ import annotations
+
+from typing import Any
+
+from mcp.server.fastmcp import FastMCP
+
+from . import (
+    admin,
+    boards,
+    devices,
+    flash,
+    hw_tools,
+    info,
+    registry,
+    serial_session,
+)
+from . import userprefs as userprefs_mod
+
+app = FastMCP("meshtastic-mcp")
+
+
+# ---------- Discovery & metadata ------------------------------------------
+
+
+@app.tool()
+def list_devices(include_unknown: bool = False) -> list[dict[str, Any]]:
+    """List USB/serial ports, flagging those likely to be Meshtastic devices.
+
+    With include_unknown=True, returns every serial port the OS knows about
+    (useful for debugging when a device isn't detected). Otherwise returns
+    only likely-Meshtastic candidates.
+    """
+    return devices.list_devices(include_unknown=include_unknown)
+
+
+@app.tool()
+def list_boards(
+    architecture: str | None = None,
+    actively_supported_only: bool = False,
+    query: str | None = None,
+    board_level: str | None = None,
+) -> list[dict[str, Any]]:
+    """Enumerate PlatformIO envs (boards) with Meshtastic metadata.
+
+    architecture: filter to one arch ("esp32", "esp32s3", "nrf52840", "rp2040", "stm32", "native").
+    actively_supported_only: filter to boards marked custom_meshtastic_actively_supported=true.
+    query: substring match on display_name, env name, or hw_model_slug (case-insensitive).
+    board_level: "release" (default-track release boards), "pr" (PR CI), or "extra" (opt-in extras).
+    """
+    return boards.list_boards(
+        architecture=architecture,
+        actively_supported_only=actively_supported_only,
+        query=query,
+        board_level=board_level,
+    )
+
+
+@app.tool()
+def get_board(env: str) -> dict[str, Any]:
+    """Full metadata for one PlatformIO env, including raw pio config fields."""
+    return boards.get_board(env)
+
+
+# ---------- Build & flash -------------------------------------------------
+
+
+@app.tool()
+def build(
+    env: str,
+    with_manifest: bool = True,
+    userprefs: dict[str, Any] | None = None,
+) -> dict[str, Any]:
+    """Build firmware for one env via `pio run -e <env>`.
+
+    Returns exit_code, duration, artifact paths under .pio/build/<env>/, and
+    tails of stdout/stderr (last 200 lines each). with_manifest=True adds the
+    mtjson target which produces an .mt.json manifest alongside the firmware.
+
+    `userprefs` (optional): dict of `USERPREFS_<KEY>: value` baked into this
+    build via userPrefs.jsonc injection. The file is restored after the build
+    completes. Use `userprefs_manifest` to discover available keys. Use
+    `userprefs_set` for persistent changes.
+    """
+    return flash.build(env, with_manifest=with_manifest, userprefs_overrides=userprefs)
+
+
+@app.tool()
+def clean(env: str) -> dict[str, Any]:
+    """Clean one env's build output via `pio run -e <env> -t clean`.
+
+    Useful when switching branches or debugging a stale-cache build failure.
+    """
+    return flash.clean(env)
+
+
+@app.tool()
+def pio_flash(
+    env: str,
+    port: str,
+    confirm: bool = False,
+    userprefs: dict[str, Any] | None = None,
+) -> dict[str, Any]:
+    """Flash firmware via `pio run -e <env> -t upload --upload-port <port>`.
+
+    Works for any architecture (ESP32/nRF52/RP2040/STM32). Requires confirm=True.
+    For first-time flashing a blank ESP32 board (erase + bootloader + app + fs),
+    prefer `erase_and_flash`. For ESP32 OTA updates, prefer `update_flash`.
+
+    `userprefs` (optional): dict of `USERPREFS_<KEY>: value` baked into this
+    build via userPrefs.jsonc injection; restored after upload.
+    """
+    return flash.flash(env, port, confirm=confirm, userprefs_overrides=userprefs)
+
+
+@app.tool()
+def erase_and_flash(
+    env: str,
+    port: str,
+    confirm: bool = False,
+    skip_build: bool = False,
+    userprefs: dict[str, Any] | None = None,
+) -> dict[str, Any]:
+    """ESP32-only: full erase + factory flash via bin/device-install.sh.
+
+    Wipes the entire flash and writes bootloader, app, OTA, and LittleFS
+    partitions from the factory.bin. Requires confirm=True. Runs `build` first
+    if no factory.bin is present (set skip_build=True to require a prior build).
+
+    `userprefs` (optional): dict of `USERPREFS_<KEY>: value` baked into the
+    factory.bin via userPrefs.jsonc injection. When provided, forces a rebuild
+    (skip_build=True is incompatible). File is restored after upload.
+    """
+    return flash.erase_and_flash(
+        env, port, confirm=confirm, skip_build=skip_build, userprefs_overrides=userprefs
+    )
+
+
+@app.tool()
+def update_flash(
+    env: str,
+    port: str,
+    confirm: bool = False,
+    skip_build: bool = False,
+    userprefs: dict[str, Any] | None = None,
+) -> dict[str, Any]:
+    """ESP32-only: OTA app-partition update via bin/device-update.sh.
+
+    Updates only the application partition, preserving device config and node
+    database. Faster than erase_and_flash but won't recover a broken bootloader.
+    Requires confirm=True. Builds first if needed.
+
+    `userprefs` (optional): dict of `USERPREFS_<KEY>: value` baked into the
+    firmware.bin via userPrefs.jsonc injection. When provided, forces a rebuild.
+    """
+    return flash.update_flash(
+        env, port, confirm=confirm, skip_build=skip_build, userprefs_overrides=userprefs
+    )
+
+
+# ---------- USERPREFS discovery & persistence -----------------------------
+
+
+@app.tool()
+def userprefs_manifest() -> dict[str, Any]:
+    """Full manifest of USERPREFS_* keys the firmware knows about.
+
+    Combines `userPrefs.jsonc` (active + commented examples) with a scan of
+    `src/**` for `USERPREFS_<KEY>` references — so every key the firmware
+    actually consumes shows up, even if undocumented in the jsonc.
+
+    Each entry has: key, active (is it uncommented), value (current), example
+    (jsonc commented default), declared_in_jsonc, consumed_by (list of src
+    files), inferred_type (brace|number|bool|enum|string|unknown).
+
+    `inferred_type` mirrors how platformio-custom.py wraps values at build
+    time: `brace` = byte array `{ 0x01, ... }`, `number` = decimal, `bool` =
+    true/false, `enum` = `meshtastic_*` constant, `string` = wrapped in quotes
+    via StringifyMacro.
+    """
+    return userprefs_mod.build_manifest()
+
+
+@app.tool()
+def userprefs_get() -> dict[str, Any]:
+    """Return the current userPrefs.jsonc state.
+
+    `active` is the dict of uncommented `USERPREFS_*` → value that will be
+    baked into the next build. `commented` is the dict of commented example
+    defaults (shown for reference).
+    """
+    state = userprefs_mod.read_state()
+    # Drop `order` (internal for round-trip rendering) from the public payload.
+    return {
+        "path": state["path"],
+        "active": state["active"],
+        "commented": state["commented"],
+    }
+
+
+@app.tool()
+def userprefs_set(prefs: dict[str, Any]) -> dict[str, Any]:
+    """Merge `prefs` into userPrefs.jsonc persistently (uncommenting keys).
+
+    Existing active values not in `prefs` are kept. To remove a key from the
+    active set, call `userprefs_reset` (restores the MCP backup if present)
+    or edit the jsonc manually. Values are stringified the way
+    platformio-custom.py expects (bool → "true"/"false", int → "42", etc.).
+    """
+    return userprefs_mod.merge_active(prefs)
+
+
+@app.tool()
+def userprefs_reset() -> dict[str, Any]:
+    """Restore userPrefs.jsonc from the most recent MCP backup (if any).
+
+    The backup is only created by the legacy `userprefs_set` workflow (not
+    currently written automatically). Returns `{restored: bool, ...}` — false
+    when no backup is present, in which case the caller should edit the
+    jsonc directly.
+    """
+    return userprefs_mod.reset()
+
+
+@app.tool()
+def userprefs_testing_profile(
+    psk_seed: str | None = None,
+    channel_name: str = "McpTest",
+    channel_num: int = 88,
+    region: str = "US",
+    modem_preset: str = "LONG_FAST",
+    short_name: str | None = None,
+    long_name: str | None = None,
+    disable_mqtt: bool = True,
+    disable_position: bool = False,
+) -> dict[str, Any]:
+    """Generate a USERPREFS dict for provisioning an isolated test-mesh device.
+
+    Baking this into firmware produces devices that:
+      - Run on a deterministic non-default LoRa slot (default 88 on US LONG_FAST,
+        well off the `hash("LongFast")` slot a stock production device uses)
+      - Join a private channel with a name and PSK that differ from public
+        defaults — so no accidental mesh-with-production-devices
+      - Have MQTT disabled (no uplink/downlink bridge), so test traffic never
+        leaks to a public broker
+      - Optionally disable GPS for bench-test conditions
+
+    For a multi-device test cluster, pass the same `psk_seed` to every call so
+    every device shares the same PSK and lands on the same isolated mesh.
+
+    Returned dict is ready to pass straight to `build`, `pio_flash`,
+    `erase_and_flash`, or `update_flash` via their `userprefs` parameter.
+
+    Example:
+        profile = userprefs_testing_profile(psk_seed="ci-run-2026-04-16")
+        erase_and_flash(env="tbeam", port="/dev/cu.usbmodem...", confirm=True,
+                        userprefs=profile)
+
+    Args:
+        psk_seed: seed for deterministic 32-byte PSK via SHA-256. None = random
+            (fine one-off, useless for multi-device clusters).
+        channel_name: primary channel name (≤11 chars). Default "McpTest".
+        channel_num: 1-indexed LoRa slot (0 = fall back to name-hash). Default
+            88 — mid-upper US band, unlikely to collide with production slots.
+        region: short code — one of US, EU_433, EU_868, CN, JP, ANZ, KR, TW,
+            RU, IN, NZ_865, TH, UA_433, UA_868, MY_433, MY_919, SG_923, LORA_24.
+        modem_preset: one of LONG_FAST, LONG_SLOW, LONG_MODERATE, VERY_LONG_SLOW,
+            MEDIUM_SLOW, MEDIUM_FAST, SHORT_SLOW, SHORT_FAST, SHORT_TURBO.
+        short_name: optional owner short name (≤4 chars) stamped into the build.
+        long_name: optional owner long name stamped into the build.
+        disable_mqtt: disable MQTT module + uplink/downlink (default True).
+        disable_position: disable GPS + smart-position broadcasts (default False).
+
+    """
+    return userprefs_mod.build_testing_profile(
+        psk_seed=psk_seed,
+        channel_name=channel_name,
+        channel_num=channel_num,
+        region=region,
+        modem_preset=modem_preset,
+        short_name=short_name,
+        long_name=long_name,
+        disable_mqtt=disable_mqtt,
+        disable_position=disable_position,
+    )
+
+
+@app.tool()
+def touch_1200bps(port: str, settle_ms: int = 250) -> dict[str, Any]:
+    """Open `port` at 1200 baud and immediately close, triggering USB CDC
+    bootloader entry on nRF52840, ESP32-S3 (native USB), RP2040, etc.
+
+    After the touch, polls serial devices for up to 3 seconds and reports any
+    new port that appeared (the bootloader often enumerates as a different
+    device). Not destructive — this is just a reset signal.
+    """
+    return flash.touch_1200bps(port, settle_ms=settle_ms)
+
+
+# ---------- Serial log sessions -------------------------------------------
+
+
+@app.tool()
+def serial_open(
+    port: str,
+    baud: int = 115200,
+    env: str | None = None,
+    filters: list[str] | None = None,
+) -> dict[str, Any]:
+    """Open a `pio device monitor` session reading from `port`.
+
+    If `env` is set, pio picks up monitor_speed and monitor_filters from
+    platformio.ini — recommended for firmware debugging since it enables
+    esp32_exception_decoder / esp32_c3_exception_decoder for ESP32 envs.
+
+    Without `env`, uses the supplied baud and filters (default ["direct"]).
+    Common filters: direct, time, hexlify, esp32_exception_decoder,
+    esp32_c3_exception_decoder, log2file.
+
+    Returns a session_id for use with serial_read / serial_close, plus the
+    resolved baud and filters so callers can confirm what pio selected.
+    """
+    session = serial_session.open_session(
+        port=port, baud=baud, env=env, filters=filters
+    )
+    registry.register_session(session)
+    return {
+        "session_id": session.id,
+        "resolved_baud": session.baud,
+        "resolved_filters": session.filters,
+        "env": session.env,
+    }
+
+
+@app.tool()
+def serial_read(
+    session_id: str,
+    max_lines: int = 200,
+    since_cursor: int | None = None,
+) -> dict[str, Any]:
+    """Read buffered lines from a serial monitor session.
+
+    Default: returns everything since your last call to serial_read (uses an
+    advancing cursor). Pass `since_cursor=N` to re-read from a specific point,
+    or `since_cursor=0` to read from the start of the in-memory buffer.
+
+    Returns `dropped` = count of lines that aged out of the 10k-line ring
+    buffer between reads — so a value > 0 means you missed data.
+    """
+    session = registry.get_session(session_id)
+    return serial_session.read_session(
+        session, max_lines=max_lines, since_cursor=since_cursor
+    )
+
+
+@app.tool()
+def serial_list() -> list[dict[str, Any]]:
+    """List all active serial monitor sessions."""
+    return [serial_session.session_summary(s) for s in registry.all_sessions()]
+
+
+@app.tool()
+def serial_close(session_id: str) -> dict[str, Any]:
+    """Terminate a serial monitor session and free its port."""
+    session = registry.remove_session(session_id)
+    if session is None:
+        return {"ok": False, "reason": f"Unknown session_id {session_id!r}"}
+    serial_session.close_session(session)
+    return {"ok": True}
+
+
+# ---------- Device interaction: reads -------------------------------------
+
+
+@app.tool()
+def device_info(port: str | None = None, timeout_s: float = 8.0) -> dict[str, Any]:
+    """Connect via meshtastic.SerialInterface and return a summary of the node.
+
+    If `port` is omitted and exactly one likely-Meshtastic device is connected,
+    it's auto-selected; otherwise the tool errors with the candidate list.
+    """
+    return info.device_info(port=port, timeout_s=timeout_s)
+
+
+@app.tool()
+def list_nodes(port: str | None = None, timeout_s: float = 8.0) -> list[dict[str, Any]]:
+    """Return the device's current node database (local node + all known peers)."""
+    return info.list_nodes(port=port, timeout_s=timeout_s)
+
+
+# ---------- Device interaction: writes ------------------------------------
+
+
+@app.tool()
+def set_owner(
+    long_name: str, short_name: str | None = None, port: str | None = None
+) -> dict[str, Any]:
+    """Set the device's owner long name and (optional) short name (≤4 chars)."""
+    return admin.set_owner(long_name=long_name, short_name=short_name, port=port)
+
+
+@app.tool()
+def get_config(section: str | None = None, port: str | None = None) -> dict[str, Any]:
+    """Read one or all config sections.
+
+    `section` may be any LocalConfig section (device, position, power, network,
+    display, lora, bluetooth, security) or LocalModuleConfig section (mqtt,
+    serial, telemetry, external_notification, canned_message, range_test,
+    store_forward, neighbor_info, ambient_lighting, detection_sensor,
+    paxcounter, audio, remote_hardware, statusmessage, traffic_management).
+    Omit or pass "all" for every section.
+    """
+    return admin.get_config(section=section, port=port)
+
+
+@app.tool()
+def set_config(path: str, value: Any, port: str | None = None) -> dict[str, Any]:
+    """Set one config field via dot-path and write it to the device.
+
+    Examples: "lora.region"="US", "lora.modem_preset"="LONG_FAST",
+    "device.role"="ROUTER", "mqtt.enabled"=True, "mqtt.address"="host".
+    Enum fields accept their name (case-insensitive) or int.
+    """
+    return admin.set_config(path=path, value=value, port=port)
+
+
+@app.tool()
+def get_channel_url(
+    include_all: bool = False, port: str | None = None
+) -> dict[str, Any]:
+    """Get the shareable channel URL (QR-code content).
+
+    include_all=True returns the admin URL including all secondary channels;
+    False returns only the primary channel (what users typically share).
+    """
+    return admin.get_channel_url(include_all=include_all, port=port)
+
+
+@app.tool()
+def set_channel_url(url: str, port: str | None = None) -> dict[str, Any]:
+    """Import channels from a Meshtastic channel URL."""
+    return admin.set_channel_url(url=url, port=port)
+
+
+@app.tool()
+def set_debug_log_api(enabled: bool, port: str | None = None) -> dict[str, Any]:
+    """Toggle security.debug_log_api_enabled on the local node.
+
+    When true, firmware streams log lines as protobuf `LogRecord` messages
+    over the StreamAPI (topic `meshtastic.log.line` in meshtastic-python)
+    instead of raw text. Lets diagnostic clients capture firmware-side logs
+    through the SAME SerialInterface used for admin/info calls — no
+    separate `pio device monitor` session needed, no exclusive-port-lock
+    conflict. Persists across reboot via NVS; wiped by factory_reset
+    unless re-applied.
+
+    The earlier emitLogRecord race (shared tx buffer) is fixed at the
+    firmware level — the log path has a dedicated scratch + txBuf and
+    both emission paths serialize via a mutex. Safe to leave on under
+    traffic.
+    """
+    return admin.set_debug_log_api(enabled=enabled, port=port)
+
+
+@app.tool()
+def send_text(
+    text: str,
+    to: str | int | None = None,
+    channel_index: int = 0,
+    want_ack: bool = False,
+    port: str | None = None,
+) -> dict[str, Any]:
+    """Send a text message over the mesh.
+
+    `to` defaults to broadcast ("^all"). Pass a node ID (hex string like
+    "!abcdef01") or node number (int) to direct-message a specific node.
+    channel_index picks which configured channel to send on.
+    """
+    return admin.send_text(
+        text=text, to=to, channel_index=channel_index, want_ack=want_ack, port=port
+    )
+
+
+@app.tool()
+def reboot(
+    port: str | None = None, confirm: bool = False, seconds: int = 10
+) -> dict[str, Any]:
+    """Reboot the connected node in `seconds` seconds. Requires confirm=True."""
+    return admin.reboot(port=port, confirm=confirm, seconds=seconds)
+
+
+@app.tool()
+def shutdown(
+    port: str | None = None, confirm: bool = False, seconds: int = 10
+) -> dict[str, Any]:
+    """Shut down the connected node in `seconds` seconds. Requires confirm=True."""
+    return admin.shutdown(port=port, confirm=confirm, seconds=seconds)
+
+
+@app.tool()
+def factory_reset(
+    port: str | None = None, confirm: bool = False, full: bool = False
+) -> dict[str, Any]:
+    """Factory-reset the connected node. Requires confirm=True.
+
+    `full=True` also wipes device identity/keys (not just config).
+    """
+    return admin.factory_reset(port=port, confirm=confirm, full=full)
+
+
+# ---------- Direct hardware tools -----------------------------------------
+
+
+@app.tool()
+def esptool_chip_info(port: str) -> dict[str, Any]:
+    """Run `esptool flash_id` and return chip, MAC, crystal, and flash size.
+
+    Read-only — no confirm required. Prefer this over parsing pio upload logs
+    when you just want to identify the chip.
+    """
+    return hw_tools.esptool_chip_info(port)
+
+
+@app.tool()
+def esptool_erase_flash(port: str, confirm: bool = False) -> dict[str, Any]:
+    """Full-chip erase via `esptool erase_flash`. Leaves the device unbootable.
+
+    Prefer `erase_and_flash` which also writes firmware. Use this only for
+    recovery when a device is in a weird state. Requires confirm=True.
+    """
+    return hw_tools.esptool_erase_flash(port, confirm=confirm)
+
+
+@app.tool()
+def esptool_raw(
+    args: list[str], port: str | None = None, confirm: bool = False
+) -> dict[str, Any]:
+    """Pass-through to `esptool`. Destructive subcommands (write_flash,
+    erase_flash, erase_region, merge_bin) require confirm=True.
+
+    Prefer the high-level `pio_flash` / `erase_and_flash` / `update_flash`
+    tools where possible — they know board-specific offsets and protocols.
+    """
+    return hw_tools.esptool_raw(args, port=port, confirm=confirm)
+
+
+@app.tool()
+def nrfutil_dfu(port: str, package_path: str, confirm: bool = False) -> dict[str, Any]:
+    """DFU-flash a .zip package to an nRF52840 via `nrfutil dfu serial`.
+
+    Prefer `pio_flash` for flashing firmware built from this repo — pio handles
+    the DFU invocation automatically. Use this tool when flashing a pre-built
+    release zip or a custom bootloader. Requires confirm=True.
+    """
+    return hw_tools.nrfutil_dfu(port, package_path, confirm=confirm)
+
+
+@app.tool()
+def nrfutil_raw(args: list[str], confirm: bool = False) -> dict[str, Any]:
+    """Pass-through to `nrfutil`. dfu/settings subcommands require confirm=True."""
+    return hw_tools.nrfutil_raw(args, confirm=confirm)
+
+
+@app.tool()
+def picotool_info(port: str | None = None) -> dict[str, Any]:
+    """Run `picotool info -a`. Requires the RP2040 to be in BOOTSEL mode
+    (hold BOOTSEL button while plugging in, or call `touch_1200bps` if the
+    firmware supports 1200bps-reset)."""
+    return hw_tools.picotool_info(port=port)
+
+
+@app.tool()
+def picotool_load(uf2_path: str, confirm: bool = False) -> dict[str, Any]:
+    """Load a UF2 to a Pico in BOOTSEL mode via `picotool load -x -t uf2`.
+
+    Prefer `pio_flash` for flashing firmware built from this repo.
+    Requires confirm=True.
+    """
+    return hw_tools.picotool_load(uf2_path, confirm=confirm)
+
+
+@app.tool()
+def picotool_raw(args: list[str], confirm: bool = False) -> dict[str, Any]:
+    """Pass-through to `picotool`. load/reboot/save/erase require confirm=True."""
+    return hw_tools.picotool_raw(args, confirm=confirm)
diff --git a/mcp-server/src/meshtastic_mcp/userprefs.py b/mcp-server/src/meshtastic_mcp/userprefs.py
new file mode 100644
index 00000000000..59d7165f972
--- /dev/null
+++ b/mcp-server/src/meshtastic_mcp/userprefs.py
@@ -0,0 +1,532 @@
+"""USERPREFS: build-time constants baked into the firmware binary.
+
+The firmware repo has `userPrefs.jsonc` at its root — a JSONC file with every
+available USERPREFS_* key listed, most commented out. At build time,
+`bin/platformio-custom.py` reads it, strips comments, and emits
+`-DUSERPREFS_<KEY>=<value>` build flags into the compile step. Firmware code
+uses `#ifdef USERPREFS_<KEY>` to pick up the baked-in defaults for channels,
+owner name, LoRa region, OEM branding, MQTT credentials, etc.
+
+This module:
+  1. Parses `userPrefs.jsonc` (preserving which keys are active vs commented)
+  2. Greps `src/` for the set of keys the firmware actually consumes (the
+     real discovery manifest — anything here that isn't in the jsonc is still
+     a valid override)
+  3. Provides a context manager for temporarily swapping in overrides during
+     a build/flash, then restoring the original file
+  4. Provides persistent `set` / `reset` for when the caller wants the change
+     to stick across multiple builds
+
+The firmware's platformio-custom.py value-type detection mirrors what we need
+for serialization: dict-like `{...}` (byte arrays, enum lists), digit-like
+(ints and floats), `true`/`false`, `meshtastic_*` enum constants, and
+everything else gets string-wrapped via `env.StringifyMacro`. We store the
+raw string values exactly as they'd appear in the jsonc to avoid round-trip
+surprises.
+"""
+
+from __future__ import annotations
+
+import json
+import re
+import shutil
+import time
+from contextlib import contextmanager
+from pathlib import Path
+from typing import Any, Iterator
+
+from . import config
+
+USERPREFS_FILE = "userPrefs.jsonc"
+BACKUP_SUFFIX = ".mcp.bak"
+
+# Pattern for lines like `// "USERPREFS_FOO": "value",` or `"USERPREFS_FOO": "v"`
+_ACTIVE_LINE = re.compile(r'^\s*"(USERPREFS_[A-Z0-9_]+)"\s*:\s*"((?:[^"\\]|\\.)*)"')
+_COMMENTED_LINE = re.compile(
+    r'^\s*//\s*"(USERPREFS_[A-Z0-9_]+)"\s*:\s*"((?:[^"\\]|\\.)*)"'
+)
+# Inline comment stripper (matches platformio-custom.py:219)
+_LINE_COMMENT = re.compile(r"//.*")
+
+# USERPREFS_* usage in firmware source (#ifdef, #if defined, direct refs)
+_USAGE_PATTERN = re.compile(r"\bUSERPREFS_[A-Z0-9_]+\b")
+
+
+def jsonc_path() -> Path:
+    return config.firmware_root() / USERPREFS_FILE
+
+
+def _read_file(path: Path) -> str:
+    return path.read_text(encoding="utf-8")
+
+
+def _parse_jsonc_state(text: str) -> dict[str, Any]:
+    """Parse userPrefs.jsonc while preserving comment state per key.
+
+    Returns:
+        {
+          "active": {key: string_value, ...},   # uncommented
+          "commented": {key: string_value, ...}, # commented examples
+          "order": [key, ...]                   # source order for round-trip
+        }
+
+    """
+    active: dict[str, str] = {}
+    commented: dict[str, str] = {}
+    order: list[str] = []
+    for line in text.splitlines():
+        if m := _COMMENTED_LINE.match(line):
+            key, val = m.group(1), m.group(2)
+            commented[key] = val
+            order.append(key)
+        elif m := _ACTIVE_LINE.match(line):
+            key, val = m.group(1), m.group(2)
+            active[key] = val
+            order.append(key)
+    return {"active": active, "commented": commented, "order": order}
+
+
+def _parse_jsonc_active(text: str) -> dict[str, str]:
+    """Parse active-only values by stripping line comments + feeding to json."""
+    stripped = "\n".join(_LINE_COMMENT.sub("", line) for line in text.splitlines())
+    try:
+        return {k: str(v) for k, v in json.loads(stripped).items()}
+    except json.JSONDecodeError as exc:
+        raise ValueError(f"userPrefs.jsonc is not valid JSONC: {exc}") from exc
+
+
+def read_state() -> dict[str, Any]:
+    """Return {active, commented, order, path}."""
+    path = jsonc_path()
+    if not path.is_file():
+        return {"active": {}, "commented": {}, "order": [], "path": str(path)}
+    state = _parse_jsonc_state(_read_file(path))
+    state["path"] = str(path)
+    return state
+
+
+# ---------- Manifest ------------------------------------------------------
+
+
+def _scan_consumed_keys() -> dict[str, list[str]]:
+    """Grep firmware src/ for USERPREFS_* references.
+
+    Returns {key: [relative_file_paths]} — only includes files under `src/`.
+    """
+    src_dir = config.firmware_root() / "src"
+    if not src_dir.is_dir():
+        return {}
+    out: dict[str, set[str]] = {}
+    for path in src_dir.rglob("*"):
+        if not path.is_file() or path.suffix.lower() not in {
+            ".c",
+            ".cc",
+            ".cpp",
+            ".h",
+            ".hpp",
+            ".ipp",
+            ".inl",
+        }:
+            continue
+        try:
+            text = path.read_text(encoding="utf-8", errors="ignore")
+        except Exception:
+            continue
+        for m in _USAGE_PATTERN.finditer(text):
+            key = m.group(0)
+            # Skip our own "_USERPREFS_" artifacts (reserve-word guard from build-userprefs-json.py)
+            if key.startswith("_USERPREFS_"):
+                continue
+            out.setdefault(key, set()).add(
+                str(path.relative_to(config.firmware_root()))
+            )
+    return {k: sorted(v) for k, v in sorted(out.items())}
+
+
+def build_manifest() -> dict[str, Any]:
+    """Build the discovery manifest.
+
+    Every known USERPREFS_* key appears exactly once with:
+      - `value` (current active value, if any)
+      - `example` (commented default from jsonc, if any)
+      - `active` bool
+      - `declared_in_jsonc` bool (key appears anywhere in userPrefs.jsonc)
+      - `consumed_by` list of source files that reference it
+      - `inferred_type`: one of "brace", "number", "bool", "enum", "string"
+        — matches platformio-custom.py's value-wrapping switch
+    """
+    state = read_state()
+    consumed = _scan_consumed_keys()
+
+    all_keys = set(state["active"]) | set(state["commented"]) | set(consumed)
+    records = []
+    for key in sorted(all_keys):
+        example = state["commented"].get(key)
+        value = state["active"].get(key)
+        records.append(
+            {
+                "key": key,
+                "active": key in state["active"],
+                "value": value,
+                "example": example,
+                "declared_in_jsonc": key in state["active"]
+                or key in state["commented"],
+                "consumed_by": consumed.get(key, []),
+                "inferred_type": infer_type(value if value is not None else example),
+            }
+        )
+    return {
+        "path": state["path"],
+        "active_count": len(state["active"]),
+        "commented_count": len(state["commented"]),
+        "consumed_key_count": len(consumed),
+        "total_keys": len(records),
+        "entries": records,
+    }
+
+
+def infer_type(value: str | None) -> str:
+    """Classify a raw value string the way platformio-custom.py does.
+
+    Mirrors the branch order in `bin/platformio-custom.py:222-235`.
+    """
+    if value is None:
+        return "unknown"
+    v = value.strip()
+    if v.startswith("{"):
+        return "brace"  # byte array / enum init list
+    if v.lstrip("-").replace(".", "", 1).isdigit():
+        return "number"
+    if v in ("true", "false"):
+        return "bool"
+    if v.startswith("meshtastic_"):
+        return "enum"
+    return "string"
+
+
+# ---------- Writing -------------------------------------------------------
+
+
+def _format_jsonc_line(key: str, value: str, commented: bool) -> str:
+    prefix = "  // " if commented else "  "
+    # Escape backslashes and quotes inside value the way platformio-custom.py
+    # expects — the original jsonc uses raw strings for most content. Keep it
+    # literal; callers are responsible for correct escaping if they pass
+    # dict/enum-init values that contain quotes.
+    return f'{prefix}"{key}": "{value}",'
+
+
+def _render_jsonc(
+    active: dict[str, str], commented: dict[str, str], order: list[str]
+) -> str:
+    """Render userPrefs.jsonc preserving source order and comment state."""
+    seen: set[str] = set()
+    lines = ["{"]
+    for key in order:
+        if key in seen:
+            continue
+        seen.add(key)
+        if key in active:
+            lines.append(_format_jsonc_line(key, active[key], commented=False))
+        elif key in commented:
+            lines.append(_format_jsonc_line(key, commented[key], commented=True))
+    # Append any newly-added keys (not in original order) at the end, active.
+    for key, value in active.items():
+        if key in seen:
+            continue
+        seen.add(key)
+        lines.append(_format_jsonc_line(key, value, commented=False))
+    # Strip trailing comma on the last data line (valid JSONC allows it, but
+    # strict `json.loads` after comment-stripping does not; the loader in
+    # platformio-custom.py uses json.loads).
+    if len(lines) > 1 and lines[-1].endswith(","):
+        lines[-1] = lines[-1].rstrip(",")
+    lines.append("}")
+    lines.append("")  # trailing newline
+    return "\n".join(lines)
+
+
+def _validate_after_write(text: str) -> None:
+    """Ensure the rendered text still parses the way platformio-custom.py does."""
+    stripped = "\n".join(_LINE_COMMENT.sub("", line) for line in text.splitlines())
+    json.loads(stripped)  # raises on any error
+
+
+def write_state(
+    active: dict[str, str], commented: dict[str, str], order: list[str]
+) -> None:
+    path = jsonc_path()
+    text = _render_jsonc(active, commented, order)
+    _validate_after_write(text)
+    path.write_text(text, encoding="utf-8")
+
+
+def merge_active(overrides: dict[str, Any]) -> dict[str, Any]:
+    """Merge `overrides` into the active set and persist.
+
+    Existing active values not in `overrides` are kept. Example/commented
+    values are preserved. Returns {before_active, after_active, path}.
+    """
+    state = read_state()
+    before = dict(state["active"])
+    after = dict(before)
+    commented = dict(state["commented"])
+    order = list(state["order"])
+    for key, raw in overrides.items():
+        if not key.startswith("USERPREFS_"):
+            raise ValueError(f"key {key!r} must start with USERPREFS_")
+        after[key] = _stringify(raw)
+        # If the key was commented, uncommenting it means removing from commented set.
+        commented.pop(key, None)
+        if key not in order:
+            order.append(key)
+    write_state(after, commented, order)
+    return {"before_active": before, "after_active": after, "path": str(jsonc_path())}
+
+
+def _stringify(value: Any) -> str:
+    """Convert a Python value to the string form userPrefs.jsonc expects.
+
+    bool → "true" / "false"; int/float → str(); anything else → str(value).
+    Callers passing brace-init strings (`"{ 0x01, 0x02, ... }"`) must format
+    them themselves — this function doesn't try to synthesize them.
+    """
+    if isinstance(value, bool):
+        return "true" if value else "false"
+    if isinstance(value, (int, float)):
+        return str(value)
+    return str(value)
+
+
+def reset() -> dict[str, Any]:
+    """Restore userPrefs.jsonc from the MCP backup if present.
+
+    Returns {restored: bool, path, backup_path}.
+    """
+    path = jsonc_path()
+    backup = path.with_suffix(path.suffix + BACKUP_SUFFIX)
+    if backup.is_file():
+        shutil.copy2(backup, path)
+        backup.unlink()
+        return {"restored": True, "path": str(path), "backup_path": str(backup)}
+    return {"restored": False, "path": str(path), "backup_path": str(backup)}
+
+
+# ---------- Transient override (for build/flash) --------------------------
+
+
+# ---------- Pre-baked profiles --------------------------------------------
+
+
+def _psk_from_bytes(data: bytes) -> str:
+    """Format 32 bytes as a C-style brace-init list for USERPREFS_CHANNEL_*_PSK.
+
+    Matches the exact format used in userPrefs.jsonc:
+        { 0x38, 0x4b, 0xbc, ... }
+    """
+    if len(data) != 32:
+        raise ValueError(f"PSK must be exactly 32 bytes, got {len(data)}")
+    return "{ " + ", ".join(f"0x{b:02x}" for b in data) + " }"
+
+
+def generate_psk(seed: str | None = None) -> str:
+    """Generate a 32-byte PSK as a brace-init string.
+
+    If `seed` is provided, the PSK is deterministic (derived via SHA-256 of
+    the seed); otherwise it's cryptographically random. Use a seed for
+    automated testing so every device in a test run shares the same key.
+    """
+    if seed is None:
+        import secrets
+
+        raw = secrets.token_bytes(32)
+    else:
+        import hashlib
+
+        raw = hashlib.sha256(seed.encode("utf-8")).digest()
+    return _psk_from_bytes(raw)
+
+
+# Meshtastic region enum name → short description (for the manifest tool).
+# Not exhaustive; these are the regions a US-based test lab is likely to pick.
+KNOWN_REGIONS = {
+    "US": "meshtastic_Config_LoRaConfig_RegionCode_US",
+    "EU_433": "meshtastic_Config_LoRaConfig_RegionCode_EU_433",
+    "EU_868": "meshtastic_Config_LoRaConfig_RegionCode_EU_868",
+    "CN": "meshtastic_Config_LoRaConfig_RegionCode_CN",
+    "JP": "meshtastic_Config_LoRaConfig_RegionCode_JP",
+    "ANZ": "meshtastic_Config_LoRaConfig_RegionCode_ANZ",
+    "KR": "meshtastic_Config_LoRaConfig_RegionCode_KR",
+    "TW": "meshtastic_Config_LoRaConfig_RegionCode_TW",
+    "RU": "meshtastic_Config_LoRaConfig_RegionCode_RU",
+    "IN": "meshtastic_Config_LoRaConfig_RegionCode_IN",
+    "NZ_865": "meshtastic_Config_LoRaConfig_RegionCode_NZ_865",
+    "TH": "meshtastic_Config_LoRaConfig_RegionCode_TH",
+    "UA_433": "meshtastic_Config_LoRaConfig_RegionCode_UA_433",
+    "UA_868": "meshtastic_Config_LoRaConfig_RegionCode_UA_868",
+    "MY_433": "meshtastic_Config_LoRaConfig_RegionCode_MY_433",
+    "MY_919": "meshtastic_Config_LoRaConfig_RegionCode_MY_919",
+    "SG_923": "meshtastic_Config_LoRaConfig_RegionCode_SG_923",
+    "LORA_24": "meshtastic_Config_LoRaConfig_RegionCode_LORA_24",
+}
+
+KNOWN_MODEM_PRESETS = {
+    "LONG_FAST": "meshtastic_Config_LoRaConfig_ModemPreset_LONG_FAST",
+    "LONG_SLOW": "meshtastic_Config_LoRaConfig_ModemPreset_LONG_SLOW",
+    "LONG_MODERATE": "meshtastic_Config_LoRaConfig_ModemPreset_LONG_MODERATE",
+    "VERY_LONG_SLOW": "meshtastic_Config_LoRaConfig_ModemPreset_VERY_LONG_SLOW",
+    "MEDIUM_SLOW": "meshtastic_Config_LoRaConfig_ModemPreset_MEDIUM_SLOW",
+    "MEDIUM_FAST": "meshtastic_Config_LoRaConfig_ModemPreset_MEDIUM_FAST",
+    "SHORT_SLOW": "meshtastic_Config_LoRaConfig_ModemPreset_SHORT_SLOW",
+    "SHORT_FAST": "meshtastic_Config_LoRaConfig_ModemPreset_SHORT_FAST",
+    "SHORT_TURBO": "meshtastic_Config_LoRaConfig_ModemPreset_SHORT_TURBO",
+}
+
+
+def build_testing_profile(
+    psk_seed: str | None = None,
+    channel_name: str = "McpTest",
+    channel_num: int = 88,
+    region: str = "US",
+    modem_preset: str = "LONG_FAST",
+    short_name: str | None = None,
+    long_name: str | None = None,
+    disable_mqtt: bool = True,
+    disable_position: bool = False,
+) -> dict[str, Any]:
+    """Build a USERPREFS dict for an isolated test-mesh device.
+
+    Defaults: US region, LONG_FAST modem, channel slot 88 (well away from the
+    default `hash("LongFast") % numChannels` slot that production devices use),
+    and a private PSK. Devices baked with the same `psk_seed` land on the same
+    isolated mesh.
+
+    See `src/mesh/RadioInterface.cpp:849` for the slot-selection math:
+    `slot = (channel_num ? channel_num - 1 : hash(name)) % numChannels`.
+    Setting `channel_num` explicitly (non-zero) forces a deterministic slot.
+
+    Args:
+        psk_seed: seed for deterministic PSK generation. `None` = random (fine
+            for one-off bakes, useless for multi-device test clusters).
+        channel_name: primary channel name. Must differ from defaults
+            ("LongFast", "MediumFast", etc.) so production devices don't
+            accidentally match after the PSK check.
+        channel_num: 1-indexed LoRa slot (1..numChannels). 88 is mid-upper US
+            band. Set to 0 to fall back to name-hash (not recommended for
+            isolation).
+        region: short code from `KNOWN_REGIONS`.
+        modem_preset: short code from `KNOWN_MODEM_PRESETS`.
+        short_name: optional owner short-name stamp (≤4 chars). None = unset.
+        long_name: optional owner long-name stamp. None = unset.
+        disable_mqtt: if True (default), disables the MQTT module and the
+            uplink/downlink bridge on the primary channel — so private test
+            traffic never leaks to a public broker.
+        disable_position: if True, disables GPS + position broadcasts — useful
+            when test devices sit on a bench without antennas.
+
+    """
+    if region not in KNOWN_REGIONS:
+        raise ValueError(
+            f"Unknown region {region!r}. Known: {sorted(KNOWN_REGIONS.keys())}"
+        )
+    if modem_preset not in KNOWN_MODEM_PRESETS:
+        raise ValueError(
+            f"Unknown modem_preset {modem_preset!r}. Known: {sorted(KNOWN_MODEM_PRESETS.keys())}"
+        )
+    if not (0 <= channel_num <= 255):
+        raise ValueError(f"channel_num must be 0..255, got {channel_num}")
+    if len(channel_name) > 11:
+        raise ValueError(
+            f"channel_name {channel_name!r} exceeds Meshtastic's 11-char max"
+        )
+    if short_name is not None and len(short_name) > 4:
+        raise ValueError(f"short_name must be ≤4 chars, got {len(short_name)}")
+
+    psk = generate_psk(seed=psk_seed)
+
+    prefs: dict[str, Any] = {
+        # --- LoRa ---
+        "USERPREFS_CONFIG_LORA_REGION": KNOWN_REGIONS[region],
+        "USERPREFS_LORACONFIG_MODEM_PRESET": KNOWN_MODEM_PRESETS[modem_preset],
+        "USERPREFS_LORACONFIG_CHANNEL_NUM": channel_num,
+        # --- Primary channel (isolated from public default) ---
+        "USERPREFS_CHANNELS_TO_WRITE": 1,
+        "USERPREFS_CHANNEL_0_NAME": channel_name,
+        "USERPREFS_CHANNEL_0_PSK": psk,
+        "USERPREFS_CHANNEL_0_PRECISION": 14,
+    }
+    if disable_mqtt:
+        prefs.update(
+            {
+                "USERPREFS_CONFIG_LORA_IGNORE_MQTT": True,
+                "USERPREFS_MQTT_ENABLED": 0,
+                "USERPREFS_CHANNEL_0_UPLINK_ENABLED": False,
+                "USERPREFS_CHANNEL_0_DOWNLINK_ENABLED": False,
+            }
+        )
+    if disable_position:
+        prefs.update(
+            {
+                "USERPREFS_CONFIG_GPS_MODE": "meshtastic_Config_PositionConfig_GpsMode_DISABLED",
+                "USERPREFS_CONFIG_SMART_POSITION_ENABLED": False,
+            }
+        )
+    if long_name is not None:
+        prefs["USERPREFS_CONFIG_OWNER_LONG_NAME"] = long_name
+    if short_name is not None:
+        prefs["USERPREFS_CONFIG_OWNER_SHORT_NAME"] = short_name
+
+    return prefs
+
+
+@contextmanager
+def temporary_overrides(overrides: dict[str, Any] | None) -> Iterator[dict[str, str]]:
+    """Apply `overrides` to userPrefs.jsonc for the duration of the context.
+
+    Yields a dict of the *effective* active values (original active merged
+    with overrides). Always restores the original file on exit, even on
+    exception. If `overrides` is None or empty, this is a no-op.
+
+    The restore writes the original file content byte-for-byte, so there's no
+    round-trip ambiguity even if the file had unusual whitespace.
+    """
+    if not overrides:
+        state = read_state()
+        yield dict(state["active"])
+        return
+
+    path = jsonc_path()
+    if not path.is_file():
+        raise FileNotFoundError(f"userPrefs.jsonc not found at {path}")
+
+    original_bytes = path.read_bytes()
+    original_stat = path.stat()
+
+    # Merge and write
+    state = _parse_jsonc_state(original_bytes.decode("utf-8"))
+    effective = dict(state["active"])
+    commented = dict(state["commented"])
+    order = list(state["order"])
+    for key, raw in overrides.items():
+        if not key.startswith("USERPREFS_"):
+            raise ValueError(f"key {key!r} must start with USERPREFS_")
+        effective[key] = _stringify(raw)
+        commented.pop(key, None)
+        if key not in order:
+            order.append(key)
+
+    rendered = _render_jsonc(effective, commented, order)
+    _validate_after_write(rendered)
+    path.write_text(rendered, encoding="utf-8")
+    # pio watches file mtimes to invalidate build cache; force the modification
+    # time to now so a pre-existing `.pio/build/<env>/` cache is discarded.
+    now = time.time()
+    import os
+
+    os.utime(path, (now, now))
+
+    try:
+        yield effective
+    finally:
+        path.write_bytes(original_bytes)
+        os.utime(path, (original_stat.st_atime, original_stat.st_mtime))
diff --git a/mcp-server/tests/README.md b/mcp-server/tests/README.md
new file mode 100644
index 00000000000..d13ae46ac80
--- /dev/null
+++ b/mcp-server/tests/README.md
@@ -0,0 +1,116 @@
+# Meshtastic MCP Server — Test Harness
+
+Automated test suite for the MCP server, organized around real operator
+concerns rather than generic "unit vs hardware".
+
+## Tiers
+
+| Dir             | Hardware                | Question this tier answers                                            |
+| --------------- | ----------------------- | --------------------------------------------------------------------- |
+| `unit/`         | none                    | Do the parsing / filtering / profile-generation primitives work?      |
+| `provisioning/` | 1 device, per-test bake | Did my pre-bake recipe stick? Does it survive a factory reset?        |
+| `admin/`        | 1 device, shared bake   | Do my daily admin ops (owner, channel URL, config writes) round-trip? |
+| `mesh/`         | 2 devices, shared bake  | Do my devices actually form a mesh? Send + receive? ACKs?             |
+| `telemetry/`    | 2 devices, shared bake  | Is telemetry reporting? Is position broadcast correct?                |
+| `monitor/`      | 1 device, shared bake   | Is the boot log clean (no panics)?                                    |
+| `fleet/`        | varies                  | Are my CI runs isolated from each other? Are reflashes idempotent?    |
+
+## Quick start
+
+```bash
+cd mcp-server
+pip install -e ".[test]"
+
+# No hardware — 33 unit tests, ~3 seconds
+pytest tests/unit -v
+
+# Hub attached (nRF52840 + ESP32-S3) — first run bakes, then exercises everything
+pytest tests/ --html=report.html
+
+# Hub already baked with session profile (dev loop) — skip bake
+pytest tests/ --assume-baked --html=report.html
+
+# Force a rebake (new firmware, new seed, etc.)
+pytest tests/ --force-bake --html=report.html
+```
+
+## CLI flags
+
+- `--force-bake` — always reflash both roles at session start, even if the
+  current state matches the session profile.
+- `--assume-baked` — skip `test_00_bake.py` entirely. Use when you know the
+  devices are already baked and want a fast dev loop.
+- `--hub-profile=<yaml>` — point at a YAML file for non-default hub hardware.
+  Default targets VID `0x239a` (nRF52) and `0x303a`/`0x10c4` (ESP32-S3).
+- `--no-teardown-rebake` — skip the session-end rebake that `provisioning/`
+  and `fleet/` tests perform. Useful in rapid iteration.
+
+## Environment variables
+
+- `MESHTASTIC_FIRMWARE_ROOT` — firmware repo path (defaults to `../` from tests/)
+- `MESHTASTIC_MCP_ENV_NRF52` — PlatformIO env for the nRF52 role (default
+  `rak4631`)
+- `MESHTASTIC_MCP_ENV_ESP32S3` — PlatformIO env for the ESP32-S3 role (default
+  `heltec-v3`)
+- `MESHTASTIC_MCP_SEED` — override the session PSK seed (default:
+  `pytest-<unix-ts>`). Set this to reproduce a specific failing run.
+
+## Fixtures you'll use when adding tests
+
+All defined in `conftest.py`:
+
+- **`hub_devices`** → `{"nrf52": "/dev/cu.X", "esp32s3": "/dev/cu.Y"}`. Auto-
+  skips the test if a required role isn't present.
+- **`test_profile`** → USERPREFS dict for the session (`build_testing_profile`).
+- **`no_region_profile`** → variant without `USERPREFS_CONFIG_LORA_REGION`.
+- **`baked_mesh`** → verifies both devices are baked with the session profile
+  (does NOT reflash — that's `test_00_bake.py`'s job).
+- **`baked_single`** → single verified baked device; parametrize `request.param`
+  to pick role.
+- **`serial_capture`** → factory; `cap = serial_capture("esp32s3")` starts a
+  pio device monitor session, drains into a per-test buffer, attaches the
+  buffer to the pytest-html report on failure.
+- **`wait_until`** → exponential-backoff polling helper; `wait_until(lambda:
+predicate(), timeout=60)` replaces flaky `time.sleep()` patterns.
+
+## Reports
+
+`pytest --html=report.html` produces a self-contained HTML with:
+
+- Per-test pass/fail/skip with timings
+- On failure: serial log capture from any `serial_capture` fixture used
+- On failure: `device_info` + lora config JSON for every role on the hub
+- Session seed and session start time (for reproducibility)
+
+`pytest --junitxml=junit.xml` produces CI-integration XML.
+
+`tool_coverage.json` is emitted at session end in the tests directory — shows
+which of the 38 MCP tools the run exercised. Useful for closing test gaps.
+
+## Adding a new test
+
+1. Pick the category that matches the operator concern (not the technical
+   surface). "Does my fleet's owner name persist" is `admin/`, not `unit/`.
+2. If you need both devices, depend on `baked_mesh`. If you need one, depend
+   on `baked_single`. If you need to mutate hardware state, put it in
+   `provisioning/` or `fleet/` and add a `try/finally` teardown that re-bakes
+   the session profile.
+3. Use `wait_until` for anything involving LoRa timing — fixed `sleep()`
+   produces flakes.
+4. Use `serial_capture` when you need to observe firmware log output (e.g.
+   "did the packet get decoded?").
+5. Add a `@pytest.mark.timeout(N)` — mesh tests routinely hit LoRa-airtime
+   waits; default pytest timeout is infinite.
+
+## Troubleshooting
+
+- **All hardware tests SKIP** → hub not detected. Plug in the USB hub, verify
+  with `pytest tests/ --collect-only` or `python -c "from meshtastic_mcp import
+devices; print(devices.list_devices())"`.
+- **`baked_mesh` fails with "devices not baked"** → run `pytest
+tests/test_00_bake.py` first, or pass `--force-bake` on the full run.
+- **Mesh formation tests time out** → check that both devices are on the same
+  session profile (`--force-bake` forces both to the current seed).
+- **Provisioning tests leave device in bad state** → teardowns re-bake, but
+  if a test crashes between "bake broken state" and "bake good state", run
+  `pytest tests/test_00_bake.py --force-bake` to recover.
diff --git a/mcp-server/tests/__init__.py b/mcp-server/tests/__init__.py
new file mode 100644
index 00000000000..e69de29bb2d
diff --git a/mcp-server/tests/_port_discovery.py b/mcp-server/tests/_port_discovery.py
new file mode 100644
index 00000000000..1310e3c5cfa
--- /dev/null
+++ b/mcp-server/tests/_port_discovery.py
@@ -0,0 +1,118 @@
+"""Role-to-port rediscovery after USB CDC re-enumeration.
+
+Used by tests that mutate device identity in ways macOS treats as a
+"new device" — notably ``factory_reset(full=False)`` on the nRF52840 and
+any operation that kicks the device through its bootloader. Both cases
+cause the kernel to re-assign the ``/dev/cu.usbmodem*`` path; a test that
+captured the pre-operation port and reuses it after will fail with
+``FileNotFoundError``.
+
+The helper polls :func:`meshtastic_mcp.devices.list_devices` (the same API
+``run-tests.sh`` and ``conftest.py::hub_devices`` use for initial hub
+detection) filtered by the role's canonical USB VID. Returns the first
+matching port — equivalent to "give me the single nRF52 (or ESP32-S3) on
+the bench right now, whichever `cu.*` path it happens to be at".
+
+Test-harness-local (not exported from ``meshtastic_mcp``): a thin wrapper
+over public ``devices.list_devices`` with no extra moving parts. If a
+non-test caller ever needs this, it's trivial to promote.
+
+Caveat: the session-scoped ``hub_devices`` fixture snapshots ports at
+session start and is dict-keyed — it doesn't learn about re-enumerations.
+Tests that call ``resolve_port_by_role`` should use the returned port
+locally for the rest of the test body rather than expecting
+``hub_devices[role]`` to update.
+"""
+
+from __future__ import annotations
+
+import time
+
+from meshtastic_mcp import devices as devices_module
+
+# Role → canonical VID(s). Kept in sync with:
+#   - `mcp-server/run-tests.sh` (ROLE_BY_VID)
+#   - `mcp-server/tests/conftest.py::hub_profile`
+# If any of those change, this must too.
+_ROLE_VIDS: dict[str, tuple[int, ...]] = {
+    "nrf52": (0x239A,),  # Adafruit / RAK nRF52840 native USB
+    "esp32s3": (0x303A, 0x10C4),  # Espressif native USB + CP2102 USB-UART
+}
+
+
+def _coerce_vid(raw: object) -> int | None:
+    """`devices.list_devices` returns vid as either '0x239a' or an int;
+    normalize to int. None on un-parseable input (matches the same fault-
+    tolerance `run-tests.sh` uses for its role detection)."""
+    if raw is None:
+        return None
+    if isinstance(raw, int):
+        return raw
+    if isinstance(raw, str):
+        try:
+            return int(raw, 16) if raw.lower().startswith("0x") else int(raw)
+        except ValueError:
+            return None
+    return None
+
+
+def resolve_port_by_role(
+    role: str,
+    *,
+    timeout_s: float = 30.0,
+    poll_start: float = 0.5,
+    poll_max: float = 5.0,
+) -> str:
+    """Return the current ``/dev/cu.*`` path for ``role`` once one appears.
+
+    Polls ``devices.list_devices(include_unknown=True)`` every ``poll_start``
+    seconds (1.5× backoff, capped at ``poll_max``) until a device matching
+    ``role``'s VID appears. Returns the first matching port.
+
+    On timeout raises :class:`AssertionError` with the list of devices that
+    WERE seen — helpful when debugging "wrong board connected" vs. "no
+    board connected" vs. "still re-enumerating".
+
+    Args:
+        role: ``"nrf52"`` or ``"esp32s3"`` (keys of ``_ROLE_VIDS``).
+        timeout_s: upper bound on how long to wait for the device to
+            re-appear. Default 30 s — nRF52 factory_reset observed at
+            2-12 s on a healthy lab hub.
+        poll_start: initial poll interval in seconds. Default 0.5 s.
+        poll_max: cap on poll interval after backoff. Default 5 s.
+
+    Raises:
+        AssertionError: if no matching device appears within ``timeout_s``.
+        ValueError: if ``role`` is not in ``_ROLE_VIDS``.
+
+    """
+    if role not in _ROLE_VIDS:
+        raise ValueError(f"unknown role {role!r}; expected one of {sorted(_ROLE_VIDS)}")
+    wanted_vids = _ROLE_VIDS[role]
+
+    deadline = time.monotonic() + timeout_s
+    delay = poll_start
+    last_seen: list[dict] = []
+    while time.monotonic() < deadline:
+        try:
+            last_seen = devices_module.list_devices(include_unknown=True)
+        except Exception as exc:
+            # list_devices is wrapped by meshtastic_mcp.devices and
+            # shouldn't raise on normal enumeration — but a kernel-level
+            # USB hiccup during re-enumeration can bubble up briefly.
+            # Treat as "nothing seen this round" and retry.
+            last_seen = [{"error": repr(exc)}]
+        for dev in last_seen:
+            vid = _coerce_vid(dev.get("vid"))
+            if vid is not None and vid in wanted_vids and dev.get("port"):
+                return dev["port"]
+        time.sleep(delay)
+        delay = min(delay * 1.5, poll_max)
+
+    # Timeout path — include what we saw so the operator can tell
+    # "nothing plugged in" from "wrong VID" from "transient USB error".
+    raise AssertionError(
+        f"no device matching role {role!r} (VIDs "
+        f"{[hex(v) for v in wanted_vids]}) appeared within {timeout_s:.0f}s. "
+        f"Last enumeration: {last_seen!r}"
+    )
diff --git a/mcp-server/tests/admin/__init__.py b/mcp-server/tests/admin/__init__.py
new file mode 100644
index 00000000000..e69de29bb2d
diff --git a/mcp-server/tests/admin/test_channel_url_roundtrip.py b/mcp-server/tests/admin/test_channel_url_roundtrip.py
new file mode 100644
index 00000000000..c766dc046f5
--- /dev/null
+++ b/mcp-server/tests/admin/test_channel_url_roundtrip.py
@@ -0,0 +1,57 @@
+"""Admin: channel URL export and re-import round-trip.
+
+Real operator workflow: "I have two fleets, I want them to share a channel
+config. Export URL from fleet A's bootstrap device, paste into fleet B's
+onboarding tool, expect identical channels." Proves `getURL` + `setURL`
+round-trip without data loss.
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+
+import pytest
+from meshtastic_mcp import admin, info
+
+
+@pytest.mark.timeout(60)
+def test_channel_url_roundtrip(
+    baked_single: dict[str, Any],
+    test_profile: dict[str, Any],
+) -> None:
+    """Runs once per connected role. Verify:
+    1. `get_channel_url()` on a baked device returns a non-empty URL.
+    2. The URL parses — `set_channel_url(url)` accepts it without error.
+    3. After set, `get_channel_url()` returns the same (canonicalized) URL.
+    4. Primary channel name survives round-trip.
+    """
+    port = baked_single["port"]
+
+    url_before = admin.get_channel_url(include_all=False, port=port)["url"]
+    assert url_before, "device returned empty channel URL"
+    assert (
+        "meshtastic" in url_before.lower() or "#" in url_before
+    ), f"URL does not look like a Meshtastic channel URL: {url_before!r}"
+
+    # Re-apply the same URL — no-op in content but exercises the setURL path.
+    applied = admin.set_channel_url(url=url_before, port=port)
+    assert applied["ok"] is True
+    assert applied["channels_imported"] >= 1
+    time.sleep(2.0)
+
+    # Confirm the primary channel name survived
+    live = info.device_info(port=port, timeout_s=8.0)
+    assert live["primary_channel"] == test_profile["USERPREFS_CHANNEL_0_NAME"]
+
+    url_after = admin.get_channel_url(include_all=False, port=port)["url"]
+    # Canonicalization is tricky: the firmware may re-serialize the protobuf
+    # with fields in a different order, producing a visually-different URL
+    # that encodes the same content. Accept that as a success when the
+    # primary channel name survived the round-trip (already asserted above)
+    # and the URL is still a parseable Meshtastic URL. Bit-equality is a
+    # nice-to-have, not a correctness guarantee.
+    assert url_after, "URL went blank after setURL"
+    assert (
+        "meshtastic" in url_after.lower() or "#" in url_after
+    ), f"URL after setURL no longer looks like a channel URL: {url_after!r}"
diff --git a/mcp-server/tests/admin/test_config_roundtrip.py b/mcp-server/tests/admin/test_config_roundtrip.py
new file mode 100644
index 00000000000..77fc8de770d
--- /dev/null
+++ b/mcp-server/tests/admin/test_config_roundtrip.py
@@ -0,0 +1,106 @@
+"""Admin: a config mutation survives a reboot.
+
+This is the most-critical admin behavior not tested elsewhere. If
+config persistence breaks in a firmware release, every deployed device
+gets bricked on its next reboot (channels lost, region lost, owner lost,
+everything back to Meshtastic stock). The fleet blast radius is "every
+unit on every shelf" — easily worth one explicit test per release.
+
+Pattern: single-device (``baked_single``, one test per role). Mutate a
+benign, easy-to-observe LoRa field (``lora.hop_limit``), confirm
+pre-reboot, reboot, rediscover port (nRF52 may re-enumerate), verify
+the value survived, restore original for downstream tests.
+
+Why ``lora.hop_limit`` specifically:
+  * Non-destructive — doesn't change region, channel, or PSK, so
+    downstream mesh tests still work regardless of the flipped value.
+  * Bounded small-integer (1..7) — easy to flip to a definitively
+    different value and read back.
+  * Persisted via ``writeConfig("lora")`` which is the same path
+    every other LoRa config mutation uses, so we're really testing
+    the whole lora-config persistence pipeline end-to-end.
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+
+import pytest
+from meshtastic_mcp import admin, info
+
+from .._port_discovery import resolve_port_by_role
+
+
+def _get_hop_limit(port: str) -> int:
+    """Read `lora.hop_limit` from the device's current config."""
+    lora = admin.get_config("lora", port=port).get("config", {}).get("lora", {})
+    hl = lora.get("hop_limit")
+    assert isinstance(hl, int), (
+        f"lora.hop_limit missing or non-int in get_config response: " f"{lora!r}"
+    )
+    return hl
+
+
+@pytest.mark.timeout(180)
+def test_lora_hop_limit_survives_reboot(
+    baked_single: dict[str, Any],
+    wait_until,
+) -> None:
+    """Runs once per connected role. Mutates `lora.hop_limit`, reboots,
+    verifies the new value is still there after the device comes back.
+    """
+    role = baked_single["role"]
+    port = baked_single["port"]
+
+    original = _get_hop_limit(port)
+    # Flip to a definitively different value within the protocol's
+    # valid range (1..7 per LoRaConfig.hop_limit comment). Pick 5 if
+    # current is != 5, else 4.
+    new_value = 5 if original != 5 else 4
+
+    try:
+        admin.set_config("lora.hop_limit", new_value, port=port)
+
+        # Pre-reboot sanity: the write reached the device and
+        # get_config reflects it in-memory. If this fails, the persist
+        # test below is moot — something's wrong with the write path
+        # itself, not with persistence.
+        assert _get_hop_limit(port) == new_value, (
+            f"pre-reboot readback failed: set {new_value}, got "
+            f"{_get_hop_limit(port)}"
+        )
+
+        # Reboot. `seconds=3` gives the Python client time to
+        # disconnect cleanly; sleep long enough for the boot to start
+        # before we begin polling.
+        admin.reboot(port=port, confirm=True, seconds=3)
+        time.sleep(8.0)
+
+        # nRF52 re-enumerates on reboot → rediscover.
+        port = resolve_port_by_role(role, timeout_s=60.0)
+        wait_until(
+            lambda: info.device_info(port=port, timeout_s=5.0).get("my_node_num")
+            is not None,
+            timeout=60,
+            backoff_start=1.0,
+        )
+
+        # The assertion this test exists for: the mutation persisted
+        # across the reboot cycle through NVS / LittleFS / UICR.
+        post = _get_hop_limit(port)
+        assert post == new_value, (
+            f"lora.hop_limit did not survive reboot: set to {new_value} "
+            f"pre-reboot, read back {post} post-reboot. Config persistence "
+            f"is broken — downstream fleet impact would be total."
+        )
+    finally:
+        # Restore so downstream tests see the original hop_limit.
+        # Wrapped in its own try to avoid masking the real assertion
+        # if the restore itself races the reboot — the worst case
+        # there is a non-default hop_limit sticks around, which is
+        # benign (mesh still works at hop_limit 3 or 5).
+        try:
+            admin.set_config("lora.hop_limit", original, port=port)
+        except Exception:
+            pass
diff --git a/mcp-server/tests/admin/test_owner_survives_reboot.py b/mcp-server/tests/admin/test_owner_survives_reboot.py
new file mode 100644
index 00000000000..eda83f4a08d
--- /dev/null
+++ b/mcp-server/tests/admin/test_owner_survives_reboot.py
@@ -0,0 +1,59 @@
+"""Admin: owner name persists across a reboot.
+
+The single most common "did my admin change stick?" test. Proves
+`localNode.setOwner()` + `writeConfig("device")` commits to non-volatile
+storage before the reboot.
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+
+import pytest
+from meshtastic_mcp import admin, info
+
+
+@pytest.mark.timeout(120)
+def test_owner_survives_reboot(
+    baked_single: dict[str, Any],
+    wait_until,
+) -> None:
+    """Runs once per connected role — proves the reboot-persistence
+    round-trip works on each device independently, not just one."""
+    port = baked_single["port"]
+
+    pre = info.device_info(port=port, timeout_s=8.0)
+    original = pre.get("long_name") or ""
+    marker = "RebootSurvive"
+    try:
+        admin.set_owner(long_name=marker, short_name="RS", port=port)
+        time.sleep(1.5)
+
+        # Confirm pre-reboot
+        confirmed = info.device_info(port=port, timeout_s=8.0)
+        assert confirmed["long_name"] == marker
+
+        # Reboot (short delay)
+        admin.reboot(port=port, confirm=True, seconds=3)
+
+        # Wait for device to come back
+        time.sleep(8.0)
+        wait_until(
+            lambda: info.device_info(port=port, timeout_s=5.0).get("my_node_num")
+            is not None,
+            timeout=60,
+            backoff_start=1.0,
+        )
+
+        post = info.device_info(port=port, timeout_s=8.0)
+        assert post["long_name"] == marker, (
+            f"owner name did not persist across reboot: "
+            f"expected {marker!r}, got {post['long_name']!r}"
+        )
+    finally:
+        # Restore original (best-effort)
+        try:
+            admin.set_owner(long_name=original or "TestNode", port=port)
+        except Exception:
+            pass
diff --git a/mcp-server/tests/conftest.py b/mcp-server/tests/conftest.py
new file mode 100644
index 00000000000..3d033b9b861
--- /dev/null
+++ b/mcp-server/tests/conftest.py
@@ -0,0 +1,1041 @@
+"""Root conftest for the MCP server test harness.
+
+Organizes the fixture graph used by every test tier:
+
+    session_seed ── test_profile ─┐
+    hub_devices ──────────────────┴─ baked_mesh (verifies) ── baked_single (parametrized)
+    hub_devices ──────────────────── no_region_profile (provisioning negative test)
+    (per-test)  ──────────────────── serial_capture, device_state_dump, wait_until
+
+CLI flags (see `pytest_addoption`):
+    --force-bake       always reflash at session start, even if state matches
+    --assume-baked     trust the operator; skip test_00_bake collection entirely
+    --hub-profile=...  path to a YAML file mapping role → {vid, pid_contains}
+    --no-teardown-rebake  skip the session-end rebake that provisioning/fleet perform
+
+Coverage hooks:
+    - Failure artifacts (serial capture, device_info, get_config) are attached
+      to pytest-html reports via `pytest_runtest_makereport`.
+    - Tool-surface coverage (which of the 37 MCP tools got exercised) is
+      accumulated in `tests/tool_coverage.py` and written to
+      `tool_coverage.json` at session end.
+"""
+
+from __future__ import annotations
+
+import atexit
+import json
+import os
+import pathlib
+import sys
+import time
+from typing import Any, Callable
+
+import pytest
+
+# Ensure the MCP server is on `sys.path` without requiring installation in
+# development mode for every checkout (we DO install in .venv but this makes
+# `pytest tests/` work from a fresh clone too). The path mutation must
+# happen before `meshtastic_mcp.*` imports below — hence the `noqa: E402`
+# markers on those imports (ruff's "module-level import not at top of file"
+# rule doesn't understand path-bootstrapping patterns).
+_HERE = pathlib.Path(__file__).resolve().parent
+_MCP_SRC = _HERE.parent / "src"
+if str(_MCP_SRC) not in sys.path:
+    sys.path.insert(0, str(_MCP_SRC))
+
+# Default firmware root: the repo this mcp-server/ lives inside.
+os.environ.setdefault("MESHTASTIC_FIRMWARE_ROOT", str(_HERE.parent.parent))
+
+from meshtastic_mcp import admin  # noqa: E402
+from meshtastic_mcp import devices as devices_module  # noqa: E402
+from meshtastic_mcp import info, serial_session, userprefs  # noqa: E402
+
+from . import tool_coverage  # noqa: E402
+
+# ---------- CLI options ---------------------------------------------------
+
+
+def pytest_addoption(parser: pytest.Parser) -> None:
+    group = parser.getgroup("meshtastic", "Meshtastic MCP test options")
+    group.addoption(
+        "--force-bake",
+        action="store_true",
+        help="Flash both hub roles at session start, even if devices appear baked.",
+    )
+    group.addoption(
+        "--assume-baked",
+        action="store_true",
+        help="Skip `test_00_bake.py` and trust devices are already baked.",
+    )
+    group.addoption(
+        "--hub-profile",
+        default=None,
+        help="YAML file mapping role → {vid, pid_contains} for non-default hardware.",
+    )
+    group.addoption(
+        "--no-teardown-rebake",
+        action="store_true",
+        help="Skip session-end rebake after provisioning/fleet tests mutate state.",
+    )
+
+
+def pytest_collection_modifyitems(
+    config: pytest.Config, items: list[pytest.Item]
+) -> None:
+    """Deselect `test_00_bake.py` when --assume-baked is passed, and sort
+    items so that admin/ + provisioning/ (tests that mutate device state
+    via reboot or factory_reset) run AFTER the read-only mesh/telemetry
+    tests.
+
+    Why the reorder: admin/test_owner_survives_reboot reboots both
+    devices; provisioning/test_baked_prefs_survive_factory_reset does a
+    factory_reset. Both wipe the in-memory PKI public-key table. Directed
+    sends with wantAck=True then NAK with Routing.Error=39
+    (PKI_SEND_FAIL_PUBLIC_KEY) because TX lost RX's key, and the firmware
+    NodeInfo cooldown (10 min) + 12-h reply suppression make re-exchange
+    slow enough to fail within a test budget. Running mesh/telemetry
+    first against the pre-reboot state is both faster and more reliable;
+    admin/provisioning then runs against a clean mesh and exercises its
+    own invariants without contaminating other tiers.
+    """
+    if config.getoption("--assume-baked"):
+        for item in items:
+            if "test_00_bake" in item.nodeid:
+                item.add_marker(pytest.mark.skip(reason="skipped by --assume-baked"))
+
+    def sort_key(item: pytest.Item) -> tuple[int, str]:
+        path = str(getattr(item, "fspath", "") or item.nodeid)
+        # Session-start bake runs FIRST. `baked_mesh` only verifies state —
+        # nothing else actually reflashes — so if test_00_bake doesn't run
+        # before the tier tests, `--force-bake` silently becomes a no-op for
+        # the tier tests and only flashes at the very end of the session.
+        # Top-level nodeid ("tests/test_00_bake.py") otherwise falls into the
+        # fallback bucket and sorts after every tier.
+        if "test_00_bake" in item.nodeid:
+            return (-1, item.nodeid)
+        # Tiers that don't mutate device state run first.
+        if "/unit/" in path or "tests/unit" in path:
+            return (0, item.nodeid)
+        if "/mesh/" in path or "tests/mesh" in path:
+            return (1, item.nodeid)
+        if "/telemetry/" in path or "tests/telemetry" in path:
+            return (2, item.nodeid)
+        if "/monitor/" in path or "tests/monitor" in path:
+            return (3, item.nodeid)
+        if "/fleet/" in path or "tests/fleet" in path:
+            return (4, item.nodeid)
+        # State-mutating tiers run last.
+        if "/admin/" in path or "tests/admin" in path:
+            return (5, item.nodeid)
+        if "/provisioning/" in path or "tests/provisioning" in path:
+            return (6, item.nodeid)
+        # Top-level + anything else falls between.
+        return (7, item.nodeid)
+
+    items.sort(key=sort_key)
+
+
+# ---------- Session-scoped fixtures ---------------------------------------
+
+
+@pytest.fixture(scope="session")
+def session_seed(request: pytest.FixtureRequest) -> str:
+    """Deterministic PSK seed for this pytest session.
+
+    Logged in the HTML report header so two runs can be correlated — and so a
+    flaky-looking test can be reproduced exactly by passing the seed back via
+    an env var (future extension).
+    """
+    # Pytest session `starttime` isn't directly exposed on the pytest API we
+    # care about, so derive from process start time — unique enough for human
+    # purposes and stable across the session.
+    seed = os.environ.get("MESHTASTIC_MCP_SEED") or f"pytest-{int(time.time())}"
+    return seed
+
+
+@pytest.fixture(scope="session")
+def test_profile(session_seed: str) -> dict[str, Any]:
+    """The canonical isolated-mesh test profile for this session."""
+    return userprefs.build_testing_profile(
+        psk_seed=session_seed,
+        channel_name="McpTest",
+        channel_num=88,
+        region="US",
+        modem_preset="LONG_FAST",
+    )
+
+
+@pytest.fixture(scope="session", autouse=True)
+def _session_userprefs(test_profile: dict[str, Any]) -> Any:
+    """Snapshot `userPrefs.jsonc`, apply the session test profile, restore at
+    session end. Guards against the suite leaving test-profile USERPREFS
+    values baked into the file — if that happened, any firmware build a
+    contributor ran next would silently inherit the test PSK / test channel
+    name / test admin key etc.
+
+    Layered safety:
+      1. In-memory snapshot taken before any mutation; teardown writes it back.
+      2. Sidecar `userPrefs.jsonc.mcp-session-bak` on disk — belt to the
+         in-memory suspenders. If Python segfaults or SIGKILLs, the next
+         session self-heals from this file at startup.
+      3. `atexit.register()` fallback: if pytest exits abnormally (Ctrl-C
+         mid-test, fatal exception before teardown), the atexit hook still
+         restores from the in-memory snapshot.
+      4. Startup self-heal: if the sidecar exists at session start, a prior
+         session crashed without cleanup — the sidecar IS the truth; restore
+         from it before taking this session's snapshot. That way a crash
+         during test A doesn't propagate dirty state into test B's baseline.
+
+    Autouse + depends on `test_profile` so it applies on every run (even
+    unit-only) — cheap, unified code path, no ordering surprises.
+    """
+    path = userprefs.jsonc_path()
+    backup_path = path.with_name(path.name + ".mcp-session-bak")
+
+    if not path.is_file():
+        # Nothing to snapshot; yield no-op and skip restore.
+        yield
+        return
+
+    # (4) Startup self-heal — prior session crashed without teardown.
+    if backup_path.is_file():
+        try:
+            sidecar_bytes = backup_path.read_bytes()
+            current_bytes = path.read_bytes()
+            if sidecar_bytes != current_bytes:
+                path.write_bytes(sidecar_bytes)
+                print(
+                    f"[userprefs] recovered {path.name} from "
+                    f"{backup_path.name} (prior session exited without "
+                    f"cleanup)",
+                    file=sys.stderr,
+                )
+        except Exception as exc:
+            print(
+                f"[userprefs] startup self-heal failed: {exc!r}",
+                file=sys.stderr,
+            )
+
+    # (1) + (2) Snapshot + sidecar.
+    original_bytes = path.read_bytes()
+    original_stat = path.stat()
+    try:
+        backup_path.write_bytes(original_bytes)
+    except Exception as exc:
+        print(f"[userprefs] could not write sidecar: {exc!r}", file=sys.stderr)
+
+    # (3) atexit fallback — fires even if pytest aborts before fixture teardown.
+    restored = {"done": False}
+
+    def _atexit_restore() -> None:
+        if restored["done"]:
+            return
+        try:
+            path.write_bytes(original_bytes)
+        except Exception:
+            pass
+        try:
+            if backup_path.is_file():
+                backup_path.unlink()
+        except Exception:
+            pass
+        restored["done"] = True
+
+    atexit.register(_atexit_restore)
+
+    # Apply the session test profile on top of the snapshot. The firmware
+    # reads userPrefs.jsonc at build time via `bin/platformio-custom.py`,
+    # so every `pio run` during the session picks up the test values.
+    # Delegate to `userprefs.merge_active` — the public API that already
+    # parses, merges, validates, and writes — rather than reaching into
+    # the private parser/renderer machinery from here.
+    try:
+        userprefs.merge_active(test_profile)
+        # Bump mtime so any pre-existing `.pio/build/*/` cache is invalidated.
+        now = time.time()
+        os.utime(path, (now, now))
+    except Exception as exc:
+        # Non-fatal: tests that depend on the baked profile will fail loudly;
+        # tests that don't (unit) still run. But the restore below is
+        # unconditional, so we can't leave a half-written file behind.
+        print(
+            f"[userprefs] failed to apply test profile: {exc!r} — "
+            f"file left at original state",
+            file=sys.stderr,
+        )
+        try:
+            path.write_bytes(original_bytes)
+        except Exception:
+            pass
+
+    try:
+        yield
+    finally:
+        restore_ok = False
+        try:
+            path.write_bytes(original_bytes)
+            os.utime(path, (original_stat.st_atime, original_stat.st_mtime))
+            restore_ok = True
+        except Exception as exc:
+            # Don't `return` out of finally (that swallows any in-flight
+            # exception from the yielded body); use a flag so the cleanup
+            # control-flow stays linear and exceptions propagate normally.
+            print(
+                f"[userprefs] teardown restore failed: {exc!r} — "
+                f"sidecar {backup_path} retained for manual recovery",
+                file=sys.stderr,
+            )
+        if restore_ok:
+            try:
+                if backup_path.is_file():
+                    backup_path.unlink()
+            except Exception:
+                pass
+        # Mark done either way: on success, cleanup is complete; on failure,
+        # the sidecar is intentionally left for next-run self-heal and we
+        # don't want the atexit hook to fight us.
+        restored["done"] = True
+        try:
+            atexit.unregister(_atexit_restore)
+        except Exception:
+            pass
+
+
+@pytest.fixture(scope="session")
+def no_region_profile(session_seed: str) -> dict[str, Any]:
+    """Variant of `test_profile` with the LoRa region stripped.
+
+    Used only by the negative `unset_region_blocks_tx` test. That test MUST
+    re-bake `test_profile` in its own teardown so downstream shared-state
+    tests still see a correctly-configured mesh.
+    """
+    profile = userprefs.build_testing_profile(
+        psk_seed=session_seed,
+        channel_name="McpTest",
+        channel_num=88,
+        region="US",  # placeholder; we delete the key below
+        modem_preset="LONG_FAST",
+    )
+    profile.pop("USERPREFS_CONFIG_LORA_REGION", None)
+    return profile
+
+
+@pytest.fixture(scope="session")
+def hub_profile(request: pytest.FixtureRequest) -> dict[str, dict[str, Any]]:
+    """Role → {vid, pid_contains} map for detecting connected hardware.
+
+    Default covers the common nRF52840 + ESP32-S3 lab hub. Override via
+    `--hub-profile=path/to/hub.yaml`. Example YAML:
+
+        nrf52:
+          vid: 0x239a
+          pid_contains: null
+        esp32s3:
+          vid: 0x303a
+          pid_contains: null
+    """
+    path = request.config.getoption("--hub-profile")
+    if path:
+        import yaml
+
+        with open(path, "r", encoding="utf-8") as f:
+            return yaml.safe_load(f)
+    return {
+        "nrf52": {"vid": 0x239A, "pid_contains": None},
+        # ESP32-S3 can enumerate under Espressif native USB (0x303a) or via a
+        # CP2102 USB-serial chip (0x10c4). Both should match for
+        # Meshtastic-compatible boards.
+        "esp32s3": {"vid": 0x303A, "pid_contains": None},
+        "esp32s3_alt": {"vid": 0x10C4, "pid_contains": None},
+    }
+
+
+def _hex_to_int(value: Any) -> int | None:
+    if value is None:
+        return None
+    if isinstance(value, int):
+        return value
+    if isinstance(value, str):
+        return int(value, 16) if value.startswith("0x") else int(value)
+    return None
+
+
+@pytest.fixture(scope="session")
+def hub_devices(hub_profile: dict[str, dict[str, Any]]) -> dict[str, str]:
+    """Map of `role → port` for devices detected on the hub.
+
+    Excludes `*_alt` roles from the returned map (they're additional VID
+    matchers for the same logical role). If a role isn't detected, an entry is
+    absent from the return value; fixtures that require specific roles should
+    check presence and `pytest.skip` with an actionable message.
+    """
+    # include_unknown=True so non-whitelisted VIDs (e.g. CP2102 at 0x10c4) that
+    # are configured as hub roles still match. The hub_profile itself gates
+    # which VIDs we consider — no risk of unrelated serial ports sneaking in.
+    found = devices_module.list_devices(include_unknown=True)
+    # Coalesce alt roles into their base name (esp32s3_alt → esp32s3)
+    resolved: dict[str, str] = {}
+    for role, spec in hub_profile.items():
+        target_vid = spec["vid"]
+        pid_contains = spec.get("pid_contains")
+        canonical = role.split("_alt", 1)[0]
+        if canonical in resolved:
+            continue
+        for dev in found:
+            vid = _hex_to_int(dev.get("vid"))
+            pid = _hex_to_int(dev.get("pid"))
+            if vid != target_vid:
+                continue
+            if pid_contains is not None and (pid is None or pid_contains not in pid):
+                continue
+            resolved[canonical] = dev["port"]
+            break
+    return resolved
+
+
+def _reset_transmit_history_state(role: str, port: str) -> str:
+    """Wipe `/prefs/transmit_history.dat` + in-memory throttle cache via
+    delete_file_request + reboot. Returns the post-reboot port (nRF52
+    re-enumerates). Best-effort — errors log to stderr + return original
+    port so a flaky start doesn't block the session.
+    """
+    from ._port_discovery import resolve_port_by_role
+
+    try:
+        from meshtastic.protobuf import admin_pb2  # type: ignore[import-untyped]
+        from meshtastic_mcp.connection import connect
+
+        with connect(port=port) as iface:
+            msg = admin_pb2.AdminMessage()
+            msg.delete_file_request = "/prefs/transmit_history.dat"
+            iface.localNode._sendAdmin(msg)
+            time.sleep(1.0)
+            # Reboot clears in-memory cache; otherwise the 5-min auto-flush
+            # rewrites the file with pre-reset timestamps.
+            iface.localNode.reboot(3)
+    except Exception as exc:
+        print(
+            f"[transmit-history-reset] {role} @ {port} clear failed: {exc!r}",
+            file=sys.stderr,
+        )
+        return port
+
+    time.sleep(8.0)
+    try:
+        fresh = resolve_port_by_role(role, timeout_s=45.0)
+    except Exception as exc:
+        print(
+            f"[transmit-history-reset] {role} didn't reappear: {exc!r}",
+            file=sys.stderr,
+        )
+        return port
+    for _ in range(20):
+        try:
+            if info.device_info(port=fresh, timeout_s=5.0).get("my_node_num"):
+                return fresh
+        except Exception:
+            time.sleep(1.5)
+    return fresh
+
+
+@pytest.fixture(scope="session", autouse=True)
+def _session_clear_transmit_history(hub_devices: dict[str, str]) -> None:
+    """Wipe transmit_history.dat on each device at session start.
+
+    Without this, the firmware's per-portnum last-broadcast cache
+    (`src/mesh/TransmitHistory.h`) carries throttle state across sessions
+    and suppresses early broadcasts. Mutates `hub_devices` in place with
+    post-reboot ports since nRF52 re-enumerates.
+    """
+    if not hub_devices:
+        yield
+        return
+    # Iterate over a snapshot — _reset_transmit_history_state can mutate
+    # hub_devices mid-loop via the update below, and dict-iteration isn't
+    # safe during mutation.
+    for role, port in list(hub_devices.items()):
+        fresh_port = _reset_transmit_history_state(role, port)
+        if fresh_port != port:
+            hub_devices[role] = fresh_port
+    yield
+
+
+@pytest.fixture(scope="session")
+def baked_mesh(
+    hub_devices: dict[str, str],
+    test_profile: dict[str, Any],
+    session_seed: str,
+    request: pytest.FixtureRequest,
+) -> dict[str, Any]:
+    """Verify that both roles are baked with the session `test_profile`.
+
+    Does NOT reflash. `test_00_bake.py` is responsible for applying the bake;
+    this fixture just checks the result by connecting to each device and
+    comparing the live config to the expected profile.
+
+    Raises with an actionable error if state is missing or mismatched:
+        "device nrf52 at /dev/cu.X not baked with session profile —
+         run test_00_bake.py first or pass --force-bake"
+
+    Returns a per-role dict with `{port, iface_fresh: callable, my_node_num}`.
+    """
+    # Verify every role that's present — don't require a fixed set.
+    # Tests that NEED a specific role (mesh_pair, bidirectional) check
+    # presence in their own fixtures and skip there with an actionable
+    # message. That keeps single-device tests runnable on a one-device
+    # hub without needing a --hub-profile override.
+    if not hub_devices:
+        pytest.skip(
+            "no hub roles detected. Attach a device or override with --hub-profile."
+        )
+
+    expected_region = test_profile["USERPREFS_CONFIG_LORA_REGION"]
+    expected_preset = test_profile["USERPREFS_LORACONFIG_MODEM_PRESET"]
+    expected_slot = test_profile["USERPREFS_LORACONFIG_CHANNEL_NUM"]
+    expected_channel_name = test_profile["USERPREFS_CHANNEL_0_NAME"]
+
+    out: dict[str, Any] = {}
+    per_role_errors: dict[str, str] = {}
+    for role in sorted(hub_devices):
+        port = hub_devices[role]
+        try:
+            live = info.device_info(port=port, timeout_s=12.0)
+        except Exception as exc:
+            # Per-role failure — drop this role from the baked set and let
+            # any test parametrized against it skip with the actionable
+            # message. Other roles still proceed.
+            per_role_errors[role] = f"device_info failed: {exc!r}"
+            continue
+        # `device_info` surfaces region/primary_channel but not modem preset
+        # or channel_num directly; pull those via a separate get_config call.
+        try:
+            lora_cfg = admin.get_config(section="lora", port=port)["config"]["lora"]
+        except Exception as exc:
+            per_role_errors[role] = f"get_config(lora) failed: {exc!r}"
+            continue
+        channel_num = int(lora_cfg.get("channel_num", 0))
+        modem_preset = lora_cfg.get("modem_preset")
+        region_short = live.get("region")
+        primary = live.get("primary_channel")
+
+        mismatches = []
+        if region_short and not expected_region.endswith(str(region_short)):
+            mismatches.append(f"region={region_short} (expected {expected_region})")
+        # `modem_preset` is omitted from the protobuf→JSON dump when it's the
+        # default (LONG_FAST, value 0). Missing + expected-LONG_FAST = match.
+        if modem_preset is None:
+            if not expected_preset.endswith("_LONG_FAST"):
+                mismatches.append(
+                    f"modem_preset=<default LONG_FAST> (expected {expected_preset})"
+                )
+        elif not expected_preset.endswith(str(modem_preset)):
+            mismatches.append(
+                f"modem_preset={modem_preset} (expected {expected_preset})"
+            )
+        if channel_num != expected_slot:
+            mismatches.append(f"channel_num={channel_num} (expected {expected_slot})")
+        if primary and primary != expected_channel_name:
+            mismatches.append(
+                f"primary_channel={primary!r} (expected {expected_channel_name!r})"
+            )
+
+        if mismatches:
+            per_role_errors[role] = "not baked with session profile: " + "; ".join(
+                mismatches
+            )
+            continue
+
+        out[role] = {
+            "port": port,
+            "my_node_num": live.get("my_node_num"),
+            "firmware_version": live.get("firmware_version"),
+        }
+
+        # NOTE: we intentionally do NOT auto-enable `security.debug_log_api_enabled`
+        # here. Firmware's `emitLogRecord` (src/mesh/StreamAPI.cpp:196) shares the
+        # `fromRadioScratch` / `txBuf` buffers with the main packet-emission path;
+        # LOG_ calls that race in-flight FromRadio emissions corrupt the byte
+        # stream, triggering protobuf DecodeError in meshtastic-python and killing
+        # the SerialInterface. Operators who want log capture can opt in via the
+        # `set_debug_log_api` MCP tool (or `admin.set_debug_log_api` directly) on
+        # a case-by-case basis. The autouse `_debug_log_buffer` fixture is still
+        # armed below — if a test explicitly enables the flag, its output will
+        # be captured and attached to failures. Firmware-side fix would need
+        # a separate tx buffer or a mutex — out of scope for the MCP harness.
+
+    # If EVERY detected role errored, skip the session — nothing testable.
+    # Otherwise yield the partial set. Tests parametrized against a role
+    # not in `out` will skip via the `baked_single`/`mesh_pair` presence
+    # check with "role not present on the hub".
+    if not out:
+        details = "\n  ".join(f"{r}: {e}" for r, e in per_role_errors.items())
+        pytest.skip(
+            "no devices matched the session bake profile:\n  "
+            + details
+            + "\nRun `pytest tests/test_00_bake.py --force-bake` first."
+        )
+    return out
+
+
+def pytest_generate_tests(metafunc: pytest.Metafunc) -> None:
+    """Auto-parametrize `baked_single` over every detected hub role, and
+    `mesh_pair` over every ordered (tx, rx) pair.
+
+    This is the "tests are context-aware of the device they're against" layer:
+    a test that takes `baked_single` runs once per connected device, so its
+    report ID reads `test_owner_survives_reboot[nrf52]` /
+    `test_owner_survives_reboot[esp32s3]`. Cross-device tests that take
+    `mesh_pair` run for every direction, so A→B and B→A are both asserted.
+
+    Both fall back to a hardcoded default set when hardware isn't present so
+    the test still COLLECTS cleanly (it'll just skip via the
+    `hub_devices` missing-role check inside the fixture).
+
+    Honors `--hub-profile=<yaml>` for non-default hardware — when set, only
+    roles defined in the YAML are parametrized. (So e.g. a yaml with only
+    `esp32s3` skips every `[nrf52]` variant at collection time.)
+    """
+    # Resolve the role → VID map, honoring --hub-profile if passed
+    profile_path = metafunc.config.getoption("--hub-profile", default=None)
+    if profile_path:
+        import yaml
+
+        with open(profile_path, "r", encoding="utf-8") as f:
+            hub = yaml.safe_load(f) or {}
+        # Flatten _alt entries into canonical-role map (keep first occurrence)
+        default_roles: dict[str, int] = {}
+        for role, spec in hub.items():
+            default_roles[role] = spec["vid"]
+    else:
+        default_roles = {"nrf52": 0x239A, "esp32s3": 0x303A, "esp32s3_alt": 0x10C4}
+
+    try:
+        from meshtastic_mcp import devices as _dev
+
+        found = _dev.list_devices(include_unknown=True)
+    except Exception:
+        found = []
+
+    detected: list[str] = []
+    for role, target_vid in default_roles.items():
+        canonical = role.split("_alt", 1)[0]
+        if canonical in detected:
+            continue
+        for d in found:
+            vid = d.get("vid")
+            if isinstance(vid, str):
+                try:
+                    vid = int(vid, 16)
+                except ValueError:
+                    vid = None
+            if vid == target_vid:
+                detected.append(canonical)
+                break
+
+    # When --hub-profile is explicit, honor its role list even if detection
+    # failed (operator knows what they plugged in; let the fixture skip
+    # unbaked roles at runtime with an actionable message).
+    if profile_path:
+        roles = detected or [r.split("_alt", 1)[0] for r in default_roles]
+    else:
+        roles = detected or ["nrf52", "esp32s3"]
+
+    if "baked_single_role" in metafunc.fixturenames:
+        metafunc.parametrize("baked_single_role", roles, ids=roles, scope="function")
+
+    if "mesh_pair_roles" in metafunc.fixturenames:
+        pairs = [(a, b) for a in roles for b in roles if a != b]
+        ids = [f"{a}->{b}" for a, b in pairs]
+        metafunc.parametrize("mesh_pair_roles", pairs, ids=ids, scope="function")
+
+
+@pytest.fixture
+def baked_single(
+    baked_mesh: dict[str, Any],
+    baked_single_role: str,
+) -> dict[str, Any]:
+    """Function-scoped: a single verified baked device.
+
+    Auto-parametrized by `pytest_generate_tests` over every detected hub
+    role — so any test taking this fixture runs once per connected device
+    (e.g. `test_owner_survives_reboot[nrf52]` +
+    `test_owner_survives_reboot[esp32s3]`). Tests never hardcode a role
+    and never skip a device that happens to be connected.
+    """
+    if baked_single_role not in baked_mesh:
+        pytest.skip(f"role {baked_single_role!r} not present on the hub")
+    return {"role": baked_single_role, **baked_mesh[baked_single_role]}
+
+
+_DEFAULT_ROLE_ENVS = {
+    "nrf52": "rak4631",
+    "esp32s3": "heltec-v3",
+}
+
+
+@pytest.fixture
+def role_env() -> Callable[[str], str]:
+    """Resolve `role` → PlatformIO env name.
+
+    Falls back to a default map tuned for the lab's default hardware
+    (RAK4631 + Heltec V3). Override per-role via env vars like
+    `MESHTASTIC_MCP_ENV_NRF52=my-custom-nrf-env`. Used by tests that need to
+    reflash a device (provisioning/fleet tiers).
+    """
+
+    def _resolve(role: str) -> str:
+        override = os.environ.get(f"MESHTASTIC_MCP_ENV_{role.upper()}")
+        if override:
+            return override
+        if role not in _DEFAULT_ROLE_ENVS:
+            raise KeyError(
+                f"no default env for role {role!r}; "
+                f"set MESHTASTIC_MCP_ENV_{role.upper()}"
+            )
+        return _DEFAULT_ROLE_ENVS[role]
+
+    return _resolve
+
+
+@pytest.fixture
+def mesh_pair(
+    baked_mesh: dict[str, Any],
+    mesh_pair_roles: tuple[str, str],
+) -> dict[str, Any]:
+    """Function-scoped: an ordered (tx, rx) pair of baked devices.
+
+    Auto-parametrized over every directed role pair, so a test that takes
+    `mesh_pair` runs for `nrf52->esp32s3` AND `esp32s3->nrf52` and asserts
+    communication in both directions independently. Cross-device tests
+    (mesh formation, broadcast delivery, direct+ACK) should prefer this over
+    `baked_mesh` so both directions are validated.
+    """
+    tx_role, rx_role = mesh_pair_roles
+    for role in (tx_role, rx_role):
+        if role not in baked_mesh:
+            pytest.skip(f"role {role!r} not present on the hub")
+    return {
+        "tx_role": tx_role,
+        "rx_role": rx_role,
+        "tx": {"role": tx_role, **baked_mesh[tx_role]},
+        "rx": {"role": rx_role, **baked_mesh[rx_role]},
+    }
+
+
+# ---------- Failure-artifact fixtures -------------------------------------
+
+
+class _SerialCapture:
+    """Active-session wrapper that lazily opens + closes a pio monitor."""
+
+    def __init__(self, port: str, env: str | None = None) -> None:
+        self._port = port
+        self._env = env
+        self._session = None
+        self._last_cursor: int | None = None
+
+    def start(self) -> None:
+        self._session = serial_session.open_session(port=self._port, env=self._env)
+
+    def snapshot(self, max_lines: int = 500) -> list[str]:
+        if self._session is None:
+            return []
+        out = serial_session.read_session(
+            self._session, max_lines=max_lines, since_cursor=0
+        )
+        return out.get("lines", [])
+
+    def stop(self) -> None:
+        if self._session is not None:
+            try:
+                serial_session.close_session(self._session)
+            except Exception:
+                pass
+            self._session = None
+
+
+@pytest.fixture
+def serial_capture(hub_devices: dict[str, str], request: pytest.FixtureRequest) -> Any:
+    """Return a `_SerialCapture` factory.
+
+    Usage:
+        cap = serial_capture("esp32s3")
+        cap.start()
+        ... run test ...
+        # on failure, serial buffer is attached via pytest_runtest_makereport
+    """
+    captures: list[_SerialCapture] = []
+
+    def factory(role: str, env: str | None = None) -> _SerialCapture:
+        if role not in hub_devices:
+            pytest.skip(f"role {role!r} not present on the hub")
+        cap = _SerialCapture(port=hub_devices[role], env=env)
+        cap.start()
+        captures.append(cap)
+        request.node._serial_captures = captures  # type: ignore[attr-defined]
+        return cap
+
+    yield factory
+
+    for cap in captures:
+        cap.stop()
+
+
+@pytest.fixture
+def wait_until() -> Callable[..., Any]:
+    """Exponential-backoff polling helper.
+
+    Usage:
+        wait_until(lambda: b.node_num in a.iface.nodesByNum, timeout=60)
+    """
+
+    def _impl(
+        predicate: Callable[[], Any],
+        timeout: float = 60.0,
+        backoff_start: float = 0.5,
+        backoff_max: float = 5.0,
+    ) -> Any:
+        deadline = time.monotonic() + timeout
+        delay = backoff_start
+        last: Any = None
+        while time.monotonic() < deadline:
+            last = predicate()
+            if last:
+                return last
+            time.sleep(delay)
+            delay = min(delay * 1.5, backoff_max)
+        raise AssertionError(
+            f"predicate did not return truthy within {timeout}s (last={last!r})"
+        )
+
+    return _impl
+
+
+# ---------- Firmware log capture (per-test autouse) -----------------------
+
+
+@pytest.fixture(scope="session", autouse=True)
+def _firmware_log_stream() -> Any:
+    """Mirror every `meshtastic.log.line` pubsub event to `tests/fwlog.jsonl`.
+
+    Why this exists: the v1 `_debug_log_buffer` per-test fixture captures
+    firmware logs *in memory* for pytest-html failure attachments, but a
+    live viewer (``meshtastic-mcp-test-tui``) can't read in-process
+    pubsub events from a different process. This fixture adds a
+    session-long, durable mirror — one JSON object per line, with
+    ``port``, ``ts``, and ``line`` fields — that the TUI tails from a
+    worker thread.
+
+    Schema (kept trivially small so the file grows slowly):
+
+        {"ts": 1729100000.123, "port": "/dev/cu.usbmodem1101", "line": "INFO  | ... [SerialConsole] Boot..."}
+
+    The file is truncated at session start (no append across runs — the
+    TUI also unlinks it on launch, so double-truncate is deliberate).
+    Gitignored via ``mcp-server/.gitignore``.
+
+    Runs alongside ``_debug_log_buffer`` — both subscribe to the same
+    pubsub topic; pubsub fans out to every subscriber so there's no
+    interference.
+    """
+    import threading
+
+    from pubsub import pub  # type: ignore[import-untyped]
+
+    out_path = _HERE / "fwlog.jsonl"
+    # Truncate at session start. TUI also unlinks on launch; this is the
+    # plain-CLI path's turn to start clean.
+    try:
+        out_path.parent.mkdir(parents=True, exist_ok=True)
+        out_path.write_text("")
+    except Exception:
+        # Non-fatal: if we can't open the file, the TUI just gets no
+        # firmware log stream. Tests still run.
+        yield
+        return
+
+    lock = threading.Lock()
+    fh = out_path.open("a", encoding="utf-8")
+
+    def handler(line: str, interface: Any) -> None:
+        # `interface` is the meshtastic SerialInterface; `.devPath`
+        # carries the /dev/cu.* we care about. Defensive about missing
+        # attribute — the pubsub handler must never raise.
+        try:
+            port = getattr(interface, "devPath", None) or getattr(
+                interface, "stream", None
+            )
+            if port and hasattr(port, "port"):
+                port = port.port
+            record = {
+                "ts": time.time(),
+                "port": str(port) if port else None,
+                "line": str(line),
+            }
+            with lock:
+                fh.write(json.dumps(record) + "\n")
+                fh.flush()
+        except Exception:
+            # Swallow — firmware log mirroring is best-effort.
+            pass
+
+    pub.subscribe(handler, "meshtastic.log.line")
+    try:
+        yield
+    finally:
+        try:
+            pub.unsubscribe(handler, "meshtastic.log.line")
+        except Exception:
+            pass
+        try:
+            fh.close()
+        except Exception:
+            pass
+
+
+@pytest.fixture(autouse=True)
+def _debug_log_buffer(request: pytest.FixtureRequest) -> Any:
+    """Per-test capture of `meshtastic.log.line` pubsub events.
+
+    Automatic — every test gets this for free. The pubsub topic fires when
+    a connected device has `security.debug_log_api_enabled=True` AND the
+    client (us) is talking protobufs over its SerialInterface. `baked_mesh`
+    flips the flag on at session start, so every subsequent test that opens
+    any SerialInterface (directly via `connect()` or via a
+    `ReceiveCollector`) picks up the device's log stream automatically.
+
+    The captured lines are attached to the test's pytest-html failure report
+    by `pytest_runtest_makereport`, so mesh/telemetry failures ship with the
+    firmware-side log context inline — no separate pio monitor, no
+    port-lock conflict.
+    """
+    import threading as _threading
+
+    from pubsub import pub  # type: ignore[import-untyped]
+
+    lines: list[str] = []
+    lock = _threading.Lock()
+
+    def handler(line: str, interface: Any) -> None:
+        with lock:
+            lines.append(line)
+
+    pub.subscribe(handler, "meshtastic.log.line")
+    # Stash a strong ref on the test item so pubsub's weakref doesn't GC
+    # the closure before the test ends (same trick ReceiveCollector uses).
+    request.node._debug_log_buffer = lines  # type: ignore[attr-defined]
+    request.node._debug_log_handler_ref = handler  # type: ignore[attr-defined]
+    try:
+        yield lines
+    finally:
+        try:
+            pub.unsubscribe(handler, "meshtastic.log.line")
+        except Exception:
+            pass
+
+
+# ---------- pytest hooks: report attachments + coverage -------------------
+
+
+def _run_with_timeout(fn: Callable[[], Any], timeout: float) -> Any:
+    """Run `fn()` in a worker thread; raise TimeoutError if it takes > `timeout`s.
+
+    `meshtastic.SerialInterface` construction can hang indefinitely on a
+    misconfigured or unresponsive port. pytest-timeout fires from the main
+    thread via SIGALRM, which doesn't protect code running inside
+    `pytest_runtest_makereport` — that hook runs outside the test's timer. So
+    we wrap each device query in a bounded worker.
+    """
+    import concurrent.futures
+
+    with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
+        future = pool.submit(fn)
+        try:
+            return future.result(timeout=timeout)
+        except concurrent.futures.TimeoutError as exc:
+            # The worker thread will keep running in the background (we can't
+            # cancel a blocked SerialInterface). It's a daemon-ish leak for
+            # the session, but better than hanging pytest forever.
+            raise TimeoutError(f"operation did not complete within {timeout}s") from exc
+
+
+@pytest.hookimpl(hookwrapper=True)
+def pytest_runtest_makereport(item: pytest.Item, call: pytest.CallInfo[Any]) -> Any:
+    """On test failure, attach serial capture + device state as report artifacts.
+
+    Hard-bounded by `_run_with_timeout` — if the device is unreachable (stuck
+    port, unbaked firmware, dead board), the dump is skipped rather than
+    hanging the session.
+    """
+    outcome = yield
+    report = outcome.get_result()
+
+    if report.when != "call" or report.outcome != "failed":
+        return
+
+    extras: list[str] = []
+
+    # Attach firmware log stream captured via the StreamAPI (populated only
+    # when the device has security.debug_log_api_enabled=True — baked_mesh
+    # flips this on at session start). Cheap and high-signal: last 200 lines
+    # of firmware log interleaved with whatever the test was doing.
+    log_buffer = getattr(item, "_debug_log_buffer", None)
+    if log_buffer:
+        extras.append(
+            f"--- firmware log stream ({len(log_buffer)} lines, last 200) ---\n"
+            + "\n".join(log_buffer[-200:])
+        )
+
+    # Attach serial captures (if the test used `serial_capture`)
+    caps = getattr(item, "_serial_captures", None)
+    if caps:
+        for cap in caps:
+            try:
+                lines = _run_with_timeout(lambda c=cap: c.snapshot(max_lines=2000), 5.0)
+            except Exception as exc:
+                lines = [f"<serial snapshot failed: {exc!r}>"]
+            extras.append(
+                f"--- serial capture [{cap._port}] ({len(lines)} lines) ---\n"
+                + "\n".join(lines[-200:])
+            )
+
+    # Dump device state for any role in hub_devices (if the fixture was used).
+    # Each query is bounded to 6s; if the device is wedged, skip the dump for
+    # that role rather than hanging the pytest session.
+    hub_fixture = (
+        item.funcargs.get("hub_devices") if hasattr(item, "funcargs") else None
+    )
+    if hub_fixture:
+        for role, port in hub_fixture.items():
+            state: dict[str, Any] = {"role": role, "port": port}
+            try:
+                state["device_info"] = _run_with_timeout(
+                    lambda p=port: info.device_info(port=p, timeout_s=4.0), 6.0
+                )
+            except Exception as exc:
+                state["device_info_error"] = repr(exc)
+            try:
+                state["config"] = _run_with_timeout(
+                    lambda p=port: admin.get_config(section="lora", port=p), 6.0
+                )
+            except Exception as exc:
+                state["config_error"] = repr(exc)
+            extras.append(
+                f"--- device state [{role}] ---\n{json.dumps(state, indent=2, default=str)}"
+            )
+
+    if extras:
+        # Attach to pytest-html via `report.sections`; pytest-html renders these
+        report.sections.append(("Meshtastic debug", "\n\n".join(extras)))
+
+
+def pytest_sessionfinish(session: pytest.Session, exitstatus: int) -> None:
+    """Emit `tool_coverage.json` at session end."""
+    out_path = pathlib.Path(__file__).parent / "tool_coverage.json"
+    tool_coverage.write_report(out_path)
+
+
+# Activate the tool-coverage tracker at import time so imports in fixtures are
+# also counted.
+tool_coverage.install()
diff --git a/mcp-server/tests/fleet/__init__.py b/mcp-server/tests/fleet/__init__.py
new file mode 100644
index 00000000000..e69de29bb2d
diff --git a/mcp-server/tests/fleet/test_psk_seed_isolates_runs.py b/mcp-server/tests/fleet/test_psk_seed_isolates_runs.py
new file mode 100644
index 00000000000..03efd987019
--- /dev/null
+++ b/mcp-server/tests/fleet/test_psk_seed_isolates_runs.py
@@ -0,0 +1,43 @@
+"""Fleet: different session seeds produce non-overlapping PSKs.
+
+No hardware needed — this is a pure property check on the test profile
+generator, elevated into the `fleet/` tier because it's the critical
+invariant for running concurrent CI labs without cross-contamination.
+"""
+
+from __future__ import annotations
+
+from meshtastic_mcp import userprefs
+
+
+def test_psk_seed_isolates_runs() -> None:
+    """Two labs running simultaneously with different seeds must end up with
+    different PSKs — which means firmware baked in lab A cannot decode lab B's
+    traffic, and vice versa.
+
+    This is the formal statement of the isolation claim that
+    `testing_profile` promises operators.
+    """
+    lab_a_morning = userprefs.build_testing_profile(psk_seed="lab-a-2026-04-16-morning")
+    lab_a_evening = userprefs.build_testing_profile(psk_seed="lab-a-2026-04-16-evening")
+    lab_b_morning = userprefs.build_testing_profile(psk_seed="lab-b-2026-04-16-morning")
+
+    # Same lab, same date, different time-of-day → different PSKs
+    assert (
+        lab_a_morning["USERPREFS_CHANNEL_0_PSK"]
+        != lab_a_evening["USERPREFS_CHANNEL_0_PSK"]
+    )
+    # Different labs, same time-of-day → different PSKs
+    assert (
+        lab_a_morning["USERPREFS_CHANNEL_0_PSK"]
+        != lab_b_morning["USERPREFS_CHANNEL_0_PSK"]
+    )
+
+    # Re-deriving with the same seed yields the same PSK (reproducibility)
+    lab_a_morning_again = userprefs.build_testing_profile(
+        psk_seed="lab-a-2026-04-16-morning"
+    )
+    assert (
+        lab_a_morning["USERPREFS_CHANNEL_0_PSK"]
+        == lab_a_morning_again["USERPREFS_CHANNEL_0_PSK"]
+    )
diff --git a/mcp-server/tests/mesh/__init__.py b/mcp-server/tests/mesh/__init__.py
new file mode 100644
index 00000000000..e69de29bb2d
diff --git a/mcp-server/tests/mesh/_receive.py b/mcp-server/tests/mesh/_receive.py
new file mode 100644
index 00000000000..a1e5bb3e88a
--- /dev/null
+++ b/mcp-server/tests/mesh/_receive.py
@@ -0,0 +1,220 @@
+"""Shared helper for mesh receive tests.
+
+`pio device monitor` captures firmware log output, which does NOT include
+decoded text message contents or telemetry payloads — those are only
+accessible through `meshtastic.SerialInterface`'s pubsub mechanism.
+
+`ReceiveCollector` opens a long-lived SerialInterface on a port, subscribes
+to the pubsub topic of interest, and exposes an atomic `wait_for(predicate)`
+that mesh tests use to verify end-to-end delivery.
+
+This module also exposes two module-level helpers for forcing a device to
+broadcast a fresh NodeInfo — the on-demand path that sidesteps the
+firmware's 10-minute NodeInfo rate-limit. Tests doing directed PKI-encrypted
+sends need BOTH endpoints to hold current pubkeys for each other:
+
+    nudge_nodeinfo(iface)    # nudge an already-open SerialInterface
+    nudge_nodeinfo_port(port) # open briefly, nudge, close
+
+See `ReceiveCollector.broadcast_nodeinfo_ping` for the firmware-side
+rationale (PKI staleness → directed sends NAK with Routing.Error=35
+PKI_UNKNOWN_PUBKEY or 39 PKI_SEND_FAIL_PUBLIC_KEY).
+"""
+
+from __future__ import annotations
+
+import threading
+import time
+from typing import Any, Callable
+
+
+def nudge_nodeinfo(iface: Any) -> None:
+    """Force the device behind ``iface`` to broadcast a fresh NodeInfo.
+
+    Sends a ``ToRadio.Heartbeat(nonce=1)`` — the firmware's documented
+    on-demand NodeInfo trigger (see `src/mesh/api/PacketAPI.cpp:74-79`
+    for TCP/UDP and `src/mesh/PhoneAPI.cpp::handleToRadio` for serial,
+    both routed to `NodeInfoModule::sendOurNodeInfo(..., shorterTimeout=true)`
+    with the 60-s window rather than the 10-min rate-limit).
+
+    Call on BOTH TX and RX ifaces before a directed PKI-encrypted send.
+    Nudging only one side leaves the other with a stale pubkey cache and
+    makes the directed send NAK with PKI_UNKNOWN_PUBKEY.
+    """
+    from meshtastic.protobuf import mesh_pb2  # type: ignore[import-untyped]
+
+    tr = mesh_pb2.ToRadio()
+    tr.heartbeat.nonce = 1
+    iface._sendToRadio(tr)
+
+
+def nudge_nodeinfo_port(port: str) -> None:
+    """Open ``port`` briefly, nudge, close — for when no iface is open yet.
+
+    Uses the meshtastic_mcp port-lock-aware `connect()` context manager
+    so we don't race ReceiveCollector or other long-lived handles on
+    the same port.
+    """
+    from meshtastic_mcp.connection import connect
+
+    with connect(port=port) as iface:
+        nudge_nodeinfo(iface)
+
+
+class ReceiveCollector:
+    """Listen for meshtastic packets on `port` and let tests wait for a match.
+
+    Must be used as a context manager so the underlying SerialInterface is
+    always closed (leaked interfaces hold the CDC port open and break
+    subsequent tool calls).
+
+    Usage:
+        with ReceiveCollector(rx_port, topic="meshtastic.receive.text") as rx:
+            # ... send from TX ...
+            assert rx.wait_for(
+                lambda pkt: pkt.get("decoded", {}).get("text") == unique,
+                timeout=60,
+            ), f"packet not received; got {rx.snapshot()!r}"
+    """
+
+    def __init__(
+        self,
+        port: str,
+        topic: str = "meshtastic.receive",
+        capture_logs: bool = False,
+    ) -> None:
+        self._port = port
+        self._topic = topic
+        self._capture_logs = capture_logs
+        self._packets: list[dict[str, Any]] = []
+        self._log_lines: list[str] = []
+        self._lock = threading.Lock()
+        self._iface = None
+        self._handler_ref = None  # keep strong ref so pubsub doesn't GC it
+        self._log_handler_ref = None
+
+    def __enter__(self) -> "ReceiveCollector":
+        from meshtastic.serial_interface import (
+            SerialInterface,  # type: ignore[import-untyped]
+        )
+        from pubsub import pub  # type: ignore[import-untyped]
+
+        # pubsub uses weak refs by default — we stash a strong ref so the
+        # handler doesn't disappear between subscribe and wait_for.
+        def handler(packet: dict, interface: Any) -> None:
+            with self._lock:
+                self._packets.append(packet)
+
+        self._handler_ref = handler
+        pub.subscribe(handler, self._topic)
+
+        # Firmware-side logs come through the SAME SerialInterface when
+        # `config.security.debug_log_api_enabled = True`. Subscribing here
+        # captures them for failure-artifact attachment without needing a
+        # separate pio monitor session that would fight our port lock.
+        if self._capture_logs:
+
+            def log_handler(line: str, interface: Any) -> None:
+                with self._lock:
+                    self._log_lines.append(line)
+
+            self._log_handler_ref = log_handler
+            pub.subscribe(log_handler, "meshtastic.log.line")
+
+        self._iface = SerialInterface(devPath=self._port, connectNow=True)
+        # Let the config bootstrap complete so we don't miss early arrivals.
+        time.sleep(1.0)
+        return self
+
+    def __exit__(self, exc_type: Any, exc: Any, tb: Any) -> None:
+        from pubsub import pub  # type: ignore[import-untyped]
+
+        if self._handler_ref is not None:
+            try:
+                pub.unsubscribe(self._handler_ref, self._topic)
+            except Exception:
+                pass
+        if self._log_handler_ref is not None:
+            try:
+                pub.unsubscribe(self._log_handler_ref, "meshtastic.log.line")
+            except Exception:
+                pass
+        if self._iface is not None:
+            try:
+                self._iface.close()
+            except Exception:
+                pass
+
+    def snapshot(self) -> list[dict[str, Any]]:
+        """Return a thread-safe copy of the list of collected packets."""
+        with self._lock:
+            return list(self._packets)
+
+    def log_snapshot(self) -> list[str]:
+        """Return captured firmware log lines.
+
+        Only populated if `capture_logs=True` AND the device has
+        `security.debug_log_api_enabled=True`.
+        """
+        with self._lock:
+            return list(self._log_lines)
+
+    def send_text(
+        self,
+        text: str,
+        destination_id: Any = "^all",
+        want_ack: bool = False,
+        channel_index: int = 0,
+    ) -> Any:
+        """Send a text packet through the already-open SerialInterface.
+
+        Use this when a test also has a ReceiveCollector open on the same port
+        — `admin.send_text(port=...)` would try to open a second SerialInterface
+        and fail the port lock.
+        """
+        if self._iface is None:
+            raise RuntimeError("ReceiveCollector not started; use as context manager")
+        return self._iface.sendText(
+            text,
+            destinationId=destination_id,
+            wantAck=want_ack,
+            channelIndex=channel_index,
+        )
+
+    def broadcast_nodeinfo_ping(self) -> None:
+        """Force the firmware on `port` to broadcast a fresh NodeInfo.
+
+        Thin wrapper around the module-level :func:`nudge_nodeinfo` that
+        also validates the context-manager invariant. Delegates so tests
+        that need to nudge BOTH sides (bilateral PKI warmup) share one
+        implementation — the caller just passes each iface in turn.
+
+        Firmware-side details (rate-limit bypass, nonce==1 trigger path,
+        shorterTimeout=true window) are documented on the module-level
+        helper.
+        """
+        if self._iface is None:
+            raise RuntimeError("ReceiveCollector not started; use as context manager")
+        nudge_nodeinfo(self._iface)
+
+    def wait_for(
+        self,
+        predicate: Callable[[dict[str, Any]], bool],
+        timeout: float = 60.0,
+        poll_interval: float = 0.5,
+    ) -> dict[str, Any] | None:
+        """Block until a received packet matches `predicate` or timeout.
+
+        Returns the matching packet (truthy) or None (falsy).
+        """
+        deadline = time.monotonic() + timeout
+        while time.monotonic() < deadline:
+            with self._lock:
+                for pkt in self._packets:
+                    try:
+                        if predicate(pkt):
+                            return pkt
+                    except Exception:
+                        continue
+            time.sleep(poll_interval)
+        return None
diff --git a/mcp-server/tests/mesh/test_bidirectional.py b/mcp-server/tests/mesh/test_bidirectional.py
new file mode 100644
index 00000000000..806040a927b
--- /dev/null
+++ b/mcp-server/tests/mesh/test_bidirectional.py
@@ -0,0 +1,83 @@
+"""Mesh: explicit two-way communication, single pass/fail.
+
+Opens a ReceiveCollector on EVERY role, sends a uniquely-tagged broadcast
+from each role in turn, and asserts every OTHER role saw it. One atomic
+test that answers "is the mesh actually working both directions?".
+
+Not parametrized — it inherently involves the full hub.
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+
+import pytest
+
+from ._receive import ReceiveCollector
+
+
+@pytest.mark.timeout(300)
+def test_bidirectional_mesh_communication(
+    baked_mesh: dict[str, Any],
+) -> None:
+    """Requires ≥2 baked roles.
+
+    For each role, broadcast a unique tag. Assert every other role's
+    ReceiveCollector saw that tag within a 120s window per direction.
+    """
+    roles = sorted(baked_mesh.keys())
+    if len(roles) < 2:
+        pytest.skip(f"need ≥2 roles; have {roles!r}")
+
+    # Open receive collectors on every role BEFORE sending anything.
+    collectors: dict[str, ReceiveCollector] = {}
+    try:
+        for role in roles:
+            rx = ReceiveCollector(
+                baked_mesh[role]["port"], topic="meshtastic.receive.text"
+            )
+            rx.__enter__()
+            collectors[role] = rx
+
+        # Let the meshtastic interfaces stabilize before the first send
+        time.sleep(2.0)
+
+        # From each role, send a uniquely-tagged broadcast. We MUST send through
+        # the already-open collector — opening a new SerialInterface here would
+        # race the collector's exclusive lock on the port.
+        tags: dict[str, str] = {}
+        for sender in roles:
+            tag = f"bidi-{sender}-{int(time.time() * 1000) % 100_000}"
+            tags[sender] = tag
+            collectors[sender].send_text(tag)
+            # Small gap so airtime doesn't overlap
+            time.sleep(4.0)
+
+        # Every OTHER role must see every sender's tag within 120s each
+        missing: list[str] = []
+        for sender, tag in tags.items():
+            for receiver in roles:
+                if receiver == sender:
+                    continue
+                got = collectors[receiver].wait_for(
+                    lambda pkt, t=tag: pkt.get("decoded", {}).get("text") == t,
+                    timeout=120,
+                )
+                if got is None:
+                    observed = [
+                        p.get("decoded", {}).get("text")
+                        for p in collectors[receiver].snapshot()
+                    ]
+                    missing.append(
+                        f"{sender}->{receiver}: tag {tag!r} not seen; "
+                        f"receiver got {observed!r}"
+                    )
+
+        assert not missing, "bidirectional comms incomplete:\n  " + "\n  ".join(missing)
+    finally:
+        for rx in collectors.values():
+            try:
+                rx.__exit__(None, None, None)
+            except Exception:
+                pass
diff --git a/mcp-server/tests/mesh/test_broadcast_delivers.py b/mcp-server/tests/mesh/test_broadcast_delivers.py
new file mode 100644
index 00000000000..2498b95c210
--- /dev/null
+++ b/mcp-server/tests/mesh/test_broadcast_delivers.py
@@ -0,0 +1,45 @@
+"""Mesh: broadcast text from TX arrives at RX.
+
+Uses `meshtastic.SerialInterface` pubsub on RX to detect the decoded text
+packet — `pio device monitor` output doesn't include message bodies.
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+
+import pytest
+from meshtastic_mcp import admin
+
+from ._receive import ReceiveCollector
+
+
+@pytest.mark.timeout(180)
+def test_broadcast_delivers(
+    mesh_pair: dict[str, Any],
+) -> None:
+    """Runs for every directed role pair. TX sends a unique broadcast text;
+    RX must receive the decoded text via the meshtastic pubsub receive topic
+    within 120s.
+    """
+    tx_port = mesh_pair["tx"]["port"]
+    rx_port = mesh_pair["rx"]["port"]
+    tx_role = mesh_pair["tx_role"]
+    rx_role = mesh_pair["rx_role"]
+
+    unique = f"mcp-{tx_role}-to-{rx_role}-{int(time.time())}"
+
+    with ReceiveCollector(rx_port, topic="meshtastic.receive.text") as rx:
+        admin.send_text(text=unique, port=tx_port)
+
+        got = rx.wait_for(
+            lambda pkt: pkt.get("decoded", {}).get("text") == unique,
+            timeout=120,
+        )
+
+    assert got is not None, (
+        f"broadcast {unique!r} from {tx_role} not received at {rx_role} within 120s. "
+        f"RX saw {len(rx.snapshot())} text packet(s): "
+        f"{[p.get('decoded', {}).get('text') for p in rx.snapshot()]!r}"
+    )
diff --git a/mcp-server/tests/mesh/test_direct_with_ack.py b/mcp-server/tests/mesh/test_direct_with_ack.py
new file mode 100644
index 00000000000..6380bf94e62
--- /dev/null
+++ b/mcp-server/tests/mesh/test_direct_with_ack.py
@@ -0,0 +1,105 @@
+"""Mesh: direct text addressed to RX's node_num arrives at RX.
+
+Uses the same pubsub receive pattern as `test_broadcast_delivers`, but sends
+with `destinationId=<rx_node_num>` and `wantAck=True`. The assertion is that
+the RX firmware accepted and decoded the text; the ACK is handled by the
+firmware transparently (and fires automatically when wantAck is set + the
+destination is the local node).
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+
+import pytest
+from meshtastic_mcp.connection import connect
+
+from ._receive import ReceiveCollector, nudge_nodeinfo
+
+
+@pytest.mark.timeout(240)
+def test_direct_with_ack_roundtrip(
+    mesh_pair: dict[str, Any],
+) -> None:
+    """Runs for every directed pair. Addressed send from TX to RX's node_num
+    with want_ack=True; RX must receive the decoded text via pubsub.
+
+    Why this proves ACK: setting want_ack on a directed send causes the
+    firmware to retry until an ACK is received. If RX's decoded.text fires
+    once, both the outbound text AND the inbound ACK happened.
+    """
+    tx_port = mesh_pair["tx"]["port"]
+    rx_port = mesh_pair["rx"]["port"]
+    rx_node_num = mesh_pair["rx"]["my_node_num"]
+    tx_role = mesh_pair["tx_role"]
+    rx_role = mesh_pair["rx_role"]
+    assert rx_node_num is not None, f"{rx_role} my_node_num missing"
+
+    unique = f"mcp-ack-{tx_role}-to-{rx_role}-{int(time.time())}"
+
+    # TX iface stays open across the RX wait — sendText+wantAck relies on
+    # the firmware's retransmit loop, which races the SerialInterface close.
+    # Bilateral NodeInfo nudge: directed packets are PKI-encrypted, so BOTH
+    # sides need current pubkeys (err=35/39 otherwise). See
+    # `tests/mesh/_receive.py::nudge_nodeinfo` for the heartbeat-nonce=1
+    # firmware path.
+    with ReceiveCollector(rx_port, topic="meshtastic.receive.text") as rx:
+        rx.broadcast_nodeinfo_ping()
+
+        with connect(port=tx_port) as tx_iface:
+            nudge_nodeinfo(tx_iface)
+
+            pk_deadline = time.monotonic() + 45.0
+            last_nudge = time.monotonic()
+            last_rec: dict[str, Any] = {}
+            while time.monotonic() < pk_deadline:
+                last_rec = (tx_iface.nodesByNum or {}).get(rx_node_num, {})
+                user = last_rec.get("user", {})
+                if user.get("publicKey"):
+                    break
+                # Re-nudge both sides every 15 s in case a broadcast was
+                # lost to a LoRa collision.
+                if time.monotonic() - last_nudge > 15.0:
+                    rx.broadcast_nodeinfo_ping()
+                    nudge_nodeinfo(tx_iface)
+                    last_nudge = time.monotonic()
+                time.sleep(1.0)
+            else:
+                pytest.fail(
+                    f"TX ({tx_role}) never saw RX ({rx_role}) public key "
+                    f"within 45s; nodesByNum entry={last_rec!r}"
+                )
+
+            # Retry covers LoRa collisions. Re-nudge both sides between
+            # attempts — if RX's cached TX pubkey is stale, just re-sending
+            # the text doesn't heal it; re-broadcasting NodeInfo does.
+            got = None
+            for _attempt in range(2):
+                packet = tx_iface.sendText(
+                    unique,
+                    destinationId=rx_node_num,
+                    wantAck=True,
+                )
+                assert packet is not None, "sendText returned None"
+                got = rx.wait_for(
+                    lambda pkt: pkt.get("decoded", {}).get("text") == unique,
+                    timeout=30,
+                )
+                if got is not None:
+                    break
+                rx.broadcast_nodeinfo_ping()
+                nudge_nodeinfo(tx_iface)
+                time.sleep(5.0)
+
+    assert got is not None, (
+        f"directed send {unique!r} from {tx_role} to {rx_role} "
+        f"(node_num 0x{rx_node_num:08x}) not received within 120s. "
+        f"RX saw {len(rx.snapshot())} text packet(s): "
+        f"{[p.get('decoded', {}).get('text') for p in rx.snapshot()]!r}"
+    )
+    # Additional: confirm the destination matches (not leaked broadcast)
+    assert got.get("to") == rx_node_num, (
+        f"received packet destination mismatch: to={got.get('to')}, "
+        f"expected 0x{rx_node_num:08x}"
+    )
diff --git a/mcp-server/tests/mesh/test_mesh_formation.py b/mcp-server/tests/mesh/test_mesh_formation.py
new file mode 100644
index 00000000000..9ec37abcab8
--- /dev/null
+++ b/mcp-server/tests/mesh/test_mesh_formation.py
@@ -0,0 +1,39 @@
+"""Mesh: two devices baked with the same session profile discover each other.
+
+The fundamental "does my mesh work" test. If both devices share a PSK, LoRa
+region, modem preset, and channel slot, they should hear each other's
+NodeInfo packets within ~60s of boot and appear in each other's `nodesByNum`
+DB.
+"""
+
+from __future__ import annotations
+
+from typing import Any
+
+import pytest
+from meshtastic_mcp.connection import connect
+
+
+@pytest.mark.timeout(180)
+def test_mesh_formation_within_60s(mesh_pair: dict[str, Any], wait_until) -> None:
+    """Runs for every directed role pair — so we prove `A sees B in its node
+    DB` AND `B sees A in its node DB` independently. A one-sided pass can
+    mask a real problem (e.g. device A's RX works but its TX is dead).
+    """
+    observer_port = mesh_pair["tx"]["port"]
+    target_node_num = mesh_pair["rx"]["my_node_num"]
+    assert (
+        target_node_num is not None
+    ), f"{mesh_pair['rx']['role']} my_node_num not populated"
+
+    def target_visible_from_observer() -> bool:
+        with connect(port=observer_port) as iface:
+            nodes = iface.nodesByNum or {}
+            return target_node_num in nodes
+
+    wait_until(
+        target_visible_from_observer,
+        timeout=120,
+        backoff_start=2.0,
+        backoff_max=10.0,
+    )
diff --git a/mcp-server/tests/mesh/test_traceroute.py b/mcp-server/tests/mesh/test_traceroute.py
new file mode 100644
index 00000000000..01ef4c39bf4
--- /dev/null
+++ b/mcp-server/tests/mesh/test_traceroute.py
@@ -0,0 +1,147 @@
+"""Mesh: traceroute from TX to RX round-trips with no intermediate hops.
+
+TX sends a `TRACEROUTE_APP` request (RouteDiscovery with `want_response=True`)
+addressed to RX's node_num. RX's firmware (`modules/TraceRouteModule.cpp`)
+replies with a RouteDiscovery payload whose `route` / `route_back` lists
+contain any intermediate relays and `snr_towards` / `snr_back` carry per-hop
+SNRs. In a 2-device direct mesh there are no relays between TX and RX, so
+both route lists must be empty and each SNR list carries exactly one entry
+for the direct TX↔RX link.
+
+Validates the full TRACEROUTE_APP portnum round-trip: request encoding, RX
+firmware dispatch, RouteDiscovery payload construction, wire response, and
+client-side decode through `meshtastic.__init__.py::protocols[TRACEROUTE_APP]`
+(which is what publishes the `meshtastic.receive.traceroute` pubsub topic).
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+
+import pytest
+from meshtastic.mesh_interface import MeshInterface
+
+from ._receive import ReceiveCollector, nudge_nodeinfo_port
+
+
+@pytest.mark.timeout(240)
+def test_traceroute_one_hop(mesh_pair: dict[str, Any]) -> None:
+    """Runs for every directed pair. Asserts TX sends + RX responds, then
+    inspects the captured RouteDiscovery to confirm the path is direct.
+
+    Why the listener is on TX (not RX):
+        The traceroute RESPONSE is addressed to TX (the original requester).
+        The meshtastic Python client publishes `meshtastic.receive.traceroute`
+        on the interface that received that response — which is TX's iface.
+        A listener on RX would only see the inbound REQUEST, which lacks
+        the SNR-towards / SNR-back fields the firmware only fills on reply.
+
+    Why we ping RX's NodeInfo before sending:
+        Traceroute requests are directed sends (wantResponse=True, specific
+        destinationId) — subject to the same PKI_SEND_FAIL_PUBLIC_KEY trap
+        as `test_direct_with_ack`. We open RX briefly to trigger the
+        on-demand NodeInfo broadcast, then wait for TX's nodesByNum to
+        populate RX's publicKey before calling sendTraceRoute.
+    """
+    tx_port = mesh_pair["tx"]["port"]
+    rx_port = mesh_pair["rx"]["port"]
+    rx_node_num = mesh_pair["rx"]["my_node_num"]
+    tx_role = mesh_pair["tx_role"]
+    rx_role = mesh_pair["rx_role"]
+    assert rx_node_num is not None, f"{rx_role} my_node_num missing"
+
+    with ReceiveCollector(
+        tx_port, topic="meshtastic.receive.traceroute"
+    ) as tx_listener:
+        # Bilateral PKI warmup — traceroute requests are directed and
+        # PKI-encrypted, so both sides need current pubkeys. See
+        # `_receive.py::nudge_nodeinfo` and the test_direct_with_ack
+        # comment for the full rationale (one-sided nudge lets err=35
+        # PKI_UNKNOWN_PUBKEY slip through in whichever direction had
+        # stale RX-side cache).
+        nudge_nodeinfo_port(rx_port)  # RX via brief side-connection
+        tx_listener.broadcast_nodeinfo_ping()  # TX via already-open iface
+
+        # Poll TX's view of RX until the publicKey propagates. 45 s matches
+        # the cap used in `test_direct_with_ack`; the re-nudge at 15 s
+        # covers a LoRa collision on the first NodeInfo broadcast.
+        pk_deadline = time.monotonic() + 45.0
+        last_nudge = time.monotonic()
+        last_rec: dict[str, Any] = {}
+        while time.monotonic() < pk_deadline:
+            last_rec = (tx_listener._iface.nodesByNum or {}).get(rx_node_num, {})
+            if last_rec.get("user", {}).get("publicKey"):
+                break
+            if time.monotonic() - last_nudge > 15.0:
+                nudge_nodeinfo_port(rx_port)
+                tx_listener.broadcast_nodeinfo_ping()
+                last_nudge = time.monotonic()
+            time.sleep(1.0)
+        else:
+            pytest.fail(
+                f"TX ({tx_role}) never saw RX ({rx_role}) public key within "
+                f"45s; nodesByNum entry={last_rec!r}"
+            )
+
+        # sendTraceRoute blocks internally on `waitForTraceRoute` and raises
+        # `MeshInterface.MeshInterfaceError` on timeout. One retry covers a
+        # transient LoRa collision on either the request or the reply.
+        ok = False
+        for _attempt in range(2):
+            try:
+                tx_listener._iface.sendTraceRoute(
+                    dest=rx_node_num,
+                    hopLimit=3,
+                )
+                ok = True
+                break
+            except MeshInterface.MeshInterfaceError:
+                time.sleep(5.0)
+        assert ok, (
+            f"sendTraceRoute {tx_role}→{rx_role} timed out twice; the mesh "
+            f"may be saturated or RX's TraceRouteModule is misrouting the "
+            f"reply"
+        )
+
+        # sendTraceRoute already waited for the response internally, but
+        # pubsub dispatch runs on the meshtastic-python reader thread —
+        # give it a short grace window to queue the packet.
+        packet = tx_listener.wait_for(
+            lambda p: p.get("from") == rx_node_num,
+            timeout=5.0,
+        )
+        assert packet is not None, (
+            f"sendTraceRoute returned OK but no `receive.traceroute` packet "
+            f"from RX (0x{rx_node_num:08x}) arrived via pubsub. Captured: "
+            f"{tx_listener.snapshot()!r}"
+        )
+
+        # Inspect the decoded RouteDiscovery. The meshtastic client stores
+        # the parsed protobuf (as a plain dict via MessageToDict) under
+        # `decoded.traceroute` for this portnum; keys are camelCase because
+        # protobuf JSON conversion uses `preserving_proto_field_name=False`
+        # by default.
+        decoded = packet.get("decoded", {})
+        route_info = decoded.get("traceroute") or {}
+
+        forward_hops = route_info.get("route") or []
+        back_hops = route_info.get("routeBack") or []
+        snr_towards = route_info.get("snrTowards") or []
+
+        assert forward_hops == [], (
+            f"traceroute forward `route` should be empty on a 2-device direct "
+            f"mesh (no intermediaries between {tx_role} and {rx_role}); got "
+            f"{forward_hops!r}"
+        )
+        assert back_hops == [], (
+            f"traceroute `routeBack` should be empty on a 2-device direct "
+            f"mesh; got {back_hops!r}"
+        )
+        # `snr_towards` has len(route) + 1 entries — one per hop plus a final
+        # entry for the destination's receive SNR. Direct mesh → len(route)
+        # is 0 → exactly 1 SNR entry.
+        assert len(snr_towards) == 1, (
+            f"traceroute `snrTowards` should carry exactly 1 entry (direct "
+            f"link SNR) on a 2-device mesh; got {snr_towards!r}"
+        )
diff --git a/mcp-server/tests/monitor/__init__.py b/mcp-server/tests/monitor/__init__.py
new file mode 100644
index 00000000000..e69de29bb2d
diff --git a/mcp-server/tests/monitor/test_boot_log_no_panic.py b/mcp-server/tests/monitor/test_boot_log_no_panic.py
new file mode 100644
index 00000000000..2e9bab0286e
--- /dev/null
+++ b/mcp-server/tests/monitor/test_boot_log_no_panic.py
@@ -0,0 +1,63 @@
+"""Monitor: boot log is clean — no panic markers in the first 60 seconds.
+
+This is the single highest-signal test for catching firmware regressions.
+If a commit broke something critical at boot (stack overflow, NULL deref, HAL
+misconfig), this test fails within a minute of reboot.
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+
+import pytest
+from meshtastic_mcp import admin
+
+# Substrings that indicate a panic/assert/crash. Case-insensitive.
+_PANIC_MARKERS = [
+    "guru meditation",
+    "corrupt heap",
+    "abort()",
+    "assertion failed",
+    "***",  # ESP-IDF "*** something" panic prefix
+    "panic",
+    "stack overflow",
+    "load prohibited",
+    "store prohibited",
+    "illegalinstr",
+    "watchdog got triggered",
+]
+
+
+@pytest.mark.timeout(180)
+def test_boot_log_no_panic(
+    baked_single: dict[str, Any],
+    serial_capture,
+    role_env,
+    wait_until,
+) -> None:
+    """Runs once per connected role — each device must boot cleanly,
+    independently. A panic on one role shouldn't mask another."""
+    role = baked_single["role"]
+    port = baked_single["port"]
+    env = role_env(role)
+
+    # Start monitor BEFORE reboot so we catch the reset banner + early boot
+    cap = serial_capture(role, env=env)
+    time.sleep(1.0)
+
+    # Trigger reboot
+    admin.reboot(port=port, confirm=True, seconds=3)
+    # Wait through the reboot+boot window
+    time.sleep(60.0)
+
+    lines = cap.snapshot(max_lines=4000)
+    assert lines, "serial capture returned no log lines — monitor may have failed"
+    blob = "\n".join(lines).lower()
+
+    hits = [marker for marker in _PANIC_MARKERS if marker in blob]
+    assert (
+        not hits
+    ), f"panic markers in boot log: {hits!r}\n\n" f"last 60 lines:\n" + "\n".join(
+        lines[-60:]
+    )
diff --git a/mcp-server/tests/provisioning/__init__.py b/mcp-server/tests/provisioning/__init__.py
new file mode 100644
index 00000000000..e69de29bb2d
diff --git a/mcp-server/tests/provisioning/test_admin_key_baked.py b/mcp-server/tests/provisioning/test_admin_key_baked.py
new file mode 100644
index 00000000000..f09e7b9179b
--- /dev/null
+++ b/mcp-server/tests/provisioning/test_admin_key_baked.py
@@ -0,0 +1,83 @@
+"""Provisioning: baked admin keys end up in the device's security config.
+
+Fleet operators pre-bake an `USERPREFS_USE_ADMIN_KEY_0` into firmware so that
+remote-admin messages from a central controller are accepted. This test
+verifies the key bytes make the round-trip: USERPREFS → build-time `-D` flag
+→ firmware → `localConfig.security.admin_key`.
+"""
+
+from __future__ import annotations
+
+import os
+from typing import Any
+
+import pytest
+from meshtastic_mcp import admin, flash
+
+# Deterministic 32-byte "admin key" — just the byte values 0..31 for easy
+# recognition in the output, formatted as a C brace-init.
+_ADMIN_KEY_BYTES = list(range(32))
+_ADMIN_KEY_BRACE = "{ " + ", ".join(f"0x{b:02x}" for b in _ADMIN_KEY_BYTES) + " }"
+
+
+@pytest.mark.skip(
+    reason="test uses flash.erase_and_flash which shells to bin/device-install.sh "
+    "which needs mt-esp32s3-ota.bin (not in repo). TODO: switch to "
+    "esptool_erase_flash + flash.flash() like test_00_bake."
+)
+@pytest.mark.timeout(600)
+def test_admin_key_baked(
+    hub_devices: dict[str, str],
+    test_profile: dict[str, Any],
+) -> None:
+    """Bake test_profile + admin key 0; verify `security.admin_key` contains
+    the baked bytes after boot. Re-bakes session profile (without admin key)
+    on teardown so downstream tests see baseline state.
+    """
+    target = "esp32s3"
+    if target not in hub_devices:
+        pytest.skip(f"role {target!r} not on hub")
+    port = hub_devices[target]
+    env = os.environ.get("MESHTASTIC_MCP_ENV_ESP32S3", "t-beam-1w")
+
+    augmented = dict(test_profile)
+    augmented["USERPREFS_USE_ADMIN_KEY_0"] = _ADMIN_KEY_BRACE
+
+    try:
+        result = flash.erase_and_flash(
+            env=env,
+            port=port,
+            confirm=True,
+            userprefs_overrides=augmented,
+        )
+        assert result["exit_code"] == 0
+
+        security = admin.get_config(section="security", port=port)["config"]["security"]
+        # `admin_key` may be a list of byte-sequences under newer protobuf, or
+        # a single bytes field under older. We accept either as long as the
+        # baked bytes appear somewhere in the serialization.
+        key_field = security.get("admin_key")
+        import base64
+        import json
+
+        serialized = json.dumps(security)
+
+        # Protobuf→JSON typically base64-encodes bytes fields. Encode our
+        # expected bytes and look for them (or a substring) in the serialized
+        # security config.
+        b64 = base64.b64encode(bytes(_ADMIN_KEY_BYTES)).decode("ascii").rstrip("=")
+        assert (
+            b64[:40] in serialized or "admin_key" in serialized
+        ), f"admin_key bytes not visible in security config: {security!r}"
+        assert (
+            key_field is not None
+        ), "security.admin_key field absent — baking key 0 didn't stick"
+    finally:
+        # Restore session profile (no admin key)
+        restore = flash.erase_and_flash(
+            env=env,
+            port=port,
+            confirm=True,
+            userprefs_overrides=test_profile,
+        )
+        assert restore["exit_code"] == 0
diff --git a/mcp-server/tests/provisioning/test_bake_region_modem_slot.py b/mcp-server/tests/provisioning/test_bake_region_modem_slot.py
new file mode 100644
index 00000000000..32546428814
--- /dev/null
+++ b/mcp-server/tests/provisioning/test_bake_region_modem_slot.py
@@ -0,0 +1,60 @@
+"""Provisioning: the pre-bake recipe (US/LONG_FAST/slot 88/private channel)
+lands on the device exactly as specified.
+
+This is THE test that proves the MCP's core value prop — flashing firmware
+with a preset USERPREFS produces a device in the expected radio config without
+any post-flash admin steps.
+"""
+
+from __future__ import annotations
+
+from typing import Any
+
+import pytest
+from meshtastic_mcp import admin, info
+
+
+@pytest.mark.timeout(60)
+def test_bake_sets_region_preset_and_slot(
+    baked_mesh: dict[str, Any],
+    test_profile: dict[str, Any],
+) -> None:
+    """After test_00_bake, both devices must report the exact region, modem
+    preset, slot, and channel name that the profile specified."""
+    for role, state in baked_mesh.items():
+        port = state["port"]
+        live = info.device_info(port=port, timeout_s=8.0)
+        lora = admin.get_config(section="lora", port=port)["config"]["lora"]
+
+        expected_region = test_profile["USERPREFS_CONFIG_LORA_REGION"].rsplit("_", 1)[
+            -1
+        ]
+        expected_preset = test_profile["USERPREFS_LORACONFIG_MODEM_PRESET"].rsplit(
+            "_", 2
+        )[-2:]
+        expected_preset_str = "_".join(expected_preset)
+
+        assert (
+            live["region"] == expected_region
+        ), f"{role}: region={live['region']!r}, expected {expected_region!r}"
+
+        # `modem_preset` is omitted from the protobuf→JSON dump when the
+        # device is using the default enum value (LONG_FAST). If the key is
+        # missing AND we expected LONG_FAST, that's a match. Otherwise compare.
+        live_preset = lora.get("modem_preset")
+        if live_preset is None:
+            assert expected_preset_str == "LONG_FAST", (
+                f"{role}: modem_preset omitted (means default LONG_FAST), "
+                f"but expected {expected_preset_str!r}"
+            )
+        else:
+            assert live_preset in (
+                expected_preset_str,
+                expected_preset_str.upper(),
+            ), f"{role}: modem_preset={live_preset!r}, expected {expected_preset_str!r}"
+
+        assert (
+            int(lora.get("channel_num", 0))
+            == test_profile["USERPREFS_LORACONFIG_CHANNEL_NUM"]
+        ), f"{role}: channel_num={lora.get('channel_num')!r}"
+        assert live["primary_channel"] == test_profile["USERPREFS_CHANNEL_0_NAME"]
diff --git a/mcp-server/tests/provisioning/test_unset_region_blocks_tx.py b/mcp-server/tests/provisioning/test_unset_region_blocks_tx.py
new file mode 100644
index 00000000000..19b9934d0b5
--- /dev/null
+++ b/mcp-server/tests/provisioning/test_unset_region_blocks_tx.py
@@ -0,0 +1,108 @@
+"""Provisioning (negative): firmware baked WITHOUT
+`USERPREFS_CONFIG_LORA_REGION` must refuse to transmit.
+
+Real operator concern: FCC compliance. A device shipped without an explicit
+region setting must not emit RF until the operator sets a region — this test
+proves the firmware honors that invariant when the USERPREFS bake deliberately
+omits the region key.
+
+Teardown re-bakes the session `test_profile` so downstream shared-state
+tiers (admin/mesh/telemetry) still see a correctly configured mesh.
+"""
+
+from __future__ import annotations
+
+from typing import Any
+
+import pytest
+from meshtastic_mcp import admin, flash, info
+
+
+@pytest.mark.skip(
+    reason="test uses flash.erase_and_flash which shells to bin/device-install.sh "
+    "which needs mt-esp32s3-ota.bin (not in repo). TODO: switch to "
+    "esptool_erase_flash + flash.flash() like test_00_bake."
+)
+@pytest.mark.timeout(600)
+def test_unset_region_blocks_tx(
+    hub_devices: dict[str, str],
+    no_region_profile: dict[str, Any],
+    test_profile: dict[str, Any],
+    serial_capture,
+) -> None:
+    """Bake a device with no LoRa region, then assert:
+      1. `config.lora.region` reads as "UNSET" (or 0).
+      2. An attempt to `send_text` surfaces a refusal — either the meshtastic
+         SDK raises, or the serial log contains a clear "region unset" marker.
+
+    Always re-bakes the session test_profile in the finalizer so downstream
+    categories are not left with a broken device.
+    """
+    target = "esp32s3"
+    if target not in hub_devices:
+        pytest.skip(f"role {target!r} not on hub")
+    port = hub_devices[target]
+
+    # Pick the right env for this role — must match what test_00_bake used.
+    import os
+
+    env = os.environ.get("MESHTASTIC_MCP_ENV_ESP32S3", "t-beam-1w")
+
+    # Capture serial before the bake to see the "region unset" log line on boot
+    cap = serial_capture(target, env=env)
+
+    # Bake without region
+    result = flash.erase_and_flash(
+        env=env,
+        port=port,
+        confirm=True,
+        userprefs_overrides=no_region_profile,
+    )
+    assert (
+        result["exit_code"] == 0
+    ), f"bake of no_region_profile failed:\n{result.get('stderr_tail', '')}"
+
+    try:
+        # After bake, device should boot with region=UNSET
+        live = info.device_info(port=port, timeout_s=12.0)
+        assert live.get("region") in (None, "UNSET", "UNSET_0", ""), (
+            f"expected region UNSET after baking without region pref; "
+            f"got {live.get('region')!r}"
+        )
+
+        # Attempting to send a message should either raise or be logged as
+        # refused. The meshtastic SDK's sendText may raise in this condition,
+        # or it may accept the call but the firmware rejects on air.
+        send_failed = False
+        try:
+            admin.send_text(text="should not transmit", port=port)
+        except Exception:
+            send_failed = True
+
+        # Give the firmware a moment to log anything
+        import time as _time
+
+        _time.sleep(3.0)
+        log = "\n".join(cap.snapshot(max_lines=2000))
+        # We expect EITHER the send raised at the Python layer, OR the serial
+        # log explicitly says region is unset.
+        log_says_unset = any(
+            marker in log.lower()
+            for marker in ("region unset", "region is unset", "no region set")
+        )
+        assert send_failed or log_says_unset, (
+            "expected send to fail or log 'region unset'; neither happened.\n"
+            f"log tail:\n{log[-2000:]}"
+        )
+    finally:
+        # Re-bake the session profile so downstream tests work.
+        restore = flash.erase_and_flash(
+            env=env,
+            port=port,
+            confirm=True,
+            userprefs_overrides=test_profile,
+        )
+        assert restore["exit_code"] == 0, (
+            "CRITICAL: failed to re-bake session profile after "
+            "no-region test; downstream tests will fail."
+        )
diff --git a/mcp-server/tests/provisioning/test_userprefs_survive_factory_reset.py b/mcp-server/tests/provisioning/test_userprefs_survive_factory_reset.py
new file mode 100644
index 00000000000..ef3f228d765
--- /dev/null
+++ b/mcp-server/tests/provisioning/test_userprefs_survive_factory_reset.py
@@ -0,0 +1,90 @@
+"""Provisioning: after a non-full factory_reset, USERPREFS defaults come back.
+
+Real operator concern: "if someone resets my fleet device, will it come back
+on my private mesh or on Meshtastic defaults?" A baked USERPREFS recipe
+should be the factory floor for the device — reset goes back to THAT state,
+not to stock Meshtastic.
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+
+import pytest
+from meshtastic_mcp import admin, info
+
+from .._port_discovery import resolve_port_by_role
+
+
+@pytest.mark.timeout(180)
+def test_baked_prefs_survive_factory_reset(
+    baked_single: dict[str, Any],
+    test_profile: dict[str, Any],
+    wait_until,
+) -> None:
+    """Runs once per connected role. Flow:
+    1. Change owner name to a known-non-default value.
+    2. Trigger factory_reset(full=False).
+    3. Rediscover the port (macOS re-enumerates the CDC endpoint on nRF52
+       factory_reset; the path can change e.g. `/dev/cu.usbmodem101` →
+       `/dev/cu.usbmodem1101`).
+    4. Wait for device to come back.
+    5. Confirm owner is back to USERPREFS-baked default (or blank default if
+       not baked), and primary channel/region/slot are still the baked values.
+    """
+    role = baked_single["role"]
+    port = baked_single["port"]
+
+    # Snapshot pre-reset config
+    pre_reset = info.device_info(port=port, timeout_s=8.0)
+    original_long_name = pre_reset.get("long_name")
+
+    # Poison the owner name with a non-default marker
+    admin.set_owner(long_name="PoisonMarker", short_name="POIZ", port=port)
+    time.sleep(2.0)
+
+    # Confirm poison stuck before reset
+    poisoned = info.device_info(port=port, timeout_s=8.0)
+    assert poisoned.get("long_name") == "PoisonMarker"
+
+    # Trigger non-full factory reset
+    admin.factory_reset(port=port, confirm=True, full=False)
+
+    # Device re-enumerates — rediscover its port before probing. nRF52's
+    # CDC endpoint drops and comes back with a new `/dev/cu.usbmodem*`
+    # path on macOS; ESP32-S3 usually keeps the same path but the helper
+    # works either way (it just returns the current path for this role).
+    # Early sleep lets the USB kernel driver settle before we start
+    # polling — list_devices during a transient re-enumeration can return
+    # an empty list and the helper's poll-with-backoff handles that too,
+    # so the sleep is optimization not correctness.
+    time.sleep(10.0)
+    port = resolve_port_by_role(role, timeout_s=60.0)
+    wait_until(
+        lambda: info.device_info(port=port, timeout_s=5.0).get("my_node_num")
+        is not None,
+        timeout=60,
+        backoff_start=1.0,
+    )
+
+    post = info.device_info(port=port, timeout_s=8.0)
+    # The key assertion: channel + region are STILL the USERPREFS-baked values,
+    # NOT Meshtastic stock defaults (which would be "LongFast" and the region
+    # the device shipped with).
+    assert post["primary_channel"] == test_profile["USERPREFS_CHANNEL_0_NAME"], (
+        f"after factory_reset, primary_channel reverted to "
+        f"{post['primary_channel']!r}, not baked {test_profile['USERPREFS_CHANNEL_0_NAME']!r}"
+    )
+    expected_region = test_profile["USERPREFS_CONFIG_LORA_REGION"].rsplit("_", 1)[-1]
+    assert post.get("region") == expected_region
+
+    # Owner name should NOT be "PoisonMarker" anymore
+    assert (
+        post.get("long_name") != "PoisonMarker"
+    ), "factory_reset did not wipe the poisoned owner name"
+
+    # If we had an original_long_name, restore it so downstream tests see
+    # the same baseline.
+    if original_long_name and post.get("long_name") != original_long_name:
+        admin.set_owner(long_name=original_long_name, port=port)
diff --git a/mcp-server/tests/telemetry/__init__.py b/mcp-server/tests/telemetry/__init__.py
new file mode 100644
index 00000000000..e69de29bb2d
diff --git a/mcp-server/tests/telemetry/test_device_telemetry_broadcast.py b/mcp-server/tests/telemetry/test_device_telemetry_broadcast.py
new file mode 100644
index 00000000000..2a909fbce88
--- /dev/null
+++ b/mcp-server/tests/telemetry/test_device_telemetry_broadcast.py
@@ -0,0 +1,77 @@
+"""Telemetry: device-metrics packets arrive at the peer.
+
+Two-path verification:
+  1. Listen on TX's pubsub for inbound telemetry packets originating from
+     RX's node_num — if one arrives within the window, telemetry works.
+  2. Fall back to checking TX's node DB for a populated `deviceMetrics`
+     block on the RX record (which the firmware writes on receipt).
+
+Both paths prove the same invariant; path 1 gives faster failure signal,
+path 2 handles the case where the packet arrived before we subscribed.
+
+Warmup note: when this test runs after `test_baked_prefs_survive_factory_reset`,
+both devices have empty node-DBs. We kick a broadcast text from RX through
+its own ReceiveCollector so TX learns RX exists and starts accepting its
+telemetry; without it, a fresh-boot pair can take 10+ min to swap NODEINFO
+before the first telemetry arrives.
+"""
+
+from __future__ import annotations
+
+import time
+from typing import Any
+
+import pytest
+from meshtastic_mcp.connection import connect
+
+from ..mesh._receive import ReceiveCollector
+
+
+@pytest.mark.timeout(600)
+def test_device_telemetry_broadcast(mesh_pair: dict[str, Any]) -> None:
+    """Runs for every directed pair. Waits up to ~8 minutes for TX to see
+    RX's device telemetry — either as a live inbound pubsub packet or as
+    a populated deviceMetrics on RX's node-DB record.
+
+    Firmware default telemetry interval is 900s; after a fresh boot the
+    first device-metrics broadcast happens within ~30-120s. We warm up
+    the mesh first with a cross-broadcast so NODEINFO is exchanged, then
+    wait up to 7 min for a telemetry packet.
+    """
+    tx_port = mesh_pair["tx"]["port"]
+    rx_port = mesh_pair["rx"]["port"]
+    rx_node_num = mesh_pair["rx"]["my_node_num"]
+
+    # Open both sides' pubsub listeners up front so we capture anything that
+    # arrives during the warmup exchange.
+    with ReceiveCollector(tx_port, topic="meshtastic.receive.telemetry") as tx_rx:
+        with ReceiveCollector(rx_port, topic="meshtastic.receive.text") as rx_tx:
+            # Warmup: send a broadcast from RX through its own collector so
+            # TX learns about RX (NODEINFO rides along with TEXT_MESSAGE_APP).
+            # Skipping this turns a 5-min wait into a 15-min wait on a fresh
+            # factory-reset pair.
+            rx_tx.send_text(f"warmup-{int(time.time())}")
+            time.sleep(5.0)
+
+            # Path 1: wait for a telemetry packet from RX on TX's pubsub.
+            got = tx_rx.wait_for(
+                lambda pkt: pkt.get("from") == rx_node_num,
+                timeout=420,  # 7 min — well above the 30-120s typical first broadcast
+            )
+            if got is not None:
+                return  # Path 1 confirmed delivery.
+
+    # Path 2: re-query TX's node DB for a populated deviceMetrics on RX.
+    # Device may have reported telemetry before we subscribed, or the
+    # pubsub delivery might race with our window — re-check nodesByNum.
+    with connect(port=tx_port) as iface:
+        rec = (iface.nodesByNum or {}).get(rx_node_num, {})
+        metrics = rec.get("deviceMetrics") or {}
+        has_any = any(
+            metrics.get(k) is not None
+            for k in ("batteryLevel", "voltage", "channelUtilization", "airUtilTx")
+        )
+        assert has_any, (
+            f"no telemetry from node 0x{rx_node_num:08x} within 7 min; "
+            f"deviceMetrics={metrics!r}"
+        )
diff --git a/mcp-server/tests/telemetry/test_telemetry_request_reply.py b/mcp-server/tests/telemetry/test_telemetry_request_reply.py
new file mode 100644
index 00000000000..8914c758d4f
--- /dev/null
+++ b/mcp-server/tests/telemetry/test_telemetry_request_reply.py
@@ -0,0 +1,187 @@
+"""Telemetry: on-demand device-metrics request gets a prompt reply.
+
+Complementary to ``test_device_telemetry_broadcast`` — that one witnesses the
+firmware's *periodic* broadcast (900 s default interval, up to ~7 min worst
+case). This one exercises the *request/reply* path: TX sends a
+``meshtastic_Telemetry`` with the ``device_metrics`` variant-tag set and
+``want_response=True`` on ``TELEMETRY_APP`` to RX, and RX's
+``modules/Telemetry/DeviceTelemetry.cpp::allocReply`` fires immediately with
+populated ``DeviceMetrics``. On a direct 2-device mesh the whole round-trip
+finishes in under a minute even from a cold boot.
+
+Validates:
+  * ``sendData(portNum=TELEMETRY_APP, want_response=True)`` encodes + routes
+    to RX (directed, PKI-encrypted to RX's pubkey)
+  * RX's ``DeviceTelemetryModule::handleReceivedProtobuf`` dispatches to
+    ``allocReply`` — which is only invoked by the framework when
+    ``want_response`` is set on the incoming packet
+  * The reply carries a ``DeviceMetrics`` sub-message with at least one
+    non-zero field (uptime_seconds is guaranteed non-zero a few seconds
+    after boot, so it reliably survives protobuf's default-value
+    serialization stripping)
+  * The reply routes back to TX and gets matched against the original
+    request via ``request_id`` — using the library's ``onResponse``
+    callback mechanism, which stores the handler at
+    ``responseHandlers[sent_packet.id]`` and dispatches when a packet
+    arrives with ``decoded.request_id == sent_packet.id``. This is more
+    precise than a pubsub ``from==rx_node_num`` filter, which can
+    accidentally match RX's periodic broadcast or a stale reply to a
+    different prior request.
+"""
+
+from __future__ import annotations
+
+import threading
+import time
+from typing import Any
+
+import pytest
+from meshtastic.protobuf import (  # type: ignore[import-untyped]
+    portnums_pb2,
+    telemetry_pb2,
+)
+
+from ..mesh._receive import ReceiveCollector, nudge_nodeinfo_port
+
+# Fields on the DeviceMetrics sub-message. The camelCase versions are what
+# `google.protobuf.json_format.MessageToDict` emits (preserving_proto_field_name
+# defaults to False); the snake_case names are the proto-source spellings.
+_DEVICE_METRICS_FIELDS = (
+    "batteryLevel",
+    "voltage",
+    "channelUtilization",
+    "airUtilTx",
+    "uptimeSeconds",
+)
+
+
+@pytest.mark.timeout(240)
+def test_telemetry_request_reply(mesh_pair: dict[str, Any]) -> None:
+    """Runs for every directed pair. TX requests RX's telemetry via
+    ``want_response=True`` and asserts the reply arrives with populated
+    DeviceMetrics.
+    """
+    tx_port = mesh_pair["tx"]["port"]
+    rx_port = mesh_pair["rx"]["port"]
+    rx_node_num = mesh_pair["rx"]["my_node_num"]
+    tx_role = mesh_pair["tx_role"]
+    rx_role = mesh_pair["rx_role"]
+    assert rx_node_num is not None, f"{rx_role} my_node_num missing"
+
+    # ReceiveCollector is still used to hold TX's SerialInterface open and
+    # give us `tx_listener._iface` for sendData / nodesByNum polling. The
+    # subscribed topic is irrelevant for this test (we match via
+    # onResponse, not pubsub), but keeping a concrete topic avoids the
+    # surprise of a pubsub wildcard receiving every packet type.
+    with ReceiveCollector(tx_port, topic="meshtastic.receive.telemetry") as tx_listener:
+        # Bilateral PKI warmup — nudge BOTH sides to rebroadcast their
+        # NodeInfo (with current pubkey) before the directed send.
+        #  * Nudging only RX gets RX's key → TX, but leaves RX with a
+        #    potentially stale TX pubkey → RX NAKs our request with
+        #    err=35 (PKI_UNKNOWN_PUBKEY) and we see no reply.
+        #  * Nudging only TX is the mirror failure.
+        # See `tests/mesh/_receive.py::nudge_nodeinfo` for firmware path.
+        nudge_nodeinfo_port(rx_port)  # briefly opens RX to send heartbeat
+        tx_listener.broadcast_nodeinfo_ping()  # TX via the already-open iface
+
+        pk_deadline = time.monotonic() + 45.0
+        last_nudge = time.monotonic()
+        last_rec: dict[str, Any] = {}
+        while time.monotonic() < pk_deadline:
+            last_rec = (tx_listener._iface.nodesByNum or {}).get(rx_node_num, {})
+            if last_rec.get("user", {}).get("publicKey"):
+                break
+            if time.monotonic() - last_nudge > 15.0:
+                # Re-nudge both sides — LoRa collisions can drop either
+                # direction's NodeInfo broadcast independently.
+                nudge_nodeinfo_port(rx_port)
+                tx_listener.broadcast_nodeinfo_ping()
+                last_nudge = time.monotonic()
+            time.sleep(1.0)
+        else:
+            pytest.fail(
+                f"TX ({tx_role}) never saw RX ({rx_role}) public key within "
+                f"45s; nodesByNum entry={last_rec!r}"
+            )
+
+        # Send the request. The Telemetry protobuf has a `which_variant`
+        # oneof tag that the firmware uses to decide which reply to build
+        # (see `src/modules/Telemetry/DeviceTelemetry.cpp::allocReply`):
+        #   device_metrics_tag → getDeviceTelemetry()
+        #   local_stats_tag    → getLocalStatsTelemetry()
+        #   anything else      → return NULL (request silently dropped)
+        # An empty `Telemetry()` has `which_variant = UNSET (0)`, so we MUST
+        # explicitly set the variant. `CopyFrom(DeviceMetrics())` with a
+        # default-constructed sub-message is the canonical Python-protobuf
+        # idiom for "set the oneof tag without populating fields" — matching
+        # how `MeshInterface.sendTelemetry()` constructs requests for the
+        # other variants.
+        #
+        # Matching the reply: the meshtastic client's `onResponse` callback
+        # mechanism fires ONLY for packets whose `decoded.request_id` equals
+        # the original outgoing packet's `id`. That's exactly the semantic
+        # we want — rejects periodic broadcasts (no request_id), rejects
+        # stale replies to prior requests (different request_id), and
+        # tolerates the firmware's reply_id/request_id naming quirk
+        # (firmware's `setReplyTo` writes the original packet's id into
+        # `decoded.request_id`, not `decoded.reply_id`).
+        #
+        # One retry covers transient LoRa collisions on request or reply.
+        reply_holder: list[dict[str, Any]] = []
+        got_reply = threading.Event()
+
+        def _on_reply(packet: dict[str, Any]) -> None:
+            reply_holder.append(packet)
+            got_reply.set()
+
+        got = None
+        for _attempt in range(2):
+            got_reply.clear()
+            del reply_holder[:]
+            req = telemetry_pb2.Telemetry()
+            req.device_metrics.CopyFrom(telemetry_pb2.DeviceMetrics())
+            tx_listener._iface.sendData(
+                req,
+                destinationId=rx_node_num,
+                portNum=portnums_pb2.PortNum.TELEMETRY_APP,
+                wantResponse=True,
+                onResponse=_on_reply,
+                hopLimit=3,
+            )
+            if got_reply.wait(timeout=45.0):
+                got = reply_holder[0]
+                break
+            time.sleep(5.0)
+
+        assert got is not None, (
+            f"no telemetry reply from {rx_role} (0x{rx_node_num:08x}) within "
+            f"90s of 2 requests; onResponse callback never fired. Captured "
+            f"{len(tx_listener.snapshot())} unrelated telemetry packet(s): "
+            f"{[hex(p.get('from') or 0) for p in tx_listener.snapshot()]!r}"
+        )
+
+        # Sanity: the reply's origin matches — a firmware bug that routed
+        # the response to the wrong sender would make onResponse fire on
+        # the wrong packet.
+        assert got.get("from") == rx_node_num, (
+            f"telemetry reply origin mismatch: from=0x{got.get('from'):08x}, "
+            f"expected 0x{rx_node_num:08x}"
+        )
+
+        # Inspect the decoded Telemetry payload. The meshtastic client stores
+        # it under `decoded.telemetry`; DeviceMetrics under `.deviceMetrics`.
+        decoded = got.get("decoded", {})
+        telem = decoded.get("telemetry") or {}
+        dm = telem.get("deviceMetrics") or {}
+
+        # A populated reply must contain at least one DeviceMetrics field.
+        # Protobuf's JSON serializer strips default-valued (zero) fields,
+        # so a bare `deviceMetrics: {}` would mean the firmware wrote the
+        # sub-message but every field was zero — plausible right at boot
+        # but not for a device that's been running long enough for a test
+        # session's warmup + NodeInfo exchange (~10-30 s uptime minimum).
+        populated = [k for k in _DEVICE_METRICS_FIELDS if k in dm]
+        assert populated, (
+            f"telemetry reply from {rx_role} carried no DeviceMetrics fields; "
+            f"decoded.telemetry={telem!r}"
+        )
diff --git a/mcp-server/tests/test_00_bake.py b/mcp-server/tests/test_00_bake.py
new file mode 100644
index 00000000000..6e864b37337
--- /dev/null
+++ b/mcp-server/tests/test_00_bake.py
@@ -0,0 +1,291 @@
+"""Session-bake module — runs first in the tier order to flash both hub roles
+with the session `test_profile`.
+
+Ordered first by `pytest_collection_modifyitems` in `conftest.py` (bucket
+-1) because `baked_mesh` only *verifies* state — it does not reflash. Without
+the explicit order pin, the top-level path `tests/test_00_bake.py` falls
+into the fallback bucket and sorts AFTER every tier, silently turning
+`--force-bake` into a no-op for the tier tests.
+
+Skipped entirely when `--assume-baked` is passed. All downstream hardware
+tests either depend on `baked_mesh` (which verifies state) or do their own
+per-test bake (provisioning/fleet tiers), so failing here gives one clear
+actionable failure instead of a cascade of mismatches.
+
+Hardware-specific env names live in a small role→env map at the top of this
+file; override by setting `MESHTASTIC_MCP_ENV_<ROLE>` env vars (e.g.
+`MESHTASTIC_MCP_ENV_NRF52=heltec-mesh-node-t114`).
+"""
+
+from __future__ import annotations
+
+import os
+import time
+from typing import Any
+
+import pytest
+import serial  # type: ignore[import-untyped]
+from meshtastic_mcp import admin, boards, flash, hw_tools, info
+
+# Default envs for a common lab setup. Override per-role via env var.
+_DEFAULT_ENVS = {
+    "nrf52": "rak4631",
+    "esp32s3": "heltec-v3",
+}
+
+_ESP32_ARCHES = {
+    "esp32",
+    "esp32-s2",
+    "esp32s2",
+    "esp32-s3",
+    "esp32s3",
+    "esp32-c3",
+    "esp32c3",
+    "esp32-c6",
+    "esp32c6",
+}
+_NRF52_ARCHES = {"nrf52", "nrf52840", "nrf52832"}
+
+
+def _wait_port_free(port: str, *, timeout_s: float = 15.0, role: str = "") -> None:
+    """Block until `port` can be exclusively opened, or raise after `timeout_s`.
+
+    Root cause for the retry loop: esptool / nrfutil / pio all take an
+    *exclusive* serial port lock (fcntl LOCK_EX on macOS, EAGAIN otherwise).
+    Anything that held the port recently — the TUI's startup `DevicePollerWorker._poll_once()`,
+    a prior `device_info` call, a lingering `meshtastic-mcp` subprocess
+    spawned by the operator's MCP host, or a stale `pio device monitor` —
+    can still be holding it when `test_00_bake` reaches the flash step. The
+    result is esptool exiting 2 in ~0.1s with `[Errno 35] Resource
+    temporarily unavailable`.
+
+    `pyserial.Serial(exclusive=True)` probes the same lock esptool takes;
+    a brief open/close cycle is the cleanest way to verify the port is
+    genuinely free before handing it to a subprocess we can't easily
+    retry. 200 ms poll interval keeps the failure fast while giving the
+    kernel time to release a just-closed descriptor.
+
+    Raises AssertionError (rather than a generic TimeoutError) so the
+    pytest summary shows the role + port + a hint at `lsof`.
+    """
+    role_prefix = f"{role}: " if role else ""
+    deadline = time.monotonic() + timeout_s
+    last_exc: BaseException | None = None
+    while time.monotonic() < deadline:
+        try:
+            s = serial.Serial(port=port, exclusive=True, timeout=0.5)
+        except Exception as exc:
+            last_exc = exc
+            time.sleep(0.2)
+            continue
+        try:
+            s.close()
+        except Exception:
+            pass
+        return
+    raise AssertionError(
+        f"{role_prefix}port {port} still busy after {timeout_s:.0f}s — "
+        f"something else holds an exclusive lock. Last error: {last_exc!r}. "
+        f"Identify the holder with `lsof {port}` and kill it; common "
+        f"culprits are a lingering `meshtastic-mcp` subprocess from the "
+        f"MCP host (.mcp.json) or a stale `pio device monitor`."
+    )
+
+
+def _prepare_nrf52_for_upload(port: str) -> str:
+    """Kick the RAK4631 (or similar nRF52 USB-DFU board) into bootloader mode
+    via 1200bps touch, then return the port where pio should upload.
+
+    Adafruit bootloader on RAK4631 interprets 1200bps-open-close as 'enter
+    DFU'. The device re-enumerates with a distinct USB VID/PID
+    (0x239A/0x0029) at a different `/dev/cu.usbmodem*` path.
+
+    `touch_1200bps` does the heavy lifting: bounded open/close, polls for the
+    Adafruit-bootloader PID specifically, retries the touch up to twice.
+    Fails loudly if the device doesn't enter DFU — no point trying pio
+    upload against an app-mode device, it'll just hang.
+    """
+    result = flash.touch_1200bps(port=port, settle_ms=500, retries=2)
+    if not result.get("ok"):
+        raise AssertionError(
+            f"nRF52 at {port} did not enter DFU bootloader after "
+            f"{result.get('attempts')} 1200bps touches. Manual recovery: "
+            f"double-tap the reset button on the board, then re-run. "
+            f"Detected port set before/after touch was unchanged."
+        )
+    new_port = result["new_port"]
+    # Small settle so pio/nrfutil sees a fully-ready CDC endpoint.
+    time.sleep(1.0)
+    return new_port
+
+
+def _env_for(role: str) -> str:
+    override = os.environ.get(f"MESHTASTIC_MCP_ENV_{role.upper()}")
+    if override:
+        return override
+    if role not in _DEFAULT_ENVS:
+        pytest.fail(
+            f"no default PlatformIO env for role {role!r}. "
+            f"Set MESHTASTIC_MCP_ENV_{role.upper()} to the env name."
+        )
+    return _DEFAULT_ENVS[role]
+
+
+def _bake_role(
+    role: str,
+    port: str,
+    test_profile: dict[str, Any],
+    force_bake: bool,
+) -> None:
+    """Bake + boot + verify for a single role. Skips if already baked unless
+    `--force-bake` was passed."""
+    env = _env_for(role)
+
+    # If not forcing, check if already baked with session profile.
+    if not force_bake:
+        try:
+            live = info.device_info(port=port, timeout_s=8.0)
+            # Quick heuristic: region matches and primary channel matches.
+            expected_region_short = test_profile["USERPREFS_CONFIG_LORA_REGION"].rsplit(
+                "_", 1
+            )[-1]
+            if (
+                live.get("region") == expected_region_short
+                and live.get("primary_channel")
+                == test_profile["USERPREFS_CHANNEL_0_NAME"]
+            ):
+                pytest.skip(
+                    f"{role} at {port} already baked with session profile "
+                    f"(pass --force-bake to reflash)"
+                )
+        except Exception:
+            # If we can't query, fall through and bake anyway.
+            pass
+
+    # All architectures go through `pio run -t upload` — pio knows the right
+    # protocol per variant (esptool for ESP32, adafruit-nrfutil for nRF52,
+    # picotool for RP2040). We don't use `bin/device-install.sh` for ESP32
+    # because it requires the external `mt-esp32s3-ota.bin` helper that's
+    # downloaded from releases, not generated by the build.
+    #
+    # IMPORTANT: `pio run -t upload` on ESP32 only overwrites the APP
+    # partition — the LittleFS partition (config + NodeDB) survives. That
+    # means USERPREFS-baked defaults never take effect on a device that was
+    # already provisioned, because NodeDB init prefers the saved config. To
+    # force USERPREFS to apply cleanly, we erase the full chip first on
+    # ESP32 boards. nRF52 DFU naturally wipes the user partition, so no
+    # erase needed there.
+    rec = boards.get_board(env)
+    arch = rec.get("architecture") or ""
+    # Make sure nothing else (TUI startup poll, MCP-host zombie, pio monitor)
+    # is holding the port before we hand it to a subprocess. Self-heals the
+    # [Errno 35] port-busy flake that otherwise fails the bake in ~0.1s.
+    _wait_port_free(port, role=role)
+    if arch in _NRF52_ARCHES:
+        upload_port = _prepare_nrf52_for_upload(port)
+    elif arch in _ESP32_ARCHES:
+        # Full chip erase — wipes NVS + LittleFS so USERPREFS defaults apply.
+        erase_result = hw_tools.esptool_erase_flash(port=port, confirm=True)
+        assert erase_result["exit_code"] == 0, (
+            f"{role}: esptool erase_flash failed:\n"
+            f"{erase_result.get('stderr_tail', '')}"
+        )
+        upload_port = port
+    else:
+        upload_port = port
+
+    # Post-erase, pre-upload: full chip erase on ESP32 drops the CDC
+    # endpoint for a moment while the bootloader re-enters download mode.
+    # Wait for the port to settle before pio reopens it for upload —
+    # otherwise a fast machine can race and hit the same errno 35.
+    if arch in _ESP32_ARCHES:
+        _wait_port_free(upload_port, role=role, timeout_s=10.0)
+
+    # NOTE: no `userprefs_overrides=` here. The session-scoped
+    # `_session_userprefs` autouse fixture in conftest.py has already baked
+    # the test profile into userPrefs.jsonc for the duration of the session
+    # and will restore the original file at session end. A local
+    # `temporary_overrides` here would be a no-op (file is already baked)
+    # AND would cause the session fixture's teardown to see different
+    # stat / mtime than it snapshotted — keep the mutation in one place.
+    result = flash.flash(
+        env=env,
+        port=upload_port,
+        confirm=True,
+    )
+    assert result["exit_code"] == 0, (
+        f"{role} bake failed: exit={result['exit_code']}\n"
+        f"stdout tail:\n{result.get('stdout_tail', '')}\n"
+        f"stderr tail:\n{result.get('stderr_tail', '')}"
+    )
+
+    # Post-flash: for nRF52, the DFU process only overwrites the app
+    # partition — the NVS region holding the existing NodeDB/config is
+    # untouched, so the firmware will prefer the saved config over the
+    # baked USERPREFS defaults. Trigger a full factory reset to wipe NVS
+    # so USERPREFS takes effect on the next boot.
+    #
+    # ESP32 devices had their full flash erased BEFORE upload via
+    # esptool_erase_flash, so they don't need this post-flash reset.
+    if arch in _NRF52_ARCHES:
+        # Give the device time to come up from DFU.
+        time.sleep(8.0)
+        # Wait for meshtastic to be responsive; `device_info` may take a
+        # few seconds on the first post-flash boot.
+        for _ in range(20):
+            try:
+                info.device_info(port=port, timeout_s=6.0)
+                break
+            except Exception:
+                time.sleep(1.5)
+        else:
+            raise AssertionError(f"{role}: device didn't respond after DFU flash")
+        # Trigger full factory reset (wipes NVS + identity)
+        admin.factory_reset(port=port, confirm=True, full=True)
+        # Wait for the device to reboot and come back with fresh config
+        # populated from USERPREFS defaults.
+        time.sleep(10.0)
+        for _ in range(30):
+            try:
+                live = info.device_info(port=port, timeout_s=6.0)
+                if live.get("my_node_num"):
+                    break
+            except Exception:
+                pass
+            time.sleep(2.0)
+        else:
+            raise AssertionError(f"{role}: device didn't return after factory_reset")
+
+
+@pytest.mark.timeout(600)
+def test_bake_nrf52(
+    hub_devices: dict[str, str],
+    test_profile: dict[str, Any],
+    request: pytest.FixtureRequest,
+) -> None:
+    """Flash the nRF52840 role with the session test profile."""
+    if "nrf52" not in hub_devices:
+        pytest.skip("nRF52 not detected on hub")
+    _bake_role(
+        role="nrf52",
+        port=hub_devices["nrf52"],
+        test_profile=test_profile,
+        force_bake=request.config.getoption("--force-bake"),
+    )
+
+
+@pytest.mark.timeout(600)
+def test_bake_esp32s3(
+    hub_devices: dict[str, str],
+    test_profile: dict[str, Any],
+    request: pytest.FixtureRequest,
+) -> None:
+    """Flash the ESP32-S3 role with the session test profile."""
+    if "esp32s3" not in hub_devices:
+        pytest.skip("ESP32-S3 not detected on hub")
+    _bake_role(
+        role="esp32s3",
+        port=hub_devices["esp32s3"],
+        test_profile=test_profile,
+        force_bake=request.config.getoption("--force-bake"),
+    )
diff --git a/mcp-server/tests/tool_coverage.py b/mcp-server/tests/tool_coverage.py
new file mode 100644
index 00000000000..b91bd4039bc
--- /dev/null
+++ b/mcp-server/tests/tool_coverage.py
@@ -0,0 +1,145 @@
+"""Tool-surface coverage: track which MCP tools the test suite actually exercises.
+
+This is NOT line coverage (that's `coverage.py`). This measures which of the
+38 public MCP tools in `meshtastic_mcp.server` got invoked during a pytest
+run — a quick signal for "where are the test-coverage gaps".
+
+Approach: introspect `meshtastic_mcp.server.app` for registered tools, find
+the underlying handler functions in their source modules, and wrap each with
+a counting shim. At session end, emit `tool_coverage.json` mapping each tool
+name to its call count. Tools never called show `count=0`.
+"""
+
+from __future__ import annotations
+
+import json
+import pathlib
+from typing import Any
+
+_counts: dict[str, int] = {}
+_installed = False
+
+
+def _bump(name: str) -> None:
+    _counts[name] = _counts.get(name, 0) + 1
+
+
+def _wrap(module: Any, attr: str, tool_name: str) -> None:
+    original = getattr(module, attr, None)
+    if original is None or not callable(original):
+        return
+
+    def wrapper(*args: Any, **kwargs: Any) -> Any:
+        _bump(tool_name)
+        return original(*args, **kwargs)
+
+    wrapper.__wrapped__ = original  # type: ignore[attr-defined]
+    wrapper.__name__ = attr
+    wrapper.__doc__ = original.__doc__
+    setattr(module, attr, wrapper)
+
+
+# Mapping: MCP tool name → (module, function name). Mirrors the wiring in
+# `meshtastic_mcp.server`. Keep synchronized manually — adding a tool without
+# updating this map means it shows as count=0 in reports even if exercised.
+_TOOL_MAP: dict[str, tuple[str, str]] = {
+    # Discovery & metadata
+    "list_devices": ("meshtastic_mcp.devices", "list_devices"),
+    "list_boards": ("meshtastic_mcp.boards", "list_boards"),
+    "get_board": ("meshtastic_mcp.boards", "get_board"),
+    # Build & flash
+    "build": ("meshtastic_mcp.flash", "build"),
+    "clean": ("meshtastic_mcp.flash", "clean"),
+    "pio_flash": ("meshtastic_mcp.flash", "flash"),
+    "erase_and_flash": ("meshtastic_mcp.flash", "erase_and_flash"),
+    "update_flash": ("meshtastic_mcp.flash", "update_flash"),
+    "touch_1200bps": ("meshtastic_mcp.flash", "touch_1200bps"),
+    # Serial log sessions — module-level functions on serial_session
+    "serial_open": ("meshtastic_mcp.serial_session", "open_session"),
+    "serial_read": ("meshtastic_mcp.serial_session", "read_session"),
+    "serial_list": ("meshtastic_mcp.registry", "all_sessions"),
+    "serial_close": ("meshtastic_mcp.serial_session", "close_session"),
+    # Device reads
+    "device_info": ("meshtastic_mcp.info", "device_info"),
+    "list_nodes": ("meshtastic_mcp.info", "list_nodes"),
+    # Device writes
+    "set_owner": ("meshtastic_mcp.admin", "set_owner"),
+    "get_config": ("meshtastic_mcp.admin", "get_config"),
+    "set_config": ("meshtastic_mcp.admin", "set_config"),
+    "get_channel_url": ("meshtastic_mcp.admin", "get_channel_url"),
+    "set_channel_url": ("meshtastic_mcp.admin", "set_channel_url"),
+    "set_debug_log_api": ("meshtastic_mcp.admin", "set_debug_log_api"),
+    "send_text": ("meshtastic_mcp.admin", "send_text"),
+    "reboot": ("meshtastic_mcp.admin", "reboot"),
+    "shutdown": ("meshtastic_mcp.admin", "shutdown"),
+    "factory_reset": ("meshtastic_mcp.admin", "factory_reset"),
+    # USERPREFS
+    "userprefs_manifest": ("meshtastic_mcp.userprefs", "build_manifest"),
+    "userprefs_get": ("meshtastic_mcp.userprefs", "read_state"),
+    "userprefs_set": ("meshtastic_mcp.userprefs", "merge_active"),
+    "userprefs_reset": ("meshtastic_mcp.userprefs", "reset"),
+    "userprefs_testing_profile": ("meshtastic_mcp.userprefs", "build_testing_profile"),
+    # Vendor hardware tools
+    "esptool_chip_info": ("meshtastic_mcp.hw_tools", "esptool_chip_info"),
+    "esptool_erase_flash": ("meshtastic_mcp.hw_tools", "esptool_erase_flash"),
+    "esptool_raw": ("meshtastic_mcp.hw_tools", "esptool_raw"),
+    "nrfutil_dfu": ("meshtastic_mcp.hw_tools", "nrfutil_dfu"),
+    "nrfutil_raw": ("meshtastic_mcp.hw_tools", "nrfutil_raw"),
+    "picotool_info": ("meshtastic_mcp.hw_tools", "picotool_info"),
+    "picotool_load": ("meshtastic_mcp.hw_tools", "picotool_load"),
+    "picotool_raw": ("meshtastic_mcp.hw_tools", "picotool_raw"),
+}
+
+
+def install() -> None:
+    """Wrap every mapped tool function with the counting shim. Idempotent."""
+    global _installed
+    if _installed:
+        return
+    import importlib
+
+    # Whitelist the exact module paths this function is ever allowed to
+    # import. `module_path` below is iterated from `_TOOL_MAP` — a file-
+    # local, hardcoded dict literal — but a static whitelist makes the
+    # "no untrusted input here" invariant legible to reviewers and to
+    # the Semgrep `non-literal-import` audit rule.
+    _allowed_modules = frozenset(path for path, _attr in _TOOL_MAP.values())
+
+    for tool_name, (module_path, attr) in _TOOL_MAP.items():
+        # Defense in depth: if someone mutates `_TOOL_MAP` at runtime
+        # (shouldn't happen; it's module-level) the whitelist catches it.
+        # `module_path` is a key from the hardcoded `_TOOL_MAP` dict and
+        # is gated above by membership in `_allowed_modules` (itself
+        # derived from the same literal values). There is no path for
+        # untrusted input to reach the `import_module` call below; the
+        # Semgrep suppression must sit on the line immediately preceding
+        # the call (multi-line comment blocks between comment and call
+        # break the rule's scope detection).
+        if module_path not in _allowed_modules:
+            continue
+        try:
+            # nosemgrep: python.lang.security.audit.non-literal-import.non-literal-import
+            mod = importlib.import_module(module_path)
+        except ImportError:
+            continue
+        _wrap(mod, attr, tool_name)
+        _counts.setdefault(tool_name, 0)
+    _installed = True
+
+
+def write_report(path: pathlib.Path) -> None:
+    """Emit `tool_coverage.json` with call counts for every mapped tool."""
+    exercised = sum(1 for c in _counts.values() if c > 0)
+    total = len(_counts)
+    payload = {
+        "total_tools": total,
+        "exercised": exercised,
+        "coverage_pct": round(100.0 * exercised / total, 1) if total else 0.0,
+        "counts": dict(sorted(_counts.items())),
+        "unexercised": sorted(k for k, v in _counts.items() if v == 0),
+    }
+    path.write_text(json.dumps(payload, indent=2), encoding="utf-8")
+
+
+def snapshot() -> dict[str, int]:
+    return dict(_counts)
diff --git a/mcp-server/tests/unit/__init__.py b/mcp-server/tests/unit/__init__.py
new file mode 100644
index 00000000000..e69de29bb2d
diff --git a/mcp-server/tests/unit/test_boards.py b/mcp-server/tests/unit/test_boards.py
new file mode 100644
index 00000000000..ece479c7c9b
--- /dev/null
+++ b/mcp-server/tests/unit/test_boards.py
@@ -0,0 +1,72 @@
+"""`boards.py` filter and enumeration correctness.
+
+Runs against the real `pio project config` output of this firmware repo —
+validates that filter predicates match expected envs and don't drift if
+variants get reorganized.
+"""
+
+from __future__ import annotations
+
+import pytest
+from meshtastic_mcp import boards
+
+
+def test_list_boards_returns_many() -> None:
+    all_boards = boards.list_boards()
+    assert len(all_boards) >= 50, "expected at least 50 PlatformIO envs"
+
+
+def test_tbeam_is_canonical_esp32() -> None:
+    """The default env in platformio.ini is `tbeam`; it must always be present
+    and flagged as esp32."""
+    rec = boards.get_board("tbeam")
+    assert rec["architecture"] == "esp32"
+    assert rec["hw_model_slug"] == "TBEAM"
+    assert rec["actively_supported"] is True
+    assert rec["board"] == "ttgo-tbeam"
+
+
+def test_filter_by_architecture() -> None:
+    esp32s3 = boards.list_boards(architecture="esp32s3")
+    assert len(esp32s3) >= 1
+    assert all(b["architecture"] == "esp32s3" for b in esp32s3)
+
+
+def test_filter_by_actively_supported() -> None:
+    supported = boards.list_boards(actively_supported_only=True)
+    unsupported = [b for b in boards.list_boards() if not b["actively_supported"]]
+    assert supported, "at least one board should be actively supported"
+    assert all(b["actively_supported"] for b in supported)
+    # Quick sanity: the set difference is non-empty in this repo (there are
+    # boards marked actively_supported=false).
+    assert unsupported, "expected at least one actively_supported=false board"
+
+
+def test_filter_by_query_substring_matches_display_name() -> None:
+    heltec = boards.list_boards(query="heltec")
+    assert heltec, "expected at least one Heltec env"
+    # Case-insensitive across display_name, env name, or hw_model_slug
+    for b in heltec:
+        blob = " ".join(
+            filter(
+                None,
+                [
+                    b.get("display_name") or "",
+                    b["env"],
+                    b.get("hw_model_slug") or "",
+                ],
+            )
+        ).lower()
+        assert "heltec" in blob
+
+
+def test_get_board_unknown_env_raises() -> None:
+    with pytest.raises(KeyError, match="Unknown env"):
+        boards.get_board("definitely-not-a-real-env")
+
+
+def test_get_board_surfaces_raw_config() -> None:
+    rec = boards.get_board("tbeam")
+    assert "raw_config" in rec
+    assert "custom_meshtastic_architecture" in rec["raw_config"]
+    assert rec["raw_config"]["custom_meshtastic_architecture"] == "esp32"
diff --git a/mcp-server/tests/unit/test_pio_wrapper.py b/mcp-server/tests/unit/test_pio_wrapper.py
new file mode 100644
index 00000000000..dae9597b99f
--- /dev/null
+++ b/mcp-server/tests/unit/test_pio_wrapper.py
@@ -0,0 +1,61 @@
+"""`pio.py` subprocess wrapper: error paths, tailing, JSON parsing.
+
+Uses a real `pio` install only for the happy-path `--version`; error paths are
+exercised with a deliberately-broken `MESHTASTIC_PIO_BIN` override.
+"""
+
+from __future__ import annotations
+
+import pytest
+from meshtastic_mcp import pio
+
+
+def test_tail_lines_keeps_last_n() -> None:
+    text = "\n".join(f"line-{i}" for i in range(1, 11))
+    assert pio.tail_lines(text, 3) == "line-8\nline-9\nline-10"
+    assert pio.tail_lines(text, 100) == text  # more lines requested than exist
+    assert pio.tail_lines("", 5) == ""
+
+
+def test_tail_lines_handles_trailing_newline() -> None:
+    assert pio.tail_lines("a\nb\nc\n", 2) == "b\nc"
+
+
+def test_pio_version_runs(monkeypatch: pytest.MonkeyPatch) -> None:
+    """Happy path: `pio --version` exits 0 and prints a version string.
+
+    This exercises subprocess spawn, timeout default, and the PioResult shape.
+    Skipped if pio isn't installed (CI would need pio preinstalled).
+    """
+    try:
+        result = pio.run(["--version"], timeout=30)
+    except pio.PioError:
+        pytest.skip("pio not available in this environment")
+    assert result.returncode == 0
+    assert "PlatformIO" in result.stdout or "platformio" in result.stdout.lower()
+    assert result.duration_s > 0
+
+
+def test_pio_bad_command_raises_pio_error() -> None:
+    """`pio` returning non-zero must surface as PioError with stderr captured."""
+    with pytest.raises(pio.PioError) as excinfo:
+        pio.run(["this-subcommand-does-not-exist"], timeout=10)
+    # PioError includes returncode + a tail of stderr/stdout.
+    assert excinfo.value.returncode != 0
+
+
+def test_pio_timeout_raises_pio_timeout(monkeypatch: pytest.MonkeyPatch) -> None:
+    """Extremely short timeout on a command that takes longer must raise PioTimeout."""
+    # `pio` startup alone typically takes ~200-500ms; a 1ms timeout reliably trips.
+    with pytest.raises(pio.PioTimeout):
+        pio.run(["--help"], timeout=0.001)
+
+
+def test_run_json_parses_device_list() -> None:
+    """`pio device list --json-output` is a known-valid JSON producer."""
+    try:
+        data = pio.run_json(["device", "list"], timeout=15)
+    except pio.PioError:
+        pytest.skip("pio not available in this environment")
+    # Always a list; may be empty if nothing is plugged in.
+    assert isinstance(data, list)
diff --git a/mcp-server/tests/unit/test_testing_profile.py b/mcp-server/tests/unit/test_testing_profile.py
new file mode 100644
index 00000000000..32ad712a720
--- /dev/null
+++ b/mcp-server/tests/unit/test_testing_profile.py
@@ -0,0 +1,120 @@
+"""`userprefs.build_testing_profile` / `generate_psk` correctness.
+
+The testing-profile generator is the critical primitive for automated test
+labs: it must produce deterministic PSKs for a given seed (so every device
+baked in a CI run joins the same mesh) and different PSKs for different seeds
+(so concurrent labs don't collide).
+"""
+
+from __future__ import annotations
+
+import pytest
+from meshtastic_mcp import userprefs
+
+
+def test_generate_psk_is_32_bytes_formatted() -> None:
+    psk = userprefs.generate_psk(seed="deterministic")
+    # Format: "{ 0x.., 0x.., ... }" with 32 comma-separated hex bytes.
+    assert psk.startswith("{ ") and psk.endswith(" }")
+    bytes_part = psk.removeprefix("{ ").removesuffix(" }")
+    hex_bytes = [b.strip() for b in bytes_part.split(",")]
+    assert len(hex_bytes) == 32
+    for b in hex_bytes:
+        assert b.startswith("0x")
+        int(b, 16)  # raises if not valid hex
+
+
+def test_generate_psk_deterministic_under_same_seed() -> None:
+    a = userprefs.generate_psk(seed="pytest-session-123")
+    b = userprefs.generate_psk(seed="pytest-session-123")
+    assert a == b, "same seed must produce same PSK"
+
+
+def test_generate_psk_varies_with_seed() -> None:
+    seeds = ["a", "b", "pytest-1", "pytest-2", "prod-fleet-alpha"]
+    psks = {userprefs.generate_psk(seed=s) for s in seeds}
+    assert len(psks) == len(seeds), "seed → PSK map must be injective"
+
+
+def test_generate_psk_random_when_seedless() -> None:
+    a = userprefs.generate_psk(seed=None)
+    b = userprefs.generate_psk(seed=None)
+    # Not strictly guaranteed (birthday paradox), but 256-bit randomness makes
+    # a collision astronomically unlikely.
+    assert a != b
+
+
+def test_testing_profile_contains_expected_keys() -> None:
+    profile = userprefs.build_testing_profile(psk_seed="ci-run-1")
+
+    required = {
+        "USERPREFS_CONFIG_LORA_REGION",
+        "USERPREFS_LORACONFIG_MODEM_PRESET",
+        "USERPREFS_LORACONFIG_CHANNEL_NUM",
+        "USERPREFS_CHANNELS_TO_WRITE",
+        "USERPREFS_CHANNEL_0_NAME",
+        "USERPREFS_CHANNEL_0_PSK",
+        "USERPREFS_CHANNEL_0_PRECISION",
+        "USERPREFS_CONFIG_LORA_IGNORE_MQTT",
+        "USERPREFS_MQTT_ENABLED",
+        "USERPREFS_CHANNEL_0_UPLINK_ENABLED",
+        "USERPREFS_CHANNEL_0_DOWNLINK_ENABLED",
+    }
+    assert required <= set(profile.keys())
+
+    # Defaults from the plan
+    assert profile["USERPREFS_CONFIG_LORA_REGION"].endswith("_US")
+    assert profile["USERPREFS_LORACONFIG_MODEM_PRESET"].endswith("_LONG_FAST")
+    assert profile["USERPREFS_LORACONFIG_CHANNEL_NUM"] == 88
+    assert profile["USERPREFS_CHANNEL_0_NAME"] == "McpTest"
+
+
+def test_testing_profile_rejects_unknown_region() -> None:
+    with pytest.raises(ValueError, match="Unknown region"):
+        userprefs.build_testing_profile(region="ATLANTIS")
+
+
+def test_testing_profile_rejects_unknown_modem_preset() -> None:
+    with pytest.raises(ValueError, match="Unknown modem_preset"):
+        userprefs.build_testing_profile(modem_preset="WARP_9")
+
+
+def test_testing_profile_rejects_oversized_channel_name() -> None:
+    with pytest.raises(ValueError, match="11-char max"):
+        userprefs.build_testing_profile(channel_name="WayTooLongChannelName")
+
+
+def test_testing_profile_rejects_oversized_short_name() -> None:
+    with pytest.raises(ValueError, match="≤4 chars"):
+        userprefs.build_testing_profile(short_name="TOOLONG")
+
+
+def test_disable_mqtt_false_drops_mqtt_keys() -> None:
+    profile = userprefs.build_testing_profile(psk_seed="x", disable_mqtt=False)
+    # When disable_mqtt is False, the MQTT-gating keys should NOT be in the
+    # profile (device uses firmware defaults, whatever those are).
+    assert "USERPREFS_MQTT_ENABLED" not in profile
+    assert "USERPREFS_CHANNEL_0_UPLINK_ENABLED" not in profile
+
+
+def test_disable_position_adds_gps_disabled() -> None:
+    profile = userprefs.build_testing_profile(psk_seed="x", disable_position=True)
+    assert profile["USERPREFS_CONFIG_GPS_MODE"].endswith("_DISABLED")
+    assert profile["USERPREFS_CONFIG_SMART_POSITION_ENABLED"] is False
+
+
+def test_owner_names_included_when_provided() -> None:
+    profile = userprefs.build_testing_profile(
+        psk_seed="x", long_name="Lab Bench 1", short_name="LB1"
+    )
+    assert profile["USERPREFS_CONFIG_OWNER_LONG_NAME"] == "Lab Bench 1"
+    assert profile["USERPREFS_CONFIG_OWNER_SHORT_NAME"] == "LB1"
+
+
+def test_psk_seed_isolation_across_ci_runs() -> None:
+    """The core claim: two test labs running concurrently with different
+    session seeds produce different PSKs — their meshes cannot decode each
+    other's traffic."""
+    lab_a = userprefs.build_testing_profile(psk_seed="lab-A-nightly")
+    lab_b = userprefs.build_testing_profile(psk_seed="lab-B-nightly")
+    assert lab_a["USERPREFS_CHANNEL_0_PSK"] != lab_b["USERPREFS_CHANNEL_0_PSK"]
diff --git a/mcp-server/tests/unit/test_userprefs_parse.py b/mcp-server/tests/unit/test_userprefs_parse.py
new file mode 100644
index 00000000000..615235bb05d
--- /dev/null
+++ b/mcp-server/tests/unit/test_userprefs_parse.py
@@ -0,0 +1,115 @@
+"""Unit tests for `userprefs.py`: jsonc parse, type inference, round-trip
+write, and the `temporary_overrides` context manager's byte-for-byte restore.
+
+None of these require hardware. They validate the contract that the flash/
+testing-profile tools rely on — if these fail, the provisioning tier will
+produce confusing mismatches.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+import pytest
+from meshtastic_mcp import userprefs
+
+
+@pytest.fixture
+def sample_jsonc(tmp_path: Path, monkeypatch: pytest.MonkeyPatch) -> Path:
+    """Write a minimal userPrefs.jsonc into tmp_path and point config at it."""
+    content = """{
+  "USERPREFS_CONFIG_LORA_REGION": "meshtastic_Config_LoRaConfig_RegionCode_US",
+  "USERPREFS_LORACONFIG_CHANNEL_NUM": "88",
+  // "USERPREFS_CHANNEL_0_NAME": "McpTest",
+  "USERPREFS_CHANNEL_0_PSK": "{ 0x01, 0x02, 0x03 }",
+  // "USERPREFS_MQTT_ENABLED": "0",
+  "USERPREFS_CONFIG_LORA_IGNORE_MQTT": "true"
+}
+"""
+    # Fake firmware root with a userPrefs.jsonc + platformio.ini (needed for
+    # `config.firmware_root()`'s walk-up detection).
+    (tmp_path / "platformio.ini").write_text("[platformio]\n", encoding="utf-8")
+    jsonc = tmp_path / "userPrefs.jsonc"
+    jsonc.write_text(content, encoding="utf-8")
+    monkeypatch.setenv("MESHTASTIC_FIRMWARE_ROOT", str(tmp_path))
+    return jsonc
+
+
+def test_read_state_separates_active_and_commented(sample_jsonc: Path) -> None:
+    state = userprefs.read_state()
+    assert set(state["active"]) == {
+        "USERPREFS_CONFIG_LORA_REGION",
+        "USERPREFS_LORACONFIG_CHANNEL_NUM",
+        "USERPREFS_CHANNEL_0_PSK",
+        "USERPREFS_CONFIG_LORA_IGNORE_MQTT",
+    }
+    assert set(state["commented"]) == {
+        "USERPREFS_CHANNEL_0_NAME",
+        "USERPREFS_MQTT_ENABLED",
+    }
+
+
+def test_infer_type_matches_platformio_custom_py() -> None:
+    # Mirrors the branch order in `bin/platformio-custom.py:222-235`.
+    assert userprefs.infer_type("{ 0x01, 0x02 }") == "brace"
+    assert userprefs.infer_type("88") == "number"
+    assert userprefs.infer_type("-1.5") == "number"
+    assert userprefs.infer_type("true") == "bool"
+    assert userprefs.infer_type("false") == "bool"
+    assert userprefs.infer_type("meshtastic_Config_DeviceConfig_Role_ROUTER") == "enum"
+    assert userprefs.infer_type("plain string value") == "string"
+    assert userprefs.infer_type(None) == "unknown"
+
+
+def test_temporary_overrides_restores_byte_for_byte(sample_jsonc: Path) -> None:
+    """The context manager MUST leave the file bit-identical on exit, even on
+    exception — this is the safety guarantee build/flash tools rely on."""
+    original = sample_jsonc.read_bytes()
+
+    with userprefs.temporary_overrides({"USERPREFS_CHANNEL_0_NAME": "OverrideTest"}):
+        # During the context, the override is written.
+        during = userprefs.read_state()
+        assert "USERPREFS_CHANNEL_0_NAME" in during["active"]
+        assert during["active"]["USERPREFS_CHANNEL_0_NAME"] == "OverrideTest"
+
+    # After: byte-identical restore.
+    assert sample_jsonc.read_bytes() == original
+
+
+def test_temporary_overrides_restores_after_exception(sample_jsonc: Path) -> None:
+    original = sample_jsonc.read_bytes()
+
+    with pytest.raises(RuntimeError, match="simulated"):
+        with userprefs.temporary_overrides({"USERPREFS_CHANNEL_0_NAME": "Failing"}):
+            raise RuntimeError("simulated mid-build failure")
+
+    assert sample_jsonc.read_bytes() == original
+
+
+def test_temporary_overrides_none_is_noop(sample_jsonc: Path) -> None:
+    original = sample_jsonc.read_bytes()
+    with userprefs.temporary_overrides(None) as effective:
+        # No file write, and `effective` still reflects the active set.
+        assert "USERPREFS_CONFIG_LORA_REGION" in effective
+    assert sample_jsonc.read_bytes() == original
+
+
+def test_temporary_overrides_rejects_non_userprefs_keys(sample_jsonc: Path) -> None:
+    with pytest.raises(ValueError, match="USERPREFS_"):
+        with userprefs.temporary_overrides({"RANDOM_KEY": "value"}):
+            pass
+
+
+def test_build_manifest_surfaces_all_keys(sample_jsonc: Path) -> None:
+    """Manifest should union the jsonc set with firmware-src consumers.
+
+    In the sample tmpdir there's no `src/` so `consumed_by` is empty for all
+    entries; that's fine — the manifest still lists every jsonc key.
+    """
+    manifest = userprefs.build_manifest()
+    keys = {e["key"] for e in manifest["entries"]}
+    # All 6 keys from sample_jsonc should be present.
+    assert "USERPREFS_CONFIG_LORA_REGION" in keys
+    assert "USERPREFS_CHANNEL_0_NAME" in keys  # commented but still listed
+    assert manifest["active_count"] == 4
+    assert manifest["commented_count"] == 2
diff --git a/src/mesh/NodeDB.cpp b/src/mesh/NodeDB.cpp
index 4b08715660f..6e57e89f6e1 100644
--- a/src/mesh/NodeDB.cpp
+++ b/src/mesh/NodeDB.cpp
@@ -17,6 +17,7 @@
 #include "Router.h"
 #include "SPILock.h"
 #include "SafeFile.h"
+#include "TransmitHistory.h"
 #include "TypeConversions.h"
 #include "error.h"
 #include "main.h"
@@ -509,6 +510,12 @@ bool NodeDB::factoryReset(bool eraseBleBonds)
     }
 #endif
     spiLock->unlock();
+
+    // rmDir above nuked the .dat file, but TransmitHistory's in-memory
+    // cache auto-flushes every 5 min and would resurrect it.
+    if (transmitHistory) {
+        transmitHistory->clear();
+    }
     // second, install default state (this will deal with the duplicate mac address issue)
     installDefaultNodeDatabase();
     installDefaultDeviceState();
diff --git a/src/mesh/PhoneAPI.cpp b/src/mesh/PhoneAPI.cpp
index 7df6720a21b..714e6110865 100644
--- a/src/mesh/PhoneAPI.cpp
+++ b/src/mesh/PhoneAPI.cpp
@@ -17,6 +17,7 @@
 #include "TypeConversions.h"
 #include "concurrency/LockGuard.h"
 #include "main.h"
+#include "modules/NodeInfoModule.h"
 #include "xmodem.h"
 
 #if FromRadio_size > MAX_TO_FROM_RADIO_SIZE
@@ -190,8 +191,23 @@ bool PhoneAPI::handleToRadio(const uint8_t *buf, size_t bufLength)
             break;
 #endif
         case meshtastic_ToRadio_heartbeat_tag:
-            LOG_DEBUG("Got client heartbeat");
-            heartbeatReceived = true;
+            // nonce==1 is a special "nodeinfo ping" trigger: force a fresh
+            // NodeInfo broadcast on the 60-second shorterTimeout path so
+            // peers can re-learn our public key after a reboot or
+            // factory_reset without waiting out the normal 10-minute
+            // NodeInfo send cooldown. Mirrors the TCP/UDP path in
+            // `src/mesh/api/PacketAPI.cpp:74-79` for serial clients.
+            // Default nonce (0) remains a plain keepalive that triggers
+            // a queue-status reply.
+            if (toRadioScratch.heartbeat.nonce == 1) {
+                if (nodeInfoModule) {
+                    LOG_INFO("Broadcasting nodeinfo ping (serial)");
+                    nodeInfoModule->sendOurNodeInfo(NODENUM_BROADCAST, true, 0, true);
+                }
+            } else {
+                LOG_DEBUG("Got client heartbeat");
+                heartbeatReceived = true;
+            }
             break;
         default:
             // Ignore nop messages
diff --git a/src/mesh/StreamAPI.cpp b/src/mesh/StreamAPI.cpp
index 2d9230a2178..2c336287714 100644
--- a/src/mesh/StreamAPI.cpp
+++ b/src/mesh/StreamAPI.cpp
@@ -2,6 +2,7 @@
 #include "PowerFSM.h"
 #include "RTC.h"
 #include "Throttle.h"
+#include "concurrency/LockGuard.h"
 #include "configuration.h"
 
 #define START1 0x94
@@ -177,6 +178,9 @@ void StreamAPI::emitTxBuffer(size_t len)
         txBuf[3] = len & 0xff;
 
         auto totalLen = len + HEADER_LEN;
+        // Serialize stream writes against `emitLogRecord` so a LOG_ firing
+        // mid-packet-emission can't interleave bytes on the wire.
+        concurrency::LockGuard guard(&streamLock);
         stream->write(txBuf, totalLen);
         stream->flush();
     }
@@ -195,21 +199,42 @@ void StreamAPI::emitRebooted()
 
 void StreamAPI::emitLogRecord(meshtastic_LogRecord_Level level, const char *src, const char *format, va_list arg)
 {
-    // In case we send a FromRadio packet
-    memset(&fromRadioScratch, 0, sizeof(fromRadioScratch));
-    fromRadioScratch.which_payload_variant = meshtastic_FromRadio_log_record_tag;
-    fromRadioScratch.log_record.level = level;
+    // IMPORTANT: do NOT touch `fromRadioScratch` or `txBuf` here — those
+    // belong to the main packet-emission path and a LOG_ firing during
+    // `writeStream()` would corrupt an in-flight encode. We keep a
+    // dedicated `fromRadioScratchLog` + `txBufLog` for log records and
+    // only serialize the actual `stream->write` call via `streamLock` so
+    // a concurrent packet emission doesn't interleave bytes on the wire.
+    memset(&fromRadioScratchLog, 0, sizeof(fromRadioScratchLog));
+    fromRadioScratchLog.which_payload_variant = meshtastic_FromRadio_log_record_tag;
+    fromRadioScratchLog.log_record.level = level;
 
     uint32_t rtc_sec = getValidTime(RTCQuality::RTCQualityDevice, true);
-    fromRadioScratch.log_record.time = rtc_sec;
-    strncpy(fromRadioScratch.log_record.source, src, sizeof(fromRadioScratch.log_record.source) - 1);
+    fromRadioScratchLog.log_record.time = rtc_sec;
+    strncpy(fromRadioScratchLog.log_record.source, src, sizeof(fromRadioScratchLog.log_record.source) - 1);
 
     auto num_printed =
-        vsnprintf(fromRadioScratch.log_record.message, sizeof(fromRadioScratch.log_record.message) - 1, format, arg);
-    if (num_printed > 0 && fromRadioScratch.log_record.message[num_printed - 1] ==
+        vsnprintf(fromRadioScratchLog.log_record.message, sizeof(fromRadioScratchLog.log_record.message) - 1, format, arg);
+    if (num_printed > 0 && fromRadioScratchLog.log_record.message[num_printed - 1] ==
                                '\n') // Strip any ending newline, because we have records for framing instead.
-        fromRadioScratch.log_record.message[num_printed - 1] = '\0';
-    emitTxBuffer(pb_encode_to_bytes(txBuf + HEADER_LEN, meshtastic_FromRadio_size, &meshtastic_FromRadio_msg, &fromRadioScratch));
+        fromRadioScratchLog.log_record.message[num_printed - 1] = '\0';
+
+    size_t len =
+        pb_encode_to_bytes(txBufLog + HEADER_LEN, meshtastic_FromRadio_size, &meshtastic_FromRadio_msg, &fromRadioScratchLog);
+    if (len != 0) {
+        txBufLog[0] = START1;
+        txBufLog[1] = START2;
+        txBufLog[2] = (len >> 8) & 0xff;
+        txBufLog[3] = len & 0xff;
+
+        auto totalLen = len + HEADER_LEN;
+        // Serialize stream writes against `emitTxBuffer` so a packet
+        // emission in flight on another task doesn't interleave bytes
+        // with this log record.
+        concurrency::LockGuard guard(&streamLock);
+        stream->write(txBufLog, totalLen);
+        stream->flush();
+    }
 }
 
 /// Hookable to find out when connection changes
diff --git a/src/mesh/StreamAPI.h b/src/mesh/StreamAPI.h
index c724871cb80..d3ad9ba794d 100644
--- a/src/mesh/StreamAPI.h
+++ b/src/mesh/StreamAPI.h
@@ -2,6 +2,7 @@
 
 #include "PhoneAPI.h"
 #include "Stream.h"
+#include "concurrency/Lock.h"
 #include "concurrency/OSThread.h"
 #include <cstdarg>
 
@@ -89,4 +90,27 @@ class StreamAPI : public PhoneAPI
 
     /// Low level function to emit a protobuf encapsulated log record
     void emitLogRecord(meshtastic_LogRecord_Level level, const char *src, const char *format, va_list arg);
+
+  private:
+    /// Dedicated scratch + tx buffer for LogRecord emission.
+    ///
+    /// The main packet emission path (`writeStream` -> `getFromRadio` ->
+    /// `emitTxBuffer`) holds `fromRadioScratch` (from PhoneAPI) and `txBuf`
+    /// from the moment `getFromRadio` starts encoding until `emitTxBuffer`
+    /// finishes pushing bytes to the stream. If a `LOG_` macro fires during
+    /// that window and we emit through the API, the old implementation
+    /// re-used `fromRadioScratch` / `txBuf` and corrupted whatever the main
+    /// path had already encoded. Symptoms on the host were
+    /// `google.protobuf.message.DecodeError: Error parsing message with type
+    /// 'meshtastic.protobuf.FromRadio'` — any tool with
+    /// `config.security.debug_log_api_enabled=true` under traffic would see
+    /// torn frames every few messages.
+    ///
+    /// Giving the log path its own scratch + txBuf means the main path is
+    /// never clobbered. We still need `streamLock` to serialize the actual
+    /// `stream->write` call so a log emission and a packet emission don't
+    /// interleave on the wire.
+    meshtastic_FromRadio fromRadioScratchLog = {};
+    uint8_t txBufLog[MAX_STREAM_BUF_SIZE] = {0};
+    concurrency::Lock streamLock;
 };
\ No newline at end of file
diff --git a/src/mesh/TransmitHistory.cpp b/src/mesh/TransmitHistory.cpp
index 33da7d35cb4..afdcf5285dd 100644
--- a/src/mesh/TransmitHistory.cpp
+++ b/src/mesh/TransmitHistory.cpp
@@ -255,6 +255,21 @@ bool TransmitHistory::saveToDisk()
     return false;
 }
 
+void TransmitHistory::clear()
+{
+    history.clear();
+    lastMillis.clear();
+    dirty = false;
+    lastDiskSave = 0; // so the next legit broadcast persists immediately
+
+    spiLock->lock();
+    if (FSCom.exists(FILENAME)) {
+        FSCom.remove(FILENAME);
+    }
+    spiLock->unlock();
+    LOG_INFO("TransmitHistory: cleared in-memory state + on-disk file");
+}
+
 #else
 // No filesystem available — provide stub with in-memory tracking
 TransmitHistory *transmitHistory = nullptr;
@@ -290,4 +305,10 @@ bool TransmitHistory::saveToDisk()
     return true;
 }
 
+void TransmitHistory::clear()
+{
+    history.clear();
+    lastMillis.clear();
+}
+
 #endif
diff --git a/src/mesh/TransmitHistory.h b/src/mesh/TransmitHistory.h
index 1a79048eaab..57e5fb6ccc9 100644
--- a/src/mesh/TransmitHistory.h
+++ b/src/mesh/TransmitHistory.h
@@ -76,6 +76,13 @@ class TransmitHistory
      */
     bool saveToDisk();
 
+    /**
+     * Wipe in-memory throttle state + remove the on-disk file. Required
+     * alongside rmDir("/prefs") in factoryReset — otherwise the 5-min
+     * auto-flush resurrects the file from the still-populated maps.
+     */
+    void clear();
+
   private:
     TransmitHistory() = default;