Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions .github/workflows/rust-data-daemon.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
name: Rust Data Daemon

on:
pull_request:
branches:
- main
paths:
- 'rust/**'
- '.github/workflows/rust-data-daemon.yaml'
push:
branches:
- main
paths:
- 'rust/**'
- '.github/workflows/rust-data-daemon.yaml'

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
build-and-test:
runs-on: ubuntu-22.04
defaults:
run:
working-directory: rust
steps:
- name: Checkout code
uses: actions/checkout@v6

- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@stable
with:
components: rustfmt, clippy

- name: Cache cargo build
uses: Swatinem/rust-cache@v2
with:
workspaces: rust

# PyO3 builds need a discoverable Python interpreter; the runner's default
# /usr/bin/python is fine, but actions/setup-python pins the version we
# test the cdylib against.
- name: Set up Python
id: setup-python
uses: actions/setup-python@v6
with:
python-version: "3.11"

- name: Check formatting
run: cargo fmt --check

- name: Lint with clippy (workspace)
env:
PYO3_PYTHON: ${{ steps.setup-python.outputs.python-path }}
run: cargo clippy --workspace --all-targets -- -D warnings

# The video-encoder tests early-return when ffmpeg is absent, so without
# this the encode/preflight path (and the ffmpeg-4.4.2 `-vsync` vs
# `-fps_mode` compatibility the daemon defends against) is never exercised
# in CI. ubuntu-22.04 ships ffmpeg 4.4.x, matching the target host.
- name: Install ffmpeg
run: |
sudo apt-get update
sudo apt-get install -y ffmpeg

- name: Test (workspace)
env:
PYO3_PYTHON: ${{ steps.setup-python.outputs.python-path }}
run: cargo test --workspace --verbose

- name: Build release (workspace)
env:
PYO3_PYTHON: ${{ steps.setup-python.outputs.python-path }}
run: cargo build --release --workspace --verbose

- name: Build documentation
env:
RUSTDOCFLAGS: -D warnings
PYO3_PYTHON: ${{ steps.setup-python.outputs.python-path }}
run: cargo doc --no-deps --document-private-items
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -121,3 +121,10 @@ examples/logs/
examples/test_streaming/

.data_daemon_test_state/

# Bundled rust artefacts built into the package tree by
# `rust/scripts/build_wheel_artefacts.sh`. Both are per-machine and per-build,
# so we never commit them; the wheel-build CI job recreates them on every run.
# See docs/rust_data_daemon_development.md#packaging-the-wheel.
neuracore/data_daemon/bin/
neuracore/data_daemon/_native_producer*.so
21 changes: 21 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -69,3 +69,24 @@ repos:
- .git/COMMIT_EDITMSG
stages: [commit-msg]
always_run: true

# Rust formatting and linting for the data-daemon rewrite crate.
# Uses the system cargo toolchain (rustup) so the rustfmt / clippy version
# matches the developer's local install and the version CI uses.
- repo: local
hooks:
- id: cargo-fmt
name: cargo fmt
description: Format Rust sources in the data-daemon crate.
entry: cargo fmt --manifest-path rust/data_daemon/Cargo.toml --
language: system
files: ^rust/data_daemon/.*\.rs$
pass_filenames: false

- id: cargo-clippy
name: cargo clippy
description: Lint the data-daemon crate with clippy (warnings denied).
entry: cargo clippy --manifest-path rust/data_daemon/Cargo.toml --all-targets -- -D warnings
language: system
files: ^rust/data_daemon/.*\.(rs|toml|lock)$
pass_filenames: false
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@ predictions = policy.predict(timeout=5)
- [Environment Variables](./docs/environment_variable.md)
- [Contribution Guide](./docs/contribution_guide.md)
- [Data Daemon](./docs/data_daemon.md)
- [Rust Data Daemon — Developer Guide](./docs/rust_data_daemon_development.md) — building the [rust/](./rust/) workspace that ships inside the wheel as the data-daemon binary + `neuracore.data_daemon._native_producer` cdylib.

# 💬 Community

Expand Down
3 changes: 1 addition & 2 deletions cSpell.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,5 @@
"language": "en,en-GB",
"dictionaries": [
"neuracore-dictionary"
],
"words": []
]
}
26 changes: 26 additions & 0 deletions docs/contribution_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -323,6 +323,32 @@ If you encounter issues with your algorithm:
- When uploading as a ZIP, make sure your module imports are correctly structured


## Development environment

### Python

```bash
git clone https://github.com/neuracoreai/neuracore
cd neuracore
pip install -e .[dev,ml]
pre-commit install
```

### Rust toolchain (data daemon)

The data daemon is being rewritten in Rust under [rust/](../rust/). The pre-commit configuration runs `cargo fmt` and `cargo clippy` against the `data_daemon` crate, and CI ([.github/workflows/rust-data-daemon.yaml](../.github/workflows/rust-data-daemon.yaml)) builds and tests it on every PR that touches `rust/data_daemon/**`.

The hooks invoke `cargo` from your `PATH` (`language: system`), so you need a working Rust toolchain locally to commit changes that touch the crate. Install one via [rustup](https://rustup.rs/):

```bash
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup component add rustfmt clippy
```

If you do not touch any file under `rust/data_daemon/`, the cargo hooks are skipped and rustup is not required.

For the full developer workflow on the Rust crates — workspace layout, build/test/lint commands, running the daemon locally, the PyO3 producer cdylib, and SQLite state inspection — see [rust_data_daemon_development.md](rust_data_daemon_development.md).

## Release Process

### Branch Strategy
Expand Down
88 changes: 56 additions & 32 deletions docs/data_daemon.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Profiles are optional. If you do not use a named profile, the daemon uses the de
- How to run the daemon (CLI or from a script)
- How profiles work (optional) and where they are stored
- The configuration fields you can set
- Environment variables that control DB path, recordings root, and upload concurrency
- Environment variables that control DB path, recordings root, and other runtime settings
- The order of precedence (defaults, profile, environment variables, CLI)
- What happens to old daemon databases at startup (automatic schema migration)
- A full CLI reference for the commands currently in use
Expand All @@ -32,13 +32,21 @@ It does not explain internal implementation details.
pip install -e .
```

Optional, but recommended for video performance:
Recommended for video recording, and **required** when running the Rust daemon
(`NCD_RUST_DAEMON=1`):

```bash
sudo apt-get update && sudo apt-get install -y ffmpeg
```

The data daemon prefers the `ffmpeg` CLI encoder for recording. If the binary is not installed or encoder init fails, it automatically falls back to PyAV.
Both daemons encode video with the `ffmpeg` CLI, but they differ when `ffmpeg`
is missing or fails to initialise:
- The **Rust daemon** shells out to `ffmpeg` and runs a preflight at startup; if
the binary is missing or the build is incompatible it fails fast with a clear
message rather than starting and dropping every video recording. Install
`ffmpeg` before launching.
- The **legacy Python daemon** prefers the `ffmpeg` CLI but automatically falls
back to PyAV when `ffmpeg` is unavailable or the encoder fails to initialise.

### 2) Launch the daemon

Expand Down Expand Up @@ -115,13 +123,18 @@ When you run:
neuracore data-daemon launch
```

the CLI starts the daemon as a separate Python process by running:
the CLI launches the daemon as a separate background process. There are two
daemon implementations and the launcher picks one based on the `NCD_RUST_DAEMON`
flag (see [rust_data_daemon_development.md](rust_data_daemon_development.md)):

```text
python -m neuracore.data_daemon.runner_entry
```
- **Rust daemon** — when `NCD_RUST_DAEMON` is truthy,
the launcher `exec`s the bundled native binary shipped in the wheel at
`neuracore/data_daemon/bin/data-daemon`. This is the implementation described
throughout this guide.
- **Legacy Python daemon (default)** — when `NCD_RUST_DAEMON` is unset or not
truthy, the launcher runs the Python implementation instead.

That daemon process:
Either daemon process:
- boots the internal components it needs
- starts its main loop
- stays running until you stop it (or the machine shuts down)
Expand All @@ -132,20 +145,16 @@ You may see simple messages when it stops:

### Startup and schema migration

On startup, the daemon initializes the SQLite store and ensures schema compatibility.

If an older single-table schema is detected (legacy `traces.status` format), the daemon
automatically migrates data to the current schema:
On startup the daemon opens its SQLite store (WAL mode) and applies any pending
schema migrations before serving requests.

- `traces` rows are transformed into lifecycle fields:
- `write_status`
- `registration_status`
- `upload_status`
- `recordings` rows are generated per unique `recording_id`
- Existing trace metadata/bytes/error fields are preserved
- Migration runs before normal startup reconciliation

Migration runs once per DB file. After a successful migration, startup continues normally.
The Rust daemon's schema is defined by the SQL migrations under
[rust/data_daemon/migrations/](../rust/data_daemon/migrations/) and applied
automatically with `sqlx::migrate!`. A fresh database is created and migrated on
first launch; an existing one has only the not-yet-applied migrations run. There
is no legacy single-table conversion — the migrations are the single source of
truth for the schema. To inspect the live database see
[rust_data_daemon_development.md#sqlite-state-inspection](rust_data_daemon_development.md#sqlite-state-inspection).

---

Expand Down Expand Up @@ -290,11 +299,6 @@ export NEURACORE_DAEMON_DB_PATH=/workspaces/neuracore/data_daemon_state.db
export NEURACORE_DAEMON_RECORDINGS_ROOT=/workspaces/neuracore/recordings
```

Recommended upload concurrency:
- Most machines: `5-10`
- Start at `5`, increase only if CPU/network/disk are stable
- Very high values can increase retries, memory pressure, and shutdown latency

---

## CLI reference
Expand Down Expand Up @@ -366,9 +370,11 @@ Notes:
### `neuracore data-daemon launch`

```bash
neuracore data-daemon launch [--profile <name>] [--background]
neuracore data-daemon launch [--profile <name>] [--background] [--debug]
```

`--debug` raises the log level (equivalent to setting `NDD_DEBUG=1`).

Examples:

```bash
Expand All @@ -395,6 +401,18 @@ neuracore data-daemon status
neuracore data-daemon stop
```

### `neuracore data-daemon reset`

Stops the daemon (if running) and removes **all** of its local state: the
recordings tree, the SQLite database, the PID file, and the shared-memory
artefacts. This is destructive and cannot be undone — use it to return a wedged
host to a clean slate.

```bash
neuracore data-daemon reset # prompts for confirmation
neuracore data-daemon reset --yes # skip the prompt (for scripts)
```

---

## Offline Recordings
Expand Down Expand Up @@ -463,17 +481,23 @@ neuracore data-daemon launch

### Which video encoder backend is being used

The recording encoder selects backend at runtime:
- Uses `ffmpeg` CLI when `ffmpeg` is available on `PATH`
- Falls back to PyAV when `ffmpeg` is unavailable or fails to initialize
Both daemons encode video with `ffmpeg`, but they handle a missing or broken
`ffmpeg` differently:
- **Rust daemon** (`NCD_RUST_DAEMON=1`) — verifies `ffmpeg` at startup. If the
binary is missing from `PATH`, or the local build cannot run the encode the
daemon needs, the preflight fails and the daemon refuses to start (rather than
starting and silently dropping every video recording).
- **Legacy Python daemon** (default) — uses the `ffmpeg` CLI when it is on
`PATH` and falls back to PyAV when `ffmpeg` is unavailable or fails to
initialise.

Quick check:
Confirm `ffmpeg` is installed and runnable:

```bash
ffmpeg -version
```

If this command succeeds, the daemon will use the FFmpeg backend for new recordings.
If that command fails, install `ffmpeg` (see [Quick start](#1-install-from-repo-root)) for the Rust daemon, or rely on the PyAV fallback under the Python daemon.

### Migration issues on startup

Expand Down
Loading
Loading