Skip to content

AG-81 - Enhance Command Registry with Extensible Dispatch System#91

Open
JeffMboya wants to merge 4 commits into
absmach:mainfrom
JeffMboya:AG-81
Open

AG-81 - Enhance Command Registry with Extensible Dispatch System#91
JeffMboya wants to merge 4 commits into
absmach:mainfrom
JeffMboya:AG-81

Conversation

@JeffMboya

@JeffMboya JeffMboya commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

What does this do?

Adds an extensible command registry with metadata, a help command, per-command auth, plus route, control lifecycle, and ota status commands.

Which issue(s) does this PR fix/relate to?

Resolves #81

List any changes that modify/break current functionality

None breaking. Authorization moves from a global gate to per-command; all built-ins still require a token.

Have you included tests for your changes?

Yes — registry, per-command auth, help, ota status, route validation, and all control subcommands are covered.

Did you document any new/modified functionality?

Yes — README command table and usage examples updated for help, control, ota status, and route.

Notes

New commands are MQTT-only (SenML), matching the issue format; no HTTP endpoints added.

@rodneyosodo rodneyosodo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a few minor things

Comment thread service.go
if len(args) < 2 || args[0] == "" || args[1] == "" {
return errors.Wrap(errRouteFailed, errInvalidCommand)
}
deviceID, hexPayload := args[0], args[1]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue 1: Error Handling in Route

The Route command publishes errors to the "control" topic but doesn't validate device existence before attempting operations. This could lead to confusing error messages when a device doesn't exist vs when there's an actual interface error.

Recommendation: Consider adding device existence validation before opening interfaces to provide clearer error messages for operators to diagnose issues.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in f5bf596Route now resolves the device with devices.Get(deviceID) up front, so a missing device returns a clear not-found error before any interface open is attempted.

Comment thread service.go
if a.store == nil {
return notConfigured
}
for key, val := range a.store.All() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue 2: Configuration Reload Validation

The controlReload function applies persisted values without validating they match expected types/ranges. If invalid values were persisted (e.g., through manual file editing), this could cause runtime issues.

Recommendation: Add validation before applying values and return which keys were updated for auditability.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in f5bf596 — each persisted value is now validated via the existing validateSettableValue before being applied; invalid entries are skipped and logged, and the response lists the applied keys for auditability (e.g. reloaded:log_level).

Comment thread service.go Outdated
// per-device scheduler using the agent's run context.
func (a *agent) controlStart() string {
a.paused.Store(false)
if a.sched != nil && a.runCtx != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue 3: Nil runCtx Risk in controlStart

The runCtx field is set to the ctx parameter in New(), but if this context is canceled before controlStart is called, restarting the scheduler will fail. This could leave the agent in an inconsistent state where stop works but start fails.

Recommendation: Add a nil check and consider using a separate lifecycle context that isn't tied to the initialization context.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in f5bf596controlStart now skips the scheduler restart when runCtx is nil or already cancelled and logs the reason instead of failing silently. A fully independent lifecycle context would leak device goroutines past process shutdown (the scheduler ctx must descend from the process ctx), so the explicit cancelled-context guard is the safer fix here.

Enhance the command dispatch system per issue absmach#81:

- Add command metadata (name, description, usage) to the registry and a
  discoverable `help` command that reports it.
- Enforce authorization per command via a RequiresAuth flag instead of a
  single global token gate.
- Add `route` command to forward a hex payload to a downstream device
  interface and return its response.
- Extend `control` with stop/start (pause/resume heartbeat, telemetry, and
  the device scheduler), reload (re-apply persisted config overrides), and
  status, keeping the existing nodered-* passthrough.
- Add `ota,status` to query OTA state over MQTT.
- Wire Route through the logging/metrics middleware and regenerate mocks.
- Cover the registry, per-command auth, help, route, and control subcommands
  with tests; document the commands in the README.

Signed-off-by: JeffMboya <jangina.mboya@gmail.com>
- Route: resolve the device up front so a missing device returns a clear
  error instead of an opaque interface-open failure.
- controlReload: validate each persisted value before applying, skip and
  log invalid entries, and report the applied keys for auditability.
- controlStart: skip restarting the scheduler when the run context is nil
  or already cancelled, logging the reason instead of failing silently.
- Extend tests for the new reload response and validation behaviour.

Signed-off-by: JeffMboya <jangina.mboya@gmail.com>
Document the commands added in this branch now that command docs live under
docs/ rather than the README:

- control.md: extensible registry + per-command auth note, route and help in
  the subsystem table, and sections for control lifecycle (stop/start/reload/
  status), route, and help.
- ota.md: query OTA status via the commands channel (ota,status).

Signed-off-by: JeffMboya <jangina.mboya@gmail.com>
- devices.md: note the route command as a one-shot write-then-read to a
  device interface, linking to its control.md section.
- docs/README.md: update the control.md index entry to mention agent
  lifecycle, route, and help.

Signed-off-by: JeffMboya <jangina.mboya@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhance Command Registry with Extensible Dispatch System

2 participants