feat(gnmi-discovery): event-driven gNMI discovery backend#436
feat(gnmi-discovery): event-driven gNMI discovery backend#436leoparente wants to merge 98 commits into
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…iasing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…fake Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements GnmicDialer + gnmicSession satisfying the Session/Dialer interfaces using github.com/openconfig/gnmic/pkg/api. All gnmi proto types are confined to gnmi/gnmic.go; build/vet/test green; no leakage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…de Decimal64 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sh-in-name paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ncile Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rrors, active-mode Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Expand _base.yaml with the H-2 comment explaining why speed and lag_member are intentionally excluded from the interface keys (subscribed-but-untranslated → churn; deferred pending Interface.Speed / Interface.Lag mapping in translate.go). Add arista_eos.yaml and nokia_sros.yaml vendor overlays that extend _base. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add TestDryRunEOSGolden which drives the real Runner with a FakeSession replaying a canned EOS notification stream (testdata/eos_stream.json) and asserts the ingested entity set: exactly 1 Device, 1 Interface, and 1 Module (ModuleBay is emitted separately). The test pins profile: arista_eos and uses the existing recordingClient/FakeDialer. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…; robust vendor detection
…ics shutdown, content-type/timeout hardening Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ction (real-transport path) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ic); robust idempotent metrics shutdown Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
@codex review |
…oned IPv6 - server: getStatus computes uptime from the copied st.StartTime (not s.stat), consistent with the race-avoidance copy. - policy: ensurePort validates a zone-qualified IPv6 literal (fe80::1%eth0) by ParseIP-ing the zone-stripped host, then brackets the original (with zone) and appends the default port, instead of treating it as malformed. Added regression cases (manager + ensurePort table tests). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e469f4ea0e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…p link label - policy: validateInterfaceRegexes rejects an interface_patterns entry with an empty type (would set Interface.Type to "", unresolvable by NetBox/Diode, and short-circuit the OC/name/speed/default fallback). resolveInterfaceType also defensively skips empty-typed patterns. - docs: fix the top-level README label "snmp-discover" -> "snmp-discovery". Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@codex review |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
@codex review |
|
Codex Review: Didn't find any major issues. Already looking forward to the next diff. Reviewed commit: ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8f3ba86b03
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…anslation An interface excluded via interface_exclude_patterns was skipped by translateInterfaces but its subinterface IPs were still emitted (assigned to a stub interface), and translatePrefixes then derived prefixes from them — so a "skipped entirely" interface leaked back in via its addresses. Extracted the exclude compile/match into shared helpers (compileInterfaceExcludes/nameExcluded) and applied them in translateIPs too, dropping excluded interfaces' addresses (and their derived prefixes). Added a regression test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6aab305cf1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return fmt.Errorf("parse profile %s: %w", filename, err) | ||
| } | ||
| p.Name = strings.TrimSuffix(filename, ".yaml") | ||
| into[p.Name] = &p |
There was a problem hiding this comment.
Preserve bundled profile when override inheritance is invalid
When an override reuses a bundled filename but has a bad extends chain or inheritance cycle, this assignment replaces the bundled raw profile before semantic validation. The later resolve error skips that name entirely, so a typo in something like /profiles/arista_eos.yaml removes the built-in Arista profile instead of skipping the bad override and falling back to the bundled profile; keep the bundled entry until the override resolves successfully.
Useful? React with 👍 / 👎.
Summary
Adds
gnmi-discovery— a new event-driven network-discovery backend (sibling tosnmp-discovery/network-discovery) that maintains long-lived gNMI subscriptions and keeps NetBox current, via Diode, within seconds of an inventory/config change.It reuses the existing backend contract: HTTP policy API, Diode ingestion, OTEL metrics, dry-run,
${ENV}interpolation — implemented in Go, on default port 8074.How it works
auto → on_change → sample → get.on_changeis preferred;sample/getare used (and surfaced in status) when a device can't stream. Curated, inventory-relevant OpenConfig paths only (no volatile telemetry — NetBox stays a source of truth, not a metrics store)._base+ thin vendor overlays). Unknown vendors auto-fall-back to_base(zero-config). Operators can drop overrides viaprofiles_dirwith no rebuild.run_id/policy_nameDiode metadata) surface in/api/v1/status, mirroring the other backends.Supported platform types
Vendor is auto-detected from the gNMI Capabilities
Organizationtoken; the matched overlay only adds amatchalias on top of_base(and documents that vendor's known-weak OpenConfig leaves) — real per-vendor leaf-path overrides are deferred until device captures exist, so the overlays bet on standard OpenConfig and degrade gracefully on absent leaves. Interface name→type recognition for each vendor's naming format lives in a shared cross-vendor table.arista_eosnokia_srosnvidia_cumulusciscojuniperhuawei10GE/25GE/40GE/100GE/Eth-Trunk/Vlanifname typingdell_os10sonicsonicoverlay; requires the device's OpenConfig/Translib framework (SONiC's defaultsonic-dborigin is out of scope)Any vendor not listed still discovers via
_base(standard OpenConfig) with zero configuration.What it discovers (
dcim/ipam, no custom fields — mirrors snmp/device-discovery)part-no), software version (folded into Platform name).state/type→ name-pattern → speed ladder), speed, MAC, LAG membership, duplex.Testing & review
go build ./...,go vet ./...,go test ./... -racegreen;golangci-lintclean; per-package coverage ≥90%. CI workflows (lint/tests/release) added.gnmi.Sessioninterface (fake-driven unit tests); the realopenconfig/gnmicclient is pinned and confined to one file.getStatusdata race, negative/overflowing-interval crash guards, OpenConfig identityref normalization, and decoupling network-OS profile selection from hardware-vendor manufacturer attribution.Notes
🤖 Generated with Claude Code