feat: retain working Lantern servers across config updates#534
feat: retain working Lantern servers across config updates#534garmr-ulfr wants to merge 5 commits into
Conversation
Replace setServers with updateServers, which keeps working Lantern servers from prior config batches instead of replacing the full set. Servers present in the incoming config are pruned from the add list so their selection history and connections survive; retained servers are capped at 60 and evicted oldest-first by selection age, with hard-demoted servers evicted first. Sort 'servers list' by latest success delay and add a --limit flag. Include the event value in the panic-in-callback log. Bumps lantern-box to v0.0.95.
There was a problem hiding this comment.
Pull request overview
This PR changes how Lantern server configs are applied so previously working servers can be retained across config updates (with a cap), and it improves the lantern servers list UX while updating the lantern-box dependency.
Changes:
- Replace full Lantern server set replacement with an “update + evict” strategy that retains prior working servers up to a fixed limit.
- Add
--limittolantern servers listand sort output by lowest observed latency. - Bump
github.com/getlantern/lantern-boxtov0.0.95.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/radiance.go | Adds updateServers + eviction helpers to retain servers across config updates. |
| backend/radiance_test.go | Adds/relocates backend tests, including eviction tests for retained Lantern servers. |
| backend/exhaustion_test.go | Removes the old exhaustion gate test file (moved into backend/radiance_test.go). |
| cmd/lantern/servers.go | Adds --limit, sorts by latency, and updates list output behavior. |
| cmd/lanternd/lanternd.go | Adjusts how the parent process derives the logger writer when draining child output. |
| events/events.go | Enhances panic logging in event callbacks to include event context. |
| go.mod | Updates github.com/getlantern/lantern-box version. |
| go.sum | Updates checksums for the lantern-box version bump. |
Comments suppressed due to low confidence (1)
cmd/lantern/servers.go:137
- The new --limit flag is ignored when --json is used, so the CLI behaves inconsistently across output modes.
func serversList(ctx context.Context, c *ipc.Client, showLatency, asJSON bool, limit int) error {
srvs, err := c.Servers(ctx)
if err != nil {
return err
}
if asJSON {
out := make([]ServerListEntry, 0, len(srvs))
for _, s := range srvs {
out = append(out, ServerListEntry{
Tag: s.Tag,
Type: s.Type,
Location: s.Location,
SelectionHistory: s.SelectionHistory,
})
}
return printJSON(out)
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Prune servers already present in the manager from the incoming list before computing evictions, and evict hard-demoted Lantern servers even when they reappear in the new config batch. Drops the now-unused incomingTags filtering from lanternServersToEvict. Guard against a nil SelectionHistory when sorting 'servers list', log only the event type on a panic-in-callback, and write child process output to stdout only to avoid duplicate log lines.
Sort and truncate the server list before branching on output format so 'servers list --json --limit N' returns the N fastest instead of all servers. Guard SelectionHistory before dereferencing it in printServerEntry, and correct the config-update error log to say 'updating servers in manager'.
myleshorton
left a comment
There was a problem hiding this comment.
Took a thorough pass across both repos. The core idea — stop discarding selection history on every config refresh — is a real improvement, and Copilot's surface comments are all addressed. One significant correctness issue plus a few design tradeoffs to weigh before merge.
🔴 The prune-by-tag assumption doesn't hold — in-place server updates get silently dropped
updateServers prunes incoming servers whose tag already exists, justified (in the radiance.go:656 thread) by "the tag will change if any of the options change." That isn't true: the client outbound tag is route-ID-derived, not content-derived.
- Tag =
<type>-out-<track>-<routeID>— lantern-cloudcmd/api/config.go:420/:426(proxy.Name=trackName-routeID,config.go:390). - lantern-cloud mutates a route's client-visible options in place under a stable route ID:
proxy_infrastructure.sql:957UpdateVPSRouteRunning … SET address=@address, port=@port, launch_cfg=@launch_cfg WHERE id=@id(+route_update.go,POST /proxy/routes/{id}).
→ When address/port/launch_cfg change but the route ID (tag) stays the same — in-place reprovision, key/credential rotation, launch_cfg edits — the prune drops the update and the client keeps stale config until that server fails, hard-demotes, is evicted, and is re-added fresh on a later cycle. Full-replace SetServers applied it immediately, so this is a regression. Details in the radiance.go:656 thread. If reprovision/rotation always issues a new route ID this is safe — but that invariant is unenforced/undocumented and load-bearing; please confirm + document, and ideally update-in-place when a tag matches but options differ (preserving SelectionHistory).
🟡 Design tradeoffs worth a decision
- Config loses authority to force-remove a server. A server dropped from the config is retained until it hard-demotes or ages out — there's no longer a config-driven way to pull a bad/compromised/decommissioned server quickly. Confirm that's acceptable and document it.
- Hard-demoted-but-still-offered servers flap. They're evicted unconditionally and pruned from the incoming batch, so they're removed this cycle and re-added fresh next cycle (re-probed, demotion lost), repeating every refresh. If the reset is intentional, a comment noting the per-cycle reprobe churn would help.
- Retaining up to 60 servers means more outbounds + more probing. Worth confirming the auto-select/URL-test path doesn't aggressively probe all 60 retained servers (resource cost).
maxRetainedLanternServerscaps retained, not total.retentionBudget = max(limit-incomingCount, 0), so a single config batch > 60 still exceeds the cap. Unlikely, but the name implies a hard limit it doesn't enforce.AddServersno longer setsIsLantern. It works only because the caller setsIsLantern: true(radiance.go:238) andClone()preserves it; the oldSetServersenforced it. Eviction silently keys onIsLantern, so a future caller that forgets breaks the cap invisibly — consider keeping that in the manager or asserting it.
🟢 Minor
RemoveServers→AddServersisn't atomic: if the add fails after the remove, you net-lose servers (low odds given the prune, but silent).- Test gap:
lanternServersToEvictis well covered, but there's no test forupdateServersitself (prune + evict + add + outbounds) — which is where the tag issue lives — nor forincomingCount > limit. cmd/lanternd/lanternd.go: the commented-out handler + TODO is unrelated scope creep for this PR.
👍 Good
Preserving SelectionHistory for working servers across refreshes is the right call and a clear win over reset-everything; eviction unit tests are clean; events.go now names the panicking event type; CI green.
Remove Manager.SetServers, which is now dead code. Also clarify that hard-demoted Lantern servers are always evicted so a later re-offer is re-probed as a fresh candidate.
|
On "🔴 The prune-by-tag assumption doesn't hold"
This is true. Existing routes will not pick up changes to The cited On "🟡 Design tradeoffs worth a decision"
Overstated. There is no force-remove channel in the config or in lantern-cloud; removal-by-omission was a side effect of full-replace, not a control surface. When lantern-cloud deprecates a route, it tears the box down promptly, so the server goes unreachable; client hard-demotes within ~5–15 min and evicts it on the next config fetch (≤15 min). Bounded and self-healing; that trade is the point of the PR. Not a blocker.
The "demotion lost" flagged as a downside is the intended behavior. If a server is re-offered, it's healthy ASN-wide, so re-adding it fresh and discarding
Confirmed: the background cycle does probe all retained members, but it's concurrency-capped and only while the tunnel is active. That's by design — retention is pointless without it. It's bounded by the cap + interval; if it's too much, we can lower it — it's a tuning knob, not a correctness issue.
True and irrelevant.
Wrong in both directions. |
This pull request updates the logic for handling config responses to keep working Lantern servers from previous configs instead of replacing the full set entirely. Servers are retained until they stop working or the max limit is reached (60).
Lantern server management improvements
LocalBackend: replacedsetServerswith a newupdateServersmethod that intelligently evicts Lantern servers when the configured limit is exceeded, prioritizing removal of hard-demoted and oldest servers, and avoids disrupting refreshed or active servers. Added helper functions for server tag management and eviction logic. (backend/radiance.go) [1] [2] [3]maxRetainedLanternServers) to prevent excessive resource usage. (backend/radiance.go)CLI and usability improvements
lantern servers listcommand with a--limitflag to restrict the number of servers shown, and added sorting by lowest latency. The output now also prints the total number of servers available. (cmd/lantern/servers.go) [1] [2] [3] [4] [5]Dependency updates
lantern-boxdependency to versionv0.0.95ingo.mod.