RST Auto Synchronization#318
Open
iamjoemccormick wants to merge 10 commits into
Open
Conversation
Member
Author
|
@claude review once |
5e16b67 to
72360dc
Compare
72360dc to
43d6ea9
Compare
While integrating Watch with Remote, a few updates were needed: - Detect the meta node ID and include it on the gRPC stream context. - Default to waiting for subscribers to ack the last event received before streaming events.
Bugs in the RST push+stub flow could cause state corruption or local data loss when a job is cancelled under certain race conditions: 1. Remote's UpdateWork() had no terminal state guard. A late-arriving COMPLETED work result (e.g. from Sync journal replay after restart or a gRPC context cancellation race) could trigger job.Complete() on an already-cancelled job, overwriting the CANCELLED state and violating the user's cancel intent. For multipart uploads this typically results in a FAILED job (the multipart was already aborted so finishUpload fails), but the state corruption makes job history confusing and difficult to reason about. For non-multipart uploads this could create a stub file after the job was cancelled, though data loss should not occur since the contents were already synced to the bucket. Fixed by checking job.InTerminalState() before processing. The work result is still persisted for inspection, but no completion logic runs. 2. Sync's gRPC server discarded work results when the work manager returned both a result and an error. This happens when Remote tries to cancel already-COMPLETED work — the manager returns the COMPLETED result alongside an error, but the server returned only the gRPC error. Remote never learned the work was COMPLETED and set the state to UNKNOWN. Fixed by returning the work result without a gRPC error when the manager provides one, so Remote sees the actual state. Assisted-by: Claude:claude-opus-4-6 3. updateRstCfg wrapped the sentinel with %s and only the inner error with %w, so errors.Is(err, ErrJobAlreadyOffloaded) returned false when updateRstConfig failed. That promoted the error to ErrJobFailedPrecondition downstream and tripped the GenerateWorkRequests lock-clear defer, leaving a stubbed-but-unlocked file. Wrap both with %w so the sentinel stays in the unwrap chain and the defer correctly skips this case. Assisted-by: Claude:claude-opus-4-7
Assisted-by: Claude:claude-sonnet-4-6
The subscriber service is split from the gRPC server so optionally the service can be reused with an existing gRPC server.
Wraps the subscriber service adding the ability to dispatch default or event specific functions as events are received. Can be wired to either an existing gRPC server, or used to setup a new one. Optional rate limits can be defined for all users to limit what event types are dispatched within a configurable time window. These limits can be overridden for specific or ranges of user IDs for all or a subset of event types. By default no events for any user are dispatched.
Wire the Watch event dispatcher + subscriber service into Remote, and define a dispatch function.
Squash into: feat(rst): support specifying restore policy with push and pull Note these changes are also needed for the later eventFilter commit so this could also be made a standalone commit.
43d6ea9 to
62e1222
Compare
753f810 to
39dd103
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do / why do we need it?
Required for all PRs.
At a high level this adds support for integrating Watch into Remote so file system modification events can trigger Remote jobs. To achieve this a number of improvements were made to Watch itself, including the addition of common/reusable subscriber and dispatch packages.
Currently only the ability to automatically restore offloaded files is implemented.
Automatically syncing files when they are closed will be a fairly trivial addition once #312 is merged, as that (amongst other things) adds the ability to set a
delay_executionwhen submitting job requests. The plan is to only auto sync files where the cooldown >0 and set the cooldown as thedelay_execution. This allows us to avoid having to add a separate journal/mechanism to keep track of files as we wait for their cooldown to expire - if the file is reopened we can just cancel the job. As those changes are fairly self contained/independent of everything else in this PR, the rest is ready for review.Related Issue(s)
Required when applicable.
Closes https://github.com/ThinkParQ/bee-remote/issues/18
Where should the reviewer(s) start reviewing this?
Only required for larger PRs when this may not be immediately obvious.
The changes are split into standalone commits which are intended to be reviewed oldest->newest as they build on each other.
Are there any specific topics we should discuss before merging?
Not required.
What are the next steps after this PR?
Not required.
Checklist before merging:
Required for all PRs.
When creating a PR these are items to keep in mind that cannot be checked by GitHub actions:
For more details refer to the Go coding standards and the pull request process.