fix(master): add --no-master flag and prevent --no-api nodes from self-electing#2135
Open
5F0jd2vLq54RerYW wants to merge 3 commits into
Open
fix(master): add --no-master flag and prevent --no-api nodes from self-electing#21355F0jd2vLq54RerYW wants to merge 3 commits into
5F0jd2vLq54RerYW wants to merge 3 commits into
Conversation
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…f-electing Adds --no-master CLI flag that skips Master instantiation entirely, and couples election candidacy to the CLI flags to prevent worker-only nodes from winning elections during solo partitions. Changes: - src/exo/main.py: Add --no-master flag; skip Master() when set. Set is_candidate = args.spawn_api and not args.no_master so that --no-api nodes (spawn_api=False) and --no-master nodes never self-elect. - src/exo/shared/election.py: Store is_candidate on the Election instance. In _election_status(), non-candidate nodes re-propose the last known master instead of winning by default during solo partitions. Flag precedence (per spec): - --no-master wins over --force-master: no Master is instantiated, no election. - --no-api wins over --force-master: is_candidate=False, seniority=-1. Also adds exo_rs stub to src/exo/shared/tests/conftest.py so election tests run without a compiled Rust extension binary. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Author
|
Note: nix is not available in this dev environment, so I ran |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related fixes for node role management in multi-node clusters:
1.
--no-masterflagNew CLI flag that skips
Masterinstantiation entirely. Previously every node always started a Master even when it was intended to run as a pure worker (e.g. a GPU worker with no API surface). With--no-master, the Master event loop, command processor, and download coordinator are not started.2.
--no-apinodes must not self-electWhen a node is started with
--no-api(spawn_api=False), it should not be eligible to become master — an API-less node winning an election means no API is reachable in the cluster. Previously, a--no-apinode in a solo partition would self-elect (the bully algorithm has no other candidates). This PR setsis_candidate = args.spawn_api and not args.no_master, which gives non-candidate nodes a seniority of-1so any API-bearing node beats them.A complementary guard in
election.pyensures non-candidate nodes re-propose the last known master during solo partitions instead of winning by default.Flag precedence
--no-master--no-api--force-master--no-master --force-master--no-api --force-masterTests
src/exo/shared/tests/test_election.py) all passexo_rsstub tosrc/exo/shared/tests/conftest.pyso tests run without a compiled Rust binary🤖 Generated with Claude Code