Skip to content

3.x: Route SELECT at SERIAL/LOCAL_SERIAL consistency like LWT#891

Open
nikagra wants to merge 2 commits into
scylladb:scylla-3.xfrom
nikagra:fix-serial-consistency-lwt-routing
Open

3.x: Route SELECT at SERIAL/LOCAL_SERIAL consistency like LWT#891
nikagra wants to merge 2 commits into
scylladb:scylla-3.xfrom
nikagra:fix-serial-consistency-lwt-routing

Conversation

@nikagra
Copy link
Copy Markdown

@nikagra nikagra commented May 16, 2026

Note

Fixes DRIVER-616. Related to: #886

Problem

Statements with SERIAL or LOCAL_SERIAL set as their main consistency level (via setConsistencyLevel) are serialised through the Paxos/serial path on the server side — the same path used by LWT writes. However, the 3.x driver only applied LWT load-balancing routing (e.g. PRESERVE_REPLICA_ORDER) to statements where isLWT() returned true, which is driven exclusively by Scylla's server-side prepare metadata.

A normal SELECT prepared or executed with LOCAL_SERIAL consistency:

SimpleStatement simpleSelect =
    new SimpleStatement("SELECT * FROM t WHERE pk=?")
        .setConsistencyLevel(ConsistencyLevel.LOCAL_SERIAL);
PreparedStatement ps = session.prepare(simpleSelect);
BoundStatement bs = ps.bind(1);
// bs.isLWT() == false  → routed as REGULAR, not PRESERVE_REPLICA_ORDER

was routed as REGULAR, ignoring the configured LWT routing method entirely.

Changes

  • TokenAwarePolicy: getRequestRouting() now treats statements with SERIAL or LOCAL_SERIAL main consistency as LWT for routing method selection (same as isLWT()). In PreserveReplicaOrderIterator, remote-DC replicas are excluded when the consistency is LOCAL_SERIAL (preserving local-DC semantics).
  • LatencyAwarePolicy: The latency-reordering bypass that already applied to isLWT() statements is extended to cover SERIAL/LOCAL_SERIAL consistency statements.
  • RackAwareRoundRobinPolicy: The rack-prioritization bypass already in place for isLWT() is extended to cover SERIAL/LOCAL_SERIAL consistency statements.

What is NOT changed

  • setSerialConsistencyLevel() (the paxos-phase serial CL option on LWT writes) is intentionally not affected — only the main setConsistencyLevel() triggers LWT routing.
  • isLWT() semantics are unchanged; this is purely additive.

Tests

  • TokenAwarePolicyTest: 7 new unit tests — serial consistency routed as LWT, LOCAL_SERIAL excludes remote replicas, plain SERIAL keeps remote replicas, serial-CL-option does not trigger LWT routing, null routing-method falls back to regular.
  • LatencyAwarePolicyTest: 1 new unit test — serial consistency statements bypass latency reordering.
  • RackAwareRoundRobinPolicyTest: 1 new unit test — serial consistency statements skip rack prioritization.
  • LWTLoadBalancingIT: New CCM integration test — prepares a SimpleStatement with LOCAL_SERIAL/SERIAL consistency, binds and executes it, verifies coordinator is non-null.

All 67 unit tests pass (mvn test -pl driver-core -Dgroups=unit).

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Extends LWT load-balancing behavior to non-LWT statements that use SERIAL/LOCAL_SERIAL as their main consistency level, so that they are routed through the Paxos-aware path (e.g. PRESERVE_REPLICA_ORDER) instead of as REGULAR queries. The change touches three load-balancing policies and adds unit + CCM integration coverage.

Changes:

  • TokenAwarePolicy.getRequestRouting treats serial-CL statements as LWT for routing; PreserveReplicaOrderIterator filters out remote replicas for LOCAL_SERIAL.
  • LatencyAwarePolicy and RackAwareRoundRobinPolicy extend their existing isLWT() bypasses to cover serial-CL statements via a duplicated hasSerialConsistency helper.
  • New unit tests across the three policies and a new LWTLoadBalancingIT CCM integration test.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
driver-core/src/main/java/com/datastax/driver/core/policies/TokenAwarePolicy.java Route serial-CL statements as LWT; restrict to local replicas only for LOCAL_SERIAL.
driver-core/src/main/java/com/datastax/driver/core/policies/LatencyAwarePolicy.java Bypass latency reordering for serial-CL statements.
driver-core/src/main/java/com/datastax/driver/core/policies/RackAwareRoundRobinPolicy.java Skip rack prioritization for serial-CL statements.
driver-core/src/test/java/com/datastax/driver/core/policies/TokenAwarePolicyTest.java New unit tests covering serial routing, LOCAL_SERIAL local-only filtering, and null routing-method fallback.
driver-core/src/test/java/com/datastax/driver/core/policies/LatencyAwarePolicyTest.java New test verifying serial-CL statements are not latency-reordered.
driver-core/src/test/java/com/datastax/driver/core/policies/RackAwareRoundRobinPolicyTest.java New test verifying serial-CL statements skip rack prioritization.
driver-core/src/test/java/com/datastax/driver/core/policies/LWTLoadBalancingIT.java New CCM integration test for SERIAL/LOCAL_SERIAL SELECT routing.

Comment thread driver-core/src/main/java/com/datastax/driver/core/policies/TokenAwarePolicy.java Outdated
Comment thread driver-core/src/main/java/com/datastax/driver/core/policies/TokenAwarePolicy.java Outdated
Fixes: https://scylladb.atlassian.net/browse/DRIVER-616

Statements with SERIAL or LOCAL_SERIAL as their main consistency level
should be routed through the LWT path (PRESERVE_REPLICA_ORDER) in the
same way as statements where isLWT() returns true.

Changes:
- TokenAwarePolicy: treat statements with SERIAL/LOCAL_SERIAL consistency
  as LWT for routing method selection; exclude remote DC replicas from
  PRESERVE_REPLICA_ORDER plans when consistency is LOCAL_SERIAL
- LatencyAwarePolicy: bypass latency-based reordering for SERIAL/LOCAL_SERIAL
  consistency statements, same as for isLWT() statements
- RackAwareRoundRobinPolicy: skip rack prioritization for SERIAL/LOCAL_SERIAL
  consistency statements, same as for isLWT() statements
- Add unit tests in TokenAwarePolicyTest (7 new), LatencyAwarePolicyTest (1),
  RackAwareRoundRobinPolicyTest (1), and integration test LWTLoadBalancingIT
@nikagra nikagra force-pushed the fix-serial-consistency-lwt-routing branch from 78c7343 to 115a445 Compare May 17, 2026 21:48
@nikagra nikagra requested a review from dkropachev May 17, 2026 23:27
Comment on lines +106 to +113
public static boolean hasSerialConsistency(
Statement statement, ConsistencyLevel defaultConsistencyLevel) {
ConsistencyLevel cl = statement.getConsistencyLevel();
if (cl == null) {
cl = defaultConsistencyLevel;
}
return cl != null && cl.isSerial();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no good reason for this API, it is too specific

Comment on lines 478 to 487
private QueryOptions.RequestRoutingMethod getRequestRouting(Statement statement) {
if (!statement.isLWT() || defaultLwtRequestRoutingMethod == null) {
if (defaultLwtRequestRoutingMethod == null) {
return QueryOptions.RequestRoutingMethod.REGULAR;
}
return defaultLwtRequestRoutingMethod;
if (statement.isLWT()
|| Statement.hasSerialConsistency(statement, queryOptions.getConsistencyLevel())) {
return defaultLwtRequestRoutingMethod;
}
return QueryOptions.RequestRoutingMethod.REGULAR;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's don't optimize it for a wrong configuration:

Suggested change
private QueryOptions.RequestRoutingMethod getRequestRouting(Statement statement) {
if (!statement.isLWT() || defaultLwtRequestRoutingMethod == null) {
if (defaultLwtRequestRoutingMethod == null) {
return QueryOptions.RequestRoutingMethod.REGULAR;
}
return defaultLwtRequestRoutingMethod;
if (statement.isLWT()
|| Statement.hasSerialConsistency(statement, queryOptions.getConsistencyLevel())) {
return defaultLwtRequestRoutingMethod;
}
return QueryOptions.RequestRoutingMethod.REGULAR;
}
private QueryOptions.RequestRoutingMethod getRequestRouting(Statement statement) {
if (!statement.isLWT()) {
return QueryOptions.RequestRoutingMethod.REGULAR;
}
ConsistencyLevel cl = statement.getConsistencyLevel();
if (cl == null) {
cl = queryOptions.getConsistencyLevel();
}
if (cl != null && cl.isSerial()) {
return defaultLwtRequestRoutingMethod;
}
return QueryOptions.RequestRoutingMethod.REGULAR;
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I'm getting your point. The problem we are solving is that server is not able to set LWT flag on PREPARE request. Regular consistency SERIAL or LOCAL_SERIAL are set during EXECUTE frame. But we still want to route such requests an LWT route. Or in pseudocode:

if (isLWT(statement) OR hasSerialConsistency(statement)) {
  return userPreferredLWTRoutingMethod
} else {
  return REGULAR
}

with just added extra treatment for null value of userPreferredLWTRoutingMethod (defaultLwtRequestRoutingMethod in our code) because it can only become null by user explicit setting. Your code on the other hand seems to be a regression to the proposed and even old implementation:

if (NOT isLWT(statement)) {
  return REGULAR // Here goes the case we want to fix: non-LWT but having SERIAL or LOCAL_SERIAL CL
} 
if (hasSerialConsistency(statement)) {
  return userPreferredLWTRoutingMethod // Here is regression: returning LWT-preferred routing for requests marked as LWT _AND_ havind serial consistency.
}
return REGULAR

in other words we return PRESERVE_REPLICA_ORDER as preferred LWT routing method can only be returned for requests marked as both LWT by server's PREPARE response AND having serial consistency. Following cases will not return PRESERVE_REPLICA_ORDER:

  1. Non-LWT select with serial main CL (what DRIVER-611 is trying to solve)
  2. LWT write with non-serial main CL (e.g. LOCAL_QUORUM).


// Second pass: return remote replicas that are UP and not IGNORED
if (nonLocalReplicas != null) {
if (nonLocalReplicas != null && !localOnly) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't just ignore non-local replicas, you need to put them into the tail of the query plan.
Let's don't touch this part for now, we have a task to aling drivers behavior in all these cases, let's address it there.

}
cluster.register(latencyTracker);
metrics = cluster.getMetrics();
queryOptions = cluster.getConfiguration().getQueryOptions();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you use it only for defaultConsistency level, probably it would be better to extract just that, instead of whole QueryOptions

- Remove Statement.hasSerialConsistency() (too specific for public API);
  inline the serial-CL detection in each policy
- Revert LOCAL_SERIAL local-only filtering in PreserveReplicaOrderIterator;
  remote replicas now appear in the tail for both SERIAL and LOCAL_SERIAL
  (alignment with other drivers deferred to a separate task per reviewer)
- Store ConsistencyLevel (not QueryOptions) in LatencyAwarePolicy;
  expose via private isEffectiveConsistencySerial() helper
- Update TokenAwarePolicyTest: replace two separate LOCAL_SERIAL/SERIAL
  remote-replica tests with a single parameterised test covering both CLs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants