Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions REMOVED_CONTENT_CLEANUP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Removed content — cleanup tracking

This file tracks documentation content that describes **removed** Weaviate features,
environment variables, or configuration fields. Rather than deleting such content the
moment a feature is removed, we keep it in place with a short "Removed in `vX.Y`" note so
that users upgrading from an older version can still find the entry and understand what
happened to it. Once enough releases have passed that few users are upgrading across the
removal boundary, the noted content should be deleted.

## Policy

- When a feature/env var/config field is removed from Weaviate, **mark** the corresponding
docs entry as `Removed in vX.Y` instead of deleting it immediately.
- Add a row to the table below so the kept-but-stale content can be found and cleaned later.
- **Suggested cleanup window:** keep the note for roughly three minor releases after the
removal, then delete the entry and remove its row here. (Adjust per the supported-version
policy at cleanup time — these are guidelines, not hard commitments.)

## Tracked entries

| Page / file | Removed item | Removed in | Suggested cleanup | Notes |
| --- | --- | --- | --- | --- |
| `docs/deploy/configuration/env-vars/index.md` | `ASYNC_REPLICATION_CLUSTER_MAX_WORKERS` (table row) | `v1.38` | `v1.41`+ | Replaced by `ASYNC_REPLICATION_SCHEDULER_WORKERS`. |
| `docs/deploy/configuration/env-vars/index.md` | `ASYNC_REPLICATION_ALIVE_NODES_CHECKING_FREQUENCY` (table row) | `v1.38` | `v1.41`+ | Scheduler no longer polls alive nodes separately. |
| `docs/deploy/configuration/env-vars/runtime-config.md` | `async_replication_cluster_max_workers` (override mapping row) | `v1.38` | `v1.41`+ | Runtime override removed alongside the env var. |
| `docs/deploy/configuration/async-rep.md` | "Removed environment variables (v1.38)" `<details>` block (both vars above) | `v1.38` | `v1.41`+ | Delete the whole block at cleanup. |
| `docs/deploy/configuration/replication.md` | "Removed in `v1.38`" `:::note` admonition | `v1.38` | `v1.41`+ | Mentions both removed env vars. |
2 changes: 1 addition & 1 deletion _includes/async-replication-per-collection-config.mdx
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
:::info Collection-level configuration — Added in `v1.36`
Async replication parameters can also be set per-collection via the `asyncConfig` object in `replicationConfig`. Per-collection settings override the cluster-wide environment variable defaults. See [Collection `asyncConfig` parameters](/weaviate/config-refs/collections#async-config) for details.
Async replication runs by default for any collection with a replication factor greater than `1` (as of `v1.38`). To fine-tune its behavior for a specific collection, set the `asyncConfig` object in `replicationConfig`. Per-collection settings override the cluster-wide environment variable defaults. See [Collection `asyncConfig` parameters](/weaviate/config-refs/collections#async-config) for details.
:::
3 changes: 0 additions & 3 deletions _includes/code/config-refs/reference.collections.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,6 @@
),
replication_config=Configure.replication(
factor=1,
async_enabled=False,
deletion_strategy=ReplicationDeletionStrategy.TIME_BASED_RESOLUTION,
),
)
Expand Down Expand Up @@ -733,7 +732,6 @@
# highlight-start
replication_config=Configure.replication(
factor=3,
async_enabled=True,
deletion_strategy=ReplicationDeletionStrategy.TIME_BASED_RESOLUTION,
),
# highlight-end
Expand All @@ -743,7 +741,6 @@
# Test
collection = client.collections.use("Article")
config = collection.config.get()
assert config.replication_config.async_enabled == True
assert (
config.replication_config.deletion_strategy
== ReplicationDeletionStrategy.TIME_BASED_RESOLUTION
Expand Down
3 changes: 1 addition & 2 deletions _includes/code/configuration/replication-consistency.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,7 @@ curl \
}
],
"replicationConfig": {
"factor": 3,
"asyncEnabled": true
"factor": 3
}
}' \
http://localhost:8080/v1/schema
Expand Down
2 changes: 1 addition & 1 deletion _includes/code/howto/go/docs/manage-data.classes_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -585,8 +585,8 @@ func Test_ManageDataClasses(t *testing.T) {
articleClass := &models.Class{
Class: "Article",
Description: "Collection of articles",
// Async replication runs by default when the replication factor is greater than 1
ReplicationConfig: &models.ReplicationConfig{
AsyncEnabled: true,
Factor: 3,
DeletionStrategy: models.ReplicationConfigDeletionStrategyTimeBasedResolution,
},
Expand Down
6 changes: 1 addition & 5 deletions _includes/code/howto/manage-data.collections.py
Original file line number Diff line number Diff line change
Expand Up @@ -786,9 +786,9 @@
client.collections.create(
"Article",
# highlight-start
# Async replication runs by default when the replication factor is greater than 1
replication_config=Configure.replication(
factor=3,
async_enabled=True,
),
# highlight-end
)
Expand Down Expand Up @@ -819,10 +819,8 @@
# highlight-start
replication_config=Configure.replication(
factor=3,
async_enabled=True,
deletion_strategy=ReplicationDeletionStrategy.TIME_BASED_RESOLUTION,
async_config=Configure.Replication.async_config(
max_workers=5,
hashtree_height=16,
frequency=30,
),
Expand All @@ -840,7 +838,6 @@
collection.config.update(
replication_config=Reconfigure.replication(
async_config=Reconfigure.Replication.async_config(
max_workers=10,
frequency=60,
),
),
Expand All @@ -851,7 +848,6 @@
# Test
collection = client.collections.use("Article")
config = collection.config.get()
assert config.replication_config.async_enabled == True
assert (
config.replication_config.deletion_strategy
== ReplicationDeletionStrategy.TIME_BASED_RESOLUTION
Expand Down
5 changes: 1 addition & 4 deletions _includes/code/howto/manage-data.collections.ts
Original file line number Diff line number Diff line change
Expand Up @@ -677,9 +677,9 @@ import { configure } from 'weaviate-client';
await client.collections.create({
name: 'Article',
// highlight-start
// Async replication runs by default when the replication factor is greater than 1
replication: configure.replication({
factor: 1,
Comment on lines +680 to 682
asyncEnabled: true,
}),
// highlight-end
})
Expand Down Expand Up @@ -710,10 +710,8 @@ await replicationClient.collections.create({
// highlight-start
replication: configure.replication({
factor: 3,
asyncEnabled: true,
deletionStrategy: 'TimeBasedResolution',
asyncConfig: {
maxWorkers: 5,
hashtreeHeight: 16,
frequency: 30,
},
Expand All @@ -729,7 +727,6 @@ const articleReplication = replicationClient.collections.use('Article')
await articleReplication.config.update({
replication: reconfigure.replication({
asyncConfig: {
maxWorkers: 10,
frequency: 60,
},
}),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -408,20 +408,20 @@ void testReplicationSettings() throws IOException {
@Test
void testAsyncRepair() throws IOException {
// START AsyncRepair
// Async replication runs by default when the replication factor is greater than 1
client.collections.create("Article", col -> col.replication(
Replication.of(rep -> rep.replicationFactor(1).asyncEnabled(true))));
Replication.of(rep -> rep.replicationFactor(1))));
Comment on lines +411 to +413
// END AsyncRepair

var config = client.collections.getConfig("Article").get();
assertThat(config.replication().asyncEnabled()).isTrue();
assertThat(config.replication().replicationFactor()).isEqualTo(1);
}

@Test
void testAllReplicationSettings() throws IOException {
// START AllReplicationSettings
threeNodeClient.collections.create("Article",
col -> col.replication(Replication.of(rep -> rep.replicationFactor(3)
.asyncEnabled(true)
.deletionStrategy(DeletionStrategy.TIME_BASED_RESOLUTION)
.asyncReplication(AsyncReplicationConfig.of(async -> async
.propagationConcurrency(5)
Expand All @@ -431,7 +431,6 @@ void testAllReplicationSettings() throws IOException {

var config = threeNodeClient.collections.getConfig("Article").get();
assertThat(config.replication().replicationFactor()).isEqualTo(3);
assertThat(config.replication().asyncEnabled()).isTrue();

// START UpdateReplicationSettings
var collection = threeNodeClient.collections.use("Article");
Expand Down
1 change: 0 additions & 1 deletion _includes/collection-mutable-parameters.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
- `autoTenantCreation` (introduced in `v1.25.0`)
- `autoTenantActivation` (introduced in `v1.25.2`)
- `replicationConfig`
- `asyncEnabled` (introduced in `v1.26.0`)
- `factor` (not mutable in `v1.25` or higher)
- `deletionStrategy` (introduced in `v1.27.0`)
- `vectorIndexConfig`
Expand Down
46 changes: 30 additions & 16 deletions docs/deploy/configuration/async-rep.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ This applies solely to data objects, as metadata consistency is treated differen
### Under the Hood

- Async replication operates as a background process either per tenant (in a multi-tenant collection) or per shard (in a non-multi-tenant collection).
- It is disabled by default but can be enabled through collection configuration changes, similar to setting the replication factor.
- As of Weaviate `v1.38`, async replication is **enabled by default** for any collection with a replication factor greater than `1`. There is no longer a per-collection toggle to enable or disable it; the per-collection `asyncConfig` object is used only to fine-tune the behavior of an already-running process.
- To turn async replication off across the entire cluster, set the [`ASYNC_REPLICATION_DISABLED`](#async_replication_disabled) environment variable to `true`.

## Environment Variable Deep Dive

Expand Down Expand Up @@ -48,13 +49,36 @@ Globally disables the entire async replication feature.
<details>
<summary> Cluster Worker Limits </summary>

#### `ASYNC_REPLICATION_CLUSTER_MAX_WORKERS`
Sets the maximum number of concurrent async replication workers across the entire cluster.
#### `ASYNC_REPLICATION_SCHEDULER_WORKERS`
Sets the number of workers in the cluster-wide pool that the async replication scheduler uses to run hashbeat work across all shards and tenants.

- Its default value is `30`.
- **Use case**: Limits the total number of concurrent replication workers to prevent resource exhaustion in large clusters with many collections or tenants.
- Its default value is `10`. The maximum is `100`.
- **Use case**: A single bounded worker pool replaces the previous per-shard goroutines, so this is the main lever for capping async replication's total concurrency and preventing resource exhaustion on clusters with many collections or tenants.
- **Special Considerations**:
- This is a cluster-wide cap. Individual collections can set their own `maxWorkers` via the per-collection [`asyncConfig`](/weaviate/config-refs/collections#async-config), but the total across all collections will not exceed this cluster limit.
- This is a cluster-wide setting. There is no per-collection worker count; collections share this single pool.

:::note Changed in `v1.38`
`ASYNC_REPLICATION_SCHEDULER_WORKERS` replaces the removed `ASYNC_REPLICATION_CLUSTER_MAX_WORKERS` environment variable, and the per-collection `maxWorkers` option has been removed.
:::

#### `ASYNC_REPLICATION_HASHTREE_INIT_CONCURRENCY`
Sets how many shards may initialize (build) their hash tree concurrently when async replication starts up.

- Its default value is `100`.
- **Use case**: Bounds the burst of work when many shards begin async replication at once, for example after a node restart or when many replicated collections exist.

</details>

<details>
<summary> Removed environment variables (v1.38) </summary>

These variables were removed when async replication moved to a centralized scheduler in `v1.38`. They are listed here for reference and are no longer read by Weaviate.

#### `ASYNC_REPLICATION_CLUSTER_MAX_WORKERS`
**Removed in `v1.38`.** Previously set the maximum number of concurrent async replication workers across the cluster (default `30`). Replaced by [`ASYNC_REPLICATION_SCHEDULER_WORKERS`](#async_replication_scheduler_workers).

#### `ASYNC_REPLICATION_ALIVE_NODES_CHECKING_FREQUENCY`
**Removed in `v1.38`.** Previously defined how often the background process checked for changes in node availability (default `5s`). The scheduler no longer uses a separate alive-nodes polling mechanism.

</details>

Expand Down Expand Up @@ -153,16 +177,6 @@ Defines a shorter frequency for subsequent comparison and propagation attempts w

</details>

<details>
<summary> Node Status Monitoring </summary>

#### `ASYNC_REPLICATION_ALIVE_NODES_CHECKING_FREQUENCY`
Defines the frequency at which the system checks for changes in the availability of nodes within the cluster.
- Its default value is `5s`. The value requires a time unit suffix (e.g. `5s`, `1m`).
- **Use Case(s)**: When a node rejoins the cluster after a period of downtime, it is highly likely to be out of sync. This setting ensures that the replication process is initiated promptly.

</details>

<details>
<summary>Timeout Management </summary>

Expand Down
10 changes: 5 additions & 5 deletions docs/deploy/configuration/env-vars/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,9 +225,7 @@ For more information on authentication and authorization, see the [Authenticatio
| `RAFT_BOOTSTRAP_EXPECT` | The number of voter notes at bootstrapping time | `string - number` | `1` |
| `RAFT_BOOTSTRAP_TIMEOUT` | The time in seconds to wait for the cluster to bootstrap | `string - number` | `90` |
| `RAFT_DRAIN_SLEEP` | Grace period before shutdown to allow ongoing operations to complete. (Default: `200ms`) | `string - number` | `2s` |
| `RAFT_ENABLE_FQDN_RESOLVER` | If `true`, use DNS lookup instead of memberlist lookup for Raft. Removed in `v1.30`. ([Read more](/weaviate/concepts/cluster.md#node-discovery)) | `boolean` | `true` |
| `RAFT_ENABLE_ONE_NODE_RECOVERY` | Enable running the single node recovery routine on restart. This is useful if the default hostname has changed and a single node cluster believes there are supposed to be two nodes. | `boolean` | `false` |
| `RAFT_FQDN_RESOLVER_TLD` | The top-level domain to use for DNS lookup, in `[node-id].[tld]` format. Removed in `v1.30`. ([Read more](/weaviate/concepts/cluster.md#node-discovery)) | `string` | `example.com` |
| `RAFT_GRPC_MESSAGE_MAX_SIZE` | The maximum internal raft gRPC message size in bytes. Defaults to 1073741824 | `string - number` | `1073741824` |
| `RAFT_JOIN` | Manually set Raft voter nodes. If set, RAFT_BOOTSTRAP_EXPECT needs to be adjusted manually to match the number of Raft voters. | `string` | `weaviate-0,weaviate-1` |
| `RAFT_METADATA_ONLY_VOTERS` | If `true`, voter nodes only handle the schema. They do not accept any data. | `boolean` | `false` |
Expand All @@ -250,12 +248,14 @@ For more information on authentication and authorization, see the [Authenticatio

| Variable | Description | Type | Example Value |
| --- | --- | --- | --- |
| `ASYNC_REPLICATION_DISABLED` | Disable async replication. Default: `false` | `boolean` | `false` |
| `ASYNC_REPLICATION_CLUSTER_MAX_WORKERS` | Maximum concurrent async replication workers across the cluster. Default: `30` | `string - number` | `10` |
| `ASYNC_REPLICATION_DISABLED` | Disable async replication cluster-wide. When `false` (default), async replication runs automatically for any collection with a replication factor greater than `1`. Default: `false` | `boolean` | `false` |
| `ASYNC_REPLICATION_SCHEDULER_WORKERS` | Number of workers in the cluster-wide pool that run async replication work across all shards and tenants. Added in `v1.38`, replacing `ASYNC_REPLICATION_CLUSTER_MAX_WORKERS`. Default: `10`, Max: `100`<br/> [Read more.](/deploy/configuration/async-rep.md#async_replication_scheduler_workers) | `string - number` | `10` |
| `ASYNC_REPLICATION_HASHTREE_INIT_CONCURRENCY` | Number of shards that may build their hash tree concurrently when async replication starts up. Added in `v1.38`. Default: `100`<br/> [Read more.](/deploy/configuration/async-rep.md#async_replication_hashtree_init_concurrency) | `string - number` | `100` |
| `ASYNC_REPLICATION_CLUSTER_MAX_WORKERS` | **Removed in `v1.38`.** Previously set the maximum number of concurrent async replication workers across the cluster. Replaced by `ASYNC_REPLICATION_SCHEDULER_WORKERS`. | `string - number` | `30` |
| `ASYNC_REPLICATION_HASHTREE_HEIGHT` | Height of the hash tree used for data comparison between nodes. If the height is `0` each node will store just one digest per shard. Default: `16` (single-tenant) / `10` (multi-tenant), Min: `0`, Max: `20`<br/> [Read more about potentially increased memory consumption.](/weaviate/concepts/replication-architecture/consistency#memory-and-performance-considerations-for-async-replication) | `string - number` | `10` |
| `ASYNC_REPLICATION_FREQUENCY` | Frequency of periodic data comparison between nodes. Default: `30s` | `string - duration` | `60s` |
| `ASYNC_REPLICATION_FREQUENCY_WHILE_PROPAGATING` | Frequency of data comparison between nodes while propagation is active. Default: `3s` | `string - duration` | `5s` |
| `ASYNC_REPLICATION_ALIVE_NODES_CHECKING_FREQUENCY` | Frequency of how often the background process checks for changes in the availability of nodes. Default: `5s` | `string - duration` | `20s` |
| `ASYNC_REPLICATION_ALIVE_NODES_CHECKING_FREQUENCY` | **Removed in `v1.38`.** Previously set how often the background process checked for changes in node availability. No longer used by the async replication scheduler. | `string - duration` | `5s` |
| `ASYNC_REPLICATION_LOGGING_FREQUENCY` | Frequency of how often the background process logs any events. Default: `60s` | `string - duration` | `7s` |
| `ASYNC_REPLICATION_DIFF_BATCH_SIZE` | Specifies the batch size for comparing digest information between nodes. Default: `1000`, Min: `1`, Max: `10000` |`string - number` | `2000` |
| `ASYNC_REPLICATION_DIFF_PER_NODE_TIMEOUT` | Defines the time limit a node has to provide a comparison response. Default: `10s` | `string - duration` | `30s` |
Expand Down
4 changes: 3 additions & 1 deletion docs/deploy/configuration/env-vars/runtime-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,9 @@ The following overrides are currently supported:
| Runtime override name | Environment variable name |
| :----------------------------------------------- | :------------------------------------------- |
| `async_replication_disabled` | `ASYNC_REPLICATION_DISABLED` |
| `async_replication_cluster_max_workers` | `ASYNC_REPLICATION_CLUSTER_MAX_WORKERS` |
| `async_replication_scheduler_workers` | `ASYNC_REPLICATION_SCHEDULER_WORKERS` |
| `async_replication_hashtree_init_concurrency` | `ASYNC_REPLICATION_HASHTREE_INIT_CONCURRENCY`|
| `async_replication_cluster_max_workers` _(removed in `v1.38`)_ | `ASYNC_REPLICATION_CLUSTER_MAX_WORKERS` _(removed in `v1.38`)_ |
| `autoschema_enabled` | `AUTOSCHEMA_ENABLED` |
| `default_quantization` | `DEFAULT_QUANTIZATION` |
| `default_sharding_count` | `DEFAULT_SHARDING_COUNT` |
Expand Down
1 change: 0 additions & 1 deletion docs/deploy/configuration/horizontal-scaling.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -338,7 +338,6 @@ curl \
],
"replicationConfig": {
"factor": 3,
"asyncEnabled": true,
"deletionStrategy": "TimeBasedResolution"
}
}' \
Expand Down
Loading
Loading