Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions _includes/feature-notes/runtime-reindex.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
:::caution Preview — added in `v1.38`

This is a preview feature. The REST shape and behavior may change before GA. Do not rely on backup/restore while a reindex is in flight or recently completed on a v1.38 Preview cluster — wait for all tasks to reach `ready` / `failed` / `cancelled` first.

:::
2 changes: 1 addition & 1 deletion docs/weaviate/concepts/indexing/inverted-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ import BlockmaxWand from '/_includes/feature-notes/blockmax-wand.mdx';

The BlockMax WAND algorithm is a variant of the WAND algorithm that is used to speed up BM25 and hybrid searches. It organizes the inverted index in blocks to enable skipping over blocks that are not relevant to the query. This can significantly reduce the number of documents that need to be scored, improving search performance.

If you are experiencing slow BM25 (or hybrid) searches and use a Weaviate version prior to `v1.30`, try migrating to a newer version that uses the BlockMax WAND algorithm to see if it improves performance. If you need to migrate existing data from a previous version of Weaviate, follow the [v1.30 migration guide](/deploy/migration/weaviate-1-30.md).
If you are experiencing slow BM25 (or hybrid) searches and use a Weaviate version prior to `v1.30`, try migrating to a newer version that uses the BlockMax WAND algorithm to see if it improves performance. If you need to migrate existing data from a previous version of Weaviate, follow the [v1.30 migration guide](/deploy/migration/weaviate-1-30.md) — or on `v1.38+`, use the live [Reindex a property](/weaviate/manage-collections/reindex-property.mdx#migrate-bm25-from-wand-to-blockmax) endpoint to migrate without restart.

:::note Scoring changes with BlockMax WAND

Expand Down
14 changes: 14 additions & 0 deletions docs/weaviate/config-refs/indexing/inverted-index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -257,8 +257,22 @@ You can drop (delete) an inverted index from a property. This is a destructive o

The following index types can be dropped: `searchable`, `filterable`, `rangeFilters`.

REST: `DELETE /v1/schema/{className}/properties/{propertyName}/index/{indexName}`. From `v1.38`, the drop is gated by the [runtime-reindex](../../manage-collections/reindex-property.mdx) MutationGuard — rejected while a reindex is in flight on the same property.

See [How-to: Drop an inverted index](../../manage-collections/inverted-index.mdx#drop-an-inverted-index) for code examples.

## Runtime reindex (v1.38 Preview)

From `v1.38`, three REST endpoints let you alter a property's inverted-index configuration on a live collection without restart:

| Method | Path | Purpose |
|---|---|---|
| `PUT` | `/v1/schema/{class}/indexes/{property}` | Add an inverted index, change tokenization, migrate BM25 WAND → BlockMax, rebuild a bucket, or cancel an in-flight task. The body shape selects the migration type. |
| `DELETE` | `/v1/schema/{class}/properties/{property}/index/{indexName}` | Drop a configured index. `indexName` ∈ `{filterable, searchable, rangeFilters}`. |
| `GET` | `/v1/schema/{class}/indexes` | Read per-property index status. |

`PUT` / `DELETE` require `UPDATE` on `Collections`; `GET` requires `READ` on `CollectionsMetadata`. See [How-to: Reindex a property](../../manage-collections/reindex-property.mdx) for request bodies, response shapes, concurrency rules, and worked examples.

## How Weaviate creates inverted indexes

Weaviate creates **separate inverted indexes for each property and each index type**. For example, if you have a `title` property that is both searchable and filterable,
Expand Down
5 changes: 2 additions & 3 deletions docs/weaviate/manage-collections/collection-operations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -595,9 +595,8 @@ Property indexes are built at import time. If you add a new property after impor
To create an index that includes all of the objects in a collection, do one of the following:

- New collections: Add all of the collection's properties before importing objects.
- Existing collections: Export the existing data from the collection. Re-create it with the new property. Import the data into the updated collection.

We are working on a re-indexing API to allow you to re-index the data after adding a property. This will be available in a future release.
- Existing collections (v1.38+): Use the runtime [Reindex a property](./reindex-property.mdx) endpoints to add or change inverted indexes on a live collection without restart.
- Existing collections (pre-v1.38): Export the existing data from the collection, recreate it with the new property, and re-import the data into the updated collection.

</details>

Expand Down
12 changes: 12 additions & 0 deletions docs/weaviate/manage-collections/inverted-index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,12 @@ import GoCode from "!!raw-loader!/_includes/code/howto/go/docs/manage-data.class

An **inverted index** is a data structure in Weaviate that enables efficient text search and filtering operations.

:::tip Change inverted indexes on a live collection (v1.38+)

You can add, change, rebuild, or drop inverted indexes on a populated collection without restart. See [Reindex a property](./reindex-property.mdx).

:::

<details>
<summary>Additional information</summary>

Expand Down Expand Up @@ -155,6 +161,12 @@ Drop (delete) an inverted index from a property. This is a destructive operation

The following index types can be dropped: `searchable`, `filterable`, `rangeFilters`.

:::note v1.38+ MutationGuard

From `v1.38`, the drop is rejected with `409` while a [runtime reindex](./reindex-property.mdx) is in flight on the same property. Cancel the in-flight task first, or wait for it to reach `ready` / `failed` / `cancelled`.

:::

<Tabs className="code" groupId="languages">
<TabItem value="py" label="Python">
<FilteredTextBlock
Expand Down
260 changes: 260 additions & 0 deletions docs/weaviate/manage-collections/reindex-property.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,260 @@
---
title: Reindex a property
sidebar_label: Reindex a property
sidebar_position: 50
image: og/docs/configuration.jpg
description: Change a property's inverted-index configuration on a live Weaviate collection without restart — add or drop filterable / searchable / range indexes, change tokenization, or migrate BM25 from WAND to BlockMax. Reads stay available throughout.
# tags: ['configuration', 'reindex', 'tokenization', 'bm25']
---

import RuntimeReindexPreview from '/_includes/feature-notes/runtime-reindex.mdx';

<RuntimeReindexPreview/>

Change a property's **inverted-index configuration** on a live collection without restarting the cluster and without losing writes. Reads stay available throughout the migration. This replaces the previous workaround of exporting, recreating the collection, and re-importing.

This page covers **inverted indexes only** (`IndexFilterable`, `IndexSearchable`, `IndexRangeFilters`, BM25 algorithm, tokenization). Vector indexes are configured separately — see [Vector configuration](./vector-config.mdx).

## What you can do

| Verb | Use when |
|---|---|
| **Add** a missing inverted index | The property was created without `IndexFilterable` / `IndexSearchable` / `IndexRangeFilters` and you now need to filter or search on it. |
| **Change** a property's tokenization | You picked `word` and want `trigram` (or vice versa) on a live `text` / `text[]` property. |
| **Migrate BM25** from WAND to BlockMax | One-way upgrade for the searchable bucket on existing collections. |
| **Rebuild / repair** a bucket | Refresh a `RoaringSet` / `RoaringSetRange` / BlockMax index after suspected corruption or heavy data churn. |
| **Cancel** an in-flight migration | You realised the call was wrong, or want to free the slot for another migration. |

To **drop** a configured inverted index, use the existing [Drop an inverted index](./inverted-index.mdx#drop-an-inverted-index) flow — it has client-library support for Python, TypeScript, Go, and Java. From v1.38, the drop operation is gated by the same MutationGuard described on this page.

## REST endpoints

The feature is REST-only for v1.38 Preview. Two new endpoints (`PUT`, `GET`), plus a new MutationGuard contract on the existing drop endpoint:

| Method | Path | Purpose |
|---|---|---|
| `PUT` | `/v1/schema/{class}/indexes/{property}` | **New in v1.38.** Submit a migration. The request body shape selects which migration type. |
| `GET` | `/v1/schema/{class}/indexes` | **New in v1.38.** Read per-property index status. |
| `DELETE` | `/v1/schema/{class}/properties/{property}/index/{indexName}` | Drop a configured index. Existed before v1.38 (see [Drop an inverted index](./inverted-index.mdx#drop-an-inverted-index)); from v1.38 it is now rejected by the MutationGuard while another reindex on the same property is in flight. |

All examples below assume `$WEAVIATE` is the cluster URL and `$TOKEN` is an API key with `UPDATE` on `Collections` (`GET` only needs `READ` on `CollectionsMetadata`).

## Add a missing inverted index

To enable filtering on a property that was created without one:

```bash
curl -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"filterable":{"enabled":true}}' \
"$WEAVIATE/v1/schema/Article/indexes/category"
```

Body shapes per index type:

| Body | Effect |
|---|---|
| `{"filterable":{"enabled":true}}` | Creates a `RoaringSet` bucket and flips `IndexFilterable=true`. |
| `{"searchable":{"enabled":true,"tokenization":"word"}}` | Creates a BlockMax searchable bucket, sets `Tokenization`, flips `IndexSearchable=true`. Requires `text` / `text[]`. |
| `{"rangeable":{"enabled":true}}` | Creates a `RoaringSetRange` bucket and flips `IndexRangeFilters=true`. Numeric types only (`int`, `number`, `date`). |

The server responds `202 Accepted` with a task ID:

```json
{
"status": "STARTED",
"taskId": "reindex/Article/category/enable-filterable/1700000000"
}
```

## Change tokenization

To retokenize a populated `text` property:

```bash
curl -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"searchable":{"tokenization":"trigram"}}' \
"$WEAVIATE/v1/schema/Article/indexes/body"
```

If the property has **both** a searchable and a filterable index, they are retokenized together in one coordinated migration. To retokenize only the filterable bucket:

```bash
curl -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"filterable":{"tokenization":"word"}}' \
"$WEAVIATE/v1/schema/Article/indexes/category"
```

For the canonical list of tokenization values, see [Tokenization options](../config-refs/collections.mdx#tokenization).

## Migrate BM25 from WAND to BlockMax

[BlockMax WAND](../concepts/indexing/inverted-index.md) is a faster scoring algorithm that replaced the original WAND implementation. On collections created before BlockMax was the default, you can migrate every searchable property in one call:

```bash
curl -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"searchable":{"algorithm":"blockmax"}}' \
"$WEAVIATE/v1/schema/Article/indexes/body"
```

The migration touches every searchable property on the class. The class-level `UsingBlockMaxWAND` flag flips to `true` after every property has been rebuilt.

This is **one-way** — a request to flip `searchable.algorithm` back from `blockmax` to `wand` is rejected. WAND is deprecated.

## Rebuild or repair an index

If you suspect corruption or want to refresh an index after heavy data churn:

```bash
# Rebuild the BlockMax searchable bucket from object storage
curl -X PUT -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"searchable":{"rebuild":true}}' \
"$WEAVIATE/v1/schema/Article/indexes/body"

# Repair a filterable bucket
curl -X PUT -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"filterable":{"rebuild":true}}' \
"$WEAVIATE/v1/schema/Article/indexes/category"

# Repair a rangeable bucket
curl -X PUT -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"rangeable":{"rebuild":true}}' \
"$WEAVIATE/v1/schema/Article/indexes/price"
```

`searchable.rebuild` is for BlockMax buckets only. To rebuild a property still on WAND, first migrate it to BlockMax (above) — `rebuild:true` on a WAND bucket is rejected.

## Cancel an in-flight migration

```bash
curl -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"searchable":{"cancel":true}}' \
"$WEAVIATE/v1/schema/Article/indexes/body"
```

The response is one of:

- `{"status": "CANCELLED", "taskId": "<id>"}` — a `STARTED` task was found and cancelled.
- `{"status": "NO_OP"}` — nothing matched (already finished, never submitted, or already cancelled). Idempotent.

Cancelling wipes partial on-disk state, so the next submit starts from a clean slate.

## Check migration status

```bash
curl -fsS -H "Authorization: Bearer $TOKEN" \
"$WEAVIATE/v1/schema/Article/indexes"
```

```json
{
"collection": "Article",
"properties": [{
"name": "body",
"dataType": "text",
"indexes": [
{ "type": "filterable", "status": "ready", "tokenization": "word" },
{ "type": "searchable", "status": "indexing", "progress": 0.42,
"tokenization": "word", "targetTokenization": "trigram",
"algorithm": "blockmax" }
]
}]
}
```

| `status` | Meaning |
|---|---|
| `ready` | Index is live and serving. No migration in flight. |
| `pending` | A task has been accepted; per-shard work has not started yet. |
| `indexing` | Per-shard work is running. `progress` is a 0..1 estimate. |
| `failed` | At least one node reported `Success=false`. The schema flag was **not** flipped — the property is still in its pre-migration state. Submit a new task to retry. |
| `cancelled` | Operator cancelled. Partial state has been scrubbed. |

Readback is throttled (3 s), so polling in a tight loop is cheap:

```bash
until curl -fsS "$WEAVIATE/v1/schema/Article/indexes" \
| jq -e '.properties[] | select(.name=="body")
| .indexes[] | select(.type=="searchable")
| .status == "ready"' > /dev/null
do sleep 2; done
```

## Multi-tenancy

Scope a task to specific tenants on a multi-tenant class with the `?tenants=` query parameter:

```bash
curl -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"filterable":{"rebuild":true}}' \
"$WEAVIATE/v1/schema/Article/indexes/category?tenants=customerA,customerB"
```

Rules:

| Class | `?tenants=` | Migration type | Result |
|---|---|---|---|
| Single-tenant | provided | any | `400` |
| Multi-tenant | omitted | format-only (`rebuild`, `repair`, `enable-rangeable`) | targets all tenants |
| Multi-tenant | omitted | semantic (`change-tokenization*`, `enable-*`) | targets all tenants (the schema flip is cluster-wide) |
| Multi-tenant | provided | format-only | targets the named subset |
| Multi-tenant | provided | semantic | `400` — semantic migrations cannot be sub-scoped |
| any | tenant in `OFFLOADED` / `FROZEN` | any | `400` — the offending tenant is named in the error |

Each tenant's replicas form an independent barrier group: tenant A starts serving the new bucket as soon as its own replicas finish, even if tenant B is still reindexing.

## Concurrency limits

- **Per `(class, property)` exclusivity** — only one migration is allowed in flight on a given `(collection, property)` pair. A second submit on the same pair returns `409` with the offending task ID.
- **Per-class cap** — up to 32 concurrent migrations per class. The next submit returns `503` until the in-flight units drain.
- **Different properties on the same class run in parallel.** Different classes are fully independent.

## What's blocked during a reindex

While a reindex is in flight on `(class, property)`, the schema FSM rejects:

- `UpdateProperty` on the same property.
- `DeleteClass` on the affected class.
- `DeleteTenants` / `UpdateTenants` on the targeted tenants when the transition makes shards locally unavailable (HOT → COLD / FROZEN / OFFLOADED). Transitions toward available (e.g. UNFREEZING) are not blocked.
- `DELETE /v1/schema/{class}/properties/{property}/index/{indexName}` on the same property.

The reject message names the in-flight task. Submitting a reindex on a **different** property of the same class is **not** blocked.

## Errors and recovery

| Code | When | Resolution |
|---|---|---|
| `400` | Malformed body, wrong property type, missing prerequisite (e.g. `searchable.tokenization` on a property with no searchable index), `?tenants=` on a single-tenant class, `?tenants=` on a semantic migration, target tenant in `OFFLOADED` / `FROZEN`. | Error responses carry next-step hints — read them. |
| `404` | Class or property doesn't exist. | Verify the class + property names. |
| `409` | An in-flight task overlaps this `(collection, property)`. The error names the offending task ID and migration type. | Wait, or cancel the existing task first. |
| `429` / `503` | Per-class in-flight cap reached, or cluster temporarily unavailable. | Retry once in-flight migrations drain. |

The schema flag is the source of truth: if a task ends in `failed`, the flag was not flipped and the property is still in its pre-migration state. A reindex is restart-safe at every phase — in-flight migrations are picked up automatically after a node restart.

## Authentication and authorization

| Endpoint | Required permission |
|---|---|
| `GET /v1/schema/{class}/indexes` | `READ` on `CollectionsMetadata` |
| `PUT /v1/schema/{class}/indexes/{property}` | `UPDATE` on `Collections` |
| `DELETE /v1/schema/{class}/properties/{property}/index/{indexName}` | `UPDATE` on `Collections` |

`UPDATE` on `Collections` is the same permission that gates `UpdateClass` and replication-factor changes — there is no dedicated `reindex` role today.

## Related pages

- [Inverted index configuration](./inverted-index.mdx) — how to set inverted indexes at collection creation, and how to drop them (multi-language client support).
- [Tokenization options](../config-refs/collections.mdx#tokenization) — canonical list of tokenization values.
- [Vector configuration](./vector-config.mdx) — sibling page covering vector indexes.
- [BlockMax WAND](../concepts/indexing/inverted-index.md) — background on the BM25 algorithm migration.

## Questions and feedback

import DocsFeedback from '/\_includes/docs-feedback.mdx';

<DocsFeedback/>
1 change: 1 addition & 0 deletions sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -580,6 +580,7 @@ const sidebars = {
"weaviate/manage-collections/vector-config",
"weaviate/manage-collections/generative-reranker-models",
"weaviate/manage-collections/inverted-index",
"weaviate/manage-collections/reindex-property",
{
type: "category",
label: "Multi-tenancy",
Expand Down