Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions _includes/feature-notes/runtime-reindex.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
:::caution Preview — added in `v1.38`

This is a preview feature. The REST shape and behavior may change before GA. Do not rely on backup/restore while a reindex is in flight or recently completed on a v1.38 Preview cluster — wait for all tasks to reach `ready` / `failed` / `cancelled` first.

:::
2 changes: 1 addition & 1 deletion docs/weaviate/concepts/filtering.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ The `indexRangeFilters` index is a range-based index for filtering by numerical

Internally, rangeable indexes are implemented as roaring bitmap slices. This data structure limits the index to values that can be stored as 64 bit integers.

`indexRangeFilters` is only available for new properties. Existing properties cannot be converted to use the rangeable index.
Before `v1.38`, `indexRangeFilters` was only available for new properties — existing properties could not be converted to use the rangeable index. From `v1.38`, you can add a rangeable index to an existing property on a populated collection without restart using the [runtime reindex](../manage-collections/inverted-index.mdx#reindex-a-property-on-a-collection) endpoints.

## Recall on Pre-Filtered Searches

Expand Down
10 changes: 9 additions & 1 deletion docs/weaviate/concepts/indexing/inverted-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ import BlockmaxWand from '/_includes/feature-notes/blockmax-wand.mdx';

The BlockMax WAND algorithm is a variant of the WAND algorithm that is used to speed up BM25 and hybrid searches. It organizes the inverted index in blocks to enable skipping over blocks that are not relevant to the query. This can significantly reduce the number of documents that need to be scored, improving search performance.

If you are experiencing slow BM25 (or hybrid) searches and use a Weaviate version prior to `v1.30`, try migrating to a newer version that uses the BlockMax WAND algorithm to see if it improves performance. If you need to migrate existing data from a previous version of Weaviate, follow the [v1.30 migration guide](/deploy/migration/weaviate-1-30.md).
If you are experiencing slow BM25 (or hybrid) searches and use a Weaviate version prior to `v1.30`, try migrating to a newer version that uses the BlockMax WAND algorithm to see if it improves performance. If you need to migrate existing data from a previous version of Weaviate, follow the [v1.30 migration guide](/deploy/migration/weaviate-1-30.md) — or on `v1.38+`, use the live [Reindex a property](/weaviate/manage-collections/inverted-index.mdx#migrate-bm25-from-wand-to-blockmax) endpoint to migrate without restart.

:::note Scoring changes with BlockMax WAND

Expand Down Expand Up @@ -168,6 +168,14 @@ An example of a complete collection object without inverted indexes:

</details>

### Changing an index after import

Because an inverted index is built at import time, a property created without one (or with the "wrong" tokenization or BM25 algorithm) historically required exporting the data, recreating the collection, and re-importing — an expensive, downtime-prone operation.

From `v1.38`, Weaviate can **reindex a property on a collection** instead. A reindex builds the new index in the background from the stored objects while the existing index keeps serving reads, then switches over once the rebuild is complete. If the rebuild fails, the property is left in its pre-migration state — nothing is partially applied — and an interrupted reindex resumes automatically after a node restart. This makes adding a missing index, changing tokenization, or migrating BM25 from WAND to BlockMax a non-destructive, restart-safe operation.

For the operational steps, see [How-to: Reindex a property on a collection](/weaviate/manage-collections/inverted-index.mdx#reindex-a-property-on-a-collection); for the endpoint reference, see [References: Runtime reindex](/weaviate/config-refs/indexing/inverted-index.mdx#runtime-reindex).

## Tokenization

Tokenization is the process of breaking text into smaller units called tokens. This process is fundamental to how inverted indexes work - the tokens produced determine what can be searched and how matching occurs.
Expand Down
84 changes: 81 additions & 3 deletions docs/weaviate/config-refs/indexing/inverted-index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -257,8 +257,87 @@ You can drop (delete) an inverted index from a property. This is a destructive o

The following index types can be dropped: `searchable`, `filterable`, `rangeFilters`.

REST: `DELETE /v1/schema/{className}/properties/{propertyName}/index/{indexName}`. From `v1.38`, the drop is rejected while a [runtime reindex](#runtime-reindex) is in progress on the same property.

See [How-to: Drop an inverted index](../../manage-collections/inverted-index.mdx#drop-an-inverted-index) for code examples.

## Runtime reindex

import RuntimeReindexPreview from "/_includes/feature-notes/runtime-reindex.mdx";

<RuntimeReindexPreview/>

From `v1.38`, three REST endpoints let you alter a property's inverted-index configuration on a collection without restart. For task-oriented walkthroughs with `curl` examples, see [How-to: Reindex a property on a collection](../../manage-collections/inverted-index.mdx#reindex-a-property-on-a-collection). This section is the endpoint and behavior reference.

### Endpoints

| Method | Path | Purpose |
|---|---|---|
| `PUT` | `/v1/schema/{class}/indexes/{property}` | Add an inverted index, change tokenization, migrate BM25 WAND → BlockMax, rebuild an index, or cancel an in-progress migration. The body shape selects the migration type. |
| `DELETE` | `/v1/schema/{class}/properties/{property}/index/{indexName}` | Drop a configured index. `indexName` ∈ `{filterable, searchable, rangeFilters}`. |
| `GET` | `/v1/schema/{class}/indexes` | Read per-property index status. |

For the `PUT` request body shapes and response formats, see the <SkipLink href="/weaviate/api/rest#tag/schema">REST API reference</SkipLink>. For worked examples of each migration type, see the [how-to guide](../../manage-collections/inverted-index.mdx#reindex-a-property-on-a-collection).

### Status values (`GET`)

`GET /v1/schema/{class}/indexes` returns per-property index status. Each index reports one of:

| `status` | Meaning |
|---|---|
| `ready` | Index is live and serving. No migration in progress. |
| `pending` | A migration has been accepted but has not started yet. |
| `indexing` | Migration is running. `progress` is a 0..1 estimate. |
| `failed` | The migration did not complete. The property is left in its pre-migration state — submit it again to retry. |
| `cancelled` | The migration was cancelled. The property is left in its pre-migration state. |

The status stays at `indexing` until the change is fully applied, so it never briefly reports the pre-migration shape after the data is already rebuilt.

### Concurrency

- **One migration per property** — only one migration can be in progress on a given property at a time. A second submit on the same property returns `409`.
- **Per-collection cap** — up to 32 migrations can run concurrently on a collection. Further submits return `429 Too Many Requests` until in-progress migrations finish.
- **Different properties on the same collection run in parallel.** Different collections are fully independent.

While a reindex is in progress on a property, Weaviate rejects updates to that property, deletion of the collection, dropping an index on that property, and tenant-status changes that would take the targeted tenants offline (for example moving them to `COLD`, `FROZEN`, or `OFFLOADED`). A reindex on a **different** property of the same collection is not blocked.

### Multi-tenancy

Scope a task to specific tenants on a multi-tenant class with the `?tenants=` query parameter (comma-separated). The rules:

| Class | `?tenants=` | Migration type | Result |
|---|---|---|---|
| Single-tenant | provided | any | `400` |
| Multi-tenant | omitted | format-only (`rebuild`, `repair`, `enable-rangeable`) | targets all tenants |
| Multi-tenant | omitted | semantic (`change-tokenization*`, `enable-filterable`, `enable-searchable`, `change-algorithm` (BM25 WAND → BlockMax)) | targets all tenants (the change is collection-wide) |
| Multi-tenant | provided | format-only | targets the named subset |
| Multi-tenant | provided | semantic | `400` — semantic migrations cannot be sub-scoped |
Comment thread
g-despot marked this conversation as resolved.
Comment thread
g-despot marked this conversation as resolved.
| any | tenant in `OFFLOADED` / `FROZEN` | any | `400` — the offending tenant is named in the error |

Each tenant is reindexed independently: a tenant starts serving its new index as soon as its own reindex finishes, even if other tenants are still in progress.

### Errors and recovery

| Code | When | Resolution |
|---|---|---|
| `400` | Malformed body, wrong property type, missing prerequisite (e.g. `searchable.tokenization` on a property with no searchable index), `?tenants=` on a single-tenant class, `?tenants=` on a semantic migration, target tenant in `OFFLOADED` / `FROZEN`. | Error responses carry next-step hints — read them. |
| `404` | Class or property doesn't exist. | Verify the class + property names. |
| `409` | A migration is already in progress on this property. | Wait, or cancel the existing migration first. |
| `429` | Per-collection concurrency cap reached (32 migrations). | Retry once in-progress migrations finish. |
| `503` | Service temporarily unavailable. | Retry. |

If a migration ends in `failed`, the property is left in its pre-migration state — nothing is partially applied. Reindexing is also restart-safe: an interrupted migration resumes automatically after a node restart.

### Required permissions

| Endpoint | Required permission |
|---|---|
| `GET /v1/schema/{class}/indexes` | `READ` on `CollectionsMetadata` |
| `PUT /v1/schema/{class}/indexes/{property}` | `UPDATE` on `Collections` |
| `DELETE /v1/schema/{class}/properties/{property}/index/{indexName}` | `UPDATE` on `CollectionsMetadata` |

`PUT` is intentionally stricter than the other endpoints: submitting a reindex rebuilds the index and changes the collection definition, so it requires `UPDATE` on `Collections` — the same permission that gates other collection-definition changes. `DELETE` shares the existing schema-metadata permission (`UPDATE` on `CollectionsMetadata`) used by the other property-management endpoints. There is no dedicated `reindex` role today.

## How Weaviate creates inverted indexes

Weaviate creates **separate inverted indexes for each property and each index type**. For example, if you have a `title` property that is both searchable and filterable,
Expand All @@ -274,9 +353,8 @@ This is caused by the inverted index being built at import time. If you add a pr
To avoid this, you can either:

- Add the property before importing objects.
- Delete the collection, re-create it with the new property and then re-import the data.

We are working on a re-indexing API to allow you to re-index the data after adding a property. This will be available in a future release.
- From `v1.38`, use the [runtime reindex](#runtime-reindex) endpoints to add or change an inverted index on the collection without restart.
- On versions before `v1.38`, delete the collection, re-create it with the new property, and then re-import the data.

## How tokenization affects inverted indexing

Expand Down
5 changes: 2 additions & 3 deletions docs/weaviate/manage-collections/collection-operations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -595,9 +595,8 @@ Property indexes are built at import time. If you add a new property after impor
To create an index that includes all of the objects in a collection, do one of the following:

- New collections: Add all of the collection's properties before importing objects.
- Existing collections: Export the existing data from the collection. Re-create it with the new property. Import the data into the updated collection.

We are working on a re-indexing API to allow you to re-index the data after adding a property. This will be available in a future release.
- Existing collections (v1.38+): Use the runtime [Reindex a property](./inverted-index.mdx#reindex-a-property-on-a-collection) endpoints to add or change inverted indexes on a collection without restart.
- Existing collections (pre-v1.38): Export the existing data from the collection, recreate it with the new property, and re-import the data into the updated collection.

</details>

Expand Down
147 changes: 146 additions & 1 deletion docs/weaviate/manage-collections/inverted-index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,16 @@ import TSCode from "!!raw-loader!/_includes/code/howto/manage-data.collections.t
import JavaV6Code from "!!raw-loader!/_includes/code/java-v6/src/test/java/ManageCollectionsTest.java";
import CSharpCode from "!!raw-loader!/_includes/code/csharp/ManageCollectionsTest.cs";
import GoCode from "!!raw-loader!/_includes/code/howto/go/docs/manage-data.classes_test.go";
import RuntimeReindexPreview from "/_includes/feature-notes/runtime-reindex.mdx";

An **inverted index** is a data structure in Weaviate that enables efficient text search and filtering operations.

:::tip Change inverted indexes on a collection (v1.38+)

You can add, change, rebuild, or drop inverted indexes on a populated collection without restart. See [Reindex a property on a collection](#reindex-a-property-on-a-collection).

:::

<details>
<summary>Additional information</summary>

Expand Down Expand Up @@ -155,6 +162,12 @@ Drop (delete) an inverted index from a property. This is a destructive operation

The following index types can be dropped: `searchable`, `filterable`, `rangeFilters`.

:::note v1.38+

From `v1.38`, the drop is rejected with `409` while a [runtime reindex](#reindex-a-property-on-a-collection) is in progress on the same property. Cancel the in-progress migration first, or wait for it to finish.

:::

<Tabs className="code" groupId="languages">
<TabItem value="py" label="Python">
<FilteredTextBlock
Expand Down Expand Up @@ -258,10 +271,142 @@ For the full list of supported tokenizers — including `kagome_ja`, `kagome_kr`
</TabItem>
</Tabs>

## Reindex a property on a collection

<RuntimeReindexPreview/>

From `v1.38`, you can change a property's inverted-index configuration on a **populated collection without restarting and without losing writes**. Reads stay available throughout the migration. This replaces the previous workaround of exporting the data, recreating the collection, and re-importing.

These operations cover **inverted indexes only** (filterable, searchable, and range indexes, the BM25 algorithm, and tokenization). For the endpoint list, request-body reference, status values, concurrency rules, error codes, and required permissions, see [References: Runtime reindex](../config-refs/indexing/inverted-index.mdx#runtime-reindex).

### Add an inverted index to a collection

To enable filtering on a property that was created without an index:

```bash
curl -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"filterable":{"enabled":true}}' \
"$WEAVIATE/v1/schema/Article/indexes/category"
```

The body shape selects the index type — `filterable`, `searchable` (requires `text` / `text[]`), or `rangeable` (numeric types only). For the full request and response formats, see the <SkipLink href="/weaviate/api/rest#tag/schema">REST API reference</SkipLink>.

The server responds `202 Accepted` with a task ID, which you can treat as an opaque identifier:

```json
{
"status": "STARTED",
"taskId": "<id>"
}
```

### Change tokenization on a collection

To retokenize a populated `text` property:

```bash
curl -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"searchable":{"tokenization":"trigram"}}' \
"$WEAVIATE/v1/schema/Article/indexes/body"
```

If the property has **both** a searchable and a filterable index, they are retokenized together in one coordinated migration. To retokenize only the filterable index, send `{"filterable":{"tokenization":"word"}}` instead. For the canonical list of tokenization values, see [Tokenization options](../config-refs/collections.mdx#tokenization).

### Migrate BM25 from WAND to BlockMax

[BlockMax WAND](../concepts/indexing/inverted-index.md#blockmax-wand-algorithm) is a faster scoring algorithm that replaced the original WAND implementation. On collections created before BlockMax was the default, you can migrate every searchable property in one call:

```bash
curl -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"searchable":{"algorithm":"blockmax"}}' \
"$WEAVIATE/v1/schema/Article/indexes/body"
```

The migration covers every searchable property on the collection, which then uses BlockMax once every property has been rebuilt. This is **one-way** — switching `searchable.algorithm` back from `blockmax` to `wand` is rejected, because WAND is deprecated.

### Rebuild or repair an index

If you suspect corruption or want to refresh an index after heavy data churn, send `rebuild:true` for the index type:

```bash
# Rebuild the searchable index
curl -X PUT -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"searchable":{"rebuild":true}}' \
"$WEAVIATE/v1/schema/Article/indexes/body"

# Rebuild the filterable index
curl -X PUT -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"filterable":{"rebuild":true}}' \
"$WEAVIATE/v1/schema/Article/indexes/category"

# Rebuild the range index
curl -X PUT -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"rangeable":{"rebuild":true}}' \
"$WEAVIATE/v1/schema/Article/indexes/price"
```

Each index is rebuilt from the stored objects. `searchable.rebuild` requires the property to already use BlockMax — to rebuild a property still on WAND, first migrate it to BlockMax (above).

### Cancel an in-progress migration

```bash
curl -X PUT \
-H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
-d '{"searchable":{"cancel":true}}' \
"$WEAVIATE/v1/schema/Article/indexes/body"
```

The response is one of:

- `{"status": "CANCELLED", "taskId": "<id>"}` — an in-progress migration was found and cancelled.
- `{"status": "NO_OP"}` — nothing matched (already finished, never submitted, or already cancelled). Idempotent.

Cancelling clears any partial state, so the next submit starts fresh.

### Check migration status

```bash
curl -fsS -H "Authorization: Bearer $TOKEN" \
"$WEAVIATE/v1/schema/Article/indexes"
```

```json
{
"collection": "Article",
"properties": [{
"name": "body",
"dataType": "text",
"indexes": [
{ "type": "filterable", "status": "ready", "tokenization": "word" },
{ "type": "searchable", "status": "indexing", "progress": 0.42,
"tokenization": "word", "targetTokenization": "trigram",
"algorithm": "blockmax" }
]
}]
}
```

Polling this endpoint is cheap, so you can poll in a loop while a migration runs. For the meaning of each `status` value, see [References: Runtime reindex](../config-refs/indexing/inverted-index.mdx#runtime-reindex). To poll until an index is `ready`:

```bash
until curl -fsS -H "Authorization: Bearer $TOKEN" "$WEAVIATE/v1/schema/Article/indexes" \
| jq -e '.properties[] | select(.name=="body")
| .indexes[] | select(.type=="searchable")
| .status == "ready"' > /dev/null
do sleep 2; done
```

To scope a migration to specific tenants on a multi-tenant class, add the `?tenants=` query parameter — see the [multi-tenancy rules](../config-refs/indexing/inverted-index.mdx#runtime-reindex) in the reference.

## Further resources

- [References: Collection definition](/weaviate/config-refs/collections.mdx)
- [Concepts: Inverted index](../config-refs/indexing/inverted-index.mdx)
- [References: Inverted index](../config-refs/indexing/inverted-index.mdx)
- [References: Runtime reindex](../config-refs/indexing/inverted-index.mdx#runtime-reindex)
- [Concepts: Inverted index](../concepts/indexing/inverted-index.md)

## Questions and feedback

Expand Down
Loading