Version: 6
Design document for adding rectilinear (variable) chunk grid support to zarr-python, conforming to the rectilinear chunk grid extension spec.
Related:
- #3750 (single ChunkGrid proposal)
- #3534 (rectilinear implementation)
- #3735 (chunk grid module/registry)
- ZEP0003 (variable chunking spec)
- zarr-specs#370 (sharding v1.1: non-divisible subchunks)
- zarr-extensions#25 (rectilinear extension)
- zarr-extensions#34 (sharding + rectilinear)
Chunk grids form a hierarchy — the rectilinear grid is strictly more general than the regular grid. Any regular grid is expressible as a rectilinear grid. There is no known chunk grid that is both (a) more general than rectilinear and (b) retains the axis-aligned tessellation properties Zarr assumes. All known grids are special cases:
| Grid type | Description | Example |
|---|---|---|
| Regular | Uniform chunk size, boundary chunks padded with fill_value | [10, 10, 10, 10] |
| Regular-bounded (zarrs) | Uniform chunk size, boundary chunks trimmed to array extent | [10, 10, 10, 5] |
| HPC boundary-padded | Regular interior, larger boundary chunks (VirtualiZarr#217) | [10, 8, 8, 8, 10] |
| Fully variable | Arbitrary per-chunk sizes | [5, 12, 3, 20] |
Prior iterations on the chunk grid design were based on the Zarr V3 spec's definition of chunk grids as an extension point alongside codecs, dtypes, etc. Therefore, we started designing the chunk grid implementation following a similar registry-based approach. However, in practice chunk grids are fundamentally different than codecs. Codecs are independent; supporting zstd tells you nothing about gzip. Chunk grids are not: every regular grid is a valid rectilinear grid. A registry-based plugin system makes sense for codecs but adds complexity without clear benefit for chunk grids. Here we start from some basic goals and propose a more fitting design for supporting different chunk grids in zarr-python.
- Follow the zarr extension proposal. The implementation should conform to the rectilinear chunk grid spec, not innovate on the metadata format.
- Minimize changes to the public API. Users creating regular arrays should see no difference. Rectilinear is additive.
- Maintain backwards compatibility. Existing code using
.chunks,isinstancechecks, or importingRegularChunkGrid/RectilinearChunkGridfromzarr.core.chunk_gridsshould continue to work where practical (with deprecation warnings where appropriate). Internal code paths/imports may be broken with justification. - Design for future iteration. The internal architecture should allow refactoring (e.g., metadata/array separation, new dimension types) without breaking the public API.
- Minimize downstream changes. xarray, VirtualiZarr, Icechunk, Cubed, etc. should need minimal updates.
- Minimize time to stable release. Ship behind a feature flag, stabilize through real-world usage, promote to stable API.
- The new API should be useful.
read_chunk_sizes/write_chunk_sizes,ChunkGrid.__getitem__,is_regular— these should solve real problems, not just expose internals. - Extensible for other serialization structures. The per-dimension design should support future encodings (tile, temporal) without changes to indexing or codecs.
- A chunk grid is a concrete arrangement of chunks. Not an abstract tiling pattern. This means that the chunk grid is bound to specific array dimensions, which enables the chunk grid to answer any question about any chunk (offset, size, count) without external parameters.
- One implementation, multiple serialization forms. A single
ChunkGridclass handles all chunking logic. The serialization format ("regular"vs"rectilinear") is chosen by the metadata layer, not the grid. - No chunk grid registry. Simple name-based dispatch in the metadata layer's
parse_chunk_grid(). - Fixed vs Varying per dimension.
FixedDimension(size, extent)for uniform chunks;VaryingDimension(edges, extent)for per-chunk edge lengths with precomputed prefix sums. Avoids expanding regular dimensions into lists of identical values. - Transparent transitions. Operations like
resize()can move an array from regular to rectilinear chunking.
@dataclass(frozen=True)
class FixedDimension:
"""Uniform chunk size. Boundary chunks contain less data but are
encoded at full size by the codec pipeline."""
size: int # chunk edge length (>= 0)
extent: int # array dimension length
def __post_init__(self) -> None:
# validates size >= 0 and extent >= 0
@property
def nchunks(self) -> int:
if self.size == 0:
return 0
return ceildiv(self.extent, self.size)
def index_to_chunk(self, idx: int) -> int:
return idx // self.size # raises IndexError if OOB
def chunk_offset(self, chunk_ix: int) -> int:
return chunk_ix * self.size # raises IndexError if OOB
def chunk_size(self, chunk_ix: int) -> int:
return self.size # always uniform; raises IndexError if OOB
def data_size(self, chunk_ix: int) -> int:
return max(0, min(self.size, self.extent - chunk_ix * self.size)) # raises IndexError if OOB
@property
def unique_edge_lengths(self) -> Iterable[int]:
return (self.size,) # O(1)
def indices_to_chunks(self, indices: NDArray) -> NDArray:
return indices // self.size
def with_extent(self, new_extent: int) -> FixedDimension:
return FixedDimension(size=self.size, extent=new_extent)
def resize(self, new_extent: int) -> FixedDimension:
return FixedDimension(size=self.size, extent=new_extent)
@dataclass(frozen=True)
class VaryingDimension:
"""Explicit per-chunk sizes. The last chunk may extend past the array
extent (extent < sum(edges)), in which case data_size clips to the
valid region while chunk_size returns the full edge length for codec
processing. This underflow is allowed to match how regular grids
handle boundary chunks, and to support shrinking an array without
rewriting chunk edges (the spec allows trailing edges beyond the extent)."""
edges: tuple[int, ...] # per-chunk edge lengths (all > 0)
cumulative: tuple[int, ...] # prefix sums for O(log n) lookup
extent: int # array dimension length (may be < sum(edges))
def __init__(self, edges: Sequence[int], extent: int) -> None:
# validates edges non-empty, all > 0, extent >= 0, extent <= sum(edges)
# computes cumulative via itertools.accumulate
# uses object.__setattr__ for frozen dataclass
@property
def nchunks(self) -> int:
# number of chunks that overlap [0, extent)
if extent == 0:
return 0
return bisect.bisect_left(self.cumulative, extent) + 1
@property
def ngridcells(self) -> int:
return len(self.edges)
def index_to_chunk(self, idx: int) -> int:
return bisect.bisect_right(self.cumulative, idx) # raises IndexError if OOB
def chunk_offset(self, chunk_ix: int) -> int:
return self.cumulative[chunk_ix - 1] if chunk_ix > 0 else 0 # raises IndexError if OOB
def chunk_size(self, chunk_ix: int) -> int:
return self.edges[chunk_ix] # raises IndexError if OOB
def data_size(self, chunk_ix: int) -> int:
offset = self.chunk_offset(chunk_ix)
return max(0, min(self.edges[chunk_ix], self.extent - offset)) # raises IndexError if OOB
@property
def unique_edge_lengths(self) -> Iterable[int]:
# lazy generator: yields unseen values, short-circuits deduplication
def indices_to_chunks(self, indices: NDArray) -> NDArray:
return np.searchsorted(self.cumulative, indices, side='right')
def with_extent(self, new_extent: int) -> VaryingDimension:
# validates cumulative[-1] >= new_extent (O(1)), re-binds extent
return VaryingDimension(self.edges, extent=new_extent)
def resize(self, new_extent: int) -> VaryingDimension:
# grow past edge sum: append chunk of size (new_extent - sum(edges))
# shrink or grow within edge sum: preserve all edges, re-bind extentBoth types implement the DimensionGrid protocol: nchunks, extent, index_to_chunk, chunk_offset, chunk_size, data_size, indices_to_chunks, unique_edge_lengths, with_extent, resize. Memory usage scales with the number of varying dimensions, not total chunks.
All per-chunk methods (chunk_offset, chunk_size, data_size) raise IndexError for out-of-bounds chunk indices, providing consistent fail-fast behavior across both dimension types.
The two size methods serve different consumers:
| Method | Returns | Consumer |
|---|---|---|
chunk_size |
Buffer size for codec processing | Codec pipeline (ArraySpec.shape) |
data_size |
Valid data region within the buffer | Indexing pipeline (chunk_selection slicing) |
For FixedDimension, these differ only at the boundary. For VaryingDimension, these differ only when the last chunk extends past the extent (i.e., extent < sum(edges)). This matches current zarr-python behavior: get_chunk_spec passes the full chunk_shape to the codec for all chunks, and the indexer generates a chunk_selection that clips the decoded buffer.
@runtime_checkable
class DimensionGrid(Protocol):
"""Structural interface shared by FixedDimension and VaryingDimension."""
@property
def nchunks(self) -> int: ...
@property
def ngridcells(self) -> int: ...
@property
def extent(self) -> int: ...
def index_to_chunk(self, idx: int) -> int: ...
def chunk_offset(self, chunk_ix: int) -> int: ... # raises IndexError if OOB
def chunk_size(self, chunk_ix: int) -> int: ... # raises IndexError if OOB
def data_size(self, chunk_ix: int) -> int: ... # raises IndexError if OOB
def indices_to_chunks(self, indices: NDArray[np.intp]) -> NDArray[np.intp]: ...
@property
def unique_edge_lengths(self) -> Iterable[int]: ...
def with_extent(self, new_extent: int) -> DimensionGrid: ...
def resize(self, new_extent: int) -> DimensionGrid: ...The protocol is @runtime_checkable, enabling polymorphic handling of both dimension types without isinstance checks.
nchunks and ngridcells differ when extent < sum(edges): nchunks counts only chunks that overlap [0, extent), while ngridcells counts total defined grid cells (i.e., len(edges)). For FixedDimension, both are equal. For VaryingDimension, they differ after a resize that shrinks the extent below the edge sum.
@dataclass(frozen=True)
class ChunkSpec:
slices: tuple[slice, ...] # valid data region in array coordinates
codec_shape: tuple[int, ...] # buffer shape for codec processing
@property
def shape(self) -> tuple[int, ...]:
return tuple(s.stop - s.start for s in self.slices)
@property
def is_boundary(self) -> bool:
return self.shape != self.codec_shapeFor interior chunks, shape == codec_shape. For boundary chunks of a regular grid, codec_shape is the full declared chunk size while shape is clipped. For rectilinear grids, shape == codec_shape unless the last chunk extends past the extent.
# Creating arrays
arr = zarr.create_array(shape=(100, 200), chunks=(10, 20)) # regular
arr = zarr.create_array(shape=(60, 100), chunks=[[10, 20, 30], [25, 25, 25, 25]]) # rectilinear
# ChunkGrid as a collection
grid = arr._chunk_grid # ChunkGrid (bound to array shape)
grid.grid_shape # (10, 10) — number of chunks per dimension
grid.ndim # 2
grid.is_regular # True if all dimensions are Fixed
spec = grid[0, 1] # ChunkSpec for chunk at grid position (0, 1)
spec.slices # (slice(0, 10), slice(20, 40))
spec.shape # (10, 20) — data shape
spec.codec_shape # (10, 20) — same for interior chunks
boundary = grid[9, 0] # boundary chunk (extent=100, size=10)
boundary.shape # (10, 20) — data shape
boundary.codec_shape # (10, 20) — codec sees full buffer
grid[99, 99] # None — out of bounds
for spec in grid: # iterate all chunks
...
# .chunks property: retained for regular grids, raises NotImplementedError for rectilinear
arr.chunks # (10, 20)
# .read_chunk_sizes / .write_chunk_sizes: works for all grids (dask-style)
arr.write_chunk_sizes # ((10, 10, ..., 10), (20, 20, ..., 20))ChunkGrid.__getitem__ constructs ChunkSpec using chunk_size for codec_shape and data_size for slices:
def __getitem__(self, coords: int | tuple[int, ...]) -> ChunkSpec | None:
if isinstance(coords, int):
coords = (coords,)
slices = []
codec_shape = []
for dim, ix in zip(self.dimensions, coords):
if ix < 0 or ix >= dim.nchunks:
return None
offset = dim.chunk_offset(ix)
slices.append(slice(offset, offset + dim.data_size(ix)))
codec_shape.append(dim.chunk_size(ix))
return ChunkSpec(tuple(slices), tuple(codec_shape))from_sizes requires array_shape, binding the extent per dimension at construction time. This is a core design choice: a chunk grid is a concrete arrangement for a specific array, not an abstract tiling pattern.
# Regular grid — all FixedDimension
grid = ChunkGrid.from_sizes(array_shape=(100, 200), chunk_sizes=(10, 20))
# Rectilinear grid — extent = sum(edges) when shape matches
grid = ChunkGrid.from_sizes(array_shape=(60, 100), chunk_sizes=[[10, 20, 30], [25, 25, 25, 25]])
# Rectilinear grid with boundary clipping — last chunk extends past array extent
# e.g., shape=(55, 90) but edges sum to (60, 100): data_size clips at extent
grid = ChunkGrid.from_sizes(array_shape=(55, 90), chunk_sizes=[[10, 20, 30], [25, 25, 25, 25]])
# Direct construction
grid = ChunkGrid(dimensions=(FixedDimension(10, 100), VaryingDimension([10, 20, 30], 55)))When extent < sum(edges), the dimension is always stored as VaryingDimension (even if all edges are identical) to preserve the explicit edge count. The last chunk's chunk_size returns the full declared edge (codec buffer) while data_size clips to the extent. This mirrors how FixedDimension handles boundary chunks in regular grids.
# Regular grid:
{"name": "regular", "configuration": {"chunk_shape": [10, 20]}}
# Rectilinear grid (with RLE compression and "kind" field):
{"name": "rectilinear", "configuration": {"kind": "inline", "chunk_shapes": [[10, 20, 30], [[25, 4]]]}}Both names deserialize to the same ChunkGrid class. The serialized form does not include the array extent — that comes from shape in array metadata and is combined with the chunk grid when constructing a ChunkGrid via ChunkGrid.from_metadata().
The ChunkGrid does not serialize itself. The format choice ("regular" vs "rectilinear") belongs to ArrayV3Metadata. Serialization and deserialization are handled by the metadata-layer chunk grid classes (RegularChunkGridMetadata and RectilinearChunkGridMetadata in metadata/v3.py), which provide to_dict() and from_dict() methods.
For create_array, the format is inferred from the chunks argument: a flat tuple produces "regular", a nested list produces "rectilinear". The _is_rectilinear_chunks() helper detects nested sequences like [[10, 20], [5, 5]].
The rectilinear format requires "kind": "inline" (validated by validate_rectilinear_kind()). Per the spec, each element of chunk_shapes can be:
- A bare integer
m: repeated untilsum >= array_extent - A list of bare integers: explicit per-chunk sizes
- A mixed array of bare integers and
[value, count]RLE pairs
RLE compression is used when serializing: runs of identical sizes become [value, count] pairs, singletons stay as bare integers.
# compress_rle([10, 10, 10, 5]) -> [[10, 3], 5]
# expand_rle([[10, 3], 5]) -> [10, 10, 10, 5]For a single-element chunk_shapes tuple like (10,), RectilinearChunkGridMetadata.to_dict() serializes it as a bare integer 10. Per the rectilinear spec, a bare integer is repeated until the sum >= extent, preserving the full codec buffer size for boundary chunks.
Zero-extent handling: Regular grids serialize zero-extent dimensions without issue (the format encodes only chunk_shape, no edges). Rectilinear grids cannot represent zero-extent dimensions because the spec requires at least one positive-integer edge length per axis.
The read_chunk_sizes and write_chunk_sizes properties provide universal access to per-dimension chunk data sizes, matching the dask Array.chunks convention. They work for both regular and rectilinear grids:
write_chunk_sizes: always returns outer (storage) chunk sizesread_chunk_sizes: returns inner chunk sizes when sharding is used, otherwise same aswrite_chunk_sizes
>>> arr = zarr.create_array(store, shape=(100, 80), chunks=(30, 40))
>>> arr.write_chunk_sizes
((30, 30, 30, 10), (40, 40))
>>> arr = zarr.create_array(store, shape=(60, 100), chunks=[[10, 20, 30], [50, 50]])
>>> arr.write_chunk_sizes
((10, 20, 30), (50, 50))The underlying ChunkGrid.chunk_sizes property (on the grid, not the array) returns the same as write_chunk_sizes.
arr.resize((80, 100)) # re-binds extent; FixedDimension stays fixed
arr.resize((200, 100)) # VaryingDimension grows by appending a new chunk
arr.resize((30, 100)) # VaryingDimension shrinks: preserves all edges, re-binds extentResize uses ChunkGrid.update_shape(new_shape), which delegates to each dimension's .resize() method:
FixedDimension.resize(): simply re-binds the extent (identical towith_extent)VaryingDimension.resize(): grow pastsum(edges)appends a chunk covering the gap; shrink or grow withinsum(edges)preserves all edges and re-binds the extent (the spec allows trailing edges beyond the array extent)
Known limitation (deferred): When growing a VaryingDimension, the current implementation always appends a single chunk covering the new region. For example, [10, 10, 10] resized from 30 to 45 produces [10, 10, 10, 15] instead of the more natural [10, 10, 10, 10, 10]. A future improvement should add an optional chunks parameter to resize() that controls how the new region is partitioned, with a sane default (e.g., repeating the last chunk size). This is safely deferrable because:
FixedDimensionalready handles resize correctly (regular grids stay regular)- The single-chunk default produces valid state, just suboptimal chunk layout
- Rectilinear arrays are behind an experimental feature flag
- Adding an optional parameter is backwards-compatible
Open design questions for the chunks parameter:
- Does it describe the new region only, or the entire post-resize array?
- Must the overlapping portion agree with existing chunks (no rechunking)?
- What is the type? Same as
chunksincreate_array?
The from_array() function handles both regular and rectilinear source arrays:
src = zarr.create_array(store, shape=(60, 100), chunks=[[10, 20, 30], [50, 50]])
new = zarr.from_array(data=src, store=new_store, chunks="keep")
# Preserves rectilinear structure: new.write_chunk_sizes == ((10, 20, 30), (50, 50))When chunks="keep", the logic checks data._chunk_grid.is_regular:
- Regular: extracts
data.chunks(flat tuple) and preserves shards - Rectilinear: extracts
data.write_chunk_sizes(nested tuples) and forces shards to None
The indexing pipeline is coupled to regular grid assumptions — every per-dimension indexer takes a scalar dim_chunk_len: int and uses // and *:
dim_chunk_ix = self.dim_sel // self.dim_chunk_len # IntDimIndexer
dim_offset = dim_chunk_ix * self.dim_chunk_len # SliceDimIndexerReplace dim_chunk_len: int with the dimension object (FixedDimension | VaryingDimension). The shared interface means the indexer code structure stays the same — dim_sel // dim_chunk_len becomes dim_grid.index_to_chunk(dim_sel). O(1) for regular, binary search for varying.
Today, get_chunk_spec() returns the same ArraySpec(shape=chunk_grid.chunk_shape) for every chunk. For rectilinear grids, each chunk has a different codec shape:
def get_chunk_spec(self, chunk_coords, array_config, prototype) -> ArraySpec:
spec = self._chunk_grid[chunk_coords]
return ArraySpec(shape=spec.codec_shape, ...)Note spec.codec_shape, not spec.shape. For regular grids, codec_shape is uniform (preserving current behavior). The boundary clipping flow is unchanged:
Write: user data → pad to codec_shape with fill_value → encode → store
Read: store → decode to codec_shape → slice via chunk_selection → user data
The ShardingCodec constructs a ChunkGrid per shard using the shard shape as extent and the subchunk shape as FixedDimension. Each shard is self-contained — it doesn't need to know whether the outer grid is regular or rectilinear. Validation checks that every unique edge length per dimension is divisible by the inner chunk size, using dim.unique_edge_lengths for efficient polymorphic iteration (O(1) for fixed dimensions, lazy-deduplicated for varying).
Level 1 — Outer chunk grid (shard boundaries): regular or rectilinear
Level 2 — Inner subchunk grid (within each shard): always regular
Level 3 — Shard index: ceil(shard_dim / subchunk_dim) entries per dimension
zarr-specs#370 lifts the requirement that subchunk shapes evenly divide the shard shape. With the proposed ChunkGrid, this just means removing the shard_shape % subchunk_shape == 0 validation — FixedDimension already handles boundary clipping via data_size.
| Outer grid | Subchunk divisibility | Required change |
|---|---|---|
| Regular | Evenly divides (v1.0) | None |
| Regular | Non-divisible (v1.1) | Remove divisibility validation |
| Rectilinear | Evenly divides | Remove "sharding incompatible" guard |
| Rectilinear | Non-divisible | Both changes |
| Current | Proposed |
|---|---|
ChunkGrid ABC + RegularChunkGrid subclass |
Single concrete ChunkGrid with is_regular |
RectilinearChunkGrid (#3534) |
Same ChunkGrid class |
| Chunk grid registry + entrypoints (#3735) | Direct name dispatch |
arr.chunks |
Retained for regular; arr.read_chunk_sizes/arr.write_chunk_sizes for general use |
get_chunk_shape(shape, coord) |
grid[coord].codec_shape or grid[coord].shape |
The chunk grid is a concrete arrangement, not an abstract tiling pattern. A finite collection naturally has an extent. Storing it enables __getitem__, eliminates dim_len parameters from every method, and makes the grid self-describing.
This does not mean ArrayV3Metadata.shape should delegate to the grid. The array shape remains an independent field in metadata. The extent is passed into the grid at construction time so it can answer boundary questions without external parameters. It is not serialized as part of the chunk grid JSON — it comes from the shape field in array metadata and is combined with the chunk grid configuration in ChunkGrid.from_metadata().
A chunk in a regular grid has two sizes. chunk_size is the buffer size the codec processes — always size for FixedDimension, even at the boundary (padded with fill_value). data_size is the valid data region — clipped to extent % size at the boundary. The indexing layer uses data_size to generate chunk_selection slices.
This matches current zarr-python behavior and matters for:
- Backward compatibility. Existing stores have boundary chunks encoded at full
chunk_shape. - Codec simplicity. Codecs assume uniform input shapes for regular grids.
- Shard index correctness. The index assumes
subchunk_dim-sized entries.
For VaryingDimension, chunk_size == data_size when extent == sum(edges). When extent < sum(edges) (e.g., after a resize that keeps the last chunk oversized), data_size clips the last chunk. This is the fundamental difference: FixedDimension has a declared size plus an extent that clips data; VaryingDimension has explicit sizes that normally are the extent but can also extend past it.
There is no known chunk grid outside the rectilinear family that retains the tessellation properties zarr-python assumes. A match on the grid name is sufficient.
Discussed in #3534. @d-v-b argued that RegularChunkGrid is unnecessary since rectilinear is more general; @dcherian argued that downstream libraries need a fast way to detect regular grids without inspecting potentially millions of chunk edges (see xarray#9808).
The resolution: a single ChunkGrid class with an is_regular property (O(1), cached at construction). This gives downstream code the fast-path detection @dcherian needed without the class hierarchy complexity @d-v-b wanted to avoid. The metadata document's name field ("regular" vs "rectilinear") is also available for clients who inspect JSON directly.
A backwards-compatibility shim in chunk_grids.py preserves the old RegularChunkGrid / RectilinearChunkGrid import paths with deprecation warnings — see Backwards compatibility.
The old design had ChunkGrid as an ABC with RegularChunkGrid as its only subclass. #3534 added RectilinearChunkGrid as a second subclass. This branch makes ChunkGrid a single concrete class instead, with separate metadata DTOs (RegularChunkGridMetadata and RectilinearChunkGridMetadata in metadata/v3.py) for serialization.
All known grids are special cases of rectilinear, so there's no need for a class hierarchy at the grid level. A ChunkGrid Protocol/ABC would mean every caller programs against an abstract interface and adding a grid type requires implementing ~15 methods. A single class is simpler.
Note: the dimension types (FixedDimension, VaryingDimension) do use a DimensionGrid Protocol — that's where the polymorphism lives. The grid-level class is concrete; the dimension-level types are polymorphic. If a genuinely novel grid type emerges that can't be expressed as a combination of per-dimension types, a grid-level Protocol can be extracted.
Debated in #3534. @d-v-b suggested making .chunks return tuple[tuple[int, ...], ...] (dask-style) for all grids. @dcherian strongly objected: every downstream consumer expects tuple[int, ...], and silently returning a different type would be worse than raising. Materializing O(10M) chunk edges into a Python tuple is also a real performance risk (xarray#8902).
The resolution:
.chunksis retained for regular grids (returnstuple[int, ...]as before).chunksraisesNotImplementedErrorfor rectilinear grids with a message pointing to.read_chunk_sizes/.write_chunk_sizes.read_chunk_sizesand.write_chunk_sizesreturntuple[tuple[int, ...], ...](dask convention) for all grids
@maxrjones noted in review that deprecating .chunks for regular grids was not desirable. The current branch does not deprecate it.
@d-v-b raised in #3534 that users need a way to say "these chunks are regular, but serialize as rectilinear" (e.g., to allow future append/extend workflows without format changes). @jhamman initially made nested-list input always produce RectilinearChunkGridMetadata.
The current branch resolves this via the metadata-layer chunk grid classes. When metadata is deserialized, the original name (from {"name": "regular"} or {"name": "rectilinear"}) determines which metadata class is instantiated (RegularChunkGridMetadata or RectilinearChunkGridMetadata), and that class handles serialization via to_dict(). Current inference behavior for create_array:
chunks=(10, 20)(flat tuple) → infers"regular"chunks=[[10, 20], [5, 5]](nested lists with varying sizes) → infers"rectilinear"chunks=[[10, 10], [20, 20]](nested lists with uniform sizes) →from_sizescollapses toFixedDimension, sois_regular=Trueand infers"regular"
Open question: Should uniform nested lists preserve "rectilinear" to support future append workflows without a format change? This could be addressed by checking the input form before collapsing, or by allowing users to pass chunk_grid_name explicitly through the create_array API.
#3750 discussion identified periodic chunk patterns as a use case not efficiently served by RLE alone. RLE compresses runs of identical values (np.repeat), but periodic patterns like days-per-month ([31, 28, 31, 30, ...] repeated 30 years) need a tile encoding (np.tile). Real-world examples include:
- Oceanographic models (ROMS): HPC boundary-padded chunks like
[10, 8, 8, 8, 10]— handled by RLE - Temporal axes: days-per-month, hours-per-day — need tile encoding for compact metadata
- Temporal-aware grids: date/time-aware chunk grids that layer over other axes (raised by @LDeakin)
A TiledDimension prototype was built (commit 9c0f582) demonstrating that the per-dimension design supports this without changes to indexing or the codec pipeline. However, it was intentionally excluded from this release because:
- Metadata format must come first. Tile encoding requires a new
kindvalue in the rectilinear spec (currently only"inline"is defined). This should go through zarr-extensions#25, not zarr-python unilaterally. - The per-dimension architecture doesn't preclude it. A future
TiledDimensioncan implement theDimensionGridprotocol alongsideFixedDimensionandVaryingDimensionwith no changes to indexing, codecs, or theChunkGridclass. - RLE covers the MVP. Most real-world variable chunk patterns (HPC boundaries, irregular partitions) are efficiently encoded with RLE. Tile encoding is an optimization for a specific (temporal) subset.
An earlier design doc proposed decoupling ChunkGrid (runtime) from ArrayV3Metadata (serialization), so that metadata would store only a plain dict and the array layer would construct the ChunkGrid.
The current implementation partially realizes this separation:
- Metadata DTOs (
RegularChunkGridMetadata,RectilinearChunkGridMetadatainmetadata/v3.py): Pure data, frozen dataclasses, no array shape. These live onArrayV3Metadata.chunk_gridand represent only what goes intozarr.json. ChunkGrid(chunk_grids.py): Shape-bound, supports indexing, iteration, and chunk specs. Lives onAsyncArray._chunk_grid, constructed from metadata +shapeviaChunkGrid.from_metadata().
This means ArrayV3Metadata.chunk_grid is now a ChunkGridMetadata (the DTO union type), not the runtime ChunkGrid. Code that previously accessed runtime methods on metadata.chunk_grid (e.g., all_chunk_coords(), __getitem__) must now use the grid from the array layer instead.
The name controls serialization format; each metadata DTO class provides its own to_dict() method for serialization. The ChunkGrid handles all runtime queries.
zarrs (Rust): Three independent grid types behind a ChunkGridTraits trait. Key patterns adopted: Fixed vs Varying per dimension, prefix sums + binary search, Option<T> for out-of-bounds, NonZeroU64 for chunk dimensions, separate subchunk grid per shard, array shape at construction.
TensorStore (C++): Stores only chunk_shape — boundary clipping via valid_data_bounds at query time. Both RegularGridRef and IrregularGrid internally. No registry.
The user-facing API is fully backward-compatible. Existing code that creates, opens, reads, and writes zarr arrays continues to work without changes:
zarr.create_array,zarr.open,zarr.open_array,zarr.open_group-- unchanged signatures. Thechunksparameter type is widened (now also accepts nested sequences for rectilinear grids), but all existing call patterns still work.arr.chunks-- returnstuple[int, ...]for regular arrays, same as before.arr.shape,arr.dtype,arr.ndim,arr.shards-- unchanged.- Top-level
zarrexports -- unchanged. - Rectilinear chunks are gated behind
zarr.config.set({'array.rectilinear_chunks': True}), so they cannot be created accidentally.
New additions (purely additive): arr.read_chunk_sizes, arr.write_chunk_sizes, zarr.experimental.ChunkGrid, zarr.experimental.ChunkSpec.
The breaking changes discussed below are confined to internal modules (zarr.core.chunk_grids, zarr.core.metadata.v3, zarr.core.indexing) that downstream libraries like cubed and VirtualiZarr access directly.
This section analyzes the internal breaking changes from the metadata/array separation and evaluates two strategies: (A) add backward-compatibility shims in zarr-python, vs. (B) require downstream packages to update. The baseline is no shims at all.
Three API changes affect downstream code:
-
RegularChunkGridclass removed fromzarr.core.chunk_grids. Onmain,RegularChunkGridis defined inchunk_grids.pyas aMetadatasubclass. This branch replaces it withRegularChunkGridMetadatainmetadata/v3.py. Without a shim,from zarr.core.chunk_grids import RegularChunkGridraisesImportError. -
RegularChunkGridno longer available fromzarr.core.metadata.v3. Onmain,v3.pyimportsRegularChunkGridfromchunk_grids.pyfor internal use. VirtualiZarr imports it from this location (from zarr.core.metadata.v3 import RegularChunkGrid). Without the internal import, this raisesImportError. -
OrthogonalIndexerconstructor expectsChunkGrid, notRegularChunkGrid/RegularChunkGridMetadata. Even if the import shims above resolve toRegularChunkGridMetadata, the indexer constructors accesschunk_grid._dimensions, which only exists on the runtimeChunkGridclass. Cubed constructsOrthogonalIndexer(selection, shape, RegularChunkGrid(chunk_shape=chunks))directly.
VirtualiZarr (5 line changes across 2 files):
# manifests/array.py (line 6): import
- from zarr.core.metadata.v3 import ArrayV3Metadata, RegularChunkGrid
+ from zarr.core.metadata.v3 import ArrayV3Metadata, RegularChunkGridMetadata
# manifests/array.py (line 53): isinstance check
- if not isinstance(_metadata.chunk_grid, RegularChunkGrid):
+ if not isinstance(_metadata.chunk_grid, RegularChunkGridMetadata):
# parsers/zarr.py (line 16): import
- from zarr.core.chunk_grids import RegularChunkGrid
+ from zarr.core.metadata.v3 import RegularChunkGridMetadata
# parsers/zarr.py (line 270): isinstance check
- if not isinstance(array_v3_metadata.chunk_grid, RegularChunkGrid):
+ if not isinstance(array_v3_metadata.chunk_grid, RegularChunkGridMetadata):
# parsers/zarr.py (line 390): cast
- cast(RegularChunkGrid, metadata.chunk_grid).chunk_shape
+ cast(RegularChunkGridMetadata, metadata.chunk_grid).chunk_shapeThe manifests/array.py import is from zarr.core.metadata.v3 (never a documented export; VirtualiZarr relied on a transitive import). The parsers/zarr.py import is from zarr.core.chunk_grids (the canonical location on main). Both are straightforward renames. The .chunk_shape attribute is unchanged on the new class.
If VirtualiZarr needs to support both old and new zarr-python, a version-conditional import adds ~5 more lines.
Cubed (3 line changes in 1 file):
# core/ops.py (lines 626-631)
def _create_zarr_indexer(selection, shape, chunks):
if zarr.__version__[0] == "3":
- from zarr.core.chunk_grids import RegularChunkGrid
+ from zarr.core.chunk_grids import ChunkGrid
from zarr.core.indexing import OrthogonalIndexer
- return OrthogonalIndexer(selection, shape, RegularChunkGrid(chunk_shape=chunks))
+ return OrthogonalIndexer(selection, shape, ChunkGrid.from_sizes(shape, chunks))Note that ChunkGrid is not a renamed class. RegularChunkGrid(chunk_shape=chunks) took only chunk sizes; ChunkGrid.from_sizes(shape, chunks) also requires the array shape. The shape parameter is already available at this call site.
If cubed needs to support both old and new zarr-python:
def _create_zarr_indexer(selection, shape, chunks):
if zarr.__version__[0] == "3":
from zarr.core.indexing import OrthogonalIndexer
try:
from zarr.core.chunk_grids import ChunkGrid
return OrthogonalIndexer(selection, shape, ChunkGrid.from_sizes(shape, chunks))
except ImportError:
from zarr.core.chunk_grids import RegularChunkGrid
return OrthogonalIndexer(selection, shape, RegularChunkGrid(chunk_shape=chunks))
else:
from zarr.indexing import OrthogonalIndexer
return OrthogonalIndexer(selection, ZarrArrayIndexingAdaptor(shape, chunks))Shim 1: __getattr__ in chunk_grids.py (~15 lines)
Maps RegularChunkGrid to RegularChunkGridMetadata with a deprecation warning. Covers:
- The
from zarr.core.chunk_grids import RegularChunkGridimport pattern (used by cubed and VirtualiZarr'sparsers/zarr.py) isinstance(x, RegularChunkGrid)checks (because the name resolves to the actual class)RegularChunkGrid(chunk_shape=(...))construction (becauseRegularChunkGridMetadataaccepts the same arguments)
Does not cover: passing the result to OrthogonalIndexer, because RegularChunkGridMetadata lacks ._dimensions.
Shim 2: __getattr__ in metadata/v3.py (~12 lines)
Same pattern, covers VirtualiZarr's import from zarr.core.metadata.v3. Mirrors Shim 1 for a different import path.
Shim 3: Auto-coerce ChunkGridMetadata in indexer constructors (~30 lines)
A helper function + 1-line insertion in each of BasicIndexer, OrthogonalIndexer, CoordinateIndexer, and MaskIndexer:
def _resolve_chunk_grid(chunk_grid, shape):
"""Coerce ChunkGridMetadata to runtime ChunkGrid if needed."""
from zarr.core.chunk_grids import ChunkGrid as _ChunkGrid
from zarr.core.metadata.v3 import ChunkGridMetadata
if isinstance(chunk_grid, _ChunkGrid):
return chunk_grid
if isinstance(chunk_grid, ChunkGridMetadata):
warnings.warn(
"Passing ChunkGridMetadata to indexers is deprecated. "
"Use ChunkGrid.from_sizes() instead.",
DeprecationWarning, stacklevel=2,
)
if hasattr(chunk_grid, "chunk_shape"):
return _ChunkGrid.from_sizes(shape, tuple(chunk_grid.chunk_shape))
return _ChunkGrid.from_sizes(shape, chunk_grid.chunk_shapes)
raise TypeError(f"Expected ChunkGrid or ChunkGridMetadata, got {type(chunk_grid)}")This covers cubed's OrthogonalIndexer(selection, shape, RegularChunkGrid(...)) pattern end-to-end (combined with Shim 1).
| No shims | Shims 1+2 only | Shims 1+2+3 | |
|---|---|---|---|
| zarr-python additions | 0 lines | ~27 lines | ~57 lines |
| VirtualiZarr changes | 5 lines | 0 lines | 0 lines |
| Cubed changes | 3 lines | 3 lines | 0 lines |
| Maintenance burden | None | Low (deprecation shims are well-understood) | Medium (indexer coercion blurs metadata/runtime boundary) |
| API clarity | Clean (metadata DTOs and runtime types are distinct) | Good (old names redirect to new names) | Weaker (indexers implicitly accept two type families) |
With Shims 1+2 only, VirtualiZarr's manifests/array.py import from zarr.core.metadata.v3 is covered by Shim 2, and the parsers/zarr.py import from zarr.core.chunk_grids is covered by Shim 1. The isinstance checks work because both shims resolve to RegularChunkGridMetadata. The cast works because .chunk_shape is unchanged. So VirtualiZarr needs 0 changes with Shims 1+2. The 3 lines for cubed remain because Shim 1 resolves the import but OrthogonalIndexer still needs a runtime ChunkGrid.
Migration from main (where only RegularChunkGrid and the abstract ChunkGrid ABC exist):
Old pattern (on main) |
New pattern |
|---|---|
from zarr.core.chunk_grids import RegularChunkGrid |
from zarr.core.metadata.v3 import RegularChunkGridMetadata |
from zarr.core.chunk_grids import ChunkGrid (ABC) |
from zarr.core.chunk_grids import ChunkGrid (concrete class, different API) |
isinstance(cg, RegularChunkGrid) |
isinstance(cg, RegularChunkGridMetadata) or grid.is_regular on the runtime ChunkGrid |
cg.chunk_shape on RegularChunkGrid |
cg.chunk_shape on RegularChunkGridMetadata (unchanged) |
ChunkGrid.from_dict(data) |
parse_chunk_grid(data) from zarr.core.metadata.v3 |
chunk_grid.all_chunk_coords(array_shape) |
chunk_grid.all_chunk_coords() (shape now stored in grid) |
chunk_grid.get_nchunks(array_shape) |
chunk_grid.get_nchunks() (shape now stored in grid) |
During the earlier #3534 effort (which used separate RegularChunkGrid/RectilinearChunkGrid classes), downstream PRs and issues were opened to explore compatibility:
These target #3534's API, not this branch's unified ChunkGrid design. New downstream POC branches for this design are linked in Proofs of concepts.
This implementation builds on prior work:
- #3534 (@jhamman) — RLE helpers, validation logic, test cases, and the review discussion that shaped the architecture.
- #3737 — extent-in-grid idea (adopted per-dimension).
- #1483 — original variable chunking POC.
- #3736 — resolved by storing extent per-dimension.
- Resize defaults (deferred): When growing a rectilinear array, should
resize()accept an optionalchunksparameter? See the Resize section for details and open design questions. Regular arrays already stay regular on resize. ChunkSpeccomplexity:ChunkSpeccarries bothslicesandcodec_shape. Should the grid expose separate methods for codec vs data queries instead?__getitem__with slices: Shouldgrid[0, :]orgrid[0:3, :]return a sub-grid or an iterator ofChunkSpecs?- Uniform nested lists: Should
chunks=[[10, 10], [20, 20]]serialize as"rectilinear"(preserving user intent for future append) or"regular"(current behavior, collapses uniform edges)? See User control over grid serialization format. zarr.openwith rectilinear: @tomwhite noted in #3534 thatzarr.open(mode="w")doesn't support rectilinear chunks directly. This could be addressed in a follow-up.
- Zarr-Python:
- Xarray:
- VirtualiZarr:
- Virtual TIFF:
- Cubed:
- Microbenchmarks: