Skip to content
Merged
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 103 additions & 0 deletions docs/source/format/CanonicalExtensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,109 @@ Fixed shape tensor
by this specification. Instead, this extension type lets one use fixed shape tensors
as elements in a field of a RecordBatch or a Table.

.. _variable_shape_tensor_extension:

Variable shape tensor
=====================

* Extension name: `arrow.variable_shape_tensor`.

* The storage type of the extension is: ``StructArray`` where struct
is composed of **data** and **shape** fields describing a single
tensor per row:

* **data** is a ``List`` holding tensor elements of a single tensor.
Data type of the list elements is uniform across the entire column.
Comment thread
rok marked this conversation as resolved.
Outdated
* **shape** is a ``FixedSizeList<int32>[ndim]`` of the tensor shape where
the size of the list ``ndim`` is equal to the number of dimensions of the
tensor.

* Extension type parameters:

* **value_type** = the Arrow data type of individual tensor elements.

Optional parameters describing the logical layout:

* **dim_names** = explicit names to tensor dimensions
as an array. The length of it should be equal to the shape
length and equal to the number of dimensions.

``dim_names`` can be used if the dimensions have well-known
names and they map to the physical layout (row-major).

* **permutation** = indices of the desired ordering of the
original dimensions, defined as an array.

The indices contain a permutation of the values [0, 1, .., N-1] where
N is the number of dimensions. The permutation indicates which
dimension of the logical layout corresponds to which dimension of the
physical tensor (the i-th dimension of the logical view corresponds
to the dimension with number ``permutations[i]`` of the physical tensor).

Permutation can be useful in case the logical order of
the tensor is a permutation of the physical order (row-major).

When logical and physical layout are equal, the permutation will always
be ([0, 1, .., N-1]) and can therefore be left out.

* **uniform_shape** = sizes of individual tensors dimensions are
guaranteed to stay constant in uniform dimensions and can vary in
Comment thread
rok marked this conversation as resolved.
Outdated
non-uniform dimensions. This holds over all tensors in the array.
Sizes in uniform dimensions are represented with int32 values, while
sizes of the non-uniform dimensions are not known in advance and are
represented with 0s. If ``uniform_shape`` is not provided it is assumed
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rather take "-1" istead of "0"? We have some other places where we use -1 for "unknown" (eg null counts)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or JSON supports null which maps quite naturally to what we are trying to express, IMHO.

Copy link
Copy Markdown
Member Author

@rok rok Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched language to null. This is similar to what pytorch and tensorflow do (they use None in python).

that all dimensions are non-uniform.
An array containing a tensor with shape (2, 3, 4) and whose first and
last dimensions are uniform would have ``uniform_shape`` (2, 0, 4).
This allows for interpreting the tensor correctly without accounting for
uniform dimensions while still permitting optional optimizations that
take advantage of the uniformity.

* Description of the serialization:

The metadata must be a valid JSON object that optionally includes
dimension names with keys **"dim_names"** and ordering of dimensions
with key **"permutation"**.
Shapes of tensors can be defined in a subset of dimensions by providing
key **"uniform_shape"**.
Minimal metadata is an empty JSON object.

- Example of minimal metadata is:

``{}``
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, one more small nitpick: the minimal metadata is actually no metadata, which is typically represented as an empty string (I am actually not fully sure if in this case the metadata key could also just not be present in the field metadata), instead of an empty json dict (I don't think we should necessarily recommend using an empty dict)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point! Empty string feels like the safer choice here. See my suggested change below.

Comment thread
jorisvandenbossche marked this conversation as resolved.
Outdated

- Example with ``dim_names`` metadata for NCHW ordered data:
Comment thread
rok marked this conversation as resolved.
Outdated

``{ "dim_names": ["C", "H", "W"] }``

- Example with ``uniform_shape`` metadata for a set of color images
with variable width:
Comment thread
pitrou marked this conversation as resolved.
Outdated
Comment thread
rok marked this conversation as resolved.
Outdated

``{ "dim_names": ["H", "W", "C"], "uniform_shape": [400, 0, 3] }``

- Example of permuted 3-dimensional tensor:

``{ "permutation": [2, 0, 1] }``

This is the physical layout shape and the shape of the logical
layout would given an individual tensor of shape [100, 200, 500]
be ``[500, 100, 200]``.
Comment thread
rok marked this conversation as resolved.
Outdated

.. note::

With the exception of ``permutation``, the parameters and storage
of VariableShapeTensor relate to the *physical* storage of the tensor.

For example, consider a tensor with:
Comment thread
rok marked this conversation as resolved.
Outdated
shape = [10, 20, 30]
dim_names = [x, y, z]
permutations = [2, 0, 1]

This means the logical tensor has names [z, x, y] and shape [30, 10, 20].

Elements in a variable shape tensor extension array are stored
in row-major/C-contiguous order.
Comment thread
rok marked this conversation as resolved.
Outdated

=========================
Community Extension Types
=========================
Expand Down