Skip to content

key space for chunk manifest #71

@d-v-b

Description

@d-v-b

if I understand correctly, the keys of the chunk manifest are chunk ids, which are strings like "0.0.0". If a zarr array has a chunk size of [10, 10, 10], then we can think of "0.0.0" as denoting the cube of indices [[0...9], [0...9], [0...9]], which in turn denotes the region in the virtual array associated with the data in a chunk.

If a Zarr array is sliced arbitrarily, then some of its chunks might be sliced before inserting into the Zarr array, in which case the region associated with that (chunk , slice) combination is not the full cube of indices associated with the string chunk id, but a subse of those indices (and the space of the subsets is determined by the slicing operations that are allowed).

So I wonder what would break in virtualizarr if the name of a chunk was changed from the chunk ID string to a sliceable expression of the output region associated with that chunk (and if the slicing operation was also expressible as the value associated with that key). This would define an abstract representation of the output region : chunk relationship that could be used internally in Zarr for lazy slicing, which I would like a lot :) If it's not workable to make this change, it would be great to know why so we can avoid it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    zarr-pythonRelevant to zarr-python upstream

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions