Skip to content

[Python] support for complex64 and complex128 as primitive types for zero-copy interop with numpy #39753

@maupardh1

Description

@maupardh1

Describe the enhancement requested

Hello Arrow team,

First, I love Arrow - thank you so much for making this great project.

I am manipulating multi-dimensional array data (time-series like) that is produced from sensors as numpy arrays of type complex64. I would like to manipulate them in arrow for recording (feather and/or parquet formats) and distributed computing in the future (cu-df, dask, spark - most likely frameworks on top of arrow, but also cupy/scipy). This would also allow me to write column names and other schema metadata. I think it could be superior (and faster) than manipulating numpy arrays.

It would be great to just call pa.array(np.array([1 + 2 * 1j, 3 + 4 * 1j], dtype=np.complex64), type=pa.complex64()) but that type doesn't exist in Arrow. I haven't found a way to zero-copy a complex64 numpy array into a pyarrow array (my understanding is that only primitive types support zero-copy between arrow and numpy, and pa.binary(8) or struct attempts on top of numpy views so far have resulted in copies). I would also need to read it back from a feather/parquet format and potentially convert it to a numpy array if needed, and land back on np.complex64.

I think this has come up one or twice already: I found this thread: https://www.mail-archive.com/[email protected]/msg23352.html and this PR: #10452 and thought I would also +1 this request, just in case.

If no first class support, do you see an alternative way to get zero-copy behavior?

Thanks!

Component(s)

Python

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions