-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Description
Describe the bug, including details regarding any error messages, version, and platform.
Consider the following path:
In [1]: import pyarrow as pa
...: import uuid
...:
...: schema = pa.schema(
...: [
...: pa.field("uuid", pa.uuid(), nullable=False),
...: ]
...: )
...:
...: arr_table = pa.Table.from_pydict(
...: {
...: "uuid": [
...: uuid.UUID("00000000-0000-0000-0000-000000000000").bytes,
...: uuid.UUID("11111111-1111-1111-1111-111111111111").bytes,
...: ],
...: },
...: schema=schema,
...: )
...:
...: import pyarrow.parquet as pq
...:
...: with pq.ParquetWriter("/tmp/some-parquet-with-uuid.parquet", schema=schema) as writer:
...: writer.write(arr_table)
...: > parq /tmp/some-parquet-with-uuid.parquet -s
# Schema
<pyarrow._parquet.ParquetSchema object at 0x105cd0480>
required group field_id=-1 schema {
required fixed_len_byte_array(16) field_id=-1 uuid;
}Example one that has been created by Iceberg (Java):
parq /var/folders/h0/wqtwn1ks0m3bksc8n7mp4_lr0000gp/T/hive12405183144450566152/table/data/uuid_bucket=7/00000-1-daaf21e3-fbee-4954-88d8-ea0371f62a6a-0-00001.parquet -s
# Schema
<pyarrow._parquet.ParquetSchema object at 0x1058ce840>
required group field_id=-1 table {
required fixed_len_byte_array(16) field_id=1 uuid (UUID);
}
The (UUID) indicates the logical type annotation many readers rely on.
Component(s)
Parquet
torchss