GH-48455: [Python] Handle nested field names when sanitizing table at ParquetWriter (flavor='spark') #48456

HyukjinKwon · 2025-12-11T08:16:54Z

Rationale for this change

arrow/python/pyarrow/parquet/core.py

Line 743 in 7a36fcc

# TODO: This will not handle prohibited characters in nested field names

This was first introduced in 4a6a6cb but did not implement the logic to handle nested types.

What changes are included in this PR?

This PR implements the nested handling for the sanitising the field names ParquetWriter (flavor='spark')

Are these changes tested?

Unittests were added, and manually tested via:

pytest python/pyarrow/tests/parquet/test_basic.py -k 'sanitized_spark' -v

Are there any user-facing changes?

Yes. It sanitizes the field names when ParquetWriter is used with flavor set to 'spark'.

GitHub Issue: [Python] Handle nested field names when sanitizing table at ParquetWriter when flavor='spark' #48455

… flavor='spark'

github-actions · 2025-12-11T08:17:22Z

⚠️ GitHub issue #48455 has been automatically assigned in GitHub to PR creator.

Handle nested field names when sanitizing table at ParquetWriter when…

f8d38e9

… flavor='spark'

HyukjinKwon requested review from AlenkaF, raulcd and rok as code owners December 11, 2025 08:16

github-actions bot added Component: Python awaiting review Awaiting review labels Dec 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GH-48455: [Python] Handle nested field names when sanitizing table at ParquetWriter (flavor='spark') #48456

GH-48455: [Python] Handle nested field names when sanitizing table at ParquetWriter (flavor='spark') #48456

Uh oh!

HyukjinKwon commented Dec 11, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

GH-48455: [Python] Handle nested field names when sanitizing table at ParquetWriter (flavor='spark') #48456

Are you sure you want to change the base?

GH-48455: [Python] Handle nested field names when sanitizing table at ParquetWriter (flavor='spark') #48456

Uh oh!

Conversation

HyukjinKwon commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

github-actions bot commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

HyukjinKwon commented Dec 11, 2025 •

edited

Loading