Skip to content

Conversation

@tustvold
Copy link
Contributor

@tustvold tustvold commented Nov 1, 2023

Which issue does this PR close?

Closes apache/arrow-rs-object-store#113

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

@tustvold
Copy link
Contributor Author

tustvold commented Nov 1, 2023

@roeap perhaps you might be able to take a look at this one?

@roeap
Copy link
Contributor

roeap commented Nov 2, 2023

@tustvold - sorry was a bit to late here, but generally always happy to review :).

The more general question around which characters are encoded or not actually came up a little while ago on delta-rs. Essentially Spark can write certain file path (that would be illegal on windows, but are valid on other os i think) that we cannot read as we consider these characters illegal. Haven't seen that edge case in the wild yet, just discovered it trying to do roundtrip tests that contain a bunch of special characters in a partition value and thus show up in the path ...

I dismissed that however as something we have to live with, as concistency across platforms and providers is a major aim for object store, right?

@tustvold
Copy link
Contributor Author

tustvold commented Nov 2, 2023

I dismissed that however as something we have to live with, as concistency across platforms and providers is a major aim for object store, right?

This PR is simply fixing a bug related to interpreting URLs as well URL-encoded URLs 😅

The broader issue of path safety, and has been relaxed by #5020 with more context on the rationale in apache/arrow-rs-object-store#112. It can broadly be summarized as "the current restrictions were insufficient and added friction, and whilst everything would be better if people didn't have special characters in paths, we have to meet our users in the middle by encouraging against it but not forbidding it". I'd be interested to hear your thoughts on this, the RC has been cut, but it isn't too late if you feel very strongly.

alamb pushed a commit to alamb/arrow-rs that referenced this pull request Mar 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ObjectStore parse_url Incorrectly Handles URLs with Spaces

3 participants