Skip to content

Conversation

@pitrou
Copy link
Member

@pitrou pitrou commented Dec 20, 2021

An occasional misunderstanding is to pass a URI to filesystem methods, where a regular path is expected.

Make these situations easier to diagnose by raising a specific error.

An occasional misunderstanding is to pass a URI to filesystem methods, where a regular path is expected.

Make these situations easier to diagnose by raising a specific error.
@github-actions
Copy link

@pitrou
Copy link
Member Author

pitrou commented Dec 20, 2021

@coryan Would you like to take a look at the GCS changes here?

Copy link
Contributor

@coryan coryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with a minor nit in the tests.


TEST_F(GcsIntegrationTest, OpenInputStreamUri) {
auto fs = internal::MakeGcsFileSystemForTest(TestGcsOptions());
ASSERT_RAISES(Invalid, fs->OpenInputStream("gcs:" + PreexistingObjectPath()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just for testing purposes, but the conventional prefix for GCS is gs://, at least that is what gsutil uses:

https://cloud.google.com/storage/docs/gsutil#syntax

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thank you.
By the way, it seems other packages have adopted the gcs:// convention: https://gcsfs.readthedocs.io/en/latest/#integration

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh... My guess (and let me stress that this would be just a guess) is that gs:// is more familiar to GCS users. I can live with gcs:// too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a reasonable change to me. Maybe drop a note that colons are not supported in filenames here (https://github.com/apache/arrow/blob/master/docs/source/cpp/io.rst#filesystems) where we mention . and .. are not supported?

static Result<S3Path> FromString(const std::string& s) {
if (internal::IsLikelyUri(s)) {
return Status::Invalid(
"Expected a S3 object path of the form 'bucket/key...', got a URI: '", s, "'");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Expected a S3 object path of the form 'bucket/key...', got a URI: '", s, "'");
"Expected an S3 object path of the form 'bucket/key...', got a URI: '", s, "'");

@pitrou
Copy link
Member Author

pitrou commented Dec 20, 2021

Maybe drop a note that colons are not supported in filenames here (https://github.com/apache/arrow/blob/master/docs/source/cpp/io.rst#filesystems) where we mention . and .. are not supported?

Hmm, they should be more or less supported except in the first segment of a path.

@pitrou pitrou closed this in 238b363 Dec 20, 2021
@pitrou pitrou deleted the ARROW-10998-uri-path-mismatch branch December 20, 2021 19:55
@ursabot
Copy link

ursabot commented Dec 20, 2021

Benchmark runs are scheduled for baseline = cfcce5a and contender = 238b363. 238b363 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️2.24% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️1.24% ⬆️0.0%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@jorisvandenbossche
Copy link
Member

Cool, this is a nice usability improvement!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants