Skip to content

Conversation

@raulcd
Copy link
Member

@raulcd raulcd commented Apr 20, 2023

Rationale for this change

GCS could be enabled on Windows wheels.

What changes are included in this PR?

Enabling GCS on Windows wheel

Are these changes tested?

Crossbow jobs for wheels run tests for GCS now.
I have tested locally that I can install the built wheel and I can import GcsFileSystem:

Python 3.9.12 (main, Apr  4 2022, 05:22:27) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32

Warning:
This Python interpreter is in a conda environment, but the environment has
not been activated.  Libraries may fail to load.  To activate this environment
please see https://conda.io/activation

Type "help", "copyright", "credits" or "license" for more information.
>>> from pyarrow.fs import GcsFileSystem
>>> fs = GcsFileSystem(access_token='abc',target_service_account='service_account@apache',credential_token_expiration=datetime.now(),default_bucket_location='us-west2',scheme='https', endpoint_override='localhost:8999')
>>> fs.default_bucket_location
'us-west2'
>>> fs.create_dir('hello')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pyarrow\_fs.pyx", line 593, in pyarrow._fs.FileSystem.create_dir
  File "pyarrow\error.pxi", line 113, in pyarrow.lib.check_status
PermissionError: [Errno 13] google::cloud::Status(UNAUTHENTICATED: Permanent error GetBucketMetadata: Could not create a OAuth2 access token to authenticate the request. The request was not sent, as such an access token is required to complete the request successfully. Learn more about Google Cloud authentication at https://cloud.google.com/docs/authentication. The underlying error message was: Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.). Detail: [errno 13] Permission denied

Are there any user-facing changes?

No but Windows wheels should contain ARROW_GCS

@github-actions
Copy link

@github-actions
Copy link

⚠️ GitHub issue #35193 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added the awaiting committer review Awaiting committer review label Apr 20, 2023
@raulcd
Copy link
Member Author

raulcd commented Apr 20, 2023

@github-actions crossbow submit wheel-windows-cp311-amd64

@github-actions
Copy link

Revision: 5246fc5be2ed942a5688b29a05700c0673d820d2

Submitted crossbow builds: ursacomputing/crossbow @ actions-43da90de90

Task Status
wheel-windows-cp311-amd64 Github Actions

@raulcd
Copy link
Member Author

raulcd commented Apr 20, 2023

@github-actions crossbow submit wheel-windows-cp311-amd64

@github-actions
Copy link

Command '['git', 'clone', 'https://github.com/ursacomputing/crossbow', '/tmp/tmptzeyqnko/crossbow']' returned non-zero exit status 128.
The Archery job run can be found at: https://github.com/apache/arrow/actions/runs/4756107770

@raulcd
Copy link
Member Author

raulcd commented Apr 20, 2023

@github-actions crossbow submit wheel-windows-cp311-amd64

@github-actions
Copy link

Revision: a68e3737990ccd7b7bc3d41236f5156c139c1759

Submitted crossbow builds: ursacomputing/crossbow @ actions-e67a78785a

Task Status
wheel-windows-cp311-amd64 Github Actions

@raulcd
Copy link
Member Author

raulcd commented Apr 27, 2023

@github-actions crossbow submit wheel-windows-cp39-amd64

@github-actions
Copy link

Revision: a68e3737990ccd7b7bc3d41236f5156c139c1759

Submitted crossbow builds: ursacomputing/crossbow @ actions-cd5876c7ff

Task Status
wheel-windows-cp39-amd64 Github Actions

@raulcd
Copy link
Member Author

raulcd commented Apr 27, 2023

@github-actions crossbow submit wheel-windows-cp39-amd64

@github-actions
Copy link

Revision: cb95927568e66d1be2044ff73e00fbe98ddf0318

Submitted crossbow builds: ursacomputing/crossbow @ actions-bb930e9684

Task Status
wheel-windows-cp39-amd64 Github Actions

@raulcd
Copy link
Member Author

raulcd commented Apr 27, 2023

@github-actions crossbow submit wheel-windows-*

@raulcd raulcd marked this pull request as ready for review April 27, 2023 14:44
@raulcd raulcd requested review from assignUser and kou as code owners April 27, 2023 14:44
@raulcd raulcd changed the title WIP: GH-35193: [Python][Packaging] Enable GCS on Windows wheels GH-35193: [Python][Packaging] Enable GCS on Windows wheels Apr 27, 2023
@github-actions
Copy link

Revision: 8b0ecc226649fb165e89e39b81a859290f223af7

Submitted crossbow builds: ursacomputing/crossbow @ actions-af464cb5b5

Task Status
wheel-windows-cp310-amd64 Github Actions
wheel-windows-cp311-amd64 Github Actions
wheel-windows-cp37-amd64 Github Actions
wheel-windows-cp38-amd64 Github Actions
wheel-windows-cp39-amd64 Github Actions

@raulcd
Copy link
Member Author

raulcd commented May 10, 2023

@github-actions crossbow submit wheel-windows-*

@github-actions
Copy link

Revision: b2b891a

Submitted crossbow builds: ursacomputing/crossbow @ actions-5e23d788f1

Task Status
wheel-windows-cp310-amd64 Github Actions
wheel-windows-cp311-amd64 Github Actions
wheel-windows-cp37-amd64 Github Actions
wheel-windows-cp38-amd64 Github Actions
wheel-windows-cp39-amd64 Github Actions

@raulcd
Copy link
Member Author

raulcd commented May 11, 2023

@github-actions crossbow submit wheel-windows-*

@github-actions
Copy link

Revision: 50efe4e

Submitted crossbow builds: ursacomputing/crossbow @ actions-20194149b9

Task Status
wheel-windows-cp310-amd64 Github Actions
wheel-windows-cp311-amd64 Github Actions
wheel-windows-cp37-amd64 Github Actions
wheel-windows-cp38-amd64 Github Actions
wheel-windows-cp39-amd64 Github Actions

@raulcd
Copy link
Member Author

raulcd commented May 11, 2023

@westonpace while enabling GCS for the Windows wheels I realised Appveyor wasn't really adding ARROW_GCS=ON and there is a failure on the arrow-gcsfs-test on Appveyor if I enable it, example:

[  FAILED  ] GcsIntegrationTest.CreateDirRecursiveFolderOnly (0 ms)
[ RUN      ] GcsIntegrationTest.CreateDirRecursiveBucketAndFolder
C:/projects/arrow/cpp/src/arrow/filesystem/gcsfs_test.cc(170): error: Expected equality of these values:
  Testbench()->error()
    Which is: "Could not start GCS emulator. Used the following list of python interpreter names: 3.10 (exe not found)"
  ""
[  FAILED  ] GcsIntegrationTest.CreateDirRecursiveBucketAndFolder (0 ms)
[ RUN      ] GcsIntegrationTest.CreateDirUri

I can move fixing Appveyor to it's own issue/PR as it's slightly different from enabling GCS for the Windows wheels, which as we can see on the wheels jobs is successful. Do you think Appveyor should be fixed here? Any idea what the issue/fix is?

@westonpace
Copy link
Member

Do you think Appveyor should be fixed here? Any idea what the issue/fix is?

It would be nice to have at least one Windows CI job that uses GCS. I don't know if the others do or not.

Any idea what the issue/fix is?

Testbench is a standalone python project that emulates GCS. This is very similar to how minio emulates S3. The C++ unit tests launch testbench (using boost::process). Since testbench is a python project the C++ tests need to be able to locate and run python. It appears, on Windows, this is not working.

Here is what the code to look for python looks like today:

    std::vector<std::string> names{"python3", "python"};
    // If the build script or application developer provides a value in the PYTHON
    // environment variable, then just use that.
    if (const auto* env = std::getenv("PYTHON")) {
      names = {env};
    }
    auto error = std::string(
        "Could not start GCS emulator."
        " Used the following list of python interpreter names:");
    for (const auto& interpreter : names) {
      auto exe_path = bp::search_path(interpreter);
      error += " " + interpreter;
      if (exe_path.empty()) {
        error += " (exe not found)";
        continue;
      }

So maybe it's as easy as setting the PYTHON environment variable to point to a valid python executable?

@raulcd
Copy link
Member Author

raulcd commented May 12, 2023

@github-actions crossbow submit wheel-windows-*

@github-actions
Copy link

Revision: e1d51a8

Submitted crossbow builds: ursacomputing/crossbow @ actions-20cbf52fd6

Task Status
wheel-windows-cp310-amd64 Github Actions
wheel-windows-cp311-amd64 Github Actions
wheel-windows-cp37-amd64 Github Actions
wheel-windows-cp38-amd64 Github Actions
wheel-windows-cp39-amd64 Github Actions

@raulcd
Copy link
Member Author

raulcd commented May 12, 2023

Thanks @westonpace, this should be ready for review now

@raulcd raulcd requested a review from westonpace May 12, 2023 11:19
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels May 12, 2023
@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels May 12, 2023
@kou
Copy link
Member

kou commented May 12, 2023

@github-actions crossbow submit wheel-windows-*

@github-actions
Copy link

Revision: cf41c23

Submitted crossbow builds: ursacomputing/crossbow @ actions-fe9442d7db

Task Status
wheel-windows-cp310-amd64 Github Actions
wheel-windows-cp311-amd64 Github Actions
wheel-windows-cp37-amd64 Github Actions
wheel-windows-cp38-amd64 Github Actions
wheel-windows-cp39-amd64 Github Actions

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting changes Awaiting changes labels May 12, 2023
@assignUser assignUser merged commit 8be70c1 into apache:main May 16, 2023
@ursabot
Copy link

ursabot commented May 16, 2023

Benchmark runs are scheduled for baseline = 2d76d9a and contender = 8be70c1. 8be70c1 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.81% ⬆️0.0%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.15% ⬆️0.03%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 8be70c13 ec2-t3-xlarge-us-east-2
[Finished] 8be70c13 test-mac-arm
[Finished] 8be70c13 ursa-i9-9960x
[Finished] 8be70c13 ursa-thinkcentre-m75q
[Finished] 2d76d9a5 ec2-t3-xlarge-us-east-2
[Finished] 2d76d9a5 test-mac-arm
[Finished] 2d76d9a5 ursa-i9-9960x
[Finished] 2d76d9a5 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@westonpace
Copy link
Member

Thanks @raulcd for enabling this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting merge Awaiting merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Python] Windows wheel is built without GCS support

5 participants