Skip to content

Conversation

@kszucs
Copy link
Member

@kszucs kszucs commented Jan 4, 2021

Also resolves:

  • ARROW-11213: [Packaging][Python] Dockerize wheel building on windows
  • ARROW-11215: [CI] Use named volumes by default for caching in docker-compose
  • ARROW-11231: [Python][Packaging] Enable mimalloc in wheels

Main features:

  • dockerize windows builds
  • use vcpkg as the dependency source where we can explicitly pin a working version without worrying about the drag of build environment over time

Potential follow-up:

  • use vcpkg on macos as well

cc @kou @xhochy

Manylinux testing

Should be straightforward (I'm going to shared pre-built images for quicker testing).

Windows testing

Only windows host is able to run windows containers. I'm virtualizing windows on macOS (should work on linux as well) using virtualbox:

git clone https://github.com/StefanScherer/windows-docker-machine
cd windows-docker-machine
# grant more resources in the vagrantfile under the virtualbox section (I use 8 cores and 16GiB of ram)
vagrant up --provider virtualbox 2019-box
docker context ls
docker context use 2019-box
docker image ls

Now docker should use the windows docker daemon.

pip install -e dev/archery[docker]
# the volumes defined in the docker compose file should be commented out and pass arrow's source explicitly
# this is required because compose is being executed on a unix host and the directory is mounted through a virtual machine
PYTHON=3.6 archery docker run --no-pull --using-docker-cli -v C:$(pwd):C:/arrow python-wheel-windows-vs2017 cmd.exe
# then within the container execute a build or do other interactive things
arrow\ci\scripts\python_wheel_windown_build.bat

@github-actions
Copy link

github-actions bot commented Jan 4, 2021

Thanks for opening a pull request!

Could you open an issue for this pull request on JIRA?
https://issues.apache.org/jira/browse/ARROW

Then could you also rename pull request title in the following format?

ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}

See also:

@kszucs
Copy link
Member Author

kszucs commented Jan 5, 2021

Currently integrating with crossbow using pre-built images, the manylinux build already works except the artifact uploading.
Pull & build took ~13m: https://github.com/ursa-labs/crossbow/runs/1652174767

Passing build under ~19m including testing: https://github.com/ursa-labs/crossbow/runs/1652276423

@kszucs
Copy link
Member Author

kszucs commented Jan 5, 2021

@github-actions crossbow submit wheel-manylinux*

@github-actions
Copy link

github-actions bot commented Jan 5, 2021

Revision: aeb902f4f50e70fb6be844ecaf1283820bee3c64

Submitted crossbow builds: ursa-labs/crossbow @ actions-825

Task Status
wheel-manylinux2010-cp36m Github Actions
wheel-manylinux2010-cp37m Github Actions
wheel-manylinux2010-cp38 Github Actions
wheel-manylinux2010-cp39 Github Actions
wheel-manylinux2014-cp36m Github Actions
wheel-manylinux2014-cp37m Github Actions
wheel-manylinux2014-cp38 Github Actions
wheel-manylinux2014-cp39 Github Actions

@kszucs
Copy link
Member Author

kszucs commented Jan 6, 2021

@xhochy we have a running build for windows on github actions now. I'll do a couple of things to try to improve the build times a bit (make the image smaller because pulling takes ~15m and enable unity builds).

Since we don't bundle msvcp140.dll these wheels are not working in the official windows python docker containers, so I'm using the same image as for building. I may try to bundle it (guess it should be about copying it into the archive) to ensure that it works in the base images.

@kszucs
Copy link
Member Author

kszucs commented Jan 7, 2021

@github-actions crossbow submit wheel-manylinux2010-cp36m wheel-manylinux2014-cp36m wheel-windows-cp36m

@github-actions
Copy link

github-actions bot commented Jan 7, 2021

Revision: 233aabae02f16ef8edf55b5470cb7ab12c58c920

Submitted crossbow builds: ursa-labs/crossbow @ actions-838

Task Status
wheel-manylinux2010-cp36m Github Actions
wheel-manylinux2014-cp36m Github Actions
wheel-windows-cp36m Github Actions

@kszucs
Copy link
Member Author

kszucs commented Jan 7, 2021

@kou @xhochy bundled the msvc runtime. Locally tested with the stock windows python images and seems to work fine. Hopefully crossbow will confirm it.

I'm going to write a more detailed description of the changes tomorrow. Any reviews are welcome!

@kou
Copy link
Member

kou commented Jan 7, 2021

It seems that https://github.com/ursa-labs/crossbow/releases/tag/actions-838-github-wheel-windows-cp36m doesn't have artifacts...
Where did you download built wheels?

@kszucs
Copy link
Member Author

kszucs commented Jan 7, 2021

@github-actions crossbow submit wheel-windows-cp36m

@kszucs
Copy link
Member Author

kszucs commented Jan 7, 2021

The build also executes the tests in a different docker container, see the "Test Wheel" step in the build.

The following build should properly upload the windows wheel to crossbow releases now.

@github-actions
Copy link

github-actions bot commented Jan 7, 2021

Revision: 009d9d049ef48e3cd6e4e17126d6093b25331312

Submitted crossbow builds: ursa-labs/crossbow @ actions-842

Task Status
wheel-windows-cp36m Github Actions

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@kszucs
Copy link
Member Author

kszucs commented Jan 8, 2021

@github-actions crossbow submit wheel-manylinux2010-cp36m wheel-manylinux2014-cp36m wheel-windows-cp36m

@github-actions
Copy link

github-actions bot commented Jan 8, 2021

Revision: b47406064789942bacaea9d3b1d62d5778796d37

Submitted crossbow builds: ursa-labs/crossbow @ actions-844

Task Status
wheel-manylinux2010-cp36m Github Actions
wheel-manylinux2014-cp36m Github Actions
wheel-windows-cp36m Github Actions

@kszucs
Copy link
Member Author

kszucs commented Jan 8, 2021

@github-actions crossbow submit wheel-*

@github-actions
Copy link

github-actions bot commented Jan 8, 2021

Revision: 8081526709fc1642d0de3c305e4bb7cfa8af6912

Submitted crossbow builds: ursa-labs/crossbow @ actions-845

Task Status
wheel-manylinux2010-cp36m Github Actions
wheel-manylinux2010-cp37m Github Actions
wheel-manylinux2010-cp38 Github Actions
wheel-manylinux2010-cp39 Github Actions
wheel-manylinux2014-cp36m Github Actions
wheel-manylinux2014-cp37m Github Actions
wheel-manylinux2014-cp38 Github Actions
wheel-manylinux2014-cp39 Github Actions
wheel-osx-high-sierra-cp36m Github Actions
wheel-osx-high-sierra-cp37m Github Actions
wheel-osx-high-sierra-cp38 Github Actions
wheel-osx-high-sierra-cp39 Github Actions
wheel-osx-mavericks-cp36m Github Actions
wheel-osx-mavericks-cp37m Github Actions
wheel-osx-mavericks-cp38 Github Actions
wheel-osx-mavericks-cp39 Github Actions
wheel-windows-cp36m Github Actions
wheel-windows-cp37m Github Actions
wheel-windows-cp38m Github Actions
wheel-windows-cp39m Github Actions

@kszucs
Copy link
Member Author

kszucs commented Jan 8, 2021

The osx builds shouldn't be failing, but that's out of the scope of this PR.

@xhochy
Copy link
Member

xhochy commented Jan 8, 2021

The OSX issue is that protobuf is installed through brew and that doesn't install it using cmake. You could conditionally add the following line

-DgRPC_PROTOBUF_PACKAGE_TYPE=CONFIG
only when protobuf is also build using bundled.

@kszucs kszucs changed the title [Python][Packaging] Refactor manylinux and windows wheel building [WIP] [Python][Packaging] Refactor manylinux and windows wheel building Jan 11, 2021
@kszucs
Copy link
Member Author

kszucs commented Jan 14, 2021

I submitted the crossbow tasks using the CLI, the in-progress results are promising:
https://github.com/ursacomputing/crossbow/branches/all?query=build-17

@kszucs
Copy link
Member Author

kszucs commented Jan 14, 2021

@kou this should be ready to merge. One of the arm linux builds is failing with a timeout error in addition to the arm conda drone builds, but these shouldn't block this PR. The latest crossbow submission is here https://github.com/ursacomputing/crossbow/branches/all?query=build-18

@kou
Copy link
Member

kou commented Jan 14, 2021

Thanks!

I'll merge this. We need to work on some follow-up tasks:

  • Fix .deb/.rpm builds on Travis CI
  • Fix Python version (3.8) in dev/tasks/conda-recipes/.ci_support/osx_pytho3.7.____cpython.yaml

@kou kou closed this in ddd9c6d Jan 14, 2021
kszucs pushed a commit that referenced this pull request Feb 17, 2021
@kszucs could you review this please? My main purpose in adding this is to improve the experience for Arrow C++ devs using Windows, but I noticed it also relates to your [TODO in #9096](https://github.com/apache/arrow/pull/9096/files#diff-990134cce6657dbbcf95457cf1a56810a7efa1f6cd58ecc27557c7d6ff45b533R67-R68). vcpkg does not have any `requirements.txt`-style package enumeration mechanism, but it supports this JSON manifest as a mechanism of defining dependencies.

In the `vcpkg install` command, you can specify the path to the directory containing this manifest file with `--x-manifest-root` which later will change to `--manifest-root`. See details at https://vcpkg.readthedocs.io/en/stable/specifications/manifests/.

There are some differences between the packages listed in this manifest versus the packages you listed in the `vcpkg install` commands in #9096
- This installs `gtest` and `benchmark`
- This installs `boost` instead of separate `boost-filesystem`, `boost-regex`, etc.
- This does not explicitly include the `core` feature of `aws-sdk-cpp` because explicitly including it causes an error, and it gets installed anyway

Closes #9287 from ianmcook/ARROW-11340

Lead-authored-by: Ian Cook <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Krisztián Szűcs <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants