-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-35245: [Java][Dataset][Linux] Enable GCS #35246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
ci/scripts/java_jni_macos_build.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you keep this list in alphabetical order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I just noticed that this is out of order (before my change):
: ${ARROW_RPATH_ORIGIN:=ON}
: ${ARROW_ORC:=ON}
|
@github-actions crossbow submit java-jars |
|
Revision: 9f4d80c8b0239aeee6a85234832b0b2d29a5dc46 Submitted crossbow builds: ursacomputing/crossbow @ actions-f4323dd042
|
|
Looks like the macOS builds are failing (which I kind of expected). |
Enables GCS when building the Arrow Dataset for Java and also fixes various
java build failures.
Without the changes to flight-sql-jdbc-driver/pom.xml the flight-sql-jdbc-driver
build will fail with the following errors:
[WARNING] Used undeclared dependencies found:
[WARNING] org.bouncycastle:bcpkix-jdk15on:jar:1.61:runtime
[WARNING] org.apache.arrow:arrow-memory-core:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.hamcrest:hamcrest:jar:2.2:runtime
[WARNING] org.apache.arrow:flight-sql:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.mockito:mockito-core:jar:2.25.1:test
[WARNING] org.apache.arrow:flight-core:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.slf4j:slf4j-api:jar:1.7.25:runtime
[WARNING] io.netty:netty-common:jar:4.1.82.Final:runtime
[WARNING] joda-time:joda-time:jar:2.10.14:runtime
[WARNING] org.apache.calcite.avatica:avatica:jar:1.18.0:runtime
[WARNING] com.google.protobuf:protobuf-java:jar:3.21.6:runtime
[WARNING] org.apache.arrow:arrow-vector:jar:12.0.0-SNAPSHOT:runtime
[WARNING] com.google.guava:guava:jar:31.1-jre:runtime
[...]
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:3.0.1:analyze-only (analyze) on project flight-sql-jdbc-driver: Dependency problems found -> [Help 1]
And also fail with:
Caused by: java.lang.NullPointerException: Could not find test data path. Set the environment variable ARROW_TEST_DATA or the JVM property arrow.test.dataRoot.
at java.util.Objects.requireNonNull(Objects.java:228)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.getTestDataRoot(FlightSqlTestCertificates.java:40)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.getFlightTestDataRoot(FlightSqlTestCertificates.java:51)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.exampleTlsCerts(FlightSqlTestCertificates.java:60)
at org.apache.arrow.driver.jdbc.ConnectionTlsTest.<clinit>(ConnectionTlsTest.java:59)
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. The Java changes look fine. We can punt on macOS for now. @davisusanibar might you be able to follow up there?
|
@github-actions crossbow submit java-jars |
|
Revision: 98d47c5 Submitted crossbow builds: ursacomputing/crossbow @ actions-b74b79c10c
|
|
It looks like there are two unrelated test failures: |
|
Yes. They are unrelated. Could you check the built artifacts at https://github.com/ursacomputing/crossbow/releases/tag/actions-b74b79c10c-github-java-jars ? |
Sure, let me also consider Windows changes needed. |
Works fine. See this screenshot for proof: For context, I just replaced our existing custom 10.0.1 build with the 12.0.1-SNAPSHOT from the built artifacts link that you sent and verified it with a simple test to a public gcs repo. |
kou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Thanks.
|
Thanks @kou @lidavidm and @davisusanibar for the very fast turn around. |
|
### Rationale for this change
Enables GCS when building the Arrow Dataset for Java and also fixes various java build failures.
Currently we are using our own custom Arrow Dataset build with GCS turned on, but we would rather this be enabled in the official releases from Arrow.
GCS support is already enabled for cpp, python, ruby, python, and r already, so there should be no reason not to enable this on java as well.
### What changes are included in this PR?
- Changes to enable GCS for Java Arrow Dataset on just Linux for now.
- Fixes to flight-sql-jdbc-driver/pom.xml. Without these fixes the flight-sql-jdbc-driver build will fail with the following errors:
```
[WARNING] Used undeclared dependencies found:
[WARNING] org.bouncycastle:bcpkix-jdk15on:jar:1.61:runtime
[WARNING] org.apache.arrow:arrow-memory-core:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.hamcrest:hamcrest:jar:2.2:runtime
[WARNING] org.apache.arrow:flight-sql:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.mockito:mockito-core:jar:2.25.1:test
[WARNING] org.apache.arrow:flight-core:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.slf4j:slf4j-api:jar:1.7.25:runtime
[WARNING] io.netty:netty-common:jar:4.1.82.Final:runtime
[WARNING] joda-time:joda-time:jar:2.10.14:runtime
[WARNING] org.apache.calcite.avatica:avatica:jar:1.18.0:runtime
[WARNING] com.google.protobuf:protobuf-java:jar:3.21.6:runtime
[WARNING] org.apache.arrow:arrow-vector:jar:12.0.0-SNAPSHOT:runtime
[WARNING] com.google.guava:guava:jar:31.1-jre:runtime
[...]
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:3.0.1:analyze-only (analyze) on project flight-sql-jdbc-driver: Dependency problems found -> [Help 1]
```
```
Caused by: java.lang.NullPointerException: Could not find test data path. Set the environment variable ARROW_TEST_DATA or the JVM property arrow.test.dataRoot.
at java.util.Objects.requireNonNull(Objects.java:228)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.getTestDataRoot(FlightSqlTestCertificates.java:40)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.getFlightTestDataRoot(FlightSqlTestCertificates.java:51)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.exampleTlsCerts(FlightSqlTestCertificates.java:60)
at org.apache.arrow.driver.jdbc.ConnectionTlsTest.<clinit>(ConnectionTlsTest.java:59)
```
### Are these changes tested?
I've tested the build by running:
```
$HOME/.local/bin/archery docker run java-jni-manylinux-2014
```
I've also tested the resulting `./java/dataset/target/arrow-dataset-12.0.0-SNAPSHOT.jar` from running the command and have verified that GCS support is enabled.
### Are there any user-facing changes?
Yes, Java Arrow Dataset will now work with GCS.
* Closes: apache#35245
Authored-by: Henry Mai <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
### Rationale for this change
Enables GCS when building the Arrow Dataset for Java and also fixes various java build failures.
Currently we are using our own custom Arrow Dataset build with GCS turned on, but we would rather this be enabled in the official releases from Arrow.
GCS support is already enabled for cpp, python, ruby, python, and r already, so there should be no reason not to enable this on java as well.
### What changes are included in this PR?
- Changes to enable GCS for Java Arrow Dataset on just Linux for now.
- Fixes to flight-sql-jdbc-driver/pom.xml. Without these fixes the flight-sql-jdbc-driver build will fail with the following errors:
```
[WARNING] Used undeclared dependencies found:
[WARNING] org.bouncycastle:bcpkix-jdk15on:jar:1.61:runtime
[WARNING] org.apache.arrow:arrow-memory-core:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.hamcrest:hamcrest:jar:2.2:runtime
[WARNING] org.apache.arrow:flight-sql:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.mockito:mockito-core:jar:2.25.1:test
[WARNING] org.apache.arrow:flight-core:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.slf4j:slf4j-api:jar:1.7.25:runtime
[WARNING] io.netty:netty-common:jar:4.1.82.Final:runtime
[WARNING] joda-time:joda-time:jar:2.10.14:runtime
[WARNING] org.apache.calcite.avatica:avatica:jar:1.18.0:runtime
[WARNING] com.google.protobuf:protobuf-java:jar:3.21.6:runtime
[WARNING] org.apache.arrow:arrow-vector:jar:12.0.0-SNAPSHOT:runtime
[WARNING] com.google.guava:guava:jar:31.1-jre:runtime
[...]
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:3.0.1:analyze-only (analyze) on project flight-sql-jdbc-driver: Dependency problems found -> [Help 1]
```
```
Caused by: java.lang.NullPointerException: Could not find test data path. Set the environment variable ARROW_TEST_DATA or the JVM property arrow.test.dataRoot.
at java.util.Objects.requireNonNull(Objects.java:228)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.getTestDataRoot(FlightSqlTestCertificates.java:40)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.getFlightTestDataRoot(FlightSqlTestCertificates.java:51)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.exampleTlsCerts(FlightSqlTestCertificates.java:60)
at org.apache.arrow.driver.jdbc.ConnectionTlsTest.<clinit>(ConnectionTlsTest.java:59)
```
### Are these changes tested?
I've tested the build by running:
```
$HOME/.local/bin/archery docker run java-jni-manylinux-2014
```
I've also tested the resulting `./java/dataset/target/arrow-dataset-12.0.0-SNAPSHOT.jar` from running the command and have verified that GCS support is enabled.
### Are there any user-facing changes?
Yes, Java Arrow Dataset will now work with GCS.
* Closes: apache#35245
Authored-by: Henry Mai <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
### Rationale for this change
Enables GCS when building the Arrow Dataset for Java and also fixes various java build failures.
Currently we are using our own custom Arrow Dataset build with GCS turned on, but we would rather this be enabled in the official releases from Arrow.
GCS support is already enabled for cpp, python, ruby, python, and r already, so there should be no reason not to enable this on java as well.
### What changes are included in this PR?
- Changes to enable GCS for Java Arrow Dataset on just Linux for now.
- Fixes to flight-sql-jdbc-driver/pom.xml. Without these fixes the flight-sql-jdbc-driver build will fail with the following errors:
```
[WARNING] Used undeclared dependencies found:
[WARNING] org.bouncycastle:bcpkix-jdk15on:jar:1.61:runtime
[WARNING] org.apache.arrow:arrow-memory-core:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.hamcrest:hamcrest:jar:2.2:runtime
[WARNING] org.apache.arrow:flight-sql:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.mockito:mockito-core:jar:2.25.1:test
[WARNING] org.apache.arrow:flight-core:jar:12.0.0-SNAPSHOT:runtime
[WARNING] org.slf4j:slf4j-api:jar:1.7.25:runtime
[WARNING] io.netty:netty-common:jar:4.1.82.Final:runtime
[WARNING] joda-time:joda-time:jar:2.10.14:runtime
[WARNING] org.apache.calcite.avatica:avatica:jar:1.18.0:runtime
[WARNING] com.google.protobuf:protobuf-java:jar:3.21.6:runtime
[WARNING] org.apache.arrow:arrow-vector:jar:12.0.0-SNAPSHOT:runtime
[WARNING] com.google.guava:guava:jar:31.1-jre:runtime
[...]
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:3.0.1:analyze-only (analyze) on project flight-sql-jdbc-driver: Dependency problems found -> [Help 1]
```
```
Caused by: java.lang.NullPointerException: Could not find test data path. Set the environment variable ARROW_TEST_DATA or the JVM property arrow.test.dataRoot.
at java.util.Objects.requireNonNull(Objects.java:228)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.getTestDataRoot(FlightSqlTestCertificates.java:40)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.getFlightTestDataRoot(FlightSqlTestCertificates.java:51)
at org.apache.arrow.driver.jdbc.utils.FlightSqlTestCertificates.exampleTlsCerts(FlightSqlTestCertificates.java:60)
at org.apache.arrow.driver.jdbc.ConnectionTlsTest.<clinit>(ConnectionTlsTest.java:59)
```
### Are these changes tested?
I've tested the build by running:
```
$HOME/.local/bin/archery docker run java-jni-manylinux-2014
```
I've also tested the resulting `./java/dataset/target/arrow-dataset-12.0.0-SNAPSHOT.jar` from running the command and have verified that GCS support is enabled.
### Are there any user-facing changes?
Yes, Java Arrow Dataset will now work with GCS.
* Closes: apache#35245
Authored-by: Henry Mai <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>

Rationale for this change
Enables GCS when building the Arrow Dataset for Java and also fixes various java build failures.
Currently we are using our own custom Arrow Dataset build with GCS turned on, but we would rather this be enabled in the official releases from Arrow.
GCS support is already enabled for cpp, python, ruby, python, and r already, so there should be no reason not to enable this on java as well.
What changes are included in this PR?
Changes to enable GCS for Java Arrow Dataset on just Linux for now.
Fixes to flight-sql-jdbc-driver/pom.xml. Without these fixes the flight-sql-jdbc-driver build will fail with the following errors:
Are these changes tested?
I've tested the build by running:
I've also tested the resulting
./java/dataset/target/arrow-dataset-12.0.0-SNAPSHOT.jarfrom running the command and have verified that GCS support is enabled.Are there any user-facing changes?
Yes, Java Arrow Dataset will now work with GCS.