Skip to content

Conversation

@davisusanibar
Copy link
Contributor

@davisusanibar davisusanibar commented May 5, 2022

Proposal to upgrade Arrow Java to Module System.

Please note: This PR is not consider to be part of next release 10.0.0, the plan is to test this on 11.0.0 version.

Current status

  1. Apache Arrow Java is compiled and tested by JDK 8, 11, 17, 18 but packaged with source/target JDK8

Changes implemented by this PR

  1. Apache Arrow Java project are continuing compiled and tested by JDK 8, 11, 17, 18 but packaged with source/target JDK8
  2. If Apache Arrow Java project is building with JDK11+ validate the project and test it against the module-info.java definitions
  3. There are not CI implementation to deliver Jar library with JDK8 & JDK11+ (we need to define what will be our next step for JDK versions)
  4. We probably are breaking APIs because we are changing package name to ensure unique package names on the project required by module-info.java
  5. We are added a new project that contains Apache Arrow Netty custom implementation needed for
  6. There are remaining modules pending to implement modules with module-info.java files. This PR contains modules-info.java for: Format / Memory / Vector
  7. Is pending to review errors at Integration testing

Next steps

  1. Implement cross-compilation with JDK 11+ and packaging with release=8 (continue offering support for JDK8):
    That is not posible, if we compile with --release=8 we will have problems to access sun.misc.Unsafe that is not as exposed/public/documented API for N >= 9: -source N -target N --system . (https://openjdk.org/jeps/247)

  2. We have these options for our next steps:

  • Option 1.- Maintain all as it is now, continue packaging with source/target JDK8

  • Option 2.- Change/Update/Modify use of sun.misc.* classes dependencies to be able to support cross-compilation (--release 8) to packaging with JDK11+ and continue offering support for JDK8 also

jdeps --jdk-internals arrow-memory-unsafe-10.0.0-SNAPSHOT.jar
arrow-memory-unsafe-10.0.0-SNAPSHOT.jar -> jdk.unsupported
   org.apache.arrow.memory.unsafe.UnsafeAllocationManager -> sun.misc.Unsafe                                    JDK internal API (jdk.unsupported)

Warning: JDK internal APIs are unsupported and private to JDK implementation that are
subject to be removed or changed incompatibly and could break your application.
Please modify your code to eliminate dependence on any JDK internal APIs.
For the most recent update on JDK internal API replacements, please check:
https://wiki.openjdk.java.net/display/JDK8/Java+Dependency+Analysis+Tool

JDK Internal API                         Suggested Replacement
----------------                         ---------------------
sun.misc.Unsafe                          See http://openjdk.java.net/jeps/260

If you decided to push some of this options please use that ML to move on next steps.


Detailed steps for current implementation

JPMS Planning:

  • Migrate Arrow Java Format

  • Migrate Arrow Java Memory Core

  • Migrate Arrow Java Memory Netty
    Not possible to apply module system for the package io.netty.buffer, it was needed to create another module memory-netty-buffer-patch and patch this package with our arrow java io.netty.buffer code.

  • Migrate Arrow Java Memory Unsafe
    TODO: Need to refactor protected access methods from unit test.

  • Migrate Arrow Java Vector
    TODO: Need to refactor protected access methods from unit test.

  • Migrate Arrow Java Flight
    Blocker: Current grpc-api and grpc-context are not modularized and cannot be used as a module because both of them has the same package named io.grpc

  • JPMS Java Platform Module System

    Java modules enforced new restrictions on existing code:

    • Package need to be uniqueness

      • There are some modules with the same package name (i.e. org.apache.arrow.memory / io.netty.buffer) that is consumed by another module with the same package name to access protected methods.
        • Java Memory module problems:
          • Core / Netty / Unsafe has the same package name org.apache.arrow.memory and there are not problems to consume protected methods, if we change the name of the package we need to define a way about how Netty module could consume for example Core protected methods.
        • Java Netty extends io.netty.buffer capabilities:
          • Is needed to create a new module arrow-memory-netty-buffer-patch to add new Java functionalities added and this need to be consumed by Arrow Memory Netty.
    • Dependencies also need to be modularized:

      • Current jar dependencies need to be modularized to be consumed as a module if not there are no way to convert our project to module if any dependencies are not modularized before.
        • Arrow Format: [x]
        • Arrow Memory: [x]
        • Arrow Vector: [x]
        • Arrow Flight:
          • Blocker: Current grpc-api and grpc-context are not modularized and cannot be used as a module because both of them has the same package named io.grpc.

    Summarize changes and benefits to migrate Arrow Java with Module System:

    • JPMS offer Strong encapsulation, Well-defined interfaces, Explicit dependencies. (3)(4)
    • JPMS offer reliable configuration and security to hide platform internals.
    • JPMS offer a partial solution to solve problema about read (80%) /write (20%) code.
    • JPMS offer optimization for readability about read/write ratio (90/10) thru module-info.java.
    • Consistency logs, JPMS implement consistency logs to really use that to solve the current problem.
    • Be able to customize JRE needed with only modules needed (not java.desktop for example and others) thru JLink.
    • Modules is also been implemented by other language such as Javascript (ES2015), C++(C++20), Net (Nuget/NetCore) and this is some bus that at some time we could onboard.
    • Consider tho take a lokk at this discussion about pro/cons (5).

(3): https://nipafx.dev/java-modules-reflection-vs-encapsulation/
(4): https://github.com/nipafx/demo-jigsaw-reflection
(5): https://www.reddit.com/r/java/comments/okt3j3/do_you_use_jigsaw_modules_in_your_java_projects/

@github-actions
Copy link

github-actions bot commented May 5, 2022

@github-actions
Copy link

github-actions bot commented May 5, 2022

⚠️ Ticket has no components in JIRA, make sure you assign one.

@davisusanibar davisusanibar changed the title ARROW-16328: [Java] Initial draft to experiment with java module system ARROW-16328: [Java] Proposal to upgrade Arrow Java project to JEP 261 Module System May 10, 2022
@davisusanibar davisusanibar changed the title ARROW-16328: [Java] Proposal to upgrade Arrow Java project to JEP 261 Module System ARROW-16328: [Java] Proposal to upgrade Arrow Java project to JPMS Java Platform Module System May 11, 2022
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<compilerArgs combine.children="append">
<arg>--patch-module=io.netty.buffer=/Users/dsusanibar/.m2/repository/org/apache/arrow/memory-netty-buffer-patch/8.0.0-SNAPSHOT/memory-netty-buffer-patch-8.0.0-SNAPSHOT.jar</arg>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ${settings.localRepository} instead of hardcoding the user

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated

java/pom.xml Outdated
<dep.avro.version>1.10.0</dep.avro.version>
<arrow.vector.classifier />
<forkCount>2</forkCount>
<forkCount>0</forkCount>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to JPMS ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be reviewed. Value more than "0" produce: NoClassDefFound Could not initialize class org.apache.arrow.memory.RootAllocator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to 2

@jduo
Copy link
Member

jduo commented Nov 24, 2023

This work has been migrated to #38876.

@jduo jduo closed this Nov 24, 2023
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<compilerArgs combine.children="append">
<arg>--patch-module=io.netty.buffer=${project.basedir}/../memory-netty-buffer-patch/target/arrow-memory-netty-buffer-patch-${project.version}.jar</arg>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davisusanibar , I'm having problems getting this step to run in the new PR and am not entirely sure if it is necessary.

The new PR just does a pure syntax compilation of module-info.java so it doesn't verify the dependencies listed and just uses the classpath to look for classes from arrow-memory-netty-buffer-patch.

Was this step added so that this module would correctly find classes in io.netty.buffer defined in arrow-memory-netty-buffer-patch at build-time only, or does this have an effect at run time? My reading online is that even if we used patch-module to compile, a user utilizing the module would still need to use patch-module to specify module extensions to netty.

If patch-module is only used here to help get this JAR to compile, it seems like it isn't necessary in the new PR since the plugin is compiling module-info.java without verifying its imports.

jduo added a commit to Bit-Quill/arrow that referenced this pull request Nov 29, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Patch classes into netty's io.netty.buffer package (specifically use
the JDK9 compiler to do this instead of inheriting JDK8).
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Dec 1, 2023
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
jduo added a commit to Bit-Quill/arrow that referenced this pull request Jan 10, 2024
Based on apache#13072
- Avoid having multiple memory modules contribute to the same package
- Introduce memory-netty-buffer-patch module for patching classes
into Netty modules
- Avoid using BaseAllocator in tests and use RootAllocator instead
- Move TestBaseAllocator#testMemoryUsage() to a new test in
memory-netty TestNettyAllocator because TestBaseAllocator is now in
memory-core, but that specific test has Netty dependencies.
@jayantsing-db
Copy link

Hey @mmendonca2 , I added an update to the issue you linked. In summary, the Arrow community decided not to deprecate Java 8 support for the foreseeable future. The plan is to revisit that discussion around 2024-06.

Hi @danepitkin, it looks like the PR to drop Java 8 support has been merged. Could you update the status on this comment regarding Arrow's compatibility with JDK 16+? Currently, users still need to specify JVM parameters to open the required packages for reflection.

@lidavidm
Copy link
Member

Unfortunately JVM parameters will be required for the foreseeable future. It may be possible to avoid this if someone can contribute support for FFM and work out how to support both Java >=21 and Java <21 from the same codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.