Skip to content

Conversation

@danepitkin
Copy link
Member

@danepitkin danepitkin commented Oct 4, 2023

WIP

Rationale for this change

The FFM APIs had their 3rd preview release in Java 21. Let's start integrating them with Arrow to see how effective they are. The FFM APIs will presumably be finalized in the next Java LTS release (24).

JEP 442: https://openjdk.org/jeps/442
java.lang.foreign: https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/foreign/package-summary.html

What changes are included in this PR?

  • Add a new memory module based on java.lang.foreign.

Are these changes tested?

  • Unit tests added

Are there any user-facing changes?

Yes.

@danepitkin danepitkin marked this pull request as draft October 4, 2023 16:16
@github-actions
Copy link

github-actions bot commented Oct 4, 2023

⚠️ GitHub issue apache/arrow-java#163 has been automatically assigned in GitHub to PR creator.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An Arena controls the lifecycle of native (off-heap) memory segments. The Shared Arena has a bounded lifetime, is explicitly closeable, and is accessible by multiple threads.[1]

[1]https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/foreign/Arena.html

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Oct 4, 2023
@danepitkin danepitkin force-pushed the danepitkin/arrow-memory-ffm branch from bb22994 to c8b78bc Compare October 19, 2023 02:03
@pitrou
Copy link
Member

pitrou commented Oct 19, 2023

Java newbie here, can you give a pointer to the FFM docs?

@danepitkin
Copy link
Member Author

danepitkin commented Oct 19, 2023

Java newbie here, can you give a pointer to the FFM docs?

Added the JEP and docs to the description!

@pitrou
Copy link
Member

pitrou commented Oct 19, 2023

Added the JEP and docs to the description!

Wow, that looks really promising.

What are the implications of using a preview API? Can the API change in later versions? Does it impact downstream users?

@danepitkin
Copy link
Member Author

danepitkin commented Oct 19, 2023

Wow, that looks really promising.

What are the implications of using a preview API? Can the API change in later versions? Does it impact downstream users?

Yes, the APIs can still change, but it is approaching stability now that its been in development for a few years. I plan to mark this module as experimental and only enabled for Java 21+, but I expect the JEP to be finalized for the next Arrow Java LTS (Long Term Support) release Java 24. The benefit of adding it now is that we can trial it and potentially provide late feedback to the JEP implementation if needed.

Adding the module itself won't impact users unless they choose to use it. Arrow Java provides an abstract interface to off-heap memory called arrow-memory-core. Users must then select exactly one implementation: sun.misc.Unsafe (arrow-memory-unsafe), netty (arrow-memory-netty), and now the experimental FFM (for Java 21+ only). The arrow-memory-core interface will need to eventually be refactored to take better advantage of FFM, but I don't plan to do that in this PR.

@pitrou
Copy link
Member

pitrou commented Oct 19, 2023

Do you think we'll also ditch our JNI interfaces at some point? I suppose we must keep them until FFM becomes mainstream, but would it be beneficial to use FFM as an alternative on supported setups?

@danepitkin
Copy link
Member Author

Do you think we'll also ditch our JNI interfaces at some point? I suppose we must keep them until FFM becomes mainstream, but would it be beneficial to use FFM as an alternative on supported setups?

Eventually, yes! I think there is a lot that needs to happen before then. Besides the Java support fiasco holding many users back (Java 8 has extended support until 2030) and needing to wait until Java 24+, the JIT implementations will also most likely require work to improve optimizations of FFM APIs to meet similar performance numbers as the existing alternatives. Long term, this definitely looks like the preferred approach over JNI to me.

@danepitkin danepitkin force-pushed the danepitkin/arrow-memory-ffm branch from c8b78bc to 609e7ef Compare November 13, 2023 20:00
FfmAllocationManager(BufferAllocator accountingAllocator, long requestedSize) {
super(accountingAllocator);
arena = Arena.ofShared();
allocatedMemorySegment = arena.allocate(requestedSize, /*byteAlignment*/ 8);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that FFM requires byteAlignment to be specified within the allocation manager, while netty/unsafe do not.

java/pom.xml Outdated
<jdk>[21,]</jdk>
</activation>
<modules>
<module>memory/memory-ffm</module>
Copy link
Member Author

@danepitkin danepitkin Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the best way to enable developers to build arrow-memory-ffm? I haven't come up with a better alternative, but I'm also not a POM expert.

CI does not build this module, but a developer can build from source with mvn clean install -pl :arrow-memory-ffm --am

@danepitkin
Copy link
Member Author

@github-actions crossbow submit -g java

@github-actions
Copy link

Revision: 0cd7939

Submitted crossbow builds: ursacomputing/crossbow @ actions-0fc32fc5ee

Task Status
java-jars Github Actions
verify-rc-source-java-linux-almalinux-8-amd64 Github Actions
verify-rc-source-java-linux-conda-latest-amd64 Github Actions
verify-rc-source-java-linux-ubuntu-20.04-amd64 Github Actions
verify-rc-source-java-linux-ubuntu-22.04-amd64 Github Actions
verify-rc-source-java-macos-amd64 Github Actions

@danepitkin danepitkin marked this pull request as ready for review November 13, 2023 21:27
@adamkennedy
Copy link

One tidbit passed on from the Trino team was that once we have FFI then we'd also want to figure out how to move past any uses of ByteBuffer in the rest of the code to the FFI classes, as (reportedly) some mistakes were made with ByteBuffer which means it isn't very friendly to inlining and the new stuff is much much faster once inlining has time to kick in.

try {
return getFactory("org.apache.arrow.memory.FfmAllocationManager");
} catch (RuntimeException e) {
if (majorVersion < 21) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we also have to detect that preview features are actually turned on and the underlying FFM libraries are available in that specific Java 21+ VM instance?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch. More helpful checks would be nice.

@danepitkin
Copy link
Member Author

One tidbit passed on from the Trino team was that once we have FFI then we'd also want to figure out how to move past any uses of ByteBuffer in the rest of the code to the FFI classes, as (reportedly) some mistakes were made with ByteBuffer which means it isn't very friendly to inlining and the new stuff is much much faster once inlining has time to kick in.

Thank you! I envision this refactor, plus supporting long memory addresses, as follow up tasks. I can file those if this PR merges without those features.

@danepitkin danepitkin marked this pull request as draft November 14, 2023 23:06
@danepitkin
Copy link
Member Author

I'm going to put this on hold for now. The next steps are to refactor out the usage of UNSAFE since it's not accessible in Java 16+. This is a huge task, but would allow us to actually test the memory-core module with FFM APIs. This PR remains an experiment and is not in any mergable form.

@@ -0,0 +1,77 @@
/*
Copy link
Member Author

@danepitkin danepitkin Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file was added as a hack to test the memory-core module with the FFM APIs. It will be useful for testing later, but can be ignored otherwise.

@wendigo
Copy link

wendigo commented Apr 16, 2024

Since FFM is GA since JDK 22, is there a plan to move forward with FFM-based allocator for Arrow?

@vibhatha
Copy link
Collaborator

@wendigo Yes we are planning to.

@danepitkin
Copy link
Member Author

Closing for now due to staleness.

@danepitkin danepitkin closed this Sep 9, 2024
@github-actions
Copy link

⚠️ GitHub issue #37739 has no components, please add labels for components.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Java] Implement arrow-memory-ffm

5 participants