ARROW-7494: [Java] Remove reader index and writer index from ArrowBuf #6133

tianchen92 · 2020-01-07T12:49:12Z

Reader and writer index and functionality doesn't belong on a chunk of memory and is due to inheritance from ByteBuf. As part of removing ByteBuf inheritance, we should also remove reader and writer indexes from ArrowBuf functionality. It wastes heap memory for rare utility. In general, a slice can be used instead of a reader/writer index pattern.

…dableBuffer

github-actions · 2020-01-07T13:01:39Z

https://issues.apache.org/jira/browse/ARROW-7494

java/memory/src/main/java/io/netty/buffer/ArrowBuf.java

tianchen92 · 2020-01-08T12:31:55Z

Test failed in Gandiva java test and I don't know how to run Gandiva java test locally, can someone help or give some guidance? thanks! @pravindra @praveenbingo

jacques-n · 2020-01-09T18:51:50Z

java/vector/src/main/java/org/apache/arrow/vector/BaseFixedWidthVector.java

+    if (valueCount == 0) {
+      return valueBuffer.slice(0, 0);
    }
+    return valueBuffer.slice(0, typeWidth == 0 ?


we should look at whether we're generating extra object in the case that the slice is the same as the original. In that case, we should probably just return the original rather than generate extra objects.

jacques-n · 2020-01-09T18:53:19Z

java/vector/src/main/codegen/templates/UnionVector.java

    List<ArrowBuf> result = new ArrayList<>(1);
-    setReaderAndWriterIndex();
-    result.add(typeBuffer);
+    result.add(sliceTypeBuffer());


calling getFieldBuffers should not create new arrow buf objects. If we're relying on reader/writer index on this we should stop.

jacques-n · 2020-01-09T18:55:21Z

java/vector/src/main/java/org/apache/arrow/vector/BaseFixedWidthVector.java

   * Get the buffers belonging to this vector
   * @return the inner buffers.
   */
  public List<ArrowBuf> getFieldBuffers() {


As above. fieldbuffers should not be causing slices. The fact that we have reader/writer settings here is wrong and we should figure out why it was added. To clarify, getFieldBuffers() is distinct from getBuffers(). The former should be for getting access to underlying data for higher-performance algorithms. The latter is for sending the data over the wire. I think we've mixed up use of both. (Maybe you need to address that in a patch first?)

Agreed, the main reason is that it uses getFieldBuffers in VectorUnloader which need writer/reader settings, and actually it should be replaced with getBuffers().
I opened a PR for this: #6156

emkornfield · 2020-01-11T03:50:29Z

Sorry I merged the int64 address space change, this will need to be rebased.

tianchen92 · 2020-01-11T06:14:38Z

Sorry I merged the int64 address space change, this will need to be rebased.

It doesn't matter., I'll rebase this after this PR merged #6156.
Besides, I met gandiva java test fail problem, could you please give some guidance how to run gandiva java test locally if you know, thanks :)

emkornfield · 2020-02-24T06:05:11Z

It doesn't matter., I'll rebase this after this PR merged #6156.
Besides, I met gandiva java test fail problem, could you please give some guidance how to run gandiva java test locally if you know, thanks :)

You need to build the C++ libraries for gandiva and add them to your linked library path (then you should be able to run java unit tests as normal.).

emkornfield · 2020-04-29T05:43:05Z

Just checking @tianchen92 what is the status of this?

tianchen92 · 2020-04-29T06:12:49Z

Just checking @tianchen92 what is the status of this?

It was blocked by #6156

emkornfield · 2021-09-12T21:30:01Z

Closing as stale.

tianchen92 added 3 commits January 7, 2020 15:46

remove reader/writer indices usages in vectors

5a3d16b

remove reader/writer indices usages in channels/JsonFileReader/GetRea…

4d8acc5

…dableBuffer

remove reader/writer indices and API from ArrowBuf

be08a0f

jacques-n reviewed Jan 7, 2020

View reviewed changes

java/memory/src/main/java/io/netty/buffer/ArrowBuf.java Outdated Show resolved Hide resolved

remove write API from ArrowBuf and fix gandiva

e622b21

tianchen92 force-pushed the ARROW-7494 branch from bfb90e6 to e622b21 Compare January 9, 2020 05:33

try to fix gandiva test

e8f221f

jacques-n reviewed Jan 9, 2020

View reviewed changes

tianchen92 mentioned this pull request Jan 10, 2020

ARROW-7539: [Java] FieldVector getFieldBuffers API should not set reader/writer indices #6156

Closed

fsaintjacques added the Component: Java label Jan 16, 2020

wesm force-pushed the master branch from 5fe5b88 to aa55967 Compare April 19, 2020 22:47

kszucs force-pushed the master branch from 1b71ca7 to 5093b80 Compare April 20, 2020 19:21

github-actions bot added the needs-rebase A PR that needs to be rebased by the author label Nov 25, 2020

jorgecarleitao force-pushed the master branch from d4608a9 to 356c300 Compare February 14, 2021 12:09

emkornfield closed this Sep 12, 2021

This was referenced Nov 26, 2024

[Java] FieldVector getFieldBuffers API should not set reader/writer indices apache/arrow-java#270

Open

[Java] Remove reader index and writer index from ArrowBuf apache/arrow-java#272

Open

ARROW-7494: [Java] Remove reader index and writer index from ArrowBuf #6133

ARROW-7494: [Java] Remove reader index and writer index from ArrowBuf #6133

Uh oh!

Conversation

tianchen92 commented Jan 7, 2020

Uh oh!

github-actions bot commented Jan 7, 2020

Uh oh!

Uh oh!

tianchen92 commented Jan 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacques-n Jan 9, 2020

Choose a reason for hiding this comment

Uh oh!

jacques-n Jan 9, 2020

Choose a reason for hiding this comment

Uh oh!

jacques-n Jan 9, 2020

Choose a reason for hiding this comment

Uh oh!

tianchen92 Jan 10, 2020

Choose a reason for hiding this comment

Uh oh!

emkornfield commented Jan 11, 2020

Uh oh!

tianchen92 commented Jan 11, 2020

Uh oh!

emkornfield commented Feb 24, 2020

Uh oh!

emkornfield commented Apr 29, 2020

Uh oh!

tianchen92 commented Apr 29, 2020

Uh oh!

emkornfield commented Sep 12, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tianchen92 commented Jan 8, 2020 •

edited

Loading