Skip to content

Conversation

@tianchen92
Copy link
Contributor

Related to ARROW-7494.

Reader and writer index and functionality doesn't belong on a chunk of memory and is due to inheritance from ByteBuf. As part of removing ByteBuf inheritance, we should also remove reader and writer indexes from ArrowBuf functionality. It wastes heap memory for rare utility. In general, a slice can be used instead of a reader/writer index pattern.

@github-actions
Copy link

github-actions bot commented Jan 7, 2020

@tianchen92
Copy link
Contributor Author

tianchen92 commented Jan 8, 2020

Test failed in Gandiva java test and I don't know how to run Gandiva java test locally, can someone help or give some guidance? thanks! @pravindra @praveenbingo

if (valueCount == 0) {
return valueBuffer.slice(0, 0);
}
return valueBuffer.slice(0, typeWidth == 0 ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should look at whether we're generating extra object in the case that the slice is the same as the original. In that case, we should probably just return the original rather than generate extra objects.

List<ArrowBuf> result = new ArrayList<>(1);
setReaderAndWriterIndex();
result.add(typeBuffer);
result.add(sliceTypeBuffer());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling getFieldBuffers should not create new arrow buf objects. If we're relying on reader/writer index on this we should stop.

* Get the buffers belonging to this vector
* @return the inner buffers.
*/
public List<ArrowBuf> getFieldBuffers() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above. fieldbuffers should not be causing slices. The fact that we have reader/writer settings here is wrong and we should figure out why it was added. To clarify, getFieldBuffers() is distinct from getBuffers(). The former should be for getting access to underlying data for higher-performance algorithms. The latter is for sending the data over the wire. I think we've mixed up use of both. (Maybe you need to address that in a patch first?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the main reason is that it uses getFieldBuffers in VectorUnloader which need writer/reader settings, and actually it should be replaced with getBuffers().
I opened a PR for this: #6156

@emkornfield
Copy link
Contributor

Sorry I merged the int64 address space change, this will need to be rebased.

@tianchen92
Copy link
Contributor Author

Sorry I merged the int64 address space change, this will need to be rebased.

It doesn't matter., I'll rebase this after this PR merged #6156.
Besides, I met gandiva java test fail problem, could you please give some guidance how to run gandiva java test locally if you know, thanks :)

@emkornfield
Copy link
Contributor

It doesn't matter., I'll rebase this after this PR merged #6156.
Besides, I met gandiva java test fail problem, could you please give some guidance how to run gandiva java test locally if you know, thanks :)

You need to build the C++ libraries for gandiva and add them to your linked library path (then you should be able to run java unit tests as normal.).

@emkornfield
Copy link
Contributor

Just checking @tianchen92 what is the status of this?

@tianchen92
Copy link
Contributor Author

Just checking @tianchen92 what is the status of this?

It was blocked by #6156

@github-actions github-actions bot added the needs-rebase A PR that needs to be rebased by the author label Nov 25, 2020
@emkornfield
Copy link
Contributor

Closing as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component: Java needs-rebase A PR that needs to be rebased by the author

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants