ARROW-1373: Implement getBuffer() methods for ValueVector #976
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
cc @jacques-n , @StevenMPhillips
Patch Summary:
As part of ARROW-801, we recently added getValidityBufferAddress(), getOffsetBufferAddress(), getDataBufferAddress() interfaces to get the virtual address of the ArrowBuf.
We now have the following new interfaces to get the corresponding ArrowBuf:
getValidityBuffer()
getDataBuffer()
getOffsetBuffer()
Background:
Currently we have getBuffer() method implemented as part of BaseDataValueVector abstract class. As part of patch for ARROW-276, NullableValueVectors no longer extends BaseDataValueVector -- they don't have to since they don't need the underlying data buffer (ArrowBuf data field) of BaseDataValueVector.
The call to getBuffer() on NullableValueVectors simply delegates the operation to getBuffer() of underlying data/value vector.
Problem:
If a piece of code is working with ValueVector abstraction and the expected runtime type is NullableVector, the compiler obviously complains about doing
(v of type ValueVector).getBuffer().
Until now this worked as we kept the compiler happy by casting the ValueVector to BaseDataValueVector and then do ((BaseDataValueVector)(v of type ValueVector)).getBuffer(). This code broke since NullableValueVectors are no longer a subtype of BaseDataValueVector -- the inheritance hierarchy was changed as part of ARROW-276.
Solution:
Similar to what was done in ARROW-801, we have new methods at ValueVector interface to get the underlying buffer. ValueVector has always had the methods getBuffers(), getBufferSizeFor(), getBufferSize(), so it makes sense to augment the ValueVector interface with new APIs.
It looks like new unit tests are not needed since the unit tests added for ARROW-801 test the new APIs as well --> getDataBufferAddress() underneath invokes getDataBuffer() to get the memory address of ArrowBuf so we are good.