Skip to content

Conversation

@siddharthteotia
Copy link
Contributor

cc @jacques-n , @StevenMPhillips

Patch Summary:

As part of ARROW-801, we recently added getValidityBufferAddress(), getOffsetBufferAddress(), getDataBufferAddress() interfaces to get the virtual address of the ArrowBuf.

We now have the following new interfaces to get the corresponding ArrowBuf:

getValidityBuffer()
getDataBuffer()
getOffsetBuffer()

Background:

Currently we have getBuffer() method implemented as part of BaseDataValueVector abstract class. As part of patch for ARROW-276, NullableValueVectors no longer extends BaseDataValueVector -- they don't have to since they don't need the underlying data buffer (ArrowBuf data field) of BaseDataValueVector.

The call to getBuffer() on NullableValueVectors simply delegates the operation to getBuffer() of underlying data/value vector.

Problem:

If a piece of code is working with ValueVector abstraction and the expected runtime type is NullableVector, the compiler obviously complains about doing
(v of type ValueVector).getBuffer().

Until now this worked as we kept the compiler happy by casting the ValueVector to BaseDataValueVector and then do ((BaseDataValueVector)(v of type ValueVector)).getBuffer(). This code broke since NullableValueVectors are no longer a subtype of BaseDataValueVector -- the inheritance hierarchy was changed as part of ARROW-276.

Solution:

Similar to what was done in ARROW-801, we have new methods at ValueVector interface to get the underlying buffer. ValueVector has always had the methods getBuffers(), getBufferSizeFor(), getBufferSize(), so it makes sense to augment the ValueVector interface with new APIs.

It looks like new unit tests are not needed since the unit tests added for ARROW-801 test the new APIs as well --> getDataBufferAddress() underneath invokes getDataBuffer() to get the memory address of ArrowBuf so we are good.

@jacques-n
Copy link
Contributor

lgtm +1

@siddharthteotia
Copy link
Contributor Author

@wesm , can this be merged?

Just looked at the log for AppVeyor build and the errors seem to be unrelated to my code changes.

C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy.cc(98): error C2061: syntax error: identifier 'ssize_t' [C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy_static.vcxproj] [C:\projects\arrow\cpp\build\snappy_ep.vcxproj]
C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy.cc(102): error C2065: 'len': undeclared identifier [C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy_static.vcxproj] [C:\projects\arrow\cpp\build\snappy_ep.vcxproj]
C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy.cc(140): error C2061: syntax error: identifier 'ssize_t' [C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy_static.vcxproj] [C:\projects\arrow\cpp\build\snappy_ep.vcxproj]
C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy.cc(143): error C2065: 'len': undeclared identifier [C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy_static.vcxproj] [C:\projects\arrow\cpp\build\snappy_ep.vcxproj]
C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy.cc(146): error C2065: 'len': undeclared identifier [C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy_static.vcxproj] [C:\projects\arrow\cpp\build\snappy_ep.vcxproj]
C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy.cc(150): error C2065: 'len': undeclared identifier [C:\projects\arrow\cpp\build\snappy_ep-prefix\src\snappy_ep\snappy_static.vcxproj] [C:\projects\arrow\cpp\build\snappy_ep.vcxproj]

@wesm
Copy link
Member

wesm commented Aug 19, 2017

Yes, I will merge. Hopefully the Windows error is transient

@asfgit asfgit closed this in c9805d6 Aug 19, 2017
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
cc @jacques-n , @StevenMPhillips

Patch Summary:

As part of ARROW-801, we recently added getValidityBufferAddress(), getOffsetBufferAddress(), getDataBufferAddress() interfaces to get the virtual address of the ArrowBuf.

We now have the following new interfaces to get the corresponding ArrowBuf:

getValidityBuffer()
getDataBuffer()
getOffsetBuffer()

Background:

Currently we have getBuffer() method implemented as part of BaseDataValueVector abstract class. As part of patch for ARROW-276, NullableValueVectors no longer extends BaseDataValueVector -- they don't have to since they don't need the underlying data buffer  (ArrowBuf data field) of BaseDataValueVector.

The call to getBuffer() on NullableValueVectors simply delegates the operation to getBuffer() of underlying data/value vector.

Problem:

If a piece of code is working with ValueVector abstraction and the expected runtime type is Nullable<something>Vector, the compiler obviously complains about doing
(v of type ValueVector).getBuffer().

Until now this worked as we kept the compiler happy by casting the ValueVector to BaseDataValueVector and then do ((BaseDataValueVector)(v of type ValueVector)).getBuffer(). This code broke since NullableValueVectors are no longer a subtype of BaseDataValueVector -- the inheritance hierarchy was changed as part of ARROW-276.

Solution:

Similar to what was done in ARROW-801, we have new methods at ValueVector interface to get the underlying buffer. ValueVector has always had the methods getBuffers(), getBufferSizeFor(), getBufferSize(), so it makes sense to augment the ValueVector interface with new APIs.

It looks like new unit tests are not needed since the unit tests added for ARROW-801 test the new APIs as well --> getDataBufferAddress() underneath invokes getDataBuffer() to get the memory address of ArrowBuf so we are good.

Author: siddharth <[email protected]>

Closes apache#976 from siddharthteotia/ARROW-1373 and squashes the following commits:

1ef2022 [siddharth] Fixed failures and added javadocs
e5ff023 [siddharth] ARROW-1373: Implement getBuffer() methods for ValueVector
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants