Skip to content

Conversation

@liyafan82
Copy link
Contributor

We need a way to copy vector values in batch. Currently, we have copyFrom and copyFromSafe APIs. However, they are not enough, as copying values individually is not performant.

@github-actions
Copy link

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method should be eliminated and a method should be added to the test data populators that can take literal values. It would make the tests easier to read and understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I have rewritten the test cases with the test data populators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment below on taking literal values instead of created values in loops. having a seperate method here make it harded to understand the actual test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have revised the test cases accordingly. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use loops. Instead use or add methods to the population helper methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised. Thank you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use helper methods to populate values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, populating vectors with literals make the code more readable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this class should probably not be public, instead there should be a public static method that uses this implementation and takes a variable length list of vectors to append. I think this would be much easier to understand. What do youthink?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. I will make it package private.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I saw we have an assertVectorsEquals someplace in the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I did not find one. So I have created one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a ValueVectorDataPopulator here as well (maybe make the inputs above slightly shorter?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised. Please take a look. Thank you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think this looks reasonable in general. One last question, do you think some null values should be added into the vectors to test proper appending of the bitmaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I think adding null values should be beneficial.
I will revise the test later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are revised to include null values. Fortunately, a bug was found.

@emkornfield
Copy link
Contributor

+1 thanks @liyafan82

@emkornfield
Copy link
Contributor

@liyafan82 could you rebase I just wanted to make sure visitor code isn't affected by DenseUnionVector

@liyafan82
Copy link
Contributor Author

@liyafan82 could you rebase I just wanted to make sure visitor code isn't affected by DenseUnionVector

@emkornfield Thanks a lot for your effort.
I am on vocation now, and will do the rebase whenever I find a time.

BTW, today is the first day of the Chinese new year, a very special day for us.
So now I want to thank you for all the kind help you have given to me in the past year, and wish you all the best in the new year!

@emkornfield
Copy link
Contributor

I am on vocation now, and will do the rebase whenever I find a time.

BTW, today is the first day of the Chinese new year, a very special day for us.
So now I want to thank you for all the kind help you have given to me in the past year, and wish you all the best in the new year!

Enjoy your vacation and same to you. Happy new year!

@liyafan82
Copy link
Contributor Author

@liyafan82 could you rebase I just wanted to make sure visitor code isn't affected by DenseUnionVector

@emkornfield I have rebased the PR.

We need to support batch append for dense union vectors, and I am working on it.
We can support it in this PR or in a separate PR.

If we do it in another PR, may be we can finish this one now?
Please make a decision based on your preference. Thank you.

@emkornfield
Copy link
Contributor

I think another PR is fine for the dense vectors.

@liyafan82
Copy link
Contributor Author

I think another PR is fine for the dense vectors.

Thanks a lot for your feedback. I will open an issue to track it.

@emkornfield
Copy link
Contributor

+1, thank you @liyafan82

@emkornfield
Copy link
Contributor

@liyafan82 sorry I think another change was merged that conflicts with this, could you rebase?

@liyafan82
Copy link
Contributor Author

@liyafan82 sorry I think another change was merged that conflicts with this, could you rebase?

@emkornfield Rebased. Please take a look. Thanks a lot for your effort.

kszucs pushed a commit that referenced this pull request Feb 7, 2020
We need a way to copy vector values in batch. Currently, we have copyFrom and copyFromSafe APIs. However, they are not enough, as copying values individually is not performant.

Closes #5916 from liyafan82/fly_1125_veccat and squashes the following commits:

94b407c <liyafan82>  Support dense union vector
ee49dc6 <liyafan82>  Add tests with null values
ad33e23 <liyafan82>  Rewrite tests with vector populator for result verification
c89211a <liyafan82>  Rewrite tests with vector populator and provide static utility
7c13ede <liyafan82>  Support concating vectors values in batch

Authored-by: liyafan82 <[email protected]>
Signed-off-by: Micah Kornfield <[email protected]>
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
We need a way to copy vector values in batch. Currently, we have copyFrom and copyFromSafe APIs. However, they are not enough, as copying values individually is not performant.

Closes apache#5916 from liyafan82/fly_1125_veccat and squashes the following commits:

94b407c <liyafan82>  Support dense union vector
ee49dc6 <liyafan82>  Add tests with null values
ad33e23 <liyafan82>  Rewrite tests with vector populator for result verification
c89211a <liyafan82>  Rewrite tests with vector populator and provide static utility
7c13ede <liyafan82>  Support concating vectors values in batch

Authored-by: liyafan82 <[email protected]>
Signed-off-by: Micah Kornfield <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants