-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-82: Initial IPC support for ListArray #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
cc7f851
01c50be
3895d34
20f984b
1374485
45e41c0
5f87aef
61b0481
a2e1e52
39c57ed
aa0602c
8e464b5
53d37bc
8ab5315
e71810b
2e6c477
3b219a1
10e6651
be04b3e
8982723
5e15815
6e57728
7789205
0af558b
0c5162d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -25,6 +25,25 @@ | |
|
|
||
| namespace arrow { | ||
|
|
||
| Status ArrayBuilder::AppendToBitmap(bool is_null) { | ||
| if (length_ == capacity_) { | ||
| // If the capacity was not already a multiple of 2, do so here | ||
| // TODO(emkornfield) doubling isn't great default allocation practice | ||
| // see https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md | ||
| // fo discussion | ||
| RETURN_NOT_OK(Resize(util::next_power2(capacity_ + 1))); | ||
| } | ||
| UnsafeAppendToBitmap(is_null); | ||
| return Status::OK(); | ||
| } | ||
|
|
||
| Status ArrayBuilder::AppendToBitmap(const uint8_t* valid_bytes, int32_t length) { | ||
| Reserve(length); | ||
|
||
|
|
||
| UnsafeAppendToBitmap(valid_bytes, length); | ||
| return Status::OK(); | ||
| } | ||
|
|
||
| Status ArrayBuilder::Init(int32_t capacity) { | ||
| capacity_ = capacity; | ||
| int32_t to_alloc = util::ceil_byte(capacity) / 8; | ||
|
|
@@ -36,6 +55,7 @@ Status ArrayBuilder::Init(int32_t capacity) { | |
| } | ||
|
|
||
| Status ArrayBuilder::Resize(int32_t new_bits) { | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. By the way I was thinking it would be nice to make Resize and Init virtual methods, I think it would reduce code repetition and potential bugs for classes the don't have any reason to override Reserve. Thoughts?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good to me, go ahead |
||
| if (!null_bitmap_) { return Init(new_bits); } | ||
| int32_t new_bytes = util::ceil_byte(new_bits) / 8; | ||
| int32_t old_bytes = null_bitmap_->size(); | ||
| RETURN_NOT_OK(null_bitmap_->Resize(new_bytes)); | ||
|
|
@@ -56,10 +76,46 @@ Status ArrayBuilder::Advance(int32_t elements) { | |
|
|
||
| Status ArrayBuilder::Reserve(int32_t elements) { | ||
| if (length_ + elements > capacity_) { | ||
| // TODO(emkornfield) power of 2 growth is potentially suboptimal | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Aside: should we do 1.5x growth everywhere (this is the folly approach IIRC -- is there more research on this subject?)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes this is what folly uses. Seems like 1.5 edges out 2 on the very light survey done here: https://en.wikipedia.org/wiki/Dynamic_array#Growth_factor Ideally, we would benchmark once we have some real data in place. In the absence of that 1.5 seems like a good default. |
||
| int32_t new_capacity = util::next_power2(length_ + elements); | ||
| return Resize(new_capacity); | ||
| } | ||
| return Status::OK(); | ||
| } | ||
|
|
||
| Status ArrayBuilder::SetNotNull(int32_t length) { | ||
| RETURN_NOT_OK(Reserve(length)); | ||
| UnsafeSetNotNull(length); | ||
| return Status::OK(); | ||
| } | ||
|
|
||
| void ArrayBuilder::UnsafeAppendToBitmap(bool is_null) { | ||
|
||
| if (is_null) { | ||
| ++null_count_; | ||
| } else { | ||
| util::set_bit(null_bitmap_data_, length_); | ||
| } | ||
| ++length_; | ||
| } | ||
|
|
||
| void ArrayBuilder::UnsafeAppendToBitmap(const uint8_t* valid_bytes, int32_t length) { | ||
| if (valid_bytes == nullptr) { | ||
| UnsafeSetNotNull(length); | ||
| return; | ||
| } | ||
| for (int32_t i = 0; i < length; ++i) { | ||
| // TODO(emkornfield) Optimize for large values of length? | ||
| AppendToBitmap(valid_bytes[i] == 0); | ||
|
||
| } | ||
| } | ||
|
|
||
| void ArrayBuilder::UnsafeSetNotNull(int32_t length) { | ||
| const int32_t new_length = length + length_; | ||
| // TODO(emkornfield) Optimize for large values of length? | ||
| for (int32_t i = length_; i < new_length; ++i) { | ||
| util::set_bit(null_bitmap_data_, i); | ||
| } | ||
| length_ = new_length; | ||
| } | ||
|
|
||
| } // namespace arrow | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a matter of consistency, should this be
is_valid?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I did it this way because it was closer to how the code previously worked. But I will change it.