Skip to content

Conversation

@tianchen92
Copy link
Contributor

Related to ARROW-6194.
As discussed in #4994.
Current static DictionaryEncoder has some limitation for extension and reuse.
Slightly change the APIs and migrate static method to object based approach.

@codecov-io
Copy link

codecov-io commented Aug 10, 2019

Codecov Report

Merging #5055 into master will increase coverage by 2.12%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5055      +/-   ##
==========================================
+ Coverage   87.59%   89.71%   +2.12%     
==========================================
  Files        1009      670     -339     
  Lines      143898    99567   -44331     
  Branches     1418        0    -1418     
==========================================
- Hits       126049    89329   -36720     
+ Misses      17487    10238    -7249     
+ Partials      362        0     -362
Impacted Files Coverage Δ
python/pyarrow/lib.pyx 97.95% <0%> (-2.05%) ⬇️
cpp/src/arrow/util/thread-pool-test.cc 97.66% <0%> (-0.94%) ⬇️
cpp/src/arrow/table.cc 95.66% <0%> (-0.81%) ⬇️
cpp/src/arrow/csv/column-builder.cc 97.02% <0%> (-0.03%) ⬇️
cpp/src/arrow/util/utf8.h 93.33% <0%> (ø) ⬆️
cpp/src/arrow/array/builder_binary.cc 90% <0%> (ø) ⬆️
cpp/src/arrow/csv/column-builder-test.cc 100% <0%> (ø) ⬆️
cpp/src/arrow/csv/column-builder.h 100% <0%> (ø) ⬆️
cpp/src/arrow/csv/parser.cc 97.36% <0%> (ø) ⬆️
cpp/src/arrow/util/utf8.cc 100% <0%> (ø) ⬆️
... and 347 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 167cea0...2354c48. Read the comment docs.

@tianchen92 tianchen92 changed the title ARROW-6194: [Java] Make DictionaryEncoder non-static making it easy to extend and reuse ARROW-6194: [Java] Add non-static approach in DictionaryEncoder making it easy to extend and reuse Aug 11, 2019
@tianchen92
Copy link
Contributor Author

@emkornfield PR has updated, would appreciate if you could take another look, thanks :)

/**
* Build hash table with dictionary vector.
*/
public void ensureBuildTable() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not put this in the constructor either in this class or in DictionaryEncoder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, fixe now, thanks.

@emkornfield
Copy link
Contributor

+1, thank you. As noted on the mailing list please try to work with @liyafan82 to figure out the path forward on encoders in general.

@tianchen92
Copy link
Contributor Author

+1, thank you. As noted on the mailing list please try to work with @liyafan82 to figure out the path forward on encoders in general.

ok

pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
…g it easy to extend and reuse

Related to [ARROW-6194](https://issues.apache.org/jira/browse/ARROW-6194).
As discussed in apache#4994.
Current static DictionaryEncoder has some limitation for extension and reuse.
Slightly change the APIs and migrate static method to object based approach.

Closes apache#5055 from tianchen92/ARROW-6194 and squashes the following commits:

2354c48 <tianchen> move build table logic to constructor
7b76cba <tianchen> add test
513f417 <tianchen> add static method back
fb2d293 <tianchen> ARROW-6194:  Make DictionaryEncoder non-static making it easy to extend and reuse

Authored-by: tianchen <[email protected]>
Signed-off-by: Micah Kornfield <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants