num_values() in statistics seems to capture the number of encoded values. This is misleading as everyplace else in parquet num_values() really indicates all values (null and not-null, i.e. the number of levels).
We should likely remove this field, rename it or at the very least update the documentation.
CC @zeroshade
Reporter: Micah Kornfield / @emkornfield
Note: This issue was originally created as PARQUET-2099. Please see the migration documentation for further details.