Skip to content

roundDecimal() fails for null FLOAT values #15377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 22, 2025

Conversation

mghildiy
Copy link
Contributor

@mghildiy mghildiy commented Mar 26, 2025

This change includes:

  1. For null value for FLOAT column, use NaN as transformation value
  • bugfix.

@mghildiy mghildiy changed the title This commit includes: roundDecimal() fails for null FLOAT values Mar 27, 2025
@mghildiy
Copy link
Contributor Author

mghildiy commented Mar 27, 2025

I woud work on unit test once code approach is approved.

_doubleValuesSV[i] = BigDecimal.valueOf(leftValues[i])
.setScale(_scale, RoundingMode.HALF_UP).doubleValue();
if (Double.NEGATIVE_INFINITY == leftValues[i]) {
_doubleValuesSV[i] = Double.NaN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check the standard SQL behavior when rounding on -Inf? Does it throw exception, return -Inf or NaN?
We should put the same check in other if branches

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is DB specific behaviour. Some throw errors, some result to -infinity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about PostgreSQL? We usually use it as a reference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostgreSQL returns -Infinity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to match that behavior. We want to handle all special values, including Inf, -Inf, NaN

Copy link
Contributor Author

@mghildiy mghildiy Apr 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I undrstand that currently pinot stores null values as -Inf only. Do we need to handle other two?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which layer eventually writes to response to a SQL query?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pinot does use -Inf as the placeholder for null value, but Inf is also valid value and we should be able to handle it without throwing exception. NaN is debatable because they are converted into null within SpecialValueTransformer

@codecov-commenter
Copy link

codecov-commenter commented Mar 28, 2025

Codecov Report

❌ Patch coverage is 33.33333% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.31%. Comparing base (59551e4) to head (9b41e37).
⚠️ Report is 2807 commits behind head on master.

Files with missing lines Patch % Lines
...nsform/function/RoundDecimalTransformFunction.java 33.33% 9 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #15377      +/-   ##
============================================
+ Coverage     61.75%   63.31%   +1.56%     
- Complexity      207     1379    +1172     
============================================
  Files          2436     3035     +599     
  Lines        133233   176732   +43499     
  Branches      20636    27121    +6485     
============================================
+ Hits          82274   111898   +29624     
- Misses        44911    56255   +11344     
- Partials       6048     8579    +2531     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.28% <33.33%> (+1.57%) ⬆️
java-21 63.27% <33.33%> (+1.65%) ⬆️
skip-bytebuffers-false ?
skip-bytebuffers-true ?
temurin 63.31% <33.33%> (+1.56%) ⬆️
unittests 63.31% <33.33%> (+1.56%) ⬆️
unittests1 56.45% <33.33%> (+9.56%) ⬆️
unittests2 33.11% <0.00%> (+5.38%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@@ -90,18 +90,30 @@ public double[] transformToDoubleValuesSV(ValueBlock valueBlock) {
double[] leftValues = _leftTransformFunction.transformToDoubleValuesSV(valueBlock);
if (_fixedScale) {
for (int i = 0; i < length; i++) {
_doubleValuesSV[i] = BigDecimal.valueOf(leftValues[i])
.setScale(_scale, RoundingMode.HALF_UP).doubleValue();
if (Double.NEGATIVE_INFINITY == leftValues[i]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) We usually put constant on the right hand side

_doubleValuesSV[i] = BigDecimal.valueOf(leftValues[i])
.setScale(_scale, RoundingMode.HALF_UP).doubleValue();
if (Double.NEGATIVE_INFINITY == leftValues[i]) {
_doubleValuesSV[i] = Double.NaN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to match that behavior. We want to handle all special values, including Inf, -Inf, NaN

@mghildiy
Copy link
Contributor Author

Please review the commit. I would add unit tests accordingly.

Comment on lines 135 to 139
if (values[colId] instanceof Double) {
values[colId] = Double.NEGATIVE_INFINITY;
} else {
values[colId] = null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct. Do we need to change this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want this behaviour for decimals only. For other flows, behaviour must be as it is currently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When null is enabled, we should be able to handle null value properly. Do you observe issues if reverting this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. This change is not needed.

_doubleValuesSV[i] = BigDecimal.valueOf(leftValues[i])
.setScale(_scale, RoundingMode.HALF_UP).doubleValue();
if (leftValues[i] == Double.NEGATIVE_INFINITY || leftValues[i] == Double.POSITIVE_INFINITY ||
leftValues[i] == Double.NaN) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NaN is not comparable. We can probably use a try-catch over result computation, and set the result to be left value if exception is caught

Copy link
Contributor Author

@mghildiy mghildiy Apr 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we can check with Double.isNaN. It's documentation says:
'This method corresponds to the isNaN operation defined in IEEE 754.'

So if its NaN, what value we return? From postgres documentation:

In addition to ordinary numeric values, the numeric type allows the special value NaN, meaning "not-a-number". Any operation on NaN yields another NaN.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Let's follow the same behavior then. You may use a try-catch to avoid the overhead of value check for happy path

Copy link
Contributor Author

@mghildiy mghildiy May 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently Double.NaN, Double.NEGATIVE_INFINITY, Double.POSITIVE_INFINITY are mapped to null(in null bit map). I am debugging code to ensure that these 3 values are not part of null bit map.

1. For round function for decimal values, if value is infinity or NaN, assign as such with no rouding operation
2. If null bit map for column contains docid and it's a Double, set value to negative infinity
.setScale(_scale, RoundingMode.HALF_UP).doubleValue();
try {
_doubleValuesSV[i] = BigDecimal.valueOf(leftValues[i])
.setScale(_scale, RoundingMode.HALF_UP).doubleValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please apply Pinot Style and reformat the changes

}
} else {
for (int i = 0; i < length; i++) {
_doubleValuesSV[i] = (double) Math.round(leftValues[i]);
if (leftValues[i] == Double.NEGATIVE_INFINITY || leftValues[i] == Double.POSITIVE_INFINITY ||
leftValues[i] == Double.NaN) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
leftValues[i] == Double.NaN) {
Double.isNaN(leftValues[i])) {

@Jackie-Jiang Jackie-Jiang merged commit d3c322d into apache:master Aug 22, 2025
30 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants