Skip to content

Skip incomplete parts of numbers #226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 6, 2025
Merged

Conversation

sebastienros
Copy link
Owner

@sebastienros sebastienros commented Jun 6, 2025

Applying this idea

@sebastienros sebastienros merged commit 36ddc8f into main Jun 6, 2025
1 check passed
@sebastienros sebastienros deleted the sebros/consumenumber branch June 6, 2025 18:15
@emayevski
Copy link
Contributor

emayevski commented Jun 7, 2025

@sebastienros
There's a problem with parsing numbers in regard to decimal separators now.

A number like "123." should be treated as "123". It sort of is, however, parsing fails. Why? Let's look at this code in the ReadDecimal method:

        if (allowDecimalSeparator && Cursor.Current == decimalSeparator)
        {
            Cursor.AdvanceNoNewLines(1);

            var numberIsEmpty = number.IsEmpty;

            if (!ReadInteger(out number))
            {
                Cursor.ResetPosition(beforeDecimalSeparator);

                // A decimal separator must be followed by a number if there is no integral part, e.g. `[NaN].[NaN]`
                if (numberIsEmpty)
                {
                    return false;
                }

                number = Cursor.Buffer.AsSpan(start.Offset, Cursor.Offset - start.Offset);
                return true;
            }
        }

The parser has read "123" and got into this block. The number is not empty, but there is no number after a decimal separator. So, the reader resets the cursor position to before the decimal separator and returns "123". The separator in this situation must be consumed, so that the consequent scanning works properly.

With current code, multiple NCalc tests fail.

This is for this reason, I did not include any checks to the decimal separator part in my pull request (I only modified the part related to group separators).

The following change fixes the problem:

        if (allowDecimalSeparator && Cursor.Current == decimalSeparator)
        {
            Cursor.AdvanceNoNewLines(1);

            var numberIsEmpty = number.IsEmpty;

            if (!ReadInteger(out number))
            {
                //Cursor.ResetPosition(beforeDecimalSeparator); // do not reset the position

                // A decimal separator must be followed by a number if there is no integral part, e.g. `[NaN].[NaN]`
                if (numberIsEmpty)
                {
                    return false;
                }

                number = Cursor.Buffer.AsSpan(start.Offset, Cursor.Offset - start.Offset -1); // subtract one
                return true;
            }
        }

emayevski added a commit to Allied-Bits-Ltd/parlot that referenced this pull request Jun 9, 2025
@sebastienros
Copy link
Owner Author

Ok, so we need to assume that when AllowDecimalSeparator is set, we need to consume it even if the next part is empty or not a valid?

And 123.e2 should not be valid either, only consume 123. and return 123 ?

@emayevski
Copy link
Contributor

Ok, so we need to assume that when AllowDecimalSeparator is set, we need to consume it even if the next part is empty or not a valid?

Yes. This is because a number cannot end with a group separator, but it can end with a decimal separator (in some countries, it is a popular form).

And 123.e2 should not be valid either, only consume 123. and return 123 ?

I am not sure about this one. As I mentioned elsewhere, I have not found such a notation for scientific numbers. So I assume that you nailed it - consume 123 with a dot and return just a number 123, leaving "e" for another parser. But maybe someone comes out and says "in my university we write 123.e2 all the time" :).

@sebastienros
Copy link
Owner Author

Then let's say that if the exponent is defined in options 123.e2 will be valid. double.Parse handles it correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants