-
Notifications
You must be signed in to change notification settings - Fork 902
Replace DtoA code with a newer algorithm #2117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Number-to-string conversions in Rhino have historically been made using the DtoA class, which is based on 20-year-old reference C code. A while back we added support for a faster algorithm (the Grisu algorithm). This change switches to the Schubfach algorithm, which is a relatively new algorithm used in many standard libraries including Java. The resulting code is much simpler, while still matching the output of the original code and also the output of V8. In a few cases, it is also a bit faster.
Nice! Where does this leave #254? |
I'm a little nervous of doing this, partly because this is replacing one implementation which we likely will not maintain with another that , and partly because introducing an additional licence means there will be some pain in adopting these changes in forks. Is there a reason we cannot use a Java |
Those signed integers in Java will get you every time! Thanks for the pointer to the Nashorn test data. That data (which originated in V8) helped find some bugs. I'm going to switch to an implementation much closer to the one from the inventor of Schubfach, which will take me a few days, so I'm going to close this PR and open a new one that's better. Thanks! |
I get the concerns about maintenance, although I do say that in the end we're going to end up with quite a bit less code in this area, and in a way I kind of think the concerns about maintenance apply to the whole project... As for the license -- the code I'm currently looking at has an MIT license, and we already have MIT-license code in this project from V8. It would require people who adopt Rhino to update their NOTICE file, yes, but TBH mature companies shipping commercial software that relies on Rhino ought to be doing that. Or, if maintaining a fork is a big burden, they can join those of you already on this thread by helping or funding people to work on the project. |
And I haven't been able to crack the code on using DecimalFormat instead of all this -- JavaScript has very specific formatting rules that require different formats depending on the exponent and I'm not sure how simple they are -- I'll keep experimenting, though. Plus DecimalFormat has other difficulties like it's unusual relationship with threads. After this particular PR, however, in whatever form it eventually takes, the implementation of toExponential, toDecimal, and toFixed is basically a few lines of code that creates a BigDecimal instance, rounds the number, and from then on it's purely string manipulation to create the output, so again a lot of complexity will go away. |
I think associating a formatter with context would be sufficient, we already make lots of assumptions about a context being associated with a single thread even when we relax those assumptions around individual objects. having more test vectors is good. Most of the bugs I’ve found in the old code are to do with sign extensions, so I recommend adding extra cases round MSBs and similar. Happy to help in generating additional test data if you need it. |
I will keep experimenting with both the Schubfach code (which when all set
up right is super fast) and DecimalFormatters (we will require at least
two) and get back!
…On Sat, Oct 11, 2025 at 12:12 PM Duncan MacGregor ***@***.***> wrote:
*aardvark179* left a comment (mozilla/rhino#2117)
<#2117 (comment)>
I think associating a formatter with context would be sufficient, we
already make lots of assumptions about a context being associated with a
single thread even when we relax those assumptions around individual
objects.
having more test vectors is good. Most of the bugs I’ve found in the old
code are to do with sign extensions, so I recommend adding extra cases
round MSBs and similar. Happy to help in generating additional test data if
you need it.
—
Reply to this email directly, view it on GitHub
<#2117 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAD7I24S2L4UROFZXCV3QML3XFJAVAVCNFSM6AAAAACIZR5GU6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTGOJTGU4TSNZXGI>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Our existing code to convert numbers to decimal is based on a very old
set of C code from 2000. It was enhanced a while back with some code from
V8 that is faster but still must fall back to the old code. All of this code
is very complicated.
Recent developments in computer science have resulted in a new generation
of algorithms for this task. The "Schubfach" algorithm, for example, is
fast and does not take much code. Many libraries, including the JDK, use it
already.
We have to have our own floating-point conversion code because JavaScript
specifies slighty different behavior than Java, but we can use the same
algorithm.
This PR replaces the two generations of DtoA code with an implementation of
Schubfach for the general-purpose case of converting doubles to strings.
The new code is much shorter than the old code. It is faster for most, but
not all, use cases, although this does not affect our standard benchmark
results unfortunately.
Subsequent PRs will also replace the Number functions toExponential,
toPrecision, and toFixed, with code based on this same base, so we'll
be able to retire yet more of the old stuff. Finally, we'll see if there
are other options for simplifying the case of converting numbers to
strings with non-base-10 representations.
I've verified this code with a pretty good set of test cases and also
compared intermedate steps in the output with the C++ code that it's based
on and it has been correct so far.