-
Notifications
You must be signed in to change notification settings - Fork 901
Separate immutable code and descriptors from function objects. #2091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate immutable code and descriptors from function objects. #2091
Conversation
@gbrail, let me know if you'd like this split into smaller PRs. The first 3 commits are all laying the ground work for the changes to the interpreter and class compilation, and the tidy up and test properties update could reasonably be split out as well. |
Sigh, that will teach me not to rest locally on all Java versions. |
The SecurityControllerTest, if I remember correctly, is a problemetic test that happens to be working for current Rhino. At least 2 PRs tried to fix it, but none get merged. For example: https://github.com/FOCONIS/rhino/blob/better-type-support/tests%2Fsrc%2Ftest%2Fjava%2Forg%2Fmozilla%2Fjavascript%2Ftests%2FSecurityControllerTest.java#L91-L91 |
In this case it's just that I just wasn't setting the security domain stuff on descriptors, and JDK21 really doesn't care. I'll fix it tomorrow. |
Also, we do appear to have at least a smoke test for the debugger in our fork that should be easy to upstream. I’ll add that to this PR tomorrow as well. |
If this means that you got rid of the giant switch table in every compiled class, and it looks like you did, I'm quite excited! I will spend some time reading over the next few days, thanks for this so far and let me try to understand it better! |
Yes, the giant switch tables are all gone. |
c29720e
to
c26776f
Compare
On this I have a rough idea. We can make type signatures auto-generated from actual class/method on class init, so that it can use runtime info and will always reflect the actual class/method signature. Some sort of mark-scan approach, like the one used in #1950 , should be capable of doing this. Another thing is matching actual object types with static arg types, I believe this will require ClassWriter to have the ability to track on-stack object type and method param/return type, some high-level bytecode manipulation framework might be required. |
Yeah, I was planning on resolving the actual methods so that a a type mismatch would fail as early as possible. Trying to be too clever with types on the stack can be actively unhelpful in my experience. |
I also almost have a nice refactoring of our compiler API To make it type safe. Not quite happy with that yet so I think it should wait to go into a follow up PR. We’ve survived with casting from object up till now. |
I really appreciate this and I'm especially excited about cleaning up the bytecode. You've also added a bunch of new interfaces which are pretty fundamental to how the new structure works. For the sake of future maintainers, can you please add some docs? We don't need anything crazy, but top-level comments on "JSCode" and "JSScript" and other new interfaces like that, along with a bit on what the methods mean, is going to help a lot of people (and me!). I won't be offended if some of the docs are written by AI either. Thanks! |
One other thing in the interest of future maintenance -- the JSDescriptor constructor, with its long long list of parameters, seems like a great candidate for a "Builder" pattern. With that said, I'll look more at who has to actually use it, but if it's humans then that might be a more flexible long-term pattern. |
c26776f
to
0ffdb64
Compare
Heh @gbrail, I meant to reply to this earlier. The descriptors do have a builder object, with mutable fields, but I had not created it as classic builder with I realised while tidying up other things on my scope branch branch was that I needed to touch the I've now added a smoke test and support for class compilation. One thing I'm not sure about is the expected interface provided here. As far as I can tell it's simply that one of the classes generated by the compiler will contain a I'll separate the debug and compiler smoke tests into a separate PR because they are likely useful to others. While writing the smoke test I also discovered that we had one bug in the byte code compilation that prevented arrow functions being compiled in one edge case, and this wasn't showing up as a test failure because we fallback silently to the interpreter. I tried disabling the fallback mode after fixing that and found 18 failing Mozilla suite tests (failing due to large script size). The main branch has the same failing tests, along with 4 test 262 cases which also failed due to large method size. Should we make a separate change to disable fallback in test 262 tests to ensure we do not regress on this? |
a0cc743
to
36f620e
Compare
The occasional build failure seems to be a bug in Jacobo when dealing with large numbers of generated classes and hash collisions. I'll see if I can turn off coverage for the generated |
Right, I've excluded the small generated accessors from coverage. That roughly halves the size of the generated coverage file and will mean we are much less likely to hit a collision. |
bfbbac2
to
e196fa5
Compare
I like the idea of finding a way to test when we regress due to large script size -- maybe a way to mark tests that we know will cause a fallback to interpreted mode and disable it for others. (Perhaps the RhinoConfig mechanism could be used to set a flag we enable when running the tests?) |
OK, I get it on the big constructor thing, this is making sense and I like how it's finding limitations in other areas. I'd still like JavaDoc for classes like JSCode, JSFunction, JSScript so that future generations will understand what they're doing when they try to maintain this! |
Sure, that seems entirely reasonable. |
e196fa5
to
93f827a
Compare
93f827a
to
388a4d6
Compare
388a4d6
to
dde7488
Compare
Awesome this is great -- Thanks! |
Separate function objects and the code they execute.
What's the problem we're trying to solve here?
The way we have traditionally represented JavaScript functions has a few drawbacks which we've encountered.:
LambdaFunction
and similar)). This class hierarchy starts to become especially strained if we consider what would be needed to supportclass
natively (where the constructor carries extra information about fields and methods), or start to support additional compilation levels or new interpreters.The overall solution
This change introduces
JSCode
objects which have an execute and a resume method. These objects are then used to represent the call, or construct actions on an object, and are held as part of aJSDescriptor
which carries all the meta data related toJSFunction
s orJSScript
s created from these descriptors.The
JSFunction
then only need to inherit fromBaseFunction
and hold a home object and a lexically boundthis
if they represent an arrow function.InterpreterData
has been refactored to be a subclass ofJSCode
, and its metadata moved to a descriptor, and the class compiler now produces static methods which are invoked by instances ofOptJSCode
objects.The construction of the descriptors has been factored out into common code used by both the interpreter and the class compiler, and the now unnecessary classes (
ArrowFunction
,InterpretedFunction
, andNativeFunction
) removed.Direct calls and function creation within compiled classes
Direct calls within a compiled script were done using an
instanceof
check (to ensure the callee was compiled from this class) and a comparison of its function index. This has been changed to a check that the descriptor of the callee matches the descriptors injected into the class as part of its initialisation., these descriptors are also used for the creation of function objects within the class.Creating the
OptJSCode
instancesI tried multiple techniques for creating instances of
OptJSCode
objects to invoke the static compiled methods (reflect, method handle invokes, method handle proxies, and horrible abuses ofLambdaMetaFactory
) and found the best to the compilation of subclasses separate from the main script compilation. It might be possible to ease meatspace pressure in large systems by using hidden classes but these are only supported on Java 15+ so I haven't done that as part of this change.Performance and scalability
Most benchmarks remain largely unchanged, though
earley
(which creates an extremely large number of activation frames) shows a 35% to 40% speed up as the compiled overrides ofNativeFunction
methods averaged over 2000 bytes and posed a substantial optimisation barrier.With this change we are able to compile considerably larger scripts which previously generated
NativeFunction
overrides exceeding the 64K method byte code limit. We do still see issues with some large scripts due to bugs and limitations in our constant pool handling.Loose ends and future work
There are a few loose ends which I have not tackled as part of this PR, but will tackle in followup work:
Evaluator
interface is poorly designed internally and passes opaque objects about which can then be turned into eitherScript
s orFunction
. I'd like to refactor that, and this would also help to remove the remaining suppression of unchecked casts we need in this area.Signatures.java
in Compute (instead of define) method descriptor in.optimizer.Signatures
#1948, but that would still push errors to run time. I think we actually do this at class initialisation time and catch errors much earlier, and simplify work on class compilation in the process.Debugger interface testing. I am far from sure the debug interface is still working, but I don't appear to have broken any tests.Smoke test added.Future enhancements that might be made based on these changes
invokeDynamic
based solution that would work across compiled classes. We would need to expandOptJSCode
to provide a mechanism for getting the right method handle, but this would not be too hard to implement, with the immutable descriptors provided a simple check that the function being invoked is the expected one.LambdaFunction
andLambdaConstructor
for built in functions. I've tried converting a small number of native functions over as proof of concept and the idea appears to work well. Future work might allow us to combine the replacement of direct call linking with the movement of built in functions toJSFunction
s to allow optimised linking of built in functions.