Skip to content

Conversation

am11
Copy link
Member

@am11 am11 commented Apr 28, 2025

No description provided.

@ghost ghost added the area-VM-coreclr label Apr 28, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Apr 28, 2025
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

@am11
Copy link
Member Author

am11 commented Apr 29, 2025

@EgorBot -amd -arm

using System.Reflection;
using BenchmarkDotNet.Attributes;

public class Bench
{
    public readonly int AReadonlyIntField = 42;

    [Benchmark]
    public int ReflectionFieldBoxing()
    {
        FieldInfo fieldInfo = typeof(Bench).GetTypeInfo().GetDeclaredField(nameof(AReadonlyIntField));
        Bench myInstance = new();

        object current = fieldInfo.GetValue(myInstance);

        fieldInfo.SetValue(myInstance, int.MinValue);

        return (int)current + (int)fieldInfo.GetValue(myInstance);
    }
}

@am11
Copy link
Member Author

am11 commented Apr 29, 2025

@jkotas, @EgorBo, WDYT? We can get rid of this HMF and later remove the managed helpers when JIT starts inlining boxes (in Tier 0 as it does in Tier 1).

@am11 am11 marked this pull request as ready for review April 29, 2025 08:45
@am11 am11 force-pushed the feature/HMF-removal/JIT_Box branch from aaadc26 to bf23c6d Compare April 29, 2025 11:19
@EgorBo
Copy link
Member

EgorBo commented Apr 29, 2025

@EgorBot -windows_intel -amd -arm

using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;

public class Bench
{
    [Benchmark] public object Box8() => Box(new Buffer8());
    [Benchmark] public object Box16() => Box(new Buffer16());
    [Benchmark] public object Box32() => Box(new Buffer32());
    [Benchmark] public object Box256() => Box(new Buffer256());


    [MethodImpl(MethodImplOptions.NoInlining | 
                MethodImplOptions.NoOptimization)]
    static object Box<T>(T value) => (object)value;
}

[InlineArray(1)] public struct Buffer8 {
    long _element0;
}
[InlineArray(2)] public struct Buffer16 {
    long _element0;
}
[InlineArray(4)] public struct Buffer32 {
    long _element0;
}
[InlineArray(32)] public struct Buffer256 {
    long _element0;
}

@am11
Copy link
Member Author

am11 commented Apr 29, 2025

@EgorBot -windows_intel -amd -arm

using System.Runtime.CompilerServices;
using BenchmarkDotNet.Attributes;

public class Bench
{
    [Benchmark] public object Box8() => Box(new Buffer8());
    [Benchmark] public object Box16() => Box(new Buffer16());
    [Benchmark] public object Box32() => Box(new Buffer32());
    [Benchmark] public object Box256() => Box(new Buffer256());


    [MethodImpl(MethodImplOptions.NoInlining | 
                MethodImplOptions.NoOptimization)]
    static object Box<T>(T value) => (object)value;
}

[InlineArray(1)] public struct Buffer8 {
    long _element0;
}
[InlineArray(2)] public struct Buffer16 {
    long _element0;
}
[InlineArray(4)] public struct Buffer32 {
    long _element0;
}
[InlineArray(32)] public struct Buffer256 {
    long _element0;
}

@am11
Copy link
Member Author

am11 commented Apr 29, 2025

arm32 doesn't seem to be happy with the TLV commit despite src/coreclr/vm/arm/stubs.cpp was using JIT_Box_MP_FastPortable. 🤔

@filipnavara
Copy link
Member

filipnavara commented Apr 29, 2025

arm32 doesn't seem to be happy with the TLV commit

arm32 had the special alignment code path (RequiresAlign8)

@am11 am11 force-pushed the feature/HMF-removal/JIT_Box branch from 10ef59b to 841c89e Compare April 29, 2025 17:52
@am11 am11 force-pushed the feature/HMF-removal/JIT_Box branch from 0b8d8f9 to 0f4cd04 Compare April 29, 2025 18:09
@am11 am11 force-pushed the feature/HMF-removal/JIT_Box branch from 0f4cd04 to 7468c56 Compare April 29, 2025 18:49
@am11
Copy link
Member Author

am11 commented May 1, 2025

NativeAOT's AllocFast mechanism is much efficient, maybe coreclr can reuse it via asm sharing being introduced in #114982?

Alternatively, JIT can optimize/inline InternalAlloc{NoChecks}. It will not only speed up the existing helpers, but also allow us to complete all remaining HMFs; basically all the remaining ones are due to Alloc GC API call Allocate{Object,String,Array} end up calling GCHeapUtilities::GetGCHeap()->Alloc() and therefore require HELPER_METHOD_FRAME. If we move them to C# today as-is, they will show mixed regressions, just likt JIT_Box which also requrie HMF due to AllocateOjbect call in the fallback path.

@huoyaoyuan
Copy link
Member

NativeAOT's AllocFast mechanism is much efficient, maybe coreclr can reuse it via asm sharing being introduced in #114982?

I'm unsure if there is behavioral difference: #105949
But sharing allocation stubs between coreclr and nativeaot should be very appealing.

@am11 am11 force-pushed the feature/HMF-removal/JIT_Box branch from 100647b to 9dbafa3 Compare May 2, 2025 15:08
@am11 am11 force-pushed the feature/HMF-removal/JIT_Box branch from 9d00363 to 113056f Compare May 2, 2025 18:51
Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@jkotas jkotas merged commit 9e490ef into dotnet:main May 2, 2025
93 of 95 checks passed
@am11 am11 deleted the feature/HMF-removal/JIT_Box branch May 2, 2025 21:19
@am11
Copy link
Member Author

am11 commented May 2, 2025

Thanks @jkotas, @EgorBo and @filipnavara for the help!

Do you think we can bring the rest of these to the same plan:

SetJitHelperFunction(CORINFO_HELP_NEWSFAST, JIT_TrialAllocSFastSP);
SetJitHelperFunction(CORINFO_HELP_NEWSFAST_ALIGN8, JIT_TrialAllocSFastSP);
SetJitHelperFunction(CORINFO_HELP_NEWARR_1_VC, JIT_NewArr1VC_UP);
SetJitHelperFunction(CORINFO_HELP_NEWARR_1_OBJ, JIT_NewArr1OBJ_UP);
or would they (some of them?) require different approach?

@filipnavara
Copy link
Member

or would they (some of them?) require different approach?

I'd think that they are good candidate for implementing in assembly and sharing with NativeAOT (as per #115134 (comment)). For some of them it looks like a quite straightforward mapping but perhaps I was just lucky to look at the ones that match up closely...

@am11
Copy link
Member Author

am11 commented May 2, 2025

Sounds good. Then we can follow @davidwrighton's footsteps once #114982 is completed. :)

@filipnavara
Copy link
Member

Then we can follow @davidwrighton's footsteps once #114982 is completed. :)

JFYI I have a prototype on top of #114982 that moves AllocFast.S/asm to the shared code base and replaces most of the helpers. Still needs a bit of love but wanted to let you know to avoid working on the same thing...

@am11
Copy link
Member Author

am11 commented May 5, 2025

Thanks! Can't wait to see class HelperMethodFrame cleanup all the way up to LazyMachState::unwindLazyState, which has one of 13 calls to PAL_VirtualUnwind -> unw_step. 😁

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-VM-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants