Skip to content

Conversation

kunalspathak
Copy link
Contributor

In #53214, I included the information of which EFlags gets updated per instruction. But for few instructions like imul and div, the flags were wrong. Fix those entries. I also went through each instruction in Intel's manual "Appendix A" "EFlags Cross-Reference" . I also made sure that I capture all possible state of EFlags per instruction e.g. undefined or restore so in future we can mitigate such issue.

A request to reviewer to please cross check from the Intel manual to make sure if I didn't miss anything.

No asmdiffs in libraries/benchmarks. Only asmdiff in coreclr_tests.pmi is the test case that was failing.

Summary of Code Size diffs:
(Lower is better)

Total bytes of base: 25772
Total bytes of diff: 25777
Total bytes of delta: 5 (0.02% of base)
Total bytes of relative delta: 0.016923668075580132
    diff is a regression.
    relative diff is a regression.


Top file regressions (bytes):
           3 : 228538.dasm (0.01% of base)
           2 : 86265.dasm (1.68% of base)

2 total files with Code Size differences (0 improved, 2 regressed), 0 unchanged.

Top method regressions (bytes):
           3 ( 0.01% of base) : 228538.dasm - ILGEN_0x372a9ae6:Method_0xdc6ff1a4(byte,byte,int,long,ushort,double,long,long):int
           2 ( 1.68% of base) : 86265.dasm - Test:A(int):int

Top method regressions (percentages):
           2 ( 1.68% of base) : 86265.dasm - Test:A(int):int
           3 ( 0.01% of base) : 228538.dasm - ILGEN_0x372a9ae6:Method_0xdc6ff1a4(byte,byte,int,long,ushort,double,long,long):int

2 total methods with Code Size differences (0 improved, 2 regressed), 0 unchanged.

Fixes: #53781

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 7, 2021
@kunalspathak
Copy link
Contributor Author

@dotnet/jit-contrib, @tannergooding

// Note that emitter has only partial support for BT. It can only emit the reg,reg form
// and the registers need to be reversed to get the correct encoding.
INST3(bt, "bt", IUM_RD, 0x0F00A3, BAD_CODE, 0x0F00A3, INS_FLAGS_WritesAllFlags) // PF = ?, AF = ?, ZF = ?, SF = ?, OF = ?
INST3(bt, "bt", IUM_RD, 0x0F00A3, BAD_CODE, 0x0F00A3, Undefined_OF | Undefined_SF | Undefined_AF | Undefined_PF | Writes_CF )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is interesting and looks to possibly be an AMD vs Intel difference.

Intel specifies ZF is unaffected while AMD specifies it is undefined

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, AMD had a different implementation of this instruction (and all the other BTx instruction) decades ago prior to/around the Athlon64. Since then the documentation declares these flags as being Undefined. All recent CPUs behave like Intel's. There were some issues with the gcc optimizer that assumed these flags didn't change, causing branching errors. That's why I remember.

On a side note, the BTx instructions have a nasty behavior when applied on memory and the index-register (or immediate value) is larger than the size of the destination. Then the destination address is taken as an array, and you can get out of bounds unaware. bts qword [rax], 0xC0 behaves like bts qword [rax+24], 0 -- I just mention it for safety reasons... I'm sure the JIT team is aware of it. Or as the code comment suggests, isn't using it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep it as Undefined_ZF in that case, to be conservative. No optimization will take benefit from flags that are marked as Undefined for a given instruction.

@kunalspathak
Copy link
Contributor Author

@BruceForstall or @AndyAyersMS - Can you take a look. This fixes a blocking outerloop issue.

Copy link
Contributor

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question

The table looks fine, but I didn't review it thoroughly

@ghost
Copy link

ghost commented Jun 8, 2021

Hello @kunalspathak!

Because this pull request has the auto-merge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test failure in JIT/Regression/JitBlue/GitHub_13822

4 participants