Skip to content

Conversation

@BruceForstall
Copy link
Contributor

Improve recognized loop invariants

When a loop has a single exit, the loop table stores a pointer to
the loop exit block. Ideally, we would have the property that the
loop entry block dominates the single exit block, and you could
thus walk up the IDom list from the exit the the entry block.

A peculiar loop structure on x86 only was preventing this: an "infinite"
loop with a "try/catch" where the only "exit" was from the "catch"
handler. For non-x86, the handler would have been moved out-of-line
as a funclet. But for x86, the handler is still in-line with the
loop blocks. This structure is peculiar because the catch handler
has no predecessors and doesn't participate "normally" in the dominator
tree.

Prevent handler blocks like this from being considered loop exits.

When a loop has a single exit, the loop table stores a pointer to
the loop exit block. Ideally, we would have the property that the
loop entry block dominates the single exit block, and you could
thus walk up the IDom list from the exit the the entry block.

A peculiar loop structure on x86 only was preventing this: an "infinite"
loop with a "try/catch" where the only "exit" was from the "catch"
handler. For non-x86, the handler would have been moved out-of-line
as a funclet. But for x86, the handler is still in-line with the
loop blocks. This structure is peculiar because the catch handler
has no predecessors and doesn't participate "normally" in the dominator
tree.

Prevent handler blocks like this from being considered loop exits.
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Apr 7, 2023
@ghost ghost assigned BruceForstall Apr 7, 2023
@ghost
Copy link

ghost commented Apr 7, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

Improve recognized loop invariants

When a loop has a single exit, the loop table stores a pointer to
the loop exit block. Ideally, we would have the property that the
loop entry block dominates the single exit block, and you could
thus walk up the IDom list from the exit the the entry block.

A peculiar loop structure on x86 only was preventing this: an "infinite"
loop with a "try/catch" where the only "exit" was from the "catch"
handler. For non-x86, the handler would have been moved out-of-line
as a funclet. But for x86, the handler is still in-line with the
loop blocks. This structure is peculiar because the catch handler
has no predecessors and doesn't participate "normally" in the dominator
tree.

Prevent handler blocks like this from being considered loop exits.

Author: BruceForstall
Assignees: BruceForstall
Labels:

area-CodeGen-coreclr

Milestone: -

@BruceForstall
Copy link
Contributor Author

There's a linux-arm Alpine System.Threading.Tasks.Dataflow.Tests failure, but there's some kind of infra failure and no log file:

This is not a helix test, for this reason there are no artifacts available in this tab

@BruceForstall
Copy link
Contributor Author

BruceForstall commented Apr 7, 2023

Diffs

No asm diffs.

TP improvement on linux-arm in MinOpts contexts in coreclr_tests.run.linux.arm.checked.mch, which doesn't make much sense since the only non-x86, non-DEBUG change was to move one Release check to DEBUG in hoisting, which shouldn't affect MinOpts.

@BruceForstall
Copy link
Contributor Author

@AndyAyersMS PTAL
cc @dotnet/jit-contrib

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm puzzled how we would ever consider a BBJ_ALWAYS or BBJ_CATCHRET block whose sole successor is outside the loop to ever be inside the loop.

@BruceForstall
Copy link
Contributor Author

Here's one example (partial block list):

***************  Natural loop table
L00, from BB02 to BB04 (Head=BB01, Entry=BB02, Exit=BB03) prehead

-----------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight    IBC  lp [IL range]     [jump]      [EH region]         [flags]
-----------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1       4    [000..00F)                                     i hascall newobj LoopPH IBC
BB02 [0002]  2  0    BB01,BB04            1472. 5888  0 [00F..01E)-> BB04 (always) T0      try { }     keep i try Loop hascall gcsafe idxlen bwd IBC align
BB03 [0003]  1     0                       0       0  0 [01E..021)-> BB05 (always)    H0   catch { }   keep i rare hascall bwd IBC
BB04 [0004]  1       BB02                  0       0  0 [021..02F)-> BB02 (always)                     i rare hascall gcsafe bwd bwd-src IBC
BB05 [0005]  1       BB03                  0       0    [02F..038)-> BB07 ( cond )                     i rare IBC

Note how BB03 is within the loop, and is the only branch out of the loop (x86 uses BBJ_ALWAYS to exit catches).

(It's from this code:

private char[] Read_byte_int_int(SerialPort com)
{
var receivedBytes = new List<byte>();
var buffer = new byte[DEFAULT_READ_BYTE_ARRAY_SIZE];
int totalBytesRead = 0;
int numBytes;
while (true)
{
try
{
numBytes = com.Read(buffer, 0, buffer.Length);
}
catch (TimeoutException)
{
break;
}
receivedBytes.InsertRange(totalBytesRead, buffer);
totalBytesRead += numBytes;
}
if (totalBytesRead < receivedBytes.Count)
receivedBytes.RemoveRange(totalBytesRead, receivedBytes.Count - totalBytesRead);
return com.Encoding.GetChars(receivedBytes.ToArray());
}
)

@AndyAyersMS
Copy link
Member

Still puzzled. Since your fix is in CheckForExit, which is only called for blocks that are in the loopBlocks set, I assume BB03 ended up in this set.

But how? The set contents are computed by HasSingleEntryCycle where we reverse walk from back edges (here BB04->BB02) to find the loop blocks. Seemingly this should just find BB04 and BB02 and stop, and during compaction we should try and move BB03 out from in between as it is a non-loop block. But evidently that's not what happens.

And more generally, how can any BBJ_ALWAYS block end up as an exit? If the successor is outside the loop then we can't reverse walk from within the loop to that block.

@BruceForstall
Copy link
Contributor Author

The BB03, above, is initially not part of loopBlocks. During MakeCompactAndFindExits it is added by CanTreatAsLoopBlocks after it is determined that it can't be removed from the loop because "EH regions would be ill-formed if we moved these blocks out.":

if (!BasicBlock::sameEHRegion(previous, nextLoopBlock) || !BasicBlock::sameEHRegion(previous, moveAfter))
{
// EH regions would be ill-formed if we moved these blocks out.
// See if we can consider them loop blocks without introducing
// a side-entry.
if (CanTreatAsLoopBlocks(block, lastNonLoopBlock))
{
// The call to `canTreatAsLoop` marked these blocks as part of the loop;
// iterate without updating `previous` so that we'll analyze them as part
// of the loop.
continue;
}
else
{
// We can't move these out of the loop or leave them in, so just give
// up on this loop.
return false;
}
}

After it gets added to loopBlocks, it gets analyzed by CheckForExit.

@BruceForstall
Copy link
Contributor Author

It's interesting how the condition

if (!BasicBlock::sameEHRegion(previous, nextLoopBlock) || !BasicBlock::sameEHRegion(previous, moveAfter))

doesn't check the [block, lastNonLoopBlock] range at all (which is the range being moved). It's not clear that's a precisely correct condition.

@AndyAyersMS
Copy link
Member

I find the logic here a bit questionable too. At any rate that explains how we end up deciding that a non loop block is a loop block.

Not sure what to recommend at this point -- maybe check and see how often we end up deciding a non-loop block has to remain in the loop, and if it's rare, we just take this fix and move on, and if its not rare, we decide if we want to pursue something more aggressive?

@BruceForstall
Copy link
Contributor Author

Looks like 894 cases on win-x64 spmi replay of not moving out blocks due to EH, 883 on win-x86.

One case I looked at seemed bogus: we refused to move a throw exit out of the loop range because the next block was a try block; an easy flow modification would work fine.

I saw a legitimate case where we wouldn't move a throw (exit) in a try body to a location after the loop.

Another case that's somewhat bogus: we wouldn't move an ALWAYS that precedes a try. (In this particular case, if we did move it, it would have created odd flow, as the ALWAYS branched to a CALLFINALLY for the region.)

So, I think there are opportunities to move more blocks out of the loop. I'm not sure about the bug fixed by this PR, though: I can't remember the EH rules for x86 region placement.

I think it's worthwhile fixing this one case to get to a place where we can assert on the dominator relationship between the loop entry and exit, and the IDom tree. I don't think fixing this and improving the non-loop block range need to be related.

@AndyAyersMS
Copy link
Member

I think it's worthwhile fixing this one case to get to a place where we can assert on the dominator relationship between the loop entry and exit, and the IDom tree. I don't think fixing this and improving the non-loop block range need to be related.

Ok by me.

@BruceForstall BruceForstall merged commit 082c5b7 into dotnet:main Apr 9, 2023
@BruceForstall BruceForstall deleted the EnsureLoopsRespectDominators branch April 9, 2023 17:39
@ghost ghost locked as resolved and limited conversation to collaborators May 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants