Skip to content

Conversation

@pcw109550
Copy link
Contributor

@pcw109550 pcw109550 commented Oct 23, 2025

Description

Adds acceptance tests to lock down the edge cases with engine API interactions, between CL and the EL. All are related with the removal of RR Sync.

Tests

  1. TestELP2PFCUInvalidHash
// TestELP2PFCUInvalidHash verifies that when an Execution Layer (EL) client
// receives a Forkchoice Update (FCU) with an unknown head hash (invalid or
// non-existent) during EL syncing, it remains in the "SYNCING" state and does
// not advance its canonical chain.

No action items.

  1. TestSafeDoesNotAdvanceWhenUnsafeIsSyncing_NoELP2P
// TestSafeDoesNotAdvanceWhenUnsafeIsSyncing_NoELP2P verifies Engine API semantics
// where ForkchoiceUpdate (FCU) validates the unsafe target first and, if the unsafe
// head is not directly appendable (e.g., there is a gap), FCU returns SYNCING and
// exits early without updating the safe head—even when the provided safe hash is
// independently appendable.

Action item: We may still want to advance the safe head while we are EL Syncing, and this test shows op-node might not advance unsafe heads due to FCU behavior. Check op-node behavior.

  1. TestInvalidPayloadThroughCLP2P
// TestInvalidPayloadThroughCLP2P verifies that invalid L2 payloads propagated via
// CL P2P (simulated with admin_postUnsafePayload) do not advance either the CL or EL.
//
// The test first confirms normal progress on a valid target, then exercises three
// invalid cases and asserts no advancement on both sides (unsafe head remains at
// startNum+1):
//
//  1. CL-detectable invalidity (bad block hash):
//     The payload is mutated (e.g., StateRoot) without updating BlockHash.
//     The CL rejects it immediately (hash mismatch) and does not relay it to the EL.
//
//  2. EL-only invalidity (bad state root):
//     The payload's BlockHash is recomputed so the CL relays it via engine_newPayload,
//     but the EL rejects it during execution due to an invalid state root.
//
//  3. EL-only invalidity via invalid parent:
//     A new payload builds on a previously rejected block (an invalid parent),
//     causing the EL to reject it as referencing an invalid ancestor.

The CL and the EL will not advance its unsafe heads if the payload is INVALID. More precisely, if the engine_newPayload returned INVALID, there will be no further engine_forkchoiceUpdate call, as well as NOT bumping the unsafe head. So we never update the CL's unsafe head if the result is INVALID. In other words, unsafe head which can be queried from CL's engine API responses were not INVALID, but rather VALID or SYNCING.

Related: #17627 (comment)

  1. TestCLUnsafeNotRewoundOnInvalidDuringELSync
// TestCLUnsafeNotRewoundOnInvalidDuringELSync verifies that the CL's unsafe head
// is not rewound when the EL returns INVALID for a payload during EL sync.
//
// When the EL is still syncing and cannot append new blocks, ForkchoiceUpdate
// returns SYNCING. In this state, the CL may continue to advance its unsafe head
// as it processes new targets, creating temporary divergence from the EL.
//
// The test then crafts a payload that the EL can still validate—even though it is
// not appendable to the EL's current head—by introducing a detectable fault in the
// payload itself (e.g., malformed ExtraData). The CL relays this payload through
// engine_newPayload, and the EL immediately responds INVALID based on intrinsic
// payload validation. The EL does not advance or trigger sync for this payload,
// and the CL's unsafe head remains unchanged, without rewinding.
//
// This confirms that an INVALID response during EL sync halts advancement but does
// not cause the CL's unsafe head to regress, preserving the last known valid head
// while maintaining correct Engine API semantics.

op-node does not rewind unsafe head which was advanced while EL Syncing when engine_newPayload response is INVALID. For example consider the following scenario:

  1. op-node / op-geth synced to block N
    • op-node unsafe: N, op-geth unsafe: N
  2. op-node receives unsafe payload N + 2 and injects to EL. op-node EL syncs. op-geth returns SYNCING.
    • op-node unsafe: N + 2, op-geth unsafe: N
  3. op-node receives unsafe payload N + 4 and injects to EL. op-node EL syncs. op-geth returns SYNCING.
    • op-node unsafe: N + 4, op-geth unsafe: N
  4. op-node receives unsafe payload N + 5, and injects to EL.
    • The payload was INVALID, EL was able to determine the payload state
    • op-node unsafe is still N + 4, op-geth unsafe: N
      So when INVALID is observed to the op-node while EL Syncing(VALID -> SYNCING -> ... -> SYNCING -> INVALID), we may need to rewind the op-node's unsafe label to N, not staying at N + 4.

Action item: We may patch op-node to rewind the unsafe head when INVALID encountered while EL Syncing

Related: #17627 (comment)

We may choose one of

  1. Reset unsafe head to the last unsafe head we got a VALID response for (not SYNCING!)
    • If we do nothing, the unsafe head will stay at SYNCING
  2. Reset unsafe head back to the current unsafe head according to the EL
  3. Reset unsafe head back to the safe head we're sending
  4. Do a full reset of the derivation pipeline

Additional context

These tests were added to lock in the behavior while discussing the edge cases of EL Syncing. Tracked at #17694

@pcw109550 pcw109550 requested a review from nonsense October 23, 2025 15:14
@pcw109550 pcw109550 changed the title op-acceptance-tests: Edge cases while EL Syncing op-acceptance-tests: Edge cases with engine APIs Oct 24, 2025
@pcw109550 pcw109550 marked this pull request as ready for review October 24, 2025 14:52
@pcw109550 pcw109550 requested review from a team as code owners October 24, 2025 14:52
@pcw109550 pcw109550 requested review from axelKingsley and removed request for axelKingsley October 24, 2025 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant