Skip to content

Commit ea4097e

Browse files
michaelsproulWoodpile37
authored andcommitted
Add --light-client-server flag and state cache utils (sigp#3714)
## Issue Addressed Part of sigp#3651. ## Proposed Changes Add a flag for enabling the light client server, which should be checked before gossip/RPC traffic is processed (e.g. sigp#3693, sigp#3711). The flag is available at runtime from `beacon_chain.config.enable_light_client_server`. Additionally, a new method `BeaconChain::with_mutable_state_for_block` is added which I envisage being used for computing light client updates. Unfortunately its performance will be quite poor on average because it will only run quickly with access to the tree hash cache. Each slot the tree hash cache is only available for a brief window of time between the head block being processed and the state advance at 9s in the slot. When the state advance happens the cache is moved and mutated to get ready for the next slot, which makes it no longer useful for merkle proofs related to the head block. Rather than spend more time trying to optimise this I think we should continue prototyping with this code, and I'll make sure `tree-states` is ready to ship before we enable the light client server in prod (cf. sigp#3206). ## Additional Info I also fixed a bug in the implementation of `BeaconState::compute_merkle_proof` whereby the tree hash cache was moved with `.take()` but never put back with `.restore()`.
1 parent f6c37d2 commit ea4097e

File tree

9 files changed

+99
-8
lines changed

9 files changed

+99
-8
lines changed

beacon_node/beacon_chain/src/beacon_chain.rs

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -998,6 +998,46 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
998998
Ok(self.store.get_state(state_root, slot)?)
999999
}
10001000

1001+
/// Run a function with mutable access to a state for `block_root`.
1002+
///
1003+
/// The primary purpose of this function is to borrow a state with its tree hash cache
1004+
/// from the snapshot cache *without moving it*. This means that calls to this function should
1005+
/// be kept to an absolute minimum, because holding the snapshot cache lock has the ability
1006+
/// to delay block import.
1007+
///
1008+
/// If there is no appropriate state in the snapshot cache then one will be loaded from disk.
1009+
/// If no state is found on disk then `Ok(None)` will be returned.
1010+
///
1011+
/// The 2nd parameter to the closure is a bool indicating whether the snapshot cache was used,
1012+
/// which can inform logging/metrics.
1013+
///
1014+
/// NOTE: the medium-term plan is to delete this function and the snapshot cache in favour
1015+
/// of `tree-states`, where all caches are CoW and everything is good in the world.
1016+
pub fn with_mutable_state_for_block<F, V, Payload: ExecPayload<T::EthSpec>>(
1017+
&self,
1018+
block: &SignedBeaconBlock<T::EthSpec, Payload>,
1019+
block_root: Hash256,
1020+
f: F,
1021+
) -> Result<Option<V>, Error>
1022+
where
1023+
F: FnOnce(&mut BeaconState<T::EthSpec>, bool) -> Result<V, Error>,
1024+
{
1025+
if let Some(state) = self
1026+
.snapshot_cache
1027+
.try_write_for(BLOCK_PROCESSING_CACHE_LOCK_TIMEOUT)
1028+
.ok_or(Error::SnapshotCacheLockTimeout)?
1029+
.borrow_unadvanced_state_mut(block_root)
1030+
{
1031+
let cache_hit = true;
1032+
f(state, cache_hit).map(Some)
1033+
} else if let Some(mut state) = self.get_state(&block.state_root(), Some(block.slot()))? {
1034+
let cache_hit = false;
1035+
f(&mut state, cache_hit).map(Some)
1036+
} else {
1037+
Ok(None)
1038+
}
1039+
}
1040+
10011041
/// Return the sync committee at `slot + 1` from the canonical chain.
10021042
///
10031043
/// This is useful when dealing with sync committee messages, because messages are signed

beacon_node/beacon_chain/src/chain_config.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@ pub struct ChainConfig {
4747
pub count_unrealized_full: CountUnrealizedFull,
4848
/// Optionally set timeout for calls to checkpoint sync endpoint.
4949
pub checkpoint_sync_url_timeout: u64,
50+
/// Whether to enable the light client server protocol.
51+
pub enable_light_client_server: bool,
5052
}
5153

5254
impl Default for ChainConfig {
@@ -68,6 +70,7 @@ impl Default for ChainConfig {
6870
paranoid_block_proposal: false,
6971
count_unrealized_full: CountUnrealizedFull::default(),
7072
checkpoint_sync_url_timeout: 60,
73+
enable_light_client_server: false,
7174
}
7275
}
7376
}

beacon_node/beacon_chain/src/snapshot_cache.rs

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -298,6 +298,27 @@ impl<T: EthSpec> SnapshotCache<T> {
298298
})
299299
}
300300

301+
/// Borrow the state corresponding to `block_root` if it exists in the cache *unadvanced*.
302+
///
303+
/// Care must be taken not to mutate the state in an invalid way. This function should only
304+
/// be used to mutate the *caches* of the state, for example the tree hash cache when
305+
/// calculating a light client merkle proof.
306+
pub fn borrow_unadvanced_state_mut(
307+
&mut self,
308+
block_root: Hash256,
309+
) -> Option<&mut BeaconState<T>> {
310+
self.snapshots
311+
.iter_mut()
312+
.find(|snapshot| {
313+
// If the pre-state exists then state advance has already taken the state for
314+
// `block_root` and mutated its tree hash cache. Rather than re-building it while
315+
// holding the snapshot cache lock (>1 second), prefer to return `None` from this
316+
// function and force the caller to load it from disk.
317+
snapshot.beacon_block_root == block_root && snapshot.pre_state.is_none()
318+
})
319+
.map(|snapshot| &mut snapshot.beacon_state)
320+
}
321+
301322
/// If there is a snapshot with `block_root`, clone it and return the clone.
302323
pub fn get_cloned(
303324
&self,

beacon_node/src/cli.rs

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -879,4 +879,11 @@ pub fn cli_app<'a, 'b>() -> App<'a, 'b> {
879879
Useful if you intend to run a non-validating beacon node.")
880880
.takes_value(false)
881881
)
882+
.arg(
883+
Arg::with_name("light-client-server")
884+
.long("light-client-server")
885+
.help("Act as a full node supporting light clients on the p2p network \
886+
[experimental]")
887+
.takes_value(false)
888+
)
882889
}

beacon_node/src/config.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -710,6 +710,9 @@ pub fn get_config<E: EthSpec>(
710710
client_config.chain.builder_fallback_disable_checks =
711711
cli_args.is_present("builder-fallback-disable-checks");
712712

713+
// Light client server config.
714+
client_config.chain.enable_light_client_server = cli_args.is_present("light-client-server");
715+
713716
Ok(client_config)
714717
}
715718

consensus/types/src/beacon_state.rs

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1708,12 +1708,12 @@ impl<T: EthSpec> BeaconState<T> {
17081708
};
17091709

17101710
// 2. Get all `BeaconState` leaves.
1711-
let cache = self.tree_hash_cache_mut().take();
1712-
let leaves = if let Some(mut cache) = cache {
1713-
cache.recalculate_tree_hash_leaves(self)?
1714-
} else {
1715-
return Err(Error::TreeHashCacheNotInitialized);
1716-
};
1711+
let mut cache = self
1712+
.tree_hash_cache_mut()
1713+
.take()
1714+
.ok_or(Error::TreeHashCacheNotInitialized)?;
1715+
let leaves = cache.recalculate_tree_hash_leaves(self)?;
1716+
self.tree_hash_cache_mut().restore(cache);
17171717

17181718
// 3. Make deposit tree.
17191719
// Use the depth of the `BeaconState` fields (i.e. `log2(32) = 5`).

lighthouse/tests/beacon_node.rs

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1580,3 +1580,18 @@ fn sync_eth1_chain_disable_deposit_contract_sync_flag() {
15801580
.run_with_zero_port()
15811581
.with_config(|config| assert_eq!(config.sync_eth1_chain, false));
15821582
}
1583+
1584+
#[test]
1585+
fn light_client_server_default() {
1586+
CommandLineTest::new()
1587+
.run_with_zero_port()
1588+
.with_config(|config| assert_eq!(config.chain.enable_light_client_server, false));
1589+
}
1590+
1591+
#[test]
1592+
fn light_client_server_enabled() {
1593+
CommandLineTest::new()
1594+
.flag("light-client-server", None)
1595+
.run_with_zero_port()
1596+
.with_config(|config| assert_eq!(config.chain.enable_light_client_server, true));
1597+
}

testing/ef_tests/check_all_files_accessed.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,6 @@
3939
"tests/.*/.*/ssz_static/LightClientOptimistic",
4040
# LightClientFinalityUpdate
4141
"tests/.*/.*/ssz_static/LightClientFinalityUpdate",
42-
# Merkle-proof tests for light clients
43-
"tests/.*/.*/merkle/single_proof",
4442
# Capella tests are disabled for now.
4543
"tests/.*/capella",
4644
# One of the EF researchers likes to pack the tarballs on a Mac

testing/ef_tests/src/cases/merkle_proof_validity.rs

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,10 @@ impl<E: EthSpec> Case for MerkleProofValidity<E> {
7878
)));
7979
}
8080
}
81+
82+
// Tree hash cache should still be initialized (not dropped).
83+
assert!(state.tree_hash_cache().is_initialized());
84+
8185
Ok(())
8286
}
8387
}

0 commit comments

Comments
 (0)