Home Lab
One of the downsides of moving from working on the Ethereum consensus layer is that you often need a real execution node sync’d, and they don’t have the near instantaneous checkpoint sync. So recently I bit the bullet and custom built a PC to run a whole bunch of different Ethereum chains on. I’m really quite happy with the result.
There’s actually a really good variety of public endpoints available for loads of Ethereum-based chains these days so while running your own is maximally decentralised, it’s not just a choice between Infura or your own node now. Public Node provide very good free JSON-API and consensus APIs. Alchemy and Quicknode both have quite usable free tiers etc. The downside with all of them though is that their servers are in the Americas or Europe and that’s a whole lot of latency away from Australia. When you’re syncing L2 nodes or particularly running fault proof systems, you wind up making a lot of requests and that latency becomes very painful very quickly. More than anything it was wanting to avoid that latency that drove me to want to run my own nodes locally.
To be useful though, I really want it to run quite a few different chains. Currently it’s running:
- Ethereum MainNet
- Ethereum Sepolia
- OP Mainnet
- OP Sepolia
- Base Mainnet
- Base Sepolia
I’m quite tempted to add a Holeksy node just so I can run some validators again - shame most of the L2 stacks and apps use Sepolia but it has a locked down validator set.
Hardware-wise running this many nodes is primarily about disk space so I wound up with an MSI Pro Z790-P motherboard which has a rather ridiculous number of ports that you can plug SSDs into - not all at full speed but plenty at fast enough speeds. It’s been nearly 20 years since I built a custom PC so there’s likely a bunch of things that aren’t the perfect trade offs but I’m quite happy with the overall result. One of the mistakes which I’m actually happy about was that I mistook the case sense names and wound up with a much larger case than I expected. That does give it capacity to shove a heap of spinning rust drives into it and leverage that for things like historic data that doesn’t need the fast disk. Its got a Intel Core i7 CPU which is barely being used. I had wanted to add 128Gb of RAM since Ethereum nodes do like to cache stuff but apparently using 4 sticks of RAM can cause instability so I’ve stuck to just 64Gb for now. It seems to be plenty for now but is probably the main limiting factor at the moment. For disk it currently has two 4Tb NVME drives.
For software, the L1 consensus nodes are obviously all Teku and they’re doing great. The team has done a great job continuing to improve things since I left so even with the significant growth in validator set, its running very happily with less memory and CPU than it had been “back in my day”. The L1 Mainnet execution client is a reth archive node which has been quite successful. I did try a reth node for sepolia but hit a few issues (which I think have now been fixed) so I’ve wound up running executionbackup and have both geth and reth for sepolia.
The L2 nodes are all op-node and op-geth - always good to actually run the software I’m helping build. For OP Sepolia, I’m also running op-dispute-mon and op-challenger to both monitor the fault proof system and participate in games to ensure correct outcomes. I really do like the fact that OP fault proofs are fully permissionless so anyone can participate in the process just like my home lab now does.
For coordination, everything is running in docker via docker-compose which made it much easier to avoid all the port conflicts that would otherwise occur. Each network has its own docker-compose file, though there’s a bunch of networks shared between chains so the L2s can connect to the L1s and everything can connect to metrics. All the compose files and other config is in a local git repo with a hook setup to automatically apply any changes. So I’ve wound with a home grown gitops kind of setup. I did try using k8s with ArgoCD to “do it properly” at one point but it just made everything far more complex and less reliable so switched back to simple docker compose.
For monitoring, I’ve got Victoria Metrics capturing metrics and Loki capturing logs - both automatically pick up any new hosts. Then there’s a grafana instance to visualise it all. I even went as far as running ethereum-metrics-exporter to give a unified view of metrics when using different clients.
The final piece is a nginx instance that exposes all the different RPC endpoints at easy to remember URLs, ie /eth/mainnet/el
, /eth/mainnet/cl
, /op/mainnet/el
etc. All the web UIs for the other services like Grafana are exposed through the same nginx instance. My initial build exposed all the RPCs on different ports and it was a nightmare trying to remember which chain was one which port, so the friendly URLs have been a big win.
Overall I’m really very happy with the setup and it is lightning fast even to perform quite expensive queries like listing every dispute game ever created. Plus it was fun to play with some “from scratch” system admin again instead of doing everything in the cloud with already existing templates and services setup.
Moving On From ConsenSys
After nearly 5 years working with the ConsenSys protocols group, I’ll be finishing up at the end of January.
So what happens with Teku? It will carry on as usual and keep going from strength to strength. There’s an amazing team of people building Teku and I have complete confidence in their ability to continue building Teku and contributing to the future of the Ethereum protocol. Teku started well before I was involved with it and has always been the work of an amazing team of people. I just wound up doing a lot of the more visible stuff - answering discord questions and reacting to the ad-hoc stuff that popped up.
My time at ConsenSys actually started by working on Besu, back before it’s initial release when it was called Pantheon. I was part of the team adding the initial support for private networks and then later moved over to join the team focussed on MainNet compatibility with work on things like fast sync, core EVM work and all that kind of fun. After that I got to help build a new team to focus on setting up tooling to make development and testing easier - modernising build and release systems, automated deployment and monitoring of test nodes and so on.
Then this “Ethereum 2.0” thing seemed like it might actually be ready to move out of the research phase and move towards production. So I joined the research team that was building “Artemis” to start bringing it out of research and to a real production-ready client. Most of the research team moved on to other research topics and we built a mostly new team around what we then called Teku. And so began one heck of a journey leading to the beacon chain launch, Altair and then The Merge. Hearing the crowd cheering in support of the merge at DevCon this year is one of the great highlights of my career.
I’m so lucky to have gotten to work with some truly amazing people. The folks who have been part of the Teku team along our journey share a truly special place in my heart though and I will always be grateful for the shared knowledge, persistence and dedication they have all contributed but even more so the caring, friendly way they contributed it. It’s not just the teams in ConsenSys but right across the Ethereum eco-system. The way the different consensus client teams have come together to push Ethereum forward is particularly amazing. These are ostensibly teams that are competing with each other and yet actively share knowledge to improve both the protocol and other team’s clients.
As I leave ConsenSys, I do so knowing that there are teams of incredible people who will carry on with the work I’m so privileged to have been able to contribute to.
So why the change? Mostly because this is a good time for me personally. As I mentioned, I started working on Teku to bring it out of research and into production. Getting The Merge done is a natural endpoint of that mission and a natural place to start looking for new challenges and opportunities. Obviously there are plenty of remaining things to improve in the Ethereum protocol and clients like Teku, but I’m keen to get a bit further out of my comfort zone.
So what’s next? I’ll be taking up a role as Staff Protocol Engineer with OP Labs to work on Optimism. I started looking at opportunities at Optimism because I’ve seen some of the great work they’ve been doing and I really like their retroactive public goods funding - it shows they’re investing in Ethereum, not just taking what they can get from it. Primarily though for me, finding a great place to work is about finding a great team of people doing interesting work. As I talked with various people from the Optimism team, I found them to be smart, curious, welcoming people who not only wanted to build great software but also wanted to keep improving the way they went about that. Plus I’ll be staying in the Ethereum eco-system so still get to work with all those amazing people. I can already see there’s a ton of stuff I can learn from the Optimism team and I think there’s places where I can bring some useful skills and experience beyond just writing some code.
In fact, given they mostly use Go and I have no real Go experience, “just writing some code” will be one of the first fun challenges. Java has kind of followed me for my career, not entirely deliberately though I do like it as a language, so I’m actually excited to really dig into writing production grade Go code.
Philosophically, one of the things I dislike about Ethereum (and blockchains in general) is that the high cost of transactions means it often becomes a rich person’s game and it often feels like people just throwing play money around. L2 solutions like Optimism are a big part of solving that by scaling blockchains and dramatically reducing fees. It feels good to me to be contributing to that. So much of the potential of Ethereum is waiting to be unlocked once it really scales. Besides, having worked on execution and consensus layers so far, moving to Layer 2 seems like an obvious next step.
Overall, I’m excited about the future of Teku and will be cheering the team on, and excited about the future of Ethereum and look forward to being part of delivering The Surge.
DevCon VI Talks
Mostly just so that I can find the recordings more easily later, here’s the recordings of DevCon VI talks I gave in Bogotá.
Firstly, Post-Merge Ethereum Client Architecture:
And a panel, It’s 10pm, do you know where you mnemonic is?
Understanding Attestation Misses
The process of producing attestations and getting them included into the chain has become more complex post-merge. Combined with a few issues with clients causing more missed attestations than normal there’s lots of people struggling to understand what’s causing those misses. So let’s dig into the process involved and how to identify where problems are occurring.
Attestation Life-Cycle
There’s a number of steps required to get an attestation included on the chain. My old video explaining this still covers the details of the journey well - the various optimisations I talk about there have long since been implemented but the process is still the same. In short, the validator needs to produce the attestation and publish it to the attestation subnet gossip channel, an aggregator needs to include it in an aggregation and publish it to the aggregates gossip channel and then it needs to be pick up by a block producer and packed into a block.
Attestations are more likely to be included in aggregates and blocks if they are published on time and match the majority of other attestations. Attestations that are different can’t be aggregated so they’re much less likely to be included in aggregates (the aggregator would have to produce an attestation that matches yours) and they take up 1 of the 128 attestations that can be in a block but pay less than better aggregated attestations.
Since attestations attest to the current state of the chain, the way to ensure your attestation matches the majority is to ensure you’re following the chain well. That’s where most of the post-merge issues have been - blocks taking too long to import, causing less accurate attestations which are then more likely to not get included. So let’s look at some metrics to follow so we can work out what’s happening.
Key Indicators of Attestation Performance
Often people just look at the “Attestation Effectiveness” metric reported by beaconcha.in, but that’s not a great metric to use. Firstly it tries to bundle together every possible measure of attestations, some within your control and some not, into a single metric. Secondly, it tends to be far too volatile with a single delayed attestation causing a very large drop in the effectiveness metric, distorting the result. As a result, it tends to make your validator performance look worse than it is and doesn’t give you any useful information to fix act on.
So let’s look at some more specific and informative metrics we can use instead.
Firstly for the overall view, look to percentage of attestation rewards earned. While that write up is pre-Altair the metrics on the Teku Dashboard have been updated to show the right values even with the new Altair rules. Look at the “Attestation Rewards Earned” line on the “Attestation Performance” graph in the top left of the board. This will tell you quite accurately how well you’re doing in terms of total rewards, but it still includes factors outside of your control and won’t help identify where problems are occurring.
To identify where problems are occurring we need to dig a bit deeper. Each epoch, Teku prints a summary of attestation performance to the logs like:
Attestation performance: epoch 148933, expected 16, produced 16, included 16 (100%), distance 1 / 1.00 / 1, correct target 16 (100%), correct head 16 (100%)
This is an example of perfect attestation performance - we expected 16 attestations, 16 were included, the distance had a minimum of 1, average of 1.00 and maximum of 1 (the distance numbers are min / avg / max in the output) and 100% of attestations had the correct target and head. One thing to note is that attestation performance is reported 2 epochs after the attestations are produced to give them time to actually be included on chain. The epoch
reported in this line tells you which epoch the attestations being reported on are from.
Each of these values are also available as metrics and the Teku Dashboard uses them to create the “Attestation Performance” graph. That provides a good way to quickly see how your validators have performed over time and get a better overview rather than fixating on a single epoch that wasn’t ideal.
Attestations Expected
Each active validator should produce one attestation per epoch. So the expected
value reported should be the same as the number of active validators you’re running. If it’s less than that, you probably haven’t loaded some of your validator keys and they’ll likely be missing all attestations. It’s pretty rare that expected
isn’t what we expect though.
Attestations Produced
If the produced
value is less than the expected
then something prevented your node from producing attestations at all. To find out what, you’ll need to scroll back up in your validator client logs to the epoch this performance report is for - remember that it will be 2 epochs ago. We’re looking for a log that shows the result of the attestation duty. When the attestation is published successfully it will show something like:
Validator *** Published attestation Count: 176, Slot: 3963003, Root: b4ca6d61be7f54f7ccc6055d0f37f122943e8313dbcfe49513c9d4ef50bbc870
The Count
field is the number of local validators that produced this attestation (this example is from our Görli testnet node - sadly we don’t have that many real-money validators).
When an attestation fails to be produced the log will show something like:
Validator *** Failed to produce attestation Slot: 4726848 Validator: d278fc2
java.lang.IllegalArgumentException: Cannot create attestation for future slot. Requested 4726848 but current slot is 4726847
at tech.pegasys.teku.validator.coordinator.ValidatorApiHandler.createAttestationData(ValidatorApiHandler.java:324)
at jdk.internal.reflect.GeneratedMethodAccessor54.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at tech.pegasys.teku.infrastructure.events.DirectEventDeliverer.executeMethod(DirectEventDeliverer.java:74)
at tech.pegasys.teku.infrastructure.events.DirectEventDeliverer.deliverToWithResponse(DirectEventDeliverer.java:67)
at tech.pegasys.teku.infrastructure.events.AsyncEventDeliverer.lambda$deliverToWithResponse$1(AsyncEventDeliverer.java:80)
at tech.pegasys.teku.infrastructure.events.AsyncEventDeliverer$QueueReader.deliverNextEvent(AsyncEventDeliverer.java:125)
at tech.pegasys.teku.infrastructure.events.AsyncEventDeliverer$QueueReader.run(AsyncEventDeliverer.java:116)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
The specific reason the attestation failed can vary a lot. In this case the beacon node wasn’t keeping up for some reason which would require further investigation into Teku and its performance. One common source of failures if the beacon node or execution client isn’t in sync at the time which appears as a 503
response code from the beacon node when using the external validator client.
We can look at the “Produced” line on the “Attestation Performance” graph of the standard Teku dashboard to see the percentage of expected attestations that were produced over time.
Attestation Timing
If the attestation was produced, the next thing to check is that it was actually produced on time. If you find the Published attestation
log line, you can compare the timestamp of that log message to the time the attestation’s slot started. You can Slot Finder to find the start time of the slot. Attestations are due to be published 4 seconds into the slot. Anywhere from the start of the slot up to about 4.5 seconds after is fine.
You can also use the validator_attestation_publication_delay
metric to track publication times. The Teku Detailed dashboard includes graphs of this under the Validator Timings
section.
Remember that neither logs nor metrics can identify when your system clock is incorrect, because the timings they’re using are from the system clock too. Make sure you’re running ntpd or chrony and that they report the clock as in sync.
Correct Head Vote
If the attestation was published on time, we need to start checking if it matched the majority of other nodes produced. There isn’t a simple way to do this directly, but generally if the head block our attestation votes for turns out to be correct, we will almost certainly have agreed with the majority of other validators. The correct head 16 (100%)
part of the attestation performance line shows how many attestations produced had the right head block. If that’s at 100% and the attestations were all published on time, there isn’t really much more your node can do.
Having some attestations with incorrect head votes may mean your node is too slow importing blocks. Note though that block producers are sometimes slow in publishing a block. These late blocks sometimes mean that the majority of validators get the head vote “wrong”, so it’s not necessarily a problem with your node when head votes aren’t at 100%. Even if it is your node that’s slow, we need to work out if the problem is in the beacon node or the execution client. Block timing logs can help us with that.
Block Timings
To dig deeper we need to enable some extra timing metrics in Teku by adding the --Xmetrics-block-timing-tracking-enabled
option. This does two things, firstly when a block finishes importing more than 4 seconds into a slot (after attestations are due), Teku will now log a Late Block Import
line which includes a break down of the time taken at each stage of processing the block (albeit very Teku-developer oriented). Secondly, it enables the beacon_block_import_delay_counter
metric which exposes that break down as metrics. Generally, for any slot where the head vote is incorrect, there will be a late block import that caused it. We just need to work out what caused the delay.
An example late block log looks like:
Late Block Import *** Block: c2b911533a8f8d5e699d1a334e0576d2b9aa4caa726bde8b827548b579b47c68 (4765916) proposer 6230 arrival 3475ms, pre-state_retrieved +5ms, processed +185ms, execution_payload_result_received +1436ms, begin_importing +0ms, transaction_prepared +0ms, transaction_committed +0ms, completed +21ms
Arrival
The first potential source of delay is that the block just didn’t get to us in time. The arrival
timing shows how much time after the start of the slot the block was first received by your node. In the example above, that was 3475ms which is quite slow, but did get to us before we needed to create an attestation 4 seconds into the slot. Delays in arrival are almost always caused by the block producer being slow to produce the block. It is however possible that the block was published on time but took a long time to be gossiped to your node. If you’re seeing late arrival for most blocks, there’s likely an issue with your node - either the system clock is wrong, your network is having issues or you may have reduced the number of peers too far.
Execution Client Processing
Post-merge, importing a block involves both the consensus and execution clients. The time Teku spends waiting for the execution client to finish processing the block is reported in the execution_payload_result_received
value. In this case 1436ms, which would have been ok if the block wasn’t received so late but isn’t ideal. Under 2 seconds is probably ok most of the time, but under 1 second would be better. Execution clients will keep working on optimisations to reduce this time so its worth keeping up to date with the latest version of your client.
Note that prior to Teku 22.9.1 this entry didn’t exist and the execution client time was just counted as part of transaction_prepared
.
Teku Processing
The other values are all various aspects of the processing Teku needs to do. pre-state_retrieved
and processed
are part of applying the state transaction when processing the block. begin_importing
, transaction_prepared
and transaction_committed
record the time taken in various parts of storing the new block to disk. Finally completed
reports the final details of things like updating the fork choice records and so on.
Prior to Teku 22.9.1, the transaction_committed
was a common source of delays when updating the actual LevelDB database on disk. The disk update is now asynchronous so unless the disk is really exceptionally slow this value is generally only 0 or 1ms.
Next Steps
All these metrics let us get an understanding of where time was spent or where failures occurred. If you’re node is processing blocks quickly, publishing attestations on time and the system clock is accurate there’s probably very little you can do to improve things - having the occasional delayed or missed attestation isn’t unheard of or really worth worrying about.
Otherwise these metrics and logs should give a fairly clear indication of which component is causing problems so you can focus investigations there are get help as needed.
Beacon REST API - Fetching Blocks on a Fork
When debugging issues on the beacon chain, it can be useful to download all blocks on a particularly, potentially non-canonical fork. This script will do just that.
The script should work with any client that supports the standard REST API. Execute it with fetch.sh <BLOCK_ROOT> <NUMBER_OF_BLOCKS_TO_DOWNLOAD>
#!/bin/bash
set -euo pipefail
ROOT=${1:?Must specify a starting block root}
COUNT=${2:?Must specify number of blocks to fetch}
for i in $(seq 1 $COUNT)
do
curl -s http://localhost:5051/eth/v2/beacon/blocks/${ROOT} | jq . > tmp.json
SLOT=$(cat tmp.json | jq -r .data.message.slot)
PARENT=$(cat tmp.json | jq -r .data.message.parent_root)
mv tmp.json ${SLOT}.json
curl -s -H 'Accept: application/octet-stream' http://localhost:5051/eth/v2/beacon/blocks/${ROOT} > ${SLOT}.ssz
echo "$SLOT ($ROOT)"
ROOT=$PARENT
done
Blocks are downloaded in both JSON and SSZ format. As it downloads it prints the slot and block root for each block it downloads.
This is particularly useful when combined with Teku’s data-storage-non-canonical-blocks-enabled
option which makes it store all blocks it receives even if they don’t wind up on the finalized chain.
Aggregators and DVT
Obol Network are doing a bunch of work on distributed validator technology and have hit some challenges with the way the beacon REST API determines if validators are scheduled to be aggregators.
Oisín Kyne has written up a detailed explanation of the problem with some proposed changes. Mostly noting it here so I can find the post again later.
Personally I’d like to avoid adding the new /eth/v1/validator/is_aggregator
and just have that information returned from the existing /eth/v1/validator/beacon_committee_subscriptions
endpoint given it has to be changed anyway and the beacon node will have to check that as part of handling the call anyway. Otherwise it seems simple enough to implement and is worth it to enable DVT to be delivered as middleware rather than having to replace the whole validator client.
Exploring Eth2: Previous Attesters
In the beacon chain spec, the chain is justified when at least 2/3rds of the active validating balance attests to the same target epoch. Simple enough, but there’s a couple of little quirks that are easy to miss.
The relevant part of the spec is:
def weigh_justification_and_finalization(state: BeaconState,
total_active_balance: Gwei,
previous_epoch_target_balance: Gwei,
current_epoch_target_balance: Gwei) -> None:
previous_epoch = get_previous_epoch(state)
current_epoch = get_current_epoch(state)
old_previous_justified_checkpoint = state.previous_justified_checkpoint
old_current_justified_checkpoint = state.current_justified_checkpoint
# Process justifications
state.previous_justified_checkpoint = state.current_justified_checkpoint
state.justification_bits[1:] = state.justification_bits[:JUSTIFICATION_BITS_LENGTH - 1]
state.justification_bits[0] = 0b0
if previous_epoch_target_balance * 3 >= total_active_balance * 2:
state.current_justified_checkpoint = Checkpoint(epoch=previous_epoch,
root=get_block_root(state, previous_epoch))
state.justification_bits[1] = 0b1
if current_epoch_target_balance * 3 >= total_active_balance * 2:
state.current_justified_checkpoint = Checkpoint(epoch=current_epoch,
root=get_block_root(state, current_epoch))
state.justification_bits[0] = 0b1
It then goes on to check if finalization should be updated. From this we can see there is already one quirk - both the previous_epoch_target_balance
and the current_epoch_target_balance
are compared to the same total_active_balance
, yet the total effective balance of all active validators can change between epochs.
The second quirk is similar but can’t be seen from this code itself. It’s a little hard to summarize where the previous_epoch_target_balance
value comes from by quoting the spec code as we have to follow the flow through a number of different functions. So let’s take a look at the Teku implementation which, for performance reasons, is a lot more direct:
UInt64 currentEpochActiveValidators = UInt64.ZERO;
UInt64 previousEpochActiveValidators = UInt64.ZERO;
UInt64 currentEpochSourceAttesters = UInt64.ZERO;
UInt64 currentEpochTargetAttesters = UInt64.ZERO;
UInt64 previousEpochSourceAttesters = UInt64.ZERO;
UInt64 previousEpochTargetAttesters = UInt64.ZERO;
UInt64 previousEpochHeadAttesters = UInt64.ZERO;
for (ValidatorStatus status : statuses) {
final UInt64 balance = status.getCurrentEpochEffectiveBalance();
if (status.isActiveInCurrentEpoch()) {
currentEpochActiveValidators = currentEpochActiveValidators.plus(balance);
}
if (status.isActiveInPreviousEpoch()) {
previousEpochActiveValidators = previousEpochActiveValidators.plus(balance);
}
if (status.isSlashed()) {
continue;
}
if (status.isCurrentEpochSourceAttester()) {
currentEpochSourceAttesters = currentEpochSourceAttesters.plus(balance);
}
if (status.isCurrentEpochTargetAttester()) {
currentEpochTargetAttesters = currentEpochTargetAttesters.plus(balance);
}
if (status.isPreviousEpochSourceAttester()) {
previousEpochSourceAttesters = previousEpochSourceAttesters.plus(balance);
}
if (status.isPreviousEpochTargetAttester()) {
previousEpochTargetAttesters = previousEpochTargetAttesters.plus(balance);
}
if (status.isPreviousEpochHeadAttester()) {
previousEpochHeadAttesters = previousEpochHeadAttesters.plus(balance);
}
}
Here we’re iterating through the ValidatorStatus
info which roughly maps to the Validator
object from the state but with some handy abstractions to make it easier to support both Phase0 and Altair with less duplication. The thing to notice here is that regardless of whether we’re adding to the current or previous epoch balances, we’re using the same balance
that we got from getCurrentEpochEffectiveBalance
. Part of the epoch transition involves adjusting effective balances though, so the effective balance of a validator might have been different in the previous epoch.
Why is it like this? Primarily because the state only maintains the current effective balance for validators. To get the effective balance for a previous epoch you’d need to have a state from in that epoch, but the state transition is designed to only need a single state and the block to apply (if any - you could just process empty slots). You could potentially make an argument that using the validator’s latest effective balance is better anyway since that’s what they actually have at stake now. In fact, any validators that are slashed in the current epoch are entirely excluded from the current and previous epoch attesting totals which makes sense - we know they’re unreliable so we ignore their attestations.
What impact does this have? Essentially none. The amount that effective balances would change is generally pretty limited and there are limits on the number of validators that can activate or exit each epoch. So the difference between the numbers you might expect and what you actually get are quite small, so you’d have to be right on the edge of justification for this to make any difference. In theory though it is possible for an epoch to be balanced just right so that it doesn’t justify immediately, but does at the next epoch transition without including any new attestations. The opposite is also possible where the epoch justifies but then effective balances change such that it doesn’t meet the threshold to justify as the previous epoch - that would just leave the state.current_justified_checkpoint
unchanged though which means the original justification stands.
But it may make for a very niche trivia question one day, and now you’re prepared with the answer…
Checkpoint Sync - What If Infura Is Hacked?
One of the common concerns people raise about checkpoint sync is the risk that someone might hack Infura and return malicious initial states causing nodes to sync and be stuck on the wrong chain. Given users usually don’t verify the initial state and Infura is currently the only publicly available service supplying initial states, there is certainly some risk there but how concerned should we really be?
The initial state you use for checkpoint sync is important because it tells the beacon node which chain it should sync and that it should reject all others. So if the initial state is from the wrong chain, your node will sync that chain and any information you get from your node is likely to be wrong. That could result in your attesting to an incorrect chain and getting inactivity penalties on the real one or post-merge being fooled into buying or selling at a bad price because your node gave you an incorrect view of the market.
Definitely sounds scary but let’s work it through.
Firstly, all the big players - staking services, exchanges etc, wouldn’t use Infura, or any other public service, to get their initial states because they can get the initial state from one of their other nodes. So immediately nearly all the high value targets are safe.
Secondly, the attack only works on nodes syncing from scratch and the attacker can’t force people to resync their nodes1. There’s also a fairly limited amount of time before someone notices their node got an invalid initial state and blows the whistle and the attack would be stopped.
So this attack doesn’t look like a very good way to make money. What about just causing chaos? There’s a relatively small number of nodes syncing the chain at any time and only one of them needs to notice the problem to raise the alarm. So the potential for causing chaos is also quite limited.
Ultimately, the idea that an attacker who is able to compromise Infura would use that to mess with checkpoint sync seems pretty unlikely. They could just mess with the data Infura returned to DApps and directly misrepresent the world state - a much more direct and likely more profitable way of achieving the same result. Or most likely they could just snoop on the incoming stream of transactions and keep all the best MEV opportunities all to themselves.
Does that mean you shouldn’t verify your initial state? Absolutely not. While there’s little reason for an attacker to hack Infura for this purpose, that doesn’t mean it won’t ever happen. And more likely Infura might have a bug which causes it to follow the wrong chain by accident. There’s a lot of room between panicking and claiming checkpoint sync is unsafe (it’s not) and saying that it’s fine to not verify anything (it’s not).
Which is to say…
-
and if an attacker could force you to resync that it would be a much bigger problem ↩︎
Checkpoint Sync Safety
Apart from being awesomely fast, checkpoint sync also exists to ensure that you can safely sync despite the limitations of weak subjectivity. The initial state you use is considered trusted - you are telling your beacon node that this state is the canonical chain and it should ignore all others. So it’s important to ensure you get the right state.
Get It From Somewhere You Trust
The simplest and best way to ensure the state is right is to get it from somewhere trusted. There are a few options.
The best source is one of your own nodes. For setups that run multiple nodes that’s easy as they can get an initial state for a new node from any of their existing nodes. Even people running single nodes may be able to use a state from their own node in some cases. For example if you need to re-sync your node or are switching clients you can store the current finalized state from your node before stopping it, then use that as the initial state for your new sync. In both these cases the solution is completely trustless - your using data from your own node so no need to trust anyone else and no need for further verification.
If you can’t get the state from your own node you’ll have to get it from someone else. A friend or family member you trust that runs their own node would be an excellent source. This isn’t trustless, but will usually still have a very high level of trust even without any further verification.
Otherwise you’ll have to get the state from a public provider. Currently that’s just Infura, but ideally more options will be available in the future. We’re beyond personal trust circles but Infura is certainly still a very reputable provider so most people would still have a reasonable level of trust in them.
The final option is to get the state from some random person on the internet. Seems crazy and is definitely not something to be trusted, but it is still an option if you verify the state against more trusted sources. Early on, before Infura supported the API to download the state this was actually the most common way people used checkpoint sync. I would just periodically put a state up in a GitHub repo so they could access it.
Verify The State
If you get the state from a source you don’t fully trust, you’ll need to verify it. You can do this by calculating the hash tree root of the state, then checking that against one or more block explorers. In essence you’re aggregating your trust from multiple services until you (hopefully) reach a level you’re comfortable with.
The main problem with this is that you’ll need a tool to calculate the hash tree root of the state itself. It’s simpler to just use the state to sync your node then confirm that the block roots your node reports match block explorers. If you wind up at the right chain head, you must have started from a canonical state. You will likely want to disable any validators you run until you’ve verified the blocks - otherwise you may attest to something you don’t actually trust.
Why Can’t The Beacon Node Do It For Me?
There have been some proposals for beacon nodes to automatically verify the state using heuristics like whether the state matches what the majority of your peers have. I’m personally very skeptical of such ideas because if you could trust the information from the network, you wouldn’t need checkpoint sync in the first place. The fundamental challenge introduced by weak subjectivity is that your node simply can’t determine what the canonical chain is (if it’s been offline for too long) and so has to be told. Your node’s peers are essentially the adversary we’re trying to defend against with checkpoint sync so we want to avoid using any information from them to second guess the initial state.
The beacon node could automate the process of checking against multiple block explorers but there are two problems with that.
Firstly, there isn’t an agreed API to perform that check so we’d need to either design that and get block explorers to support it or write custom code in clients to support each block explorer.
Secondly, and more problematically, client developers would have to decide which block explorers are trust worthy and embed the list into their clients. There’s already a lot of responsibility in being a client dev and a lot of trust the community puts in us - we really don’t want to expand that by also being responsible for deciding which services users should trust. There could just be a config option so users could specify their own list of explorers to verify against but that’s a pretty clunky UX and it’s very unlikely users would go to the effort of finding suitable URLs and specifying them. Besides, it’s probably more work for them than just verifying the block roots manually.
Weak Subjectivity Checkpoints Have Failed
Let’s just admit it, weak subjectivity checkpoints have failed…
The beacon chain brings to Ethereum the concept of weak subjectivity, so prior to the launch of the beacon chain MainNet it was seen as important to have a solution agreed that would allow users starting a new node to verify the chain they sync’d was in fact the canonical chain.
This was before anyone client had implemented checkpoint sync so the proposal was that the epoch and block root from a point close enough to the chain head that it was within the weak subjectivity period but far enough back that it wouldn’t need to change often (e.g update only every 256 epochs). Since it’s small and doesn’t change often people could easily publish it via all kinds of methods - block explorers could show it, twitter bots could post it, it could even be printed in newspapers. Thus it would be easy for users to find it and they’d have lots of different sources to verify against.
Clients would then sync from genesis and when they got to the epoch in the weak subjectivity checkpoint, they’d check that the block root matches the block they have, thus confirming they sync’d the right chain. If it didn’t match the client would crash and the user would have to either correct the weak subjectivity checkpoint they were providing or delete the current sync and try again, hoping the client wound up on the right chain this time.
Of course, that’s the first major problem with weak subjectivity checkpoints - you can waste days syncing before realising you’re on the wrong chain. Then once you discover that you have to start again but don’t really have anything you can do to avoid the problem happening again.
Somewhat surprisingly though, that’s not the biggest problem with weak subjectivity checkpoints. The biggest problem is that they’re actually extremely hard to find. For a while beaconscan provided an endpoint that would give back the current weak subjectivity checkpoint - no explanation of what it was or how to use it, but at least you could get the value. It appears they silently removed it at some point though. I don’t recall seeing weak subjectivity checkpoints published anywhere else.
In fact, I’m actually having trouble turning up a clear reference of how to calculate the weak subjectivity checkpoint. There is this doc in the specs which says how to calculate the weak subjectivity period, which amusingly basically has a TODO how to distribute the checkpoints. There’s also this doc from Aditya which gives a lot more detail on weak subjectivity periods. I had thought there was some spec for ensuring everyone selected the same epoch to use for the weak subjectivity checkpoint and I thought it was a simple “every X epochs” but I can’t actually find that documented anywhere.
We could fix this - a clearer and/or simpler spec on how to get the weak subjectivity checkpoint could be created (and there is a suggestion of adding it to the standard REST API), we could put pressure on providers to publish those checkpoints and setup twitter bots etc so it was available. Even then, we’d still have a truly lousy user experience with slow sync times that get longer as more blocks are piled on to the beacon chain and that can only retrospectively tell you that you’ve been wasting your time.
So let’s just admit it, weak subjectivity checkpoints have failed. The future is in using checkpoint sync to start from the current finalized state. That minimises sync time and users can realistically take advantage of it today. It’s not perfect, we need more places to provide that state and better ways of validating that the state is in fact the right one, but it’s already more accessible than weak subjectivity checkpoints and has infinitely more users actually using it (ie more than zero). If we can move on from the idea of weak subjectivity checkpoints, we can focus on getting the most out of checkpoint sync in terms of usability and security.