July 21, 2025

Teku Event Channels

Teku uses a really nice framework for separating different components - Event Channels. It’s based on similar patterns in the Sail library used at LMAX for sending network messages between services. In Teku though, it’s designed to work in-process while still decoupling the components in the system. Turns out I never wrote about it here, so I’m very belatedly catching up.

Event channels are defined by declaring a pretty standard interface:

public interface SlotEventsChannel extends VoidReturningChannelInterface {
  void onSlot(UInt64 slot);
}

There are a few simple restrictions:

It must extend from VoidReturningChannelInterface (or ChannelInterface but we’ll get to non-void returning cases later)
All methods must return void
Methods cannot throw any exceptions

There can be any number of methods on the same interface and any number of subscribers to the channel.

The implementing side simply implements the interface and the calling side simply has an implementation of the interface injected and calls it as normal. So far, this isn’t actually providing any real separation - it’s just using a Java interface. You can pass the concrete implementation of the interface to the calling side and it will all work. The interface provides some decoupling between the caller and receiver, but they’re still coupled temporally because the call is synchronous, and exceptions on the receiving side would propagate back up through the calling side. Both can be fixed to isolate the components fully, but then you have to do that at every call-site.

Instead, the event channel system uses reflection to generate an implementation of the interface that ensures complete isolation between caller and receiver. The generated implementation is passed to the caller and it implements each method by passing the work to a thread pool, then calling the actual implementation. It also provides error handling and records metrics to give visibility into the event system. While reflection is used to generate the implementation most of the code is in abstract classes that the generated implementations extend so it’s easy to maintain. Importantly, the complexity of that reflection is abstracted away from the code using them framework - it’s just like an interface where part of the API contract is that calls are always asynchronous and never throw any exceptions. The code for the framework is quite small, all in the infrastructure.events package.

Calls to the interface are added to the queue the thread pool takes work from in call order. So if the thread pool has a single thread, the calls will all be processed in exactly the same order they were made. In most cases there are multiple threads in the thread pool so processing happens in parallel (but starts in order), but for cases like the StorageUpdateChannel where event order is important, a single thread is used.

The VoidReturningChannelInterface is an ideal case for maximum decoupling of components - the sender is just notifying when events happen and forgetting about them. But often we need to request data from another component or be able to handle failures. The storage system in Teku is a decoupled component for example. In that case we use an interface that just extends ChannelInterface. Then methods are allowed to return SafeFuture - the promise type used in Teku. Exceptions are still not allowed, but the returned SafeFuture can be used to return error information as part of the result. The same implementation applies, reflection is used to generate an implementation that calls the real implementation via a thread pool, but now when the real implementation completes, the result is used to complete the originally returned SafeFuture. For example:

public interface Eth1DepositStorageChannel extends ChannelInterface {
  SafeFuture<ReplayDepositsResult> replayDepositEvents();

  SafeFuture<Boolean> removeDepositEvents();
}

Note that the actual implementation still provides a method that returns SafeFuture which allows it to use an asynchronous implementation when suitable. It can also just use SafeFuture.completedFuture(value) to return a value synchronously easily. The event system will now only allow a single subscriber to the topic to ensure it knows where the result value should come from. Since publishers and subscribers are created at startup, if multiple subscribers are added it means Teku fails to start, a lot of tests fail and it won’t go unnoticed.

There’s a bunch of nice things about this framework:

EventChannels have “click-throughability”. You can easily jump from a call to the interface to the actual implementation (or see all implementations) using the go to implementation functionality of an IDE. The details of how the decoupling is implemented are all abstracted away.
The ability to return a value asynchronously is much easier to reason about than having to send responses via a separate event. The request/response is clearly coupled together in the interface rather than piecing together two independent events.
For testing, the channel interface can be easily mocked, a synchronous event channel passed or a custom stub provided.

One particularly neat trick in Teku is that the validator client can run either within the Teku beacon node process or as a separate process. It’s the event channel system that makes that work. The validator client was originally built in-process but as it’s own component so all calls to or from it were completely asynchronous and decoupled through the event channel interfaces. To make it run as an external process we simply wrote an implementation of the channels it called that worked by sending HTTP requests to the beacon node API rather than using the in-process generated ones. The calls to the validator client were all timing information like the SlotEventsChannel above. For most of those we simply wrote a new publisher that ran on an independent timer inside the validator client. The few that actually depended on the state of the beacon node were produced by subscribing to the beacon node API event stream and sending events based off of that.

The main downside is that the asynchronicity of the call isn’t visible in the actual code (only in the reflection generated implementation). That’s why by convention in Teku channel interfaces and the variables for them are always suffixed with Channel so it is clear that asynchronicity is part of the API contract. It isn’t immediately obvious to people new to the codebase, but it’s quick to learn and easy to remember so I don’t recall it ever causing any problems in practice.

Ultimately event channels are a pretty simple system that provides a lot of power and flexibility.

April 29, 2024

Home Lab

One of the downsides of moving from working on the Ethereum consensus layer is that you often need a real execution node sync’d, and they don’t have the near instantaneous checkpoint sync. So recently I bit the bullet and custom built a PC to run a whole bunch of different Ethereum chains on. I’m really quite happy with the result.

There’s actually a really good variety of public endpoints available for loads of Ethereum-based chains these days so while running your own is maximally decentralised, it’s not just a choice between Infura or your own node now. Public Node provide very good free JSON-API and consensus APIs. Alchemy and Quicknode both have quite usable free tiers etc. The downside with all of them though is that their servers are in the Americas or Europe and that’s a whole lot of latency away from Australia. When you’re syncing L2 nodes or particularly running fault proof systems, you wind up making a lot of requests and that latency becomes very painful very quickly. More than anything it was wanting to avoid that latency that drove me to want to run my own nodes locally.

To be useful though, I really want it to run quite a few different chains. Currently it’s running:

Ethereum MainNet
Ethereum Sepolia
OP Mainnet
OP Sepolia
Base Mainnet
Base Sepolia

I’m quite tempted to add a Holeksy node just so I can run some validators again - shame most of the L2 stacks and apps use Sepolia but it has a locked down validator set.

Hardware-wise running this many nodes is primarily about disk space so I wound up with an MSI Pro Z790-P motherboard which has a rather ridiculous number of ports that you can plug SSDs into - not all at full speed but plenty at fast enough speeds. It’s been nearly 20 years since I built a custom PC so there’s likely a bunch of things that aren’t the perfect trade offs but I’m quite happy with the overall result. One of the mistakes which I’m actually happy about was that I mistook the case size names and wound up with a much larger case than I expected. That does give it capacity to shove a heap of spinning rust drives into it and leverage that for things like historic data that doesn’t need the fast disk. Its got a Intel Core i7 CPU which is barely being used. I had wanted to add 128Gb of RAM since Ethereum nodes do like to cache stuff but apparently using 4 sticks of RAM can cause instability so I’ve stuck to just 64Gb for now. It seems to be plenty for now but is probably the main limiting factor at the moment. For disk it currently has two 4Tb NVME drives.

For software, the L1 consensus nodes are obviously all Teku and they’re doing great. The team has done a great job continuing to improve things since I left so even with the significant growth in validator set, its running very happily with less memory and CPU than it had been “back in my day”. The L1 Mainnet execution client is a reth archive node which has been quite successful. I did try a reth node for sepolia but hit a few issues (which I think have now been fixed) so I’ve wound up running executionbackup and have both geth and reth for sepolia.

The L2 nodes are all op-node and op-geth - always good to actually run the software I’m helping build. For OP Sepolia, I’m also running op-dispute-mon and op-challenger to both monitor the fault proof system and participate in games to ensure correct outcomes. I really do like the fact that OP fault proofs are fully permissionless so anyone can participate in the process just like my home lab now does.

For coordination, everything is running in docker via docker-compose which made it much easier to avoid all the port conflicts that would otherwise occur. Each network has its own docker-compose file, though there’s a bunch of networks shared between chains so the L2s can connect to the L1s and everything can connect to metrics. All the compose files and other config is in a local git repo with a hook setup to automatically apply any changes. So I’ve wound with a home grown gitops kind of setup. I did try using k8s with ArgoCD to “do it properly” at one point but it just made everything far more complex and less reliable so switched back to simple docker compose.

For monitoring, I’ve got Victoria Metrics capturing metrics and Loki capturing logs - both automatically pick up any new hosts. Then there’s a grafana instance to visualise it all. I even went as far as running ethereum-metrics-exporter to give a unified view of metrics when using different clients.

The final piece is a nginx instance that exposes all the different RPC endpoints at easy to remember URLs, ie /eth/mainnet/el, /eth/mainnet/cl, /op/mainnet/el etc. All the web UIs for the other services like Grafana are exposed through the same nginx instance. My initial build exposed all the RPCs on different ports and it was a nightmare trying to remember which chain was one which port, so the friendly URLs have been a big win.

Overall I’m really very happy with the setup and it is lightning fast even to perform quite expensive queries like listing every dispute game ever created. Plus it was fun to play with some “from scratch” system admin again instead of doing everything in the cloud with already existing templates and services setup.

January 18, 2023

Moving On From ConsenSys

After nearly 5 years working with the ConsenSys protocols group, I’ll be finishing up at the end of January.

So what happens with Teku? It will carry on as usual and keep going from strength to strength. There’s an amazing team of people building Teku and I have complete confidence in their ability to continue building Teku and contributing to the future of the Ethereum protocol. Teku started well before I was involved with it and has always been the work of an amazing team of people. I just wound up doing a lot of the more visible stuff - answering discord questions and reacting to the ad-hoc stuff that popped up.

My time at ConsenSys actually started by working on Besu, back before it’s initial release when it was called Pantheon. I was part of the team adding the initial support for private networks and then later moved over to join the team focussed on MainNet compatibility with work on things like fast sync, core EVM work and all that kind of fun. After that I got to help build a new team to focus on setting up tooling to make development and testing easier - modernising build and release systems, automated deployment and monitoring of test nodes and so on.

Then this “Ethereum 2.0” thing seemed like it might actually be ready to move out of the research phase and move towards production. So I joined the research team that was building “Artemis” to start bringing it out of research and to a real production-ready client. Most of the research team moved on to other research topics and we built a mostly new team around what we then called Teku. And so began one heck of a journey leading to the beacon chain launch, Altair and then The Merge. Hearing the crowd cheering in support of the merge at DevCon this year is one of the great highlights of my career.

I’m so lucky to have gotten to work with some truly amazing people. The folks who have been part of the Teku team along our journey share a truly special place in my heart though and I will always be grateful for the shared knowledge, persistence and dedication they have all contributed but even more so the caring, friendly way they contributed it. It’s not just the teams in ConsenSys but right across the Ethereum eco-system. The way the different consensus client teams have come together to push Ethereum forward is particularly amazing. These are ostensibly teams that are competing with each other and yet actively share knowledge to improve both the protocol and other team’s clients.

As I leave ConsenSys, I do so knowing that there are teams of incredible people who will carry on with the work I’m so privileged to have been able to contribute to.

So why the change? Mostly because this is a good time for me personally. As I mentioned, I started working on Teku to bring it out of research and into production. Getting The Merge done is a natural endpoint of that mission and a natural place to start looking for new challenges and opportunities. Obviously there are plenty of remaining things to improve in the Ethereum protocol and clients like Teku, but I’m keen to get a bit further out of my comfort zone.

So what’s next? I’ll be taking up a role as Staff Protocol Engineer with OP Labs to work on Optimism. I started looking at opportunities at Optimism because I’ve seen some of the great work they’ve been doing and I really like their retroactive public goods funding - it shows they’re investing in Ethereum, not just taking what they can get from it. Primarily though for me, finding a great place to work is about finding a great team of people doing interesting work. As I talked with various people from the Optimism team, I found them to be smart, curious, welcoming people who not only wanted to build great software but also wanted to keep improving the way they went about that. Plus I’ll be staying in the Ethereum eco-system so still get to work with all those amazing people. I can already see there’s a ton of stuff I can learn from the Optimism team and I think there’s places where I can bring some useful skills and experience beyond just writing some code.

In fact, given they mostly use Go and I have no real Go experience, “just writing some code” will be one of the first fun challenges. Java has kind of followed me for my career, not entirely deliberately though I do like it as a language, so I’m actually excited to really dig into writing production grade Go code.

Philosophically, one of the things I dislike about Ethereum (and blockchains in general) is that the high cost of transactions means it often becomes a rich person’s game and it often feels like people just throwing play money around. L2 solutions like Optimism are a big part of solving that by scaling blockchains and dramatically reducing fees. It feels good to me to be contributing to that. So much of the potential of Ethereum is waiting to be unlocked once it really scales. Besides, having worked on execution and consensus layers so far, moving to Layer 2 seems like an obvious next step.

Overall, I’m excited about the future of Teku and will be cheering the team on, and excited about the future of Ethereum and look forward to being part of delivering The Surge.

December 1, 2022

DevCon VI Talks

Mostly just so that I can find the recordings more easily later, here’s the recordings of DevCon VI talks I gave in Bogotá.

Firstly, Post-Merge Ethereum Client Architecture:

And a panel, It’s 10pm, do you know where you mnemonic is?

September 25, 2022

Understanding Attestation Misses

The process of producing attestations and getting them included into the chain has become more complex post-merge. Combined with a few issues with clients causing more missed attestations than normal there’s lots of people struggling to understand what’s causing those misses. So let’s dig into the process involved and how to identify where problems are occurring.

Attestation Life-Cycle

There’s a number of steps required to get an attestation included on the chain. My old video explaining this still covers the details of the journey well - the various optimisations I talk about there have long since been implemented but the process is still the same. In short, the validator needs to produce the attestation and publish it to the attestation subnet gossip channel, an aggregator needs to include it in an aggregation and publish it to the aggregates gossip channel and then it needs to be pick up by a block producer and packed into a block.

Attestations are more likely to be included in aggregates and blocks if they are published on time and match the majority of other attestations. Attestations that are different can’t be aggregated so they’re much less likely to be included in aggregates (the aggregator would have to produce an attestation that matches yours) and they take up 1 of the 128 attestations that can be in a block but pay less than better aggregated attestations.

Since attestations attest to the current state of the chain, the way to ensure your attestation matches the majority is to ensure you’re following the chain well. That’s where most of the post-merge issues have been - blocks taking too long to import, causing less accurate attestations which are then more likely to not get included. So let’s look at some metrics to follow so we can work out what’s happening.

Key Indicators of Attestation Performance

Often people just look at the “Attestation Effectiveness” metric reported by beaconcha.in, but that’s not a great metric to use. Firstly it tries to bundle together every possible measure of attestations, some within your control and some not, into a single metric. Secondly, it tends to be far too volatile with a single delayed attestation causing a very large drop in the effectiveness metric, distorting the result. As a result, it tends to make your validator performance look worse than it is and doesn’t give you any useful information to fix act on.

So let’s look at some more specific and informative metrics we can use instead.

Firstly for the overall view, look to percentage of attestation rewards earned. While that write up is pre-Altair the metrics on the Teku Dashboard have been updated to show the right values even with the new Altair rules. Look at the “Attestation Rewards Earned” line on the “Attestation Performance” graph in the top left of the board. This will tell you quite accurately how well you’re doing in terms of total rewards, but it still includes factors outside of your control and won’t help identify where problems are occurring.

To identify where problems are occurring we need to dig a bit deeper. Each epoch, Teku prints a summary of attestation performance to the logs like:

Attestation performance: epoch 148933, expected 16, produced 16, included 16 (100%), distance 1 / 1.00 / 1, correct target 16 (100%), correct head 16 (100%)

This is an example of perfect attestation performance - we expected 16 attestations, 16 were included, the distance had a minimum of 1, average of 1.00 and maximum of 1 (the distance numbers are min / avg / max in the output) and 100% of attestations had the correct target and head. One thing to note is that attestation performance is reported 2 epochs after the attestations are produced to give them time to actually be included on chain. The epoch reported in this line tells you which epoch the attestations being reported on are from.

Each of these values are also available as metrics and the Teku Dashboard uses them to create the “Attestation Performance” graph. That provides a good way to quickly see how your validators have performed over time and get a better overview rather than fixating on a single epoch that wasn’t ideal.

Attestations Expected

Each active validator should produce one attestation per epoch. So the expected value reported should be the same as the number of active validators you’re running. If it’s less than that, you probably haven’t loaded some of your validator keys and they’ll likely be missing all attestations. It’s pretty rare that expected isn’t what we expect though.

Attestations Produced

If the produced value is less than the expected then something prevented your node from producing attestations at all. To find out what, you’ll need to scroll back up in your validator client logs to the epoch this performance report is for - remember that it will be 2 epochs ago. We’re looking for a log that shows the result of the attestation duty. When the attestation is published successfully it will show something like:

Validator   *** Published attestation        Count: 176, Slot: 3963003, Root: b4ca6d61be7f54f7ccc6055d0f37f122943e8313dbcfe49513c9d4ef50bbc870

The Count field is the number of local validators that produced this attestation (this example is from our Görli testnet node - sadly we don’t have that many real-money validators).

When an attestation fails to be produced the log will show something like:

Validator   *** Failed to produce attestation  Slot: 4726848 Validator: d278fc2
java.lang.IllegalArgumentException: Cannot create attestation for future slot. Requested 4726848 but current slot is 4726847
	at tech.pegasys.teku.validator.coordinator.ValidatorApiHandler.createAttestationData(ValidatorApiHandler.java:324)
	at jdk.internal.reflect.GeneratedMethodAccessor54.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at tech.pegasys.teku.infrastructure.events.DirectEventDeliverer.executeMethod(DirectEventDeliverer.java:74)
	at tech.pegasys.teku.infrastructure.events.DirectEventDeliverer.deliverToWithResponse(DirectEventDeliverer.java:67)
	at tech.pegasys.teku.infrastructure.events.AsyncEventDeliverer.lambda$deliverToWithResponse$1(AsyncEventDeliverer.java:80)
	at tech.pegasys.teku.infrastructure.events.AsyncEventDeliverer$QueueReader.deliverNextEvent(AsyncEventDeliverer.java:125)
	at tech.pegasys.teku.infrastructure.events.AsyncEventDeliverer$QueueReader.run(AsyncEventDeliverer.java:116)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)

The specific reason the attestation failed can vary a lot. In this case the beacon node wasn’t keeping up for some reason which would require further investigation into Teku and its performance. One common source of failures if the beacon node or execution client isn’t in sync at the time which appears as a 503 response code from the beacon node when using the external validator client.

We can look at the “Produced” line on the “Attestation Performance” graph of the standard Teku dashboard to see the percentage of expected attestations that were produced over time.

Attestation Timing

If the attestation was produced, the next thing to check is that it was actually produced on time. If you find the Published attestation log line, you can compare the timestamp of that log message to the time the attestation’s slot started. You can Slot Finder to find the start time of the slot. Attestations are due to be published 4 seconds into the slot. Anywhere from the start of the slot up to about 4.5 seconds after is fine.

You can also use the validator_attestation_publication_delay metric to track publication times. The Teku Detailed dashboard includes graphs of this under the Validator Timings section.

Remember that neither logs nor metrics can identify when your system clock is incorrect, because the timings they’re using are from the system clock too. Make sure you’re running ntpd or chrony and that they report the clock as in sync.

Correct Head Vote

If the attestation was published on time, we need to start checking if it matched the majority of other nodes produced. There isn’t a simple way to do this directly, but generally if the head block our attestation votes for turns out to be correct, we will almost certainly have agreed with the majority of other validators. The correct head 16 (100%) part of the attestation performance line shows how many attestations produced had the right head block. If that’s at 100% and the attestations were all published on time, there isn’t really much more your node can do.

Having some attestations with incorrect head votes may mean your node is too slow importing blocks. Note though that block producers are sometimes slow in publishing a block. These late blocks sometimes mean that the majority of validators get the head vote “wrong”, so it’s not necessarily a problem with your node when head votes aren’t at 100%. Even if it is your node that’s slow, we need to work out if the problem is in the beacon node or the execution client. Block timing logs can help us with that.

Block Timings

To dig deeper we need to enable some extra timing metrics in Teku by adding the --Xmetrics-block-timing-tracking-enabled option. This does two things, firstly when a block finishes importing more than 4 seconds into a slot (after attestations are due), Teku will now log a Late Block Import line which includes a break down of the time taken at each stage of processing the block (albeit very Teku-developer oriented). Secondly, it enables the beacon_block_import_delay_counter metric which exposes that break down as metrics. Generally, for any slot where the head vote is incorrect, there will be a late block import that caused it. We just need to work out what caused the delay.

An example late block log looks like:

Late Block Import *** Block: c2b911533a8f8d5e699d1a334e0576d2b9aa4caa726bde8b827548b579b47c68 (4765916) proposer 6230 arrival 3475ms, pre-state_retrieved +5ms, processed +185ms, execution_payload_result_received +1436ms, begin_importing +0ms, transaction_prepared +0ms, transaction_committed +0ms, completed +21ms

Arrival

The first potential source of delay is that the block just didn’t get to us in time. The arrival timing shows how much time after the start of the slot the block was first received by your node. In the example above, that was 3475ms which is quite slow, but did get to us before we needed to create an attestation 4 seconds into the slot. Delays in arrival are almost always caused by the block producer being slow to produce the block. It is however possible that the block was published on time but took a long time to be gossiped to your node. If you’re seeing late arrival for most blocks, there’s likely an issue with your node - either the system clock is wrong, your network is having issues or you may have reduced the number of peers too far.

Execution Client Processing

Post-merge, importing a block involves both the consensus and execution clients. The time Teku spends waiting for the execution client to finish processing the block is reported in the execution_payload_result_received value. In this case 1436ms, which would have been ok if the block wasn’t received so late but isn’t ideal. Under 2 seconds is probably ok most of the time, but under 1 second would be better. Execution clients will keep working on optimisations to reduce this time so its worth keeping up to date with the latest version of your client.

Note that prior to Teku 22.9.1 this entry didn’t exist and the execution client time was just counted as part of transaction_prepared.

Teku Processing

The other values are all various aspects of the processing Teku needs to do. pre-state_retrieved and processed are part of applying the state transaction when processing the block. begin_importing, transaction_prepared and transaction_committed record the time taken in various parts of storing the new block to disk. Finally completed reports the final details of things like updating the fork choice records and so on.

Prior to Teku 22.9.1, the transaction_committed was a common source of delays when updating the actual LevelDB database on disk. The disk update is now asynchronous so unless the disk is really exceptionally slow this value is generally only 0 or 1ms.

Next Steps

All these metrics let us get an understanding of where time was spent or where failures occurred. If you’re node is processing blocks quickly, publishing attestations on time and the system clock is accurate there’s probably very little you can do to improve things - having the occasional delayed or missed attestation isn’t unheard of or really worth worrying about.

Otherwise these metrics and logs should give a fairly clear indication of which component is causing problems so you can focus investigations there are get help as needed.

August 18, 2022

Beacon REST API - Fetching Blocks on a Fork

When debugging issues on the beacon chain, it can be useful to download all blocks on a particularly, potentially non-canonical fork. This script will do just that.

The script should work with any client that supports the standard REST API. Execute it with fetch.sh <BLOCK_ROOT> <NUMBER_OF_BLOCKS_TO_DOWNLOAD>

#!/bin/bash
set -euo pipefail

ROOT=${1:?Must specify a starting block root}
COUNT=${2:?Must specify number of blocks to fetch}

for i in $(seq 1 $COUNT)
do 
	curl -s http://localhost:5051/eth/v2/beacon/blocks/${ROOT} | jq . > tmp.json
	SLOT=$(cat tmp.json | jq -r .data.message.slot)
	PARENT=$(cat tmp.json | jq -r .data.message.parent_root)
	mv tmp.json ${SLOT}.json
	curl -s -H 'Accept: application/octet-stream' http://localhost:5051/eth/v2/beacon/blocks/${ROOT} > ${SLOT}.ssz
	echo "$SLOT ($ROOT)"
	ROOT=$PARENT
done

Blocks are downloaded in both JSON and SSZ format. As it downloads it prints the slot and block root for each block it downloads.

This is particularly useful when combined with Teku’s data-storage-non-canonical-blocks-enabled option which makes it store all blocks it receives even if they don’t wind up on the finalized chain.

July 2, 2022

Aggregators and DVT

Obol Network are doing a bunch of work on distributed validator technology and have hit some challenges with the way the beacon REST API determines if validators are scheduled to be aggregators.

Oisín Kyne has written up a detailed explanation of the problem with some proposed changes. Mostly noting it here so I can find the post again later.

Personally I’d like to avoid adding the new /eth/v1/validator/is_aggregator and just have that information returned from the existing /eth/v1/validator/beacon_committee_subscriptions endpoint given it has to be changed anyway and the beacon node will have to check that as part of handling the call anyway. Otherwise it seems simple enough to implement and is worth it to enable DVT to be delivered as middleware rather than having to replace the whole validator client.

June 25, 2022

Exploring Eth2: Previous Attesters

In the beacon chain spec, the chain is justified when at least 2/3rds of the active validating balance attests to the same target epoch. Simple enough, but there’s a couple of little quirks that are easy to miss.

The relevant part of the spec is:

def weigh_justification_and_finalization(state: BeaconState,
                                         total_active_balance: Gwei,
                                         previous_epoch_target_balance: Gwei,
                                         current_epoch_target_balance: Gwei) -> None:
    previous_epoch = get_previous_epoch(state)
    current_epoch = get_current_epoch(state)
    old_previous_justified_checkpoint = state.previous_justified_checkpoint
    old_current_justified_checkpoint = state.current_justified_checkpoint

    # Process justifications
    state.previous_justified_checkpoint = state.current_justified_checkpoint
    state.justification_bits[1:] = state.justification_bits[:JUSTIFICATION_BITS_LENGTH - 1]
    state.justification_bits[0] = 0b0
    if previous_epoch_target_balance * 3 >= total_active_balance * 2:
        state.current_justified_checkpoint = Checkpoint(epoch=previous_epoch,
                                                        root=get_block_root(state, previous_epoch))
        state.justification_bits[1] = 0b1
    if current_epoch_target_balance * 3 >= total_active_balance * 2:
        state.current_justified_checkpoint = Checkpoint(epoch=current_epoch,
                                                        root=get_block_root(state, current_epoch))
        state.justification_bits[0] = 0b1

It then goes on to check if finalization should be updated. From this we can see there is already one quirk - both the previous_epoch_target_balance and the current_epoch_target_balance are compared to the same total_active_balance, yet the total effective balance of all active validators can change between epochs.

The second quirk is similar but can’t be seen from this code itself. It’s a little hard to summarize where the previous_epoch_target_balance value comes from by quoting the spec code as we have to follow the flow through a number of different functions. So let’s take a look at the Teku implementation which, for performance reasons, is a lot more direct:

    UInt64 currentEpochActiveValidators = UInt64.ZERO;
    UInt64 previousEpochActiveValidators = UInt64.ZERO;
    UInt64 currentEpochSourceAttesters = UInt64.ZERO;
    UInt64 currentEpochTargetAttesters = UInt64.ZERO;
    UInt64 previousEpochSourceAttesters = UInt64.ZERO;
    UInt64 previousEpochTargetAttesters = UInt64.ZERO;
    UInt64 previousEpochHeadAttesters = UInt64.ZERO;

    for (ValidatorStatus status : statuses) {
      final UInt64 balance = status.getCurrentEpochEffectiveBalance();
      if (status.isActiveInCurrentEpoch()) {
        currentEpochActiveValidators = currentEpochActiveValidators.plus(balance);
      }
      if (status.isActiveInPreviousEpoch()) {
        previousEpochActiveValidators = previousEpochActiveValidators.plus(balance);
      }

      if (status.isSlashed()) {
        continue;
      }
      if (status.isCurrentEpochSourceAttester()) {
        currentEpochSourceAttesters = currentEpochSourceAttesters.plus(balance);
      }
      if (status.isCurrentEpochTargetAttester()) {
        currentEpochTargetAttesters = currentEpochTargetAttesters.plus(balance);
      }

      if (status.isPreviousEpochSourceAttester()) {
        previousEpochSourceAttesters = previousEpochSourceAttesters.plus(balance);
      }
      if (status.isPreviousEpochTargetAttester()) {
        previousEpochTargetAttesters = previousEpochTargetAttesters.plus(balance);
      }
      if (status.isPreviousEpochHeadAttester()) {
        previousEpochHeadAttesters = previousEpochHeadAttesters.plus(balance);
      }
    }

Here we’re iterating through the ValidatorStatus info which roughly maps to the Validator object from the state but with some handy abstractions to make it easier to support both Phase0 and Altair with less duplication. The thing to notice here is that regardless of whether we’re adding to the current or previous epoch balances, we’re using the same balance that we got from getCurrentEpochEffectiveBalance. Part of the epoch transition involves adjusting effective balances though, so the effective balance of a validator might have been different in the previous epoch.

Why is it like this? Primarily because the state only maintains the current effective balance for validators. To get the effective balance for a previous epoch you’d need to have a state from in that epoch, but the state transition is designed to only need a single state and the block to apply (if any - you could just process empty slots). You could potentially make an argument that using the validator’s latest effective balance is better anyway since that’s what they actually have at stake now. In fact, any validators that are slashed in the current epoch are entirely excluded from the current and previous epoch attesting totals which makes sense - we know they’re unreliable so we ignore their attestations.

What impact does this have? Essentially none. The amount that effective balances would change is generally pretty limited and there are limits on the number of validators that can activate or exit each epoch. So the difference between the numbers you might expect and what you actually get are quite small, so you’d have to be right on the edge of justification for this to make any difference. In theory though it is possible for an epoch to be balanced just right so that it doesn’t justify immediately, but does at the next epoch transition without including any new attestations. The opposite is also possible where the epoch justifies but then effective balances change such that it doesn’t meet the threshold to justify as the previous epoch - that would just leave the state.current_justified_checkpoint unchanged though which means the original justification stands.

But it may make for a very niche trivia question one day, and now you’re prepared with the answer…

May 23, 2022

Checkpoint Sync - What If Infura Is Hacked?

One of the common concerns people raise about checkpoint sync is the risk that someone might hack Infura and return malicious initial states causing nodes to sync and be stuck on the wrong chain. Given users usually don’t verify the initial state and Infura is currently the only publicly available service supplying initial states, there is certainly some risk there but how concerned should we really be?

The initial state you use for checkpoint sync is important because it tells the beacon node which chain it should sync and that it should reject all others. So if the initial state is from the wrong chain, your node will sync that chain and any information you get from your node is likely to be wrong. That could result in your attesting to an incorrect chain and getting inactivity penalties on the real one or post-merge being fooled into buying or selling at a bad price because your node gave you an incorrect view of the market.

Definitely sounds scary but let’s work it through.

Firstly, all the big players - staking services, exchanges etc, wouldn’t use Infura, or any other public service, to get their initial states because they can get the initial state from one of their other nodes. So immediately nearly all the high value targets are safe.

Secondly, the attack only works on nodes syncing from scratch and the attacker can’t force people to resync their nodes¹. There’s also a fairly limited amount of time before someone notices their node got an invalid initial state and blows the whistle and the attack would be stopped.

So this attack doesn’t look like a very good way to make money. What about just causing chaos? There’s a relatively small number of nodes syncing the chain at any time and only one of them needs to notice the problem to raise the alarm. So the potential for causing chaos is also quite limited.

Ultimately, the idea that an attacker who is able to compromise Infura would use that to mess with checkpoint sync seems pretty unlikely. They could just mess with the data Infura returned to DApps and directly misrepresent the world state - a much more direct and likely more profitable way of achieving the same result. Or most likely they could just snoop on the incoming stream of transactions and keep all the best MEV opportunities all to themselves.

Does that mean you shouldn’t verify your initial state? Absolutely not. While there’s little reason for an attacker to hack Infura for this purpose, that doesn’t mean it won’t ever happen. And more likely Infura might have a bug which causes it to follow the wrong chain by accident. There’s a lot of room between panicking and claiming checkpoint sync is unsafe (it’s not) and saying that it’s fine to not verify anything (it’s not).

Which is to say…

Keep Calm and Verify

and if an attacker could force you to resync that it would be a much bigger problem ↩︎

May 21, 2022

Checkpoint Sync Safety

Apart from being awesomely fast, checkpoint sync also exists to ensure that you can safely sync despite the limitations of weak subjectivity. The initial state you use is considered trusted - you are telling your beacon node that this state is the canonical chain and it should ignore all others. So it’s important to ensure you get the right state.

Get It From Somewhere You Trust

The simplest and best way to ensure the state is right is to get it from somewhere trusted. There are a few options.

The best source is one of your own nodes. For setups that run multiple nodes that’s easy as they can get an initial state for a new node from any of their existing nodes. Even people running single nodes may be able to use a state from their own node in some cases. For example if you need to re-sync your node or are switching clients you can store the current finalized state from your node before stopping it, then use that as the initial state for your new sync. In both these cases the solution is completely trustless - your using data from your own node so no need to trust anyone else and no need for further verification.

If you can’t get the state from your own node you’ll have to get it from someone else. A friend or family member you trust that runs their own node would be an excellent source. This isn’t trustless, but will usually still have a very high level of trust even without any further verification.

Otherwise you’ll have to get the state from a public provider. Currently that’s just Infura, but ideally more options will be available in the future. We’re beyond personal trust circles but Infura is certainly still a very reputable provider so most people would still have a reasonable level of trust in them.

The final option is to get the state from some random person on the internet. Seems crazy and is definitely not something to be trusted, but it is still an option if you verify the state against more trusted sources. Early on, before Infura supported the API to download the state this was actually the most common way people used checkpoint sync. I would just periodically put a state up in a GitHub repo so they could access it.

Verify The State

If you get the state from a source you don’t fully trust, you’ll need to verify it. You can do this by calculating the hash tree root of the state, then checking that against one or more block explorers. In essence you’re aggregating your trust from multiple services until you (hopefully) reach a level you’re comfortable with.

The main problem with this is that you’ll need a tool to calculate the hash tree root of the state itself. It’s simpler to just use the state to sync your node then confirm that the block roots your node reports match block explorers. If you wind up at the right chain head, you must have started from a canonical state. You will likely want to disable any validators you run until you’ve verified the blocks - otherwise you may attest to something you don’t actually trust.

Why Can’t The Beacon Node Do It For Me?

There have been some proposals for beacon nodes to automatically verify the state using heuristics like whether the state matches what the majority of your peers have. I’m personally very skeptical of such ideas because if you could trust the information from the network, you wouldn’t need checkpoint sync in the first place. The fundamental challenge introduced by weak subjectivity is that your node simply can’t determine what the canonical chain is (if it’s been offline for too long) and so has to be told. Your node’s peers are essentially the adversary we’re trying to defend against with checkpoint sync so we want to avoid using any information from them to second guess the initial state.

The beacon node could automate the process of checking against multiple block explorers but there are two problems with that.

Firstly, there isn’t an agreed API to perform that check so we’d need to either design that and get block explorers to support it or write custom code in clients to support each block explorer.

Secondly, and more problematically, client developers would have to decide which block explorers are trust worthy and embed the list into their clients. There’s already a lot of responsibility in being a client dev and a lot of trust the community puts in us - we really don’t want to expand that by also being responsible for deciding which services users should trust. There could just be a config option so users could specify their own list of explorers to verify against but that’s a pretty clunky UX and it’s very unlikely users would go to the effort of finding suitable URLs and specifying them. Besides, it’s probably more work for them than just verifying the block roots manually.

All Posts

Symphonious

Living in a state of accord.