-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Answering: "What happens today when we run nimbus --mainnet
?"'
#863
Labels
Networking
Security
Security vulnerability or security related
Sync
Prevents or affects sync with Ethereum network
Comments
jlokier
added
Networking
Security
Security vulnerability or security related
Sync
Prevents or affects sync with Ethereum network
labels
Oct 19, 2021
jlokier
added a commit
that referenced
this issue
Feb 16, 2022
A saving of about 24,000 GB over the current Mainnet disk space. At last, it's feasible to work with all Mainnet states up to the current head block. With this patch, Nimbus-eth1 has access to the entire Mainnet state history, by reading from a specially-constructed database file of size 167.47 GiB which contains all the states. For the first time ever, it's possible to run Nimbus-eth1 on high numbered Mainnet blocks! Validate the processing, and run things like current-day transactions on current-day states. It's read-only at the moment. The format is not a fixed read-only format. It's actually designed to be part of a writable database, but it's been kept simple to ship something and be a proof of concept emphasising size. Using this ability, Nimbus-eth1 can validate blocks throughout the whole history, and a number of blocks have been tried. A smaller 25.03 GiB file is available for 90k blocks pruned state. Files for Goerli are also available. These files are available on request to test this code if someone wants to. (They can also be regenerated, but doing so requires a big machine and a synced Erigon instance. You may prefer to just get the files). This frees people up to work on other areas with _full_ access to the Mainnet states, all the way from block 0 to near today's head block. Each value is looked up, and compared with the value stored in RocksDB. Ultimately RocksDB can be dropped, but this is meant to be a proof of concept so for now it just compares values. Some blocks fail. Close investigations with Etherscan's help indicate the data in the database file is fine, and it is the comparison function that misses some balance updates, for example when a transaction involves the same account as the miner. **Size figures** **Mainnet Ethereum "archive mode" state history in 167.47 GiB**. (Blocks 0 to 13818907. The final block is dated 2021-12-16 22:38:47 UTC). This compares extremely favourably\* with [8,646 GiB and 8,719 GiB (charts)](https://etherscan.io/chartsync/chainarchive) used by popular implementations Geth and OpenEthereum respectively at the same time frame. It's a profound improvement over [22,911 GiB for Nimbus-eth1](#863) (= 24.6 TB), which this approach to storage was designed to address. \* Note that those Etherscan charts show space used by other things than just state-history, but state-history accounts for almost all of that space. To finish the comparison, minimum required Merkle hashes, block bodies, block headers, contract code and receipts must be added. Some more space on top is required in an actively updated database. Some experiments have been done and there are good reasons to believe all those things can be fitted in less than 420 GiB more "estimated worst case". **Pruned size** "Pruned mode" state history comes to **25.03 GiB**. (Blocks 13728908-13818907, 90k history). This also compares favourably\* with [pruned mode charts](https://etherscan.io/chartsync/chaindefault), but the picture is more complicated with pruned state, as the other things contribute more significantly to the size in those charts. Even so, the pruned state size is promising. **Lookup performance** Any account or storage can be looked up at any point in block time in O(log N) time using these files. This proof of concept is focused more on demonstrating small size than time, so the constant factor of the big-O notation is quite high, but when fully optimised the constant factor will have low IOPS, and reasonable for CPU and memory. **Space first, speed second** This is a proof of concept designed to highlight _space used_, rather than time. The compact database is part of an implementation in progress of an on-disk data structure designed to be fast as well, for Ethereum use cases. Specifically, fast at random-access writes for EVM execution, and fast with low write-amplification for network state synchronisation. It is neither a B-tree nor an LSM-tree but has elements of both. However the current implementation, although O(log N), has a high constant time and I/O factor. The number of I/O operations (IOPS) is significantly higher than necessary. Speed will improve greatly when index blocks and structures inside each block are added to improve the performance. With those in place, the IOPS will drop to _less than 1 IOPS_ average per account/storage query during EVM executions, even at Mainnet archive scale. The structure is also designed to support fast network sync, and to store the received data efficiently without write-amplification. The ad-hoc encoding of individual values has been through many iterations to optimise the assignment of bits and opcodes to different purposes, but a number of improvements are still known that would reduce size further. Signed-off-by: Jamie Lokier <jamie@shareable.org>
Obsolete after significant improvements and optimizations, and references to various Goerli blocks are not useful anymore. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Networking
Security
Security vulnerability or security related
Sync
Prevents or affects sync with Ethereum network
This meta issue is to write down a few observations when we run
nimbus --mainnet
(now changed tonimbus --network=mainnet
).Nobody tested
nimbus --mainnet
with the serious goal of completing sync to head yet. It was known to take a lot of space and be I/O intensive. People said it would take "about 2 TB". So I thought, these days we can afford that and decided to try it out properly, to see what happens for real so we're not guessing. Boy was 2 TB an underestimate. At block 6000000 (44.69%) and 4.1 TB storage used, I stopped. Total estimated space is 9-15 TB. (A similar order of magnitude as Geth or OpenEthereum in archive mode.)Most issues are more easily found and fixed by syncing to Goerli first. We won't note those here. For those, see instead the related issue #862 'Answering: "What happens today when we run
nimbus --goerli
?"' which has a detailed list of issues, all of which affect Mainnet too. (See also related issues #688 "Sync to Mainnet" and #687 "Sync to Goerli".)Issues which are specific to Mainnet should be filed individually and fixed one by one, outside this meta issue. Ideally we will file those issues and fixes, and update this meta issue to point to them.
Time and space required
It has proven useful to know a guideline for how much time and storage to expect, so a Mainnet sync can be replicated without going through the tedium of trial and error, disk full recovery efforts, etc.
(Later commits are required to complete, see issues in #862)
Projected to be 24.6 TB at head on 2022-01-13 (using Etherscan curve)
To reach block 6000000 in a similar time, you will need to run
nimbus --mainnet
in a loop to auto-restart it when it crashes, and with enough storage space. The time shown above does not count stops during the test where Nimbus crashed and was later restarted after analysis, time to recover from disk full conditions, time spent syncing which was reverted to a clean storage snapshot, or time when sync progress was stopped due to one of the bugs affecting progress. (True calendar time for this test was 28 days 18 hours).Total storage estimate
UPDATE 2022-01-13: The estimate total storage to reach Mainnet head at block 13993867 is 24.6 TB. This turned out to be considerably larger than the 9-15 TB initial estimate below. It was found by using Geth and OpenEthereum growth chart at Etherscan to estimate the space ratio growing from from block 6000000 to the head block.
Estimated total storage to reach Mainnet head at block 13425180 is 9-15 TB. I didn't have enough spare SSD to run Mainnet that far. The estimate is from extrapolating block 600000 / 4.1 TB to block 13425180 which gives 9.2 TB, and then adding more because experience with the smaller networks suggests the growth rate increases later in the chain. (Goerli grew from 396GB to 805GB between blocks 4792321 and 5631351.)
About the database size
The default prune mode is in operation, which is
--prune:full
, and in fact many state pruning events are performed. It's not possible to recover full state history from this database. This is not an "archive node", despite what the size may suggest.Note: Especially to readers outside the core team, it's worth mentioning the database and sync method are being replaced by an Exciting New Design™🏝 that is much faster and smaller. This test and #862 (Goerli test) were done to examine the current status, systematically track every issue that shows up so we can address them, and get handy baseline measurements to compare against.
Issues specific to Mainnet
Issues listed in #862 'Answering: "What happens today when we run nimbus --goerli?"' that affect both Goerli and Mainnet aren't duplicated here.
Most differences are general things about scale and adversity, but these are not specific bugs:
Below is a consensus bug which was seen only on Mainnet, and prevented sync from progressing. Because testing only went a little past 6000000 (44.69%), there may be other logic issues that we have not detected at higher block numbers.
Points where bulk sync stopped (only seen on Mainnet)
Progress stopped at block 6000961. This block number was due to a consensus bug at block 6001128 (see next), and the batching logic in
blockchain_sync
which does 192 blocks at a time and aborts the whole batch when any block fails.Consensus bug at block 6001128. This occurs on a
CREATE
orCREATE2
operation, but it is not the same bug as the one at Goerli block 5080941 (see Answering: "What happens today when we runnimbus --goerli
?" #862).writeContract
logic which occurs on aCREATE
orCREATE2
operation, but it is not the same bug as the one affecting Goerli block 5080941 (see Answering: "What happens today when we runnimbus --goerli
?" #862) which is also linked towriteContract
logic.CREATE
/CREATE2
's returndata bug", which changes handling of receivedreturnData
from calling a nested contract.writeContract
instead of handling of receivedreturnData
. Because of this overlap, the Mainnet consensus bug at 6001128 and Goerli consensus bug at 5080941 were thought to be the same bug at first.SELFDESTRUCT
interaction withCREATE
orCREATE2
.The text was updated successfully, but these errors were encountered: