Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Storage chains part 1 #7868

Merged
11 commits merged into from
Jan 14, 2021
Merged

Storage chains part 1 #7868

11 commits merged into from
Jan 14, 2021

Conversation

arkpar
Copy link
Member

@arkpar arkpar commented Jan 11, 2021

Initial groundwork towards supporting storage chains in substrate. This includes transactions addressable in the database and block/transaction pruning.

Future PRs will include:

  • Serving transactions over IPFS bitswap
  • Ref-counted transactions and renewals.
  • Offchain worker API
  • Runtime module

polkadot companion: paritytech/polkadot#2253

@arkpar arkpar added the A3-in_progress Pull request is in progress. No review needed at this stage. label Jan 11, 2021
@burdges
Copy link

burdges commented Jan 11, 2021

Can you explain what "storage chain" means? Why is IPFS involved? What purpose it serves, aka what is the threat model?

@arkpar
Copy link
Member Author

arkpar commented Jan 11, 2021

The idea is to maximize on-chain storage efficiency. Here's Gav's writeup on the feature:

v1.0: Finite history direct storage chain:

  • 6 second blocks (10/min), 4 MB max extrinsics/block
  • IPFS-addressable extrinsics (new feature)
  • This chain doesn't have its own internal economy but rather uses a DOT derivative.
  • State only handles data market and maintains upcoming renewals. Each block has an array of sizes, one for each extrinsic, rounded up to 256 bytes. Assuming max 16MB extrinsics a 24-bit size), we can drop the lower 8-bits to get a Vec<u16>. Assuming 40KB average item size, then 100 items per block, = 200 bytes state per block. 100,000 state entries for each block in 7 day period, and ~20 MB state.
  • Each block from the last 7 days also has a 32-byte Merkle root stored in state. The trie is formed (in-memory) from the 32-byte chunked extrinsics (or perhaps use some other chunking means), indexed by (extrinsic_index, chunk_index). This should be checked as part of block verification.
  • Creating a new block requires solving a short proof-of-work. The seed for the PoW is derived from several randomly selected bytes from the block 7 days prior (and which is now going out of scope); the proof of these bytes as well as the PoW solution is included on-chain. The risk of short-term forks may first be reduced through a proof-of-stake mechanism which applies different difficult requirements to each potential block author.
  • Block authors are paid out 50% of the fees immediately, and 50% is placed in a pot for them to split pro-rata weekly.
  • New data extrinsics charged per-byte (to pay for network storing a byte for 7 days); renewals charged per-byte on their original data (the size is in state). A single fee for the book-keeping of the size payable, too. Since everything is fixed-timespan, there's no issue with fee-charged (as opposed to deposit-taking).
  • Network maintainers (i.e. full nodes) are required to retain
  • Assume only 7-day block history (and all current state) needed for sync of full nodes.
  • All data (extrinsics) kept for minimum 7 days. Data can be renewed with a simple on-chain tx (no need to resend data).
  • Nodes throw out old (> 7 days) blocks and their data.
  • Total max block data needed to be stored by full-nodes: 7*24*60*10*4MB =~ 400GB
  • Needs fast-sync (warp sync not so important) along with only partial block history download. Syncing beyond 7 days of history is designed to be needless for running a block-authoring full-node.
  • An optimised publication of blocks allow extrinsics whose hash already appears in chain history to be ommitted. For normal sync-behaviour, the full block (all extrinsics) remains (to ensure that nodes which are not yet synced and don't have full history can still sync properly)

@arkpar arkpar marked this pull request as ready for review January 12, 2021 11:43
@arkpar arkpar added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. and removed A3-in_progress Pull request is in progress. No review needed at this stage. labels Jan 12, 2021
@arkpar arkpar requested a review from bkchr January 13, 2021 09:07
Copy link
Contributor

@cheme cheme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, to be sure, the PR introduce:

  • an alternate block storage mode config (not switchable) that allows addressing extrinsics directly by their hash.
  • pruning for block body (in both mode).

Wondering a bit about addressing by something like 'BlockID ++ Extrinsic index' instead of extrinsic hash.
But I guess it doesn't make sense if extrinsic/transaction hash is going to be replaced with ipfs address and support ref counting?

'Transaction' usage in naming and description instead of block extrinsic confused me a little, but it is probably related to the targeted functionalities (which I am unaware of).

/// State pruning settings.
pub state_pruning: PruningMode,
/// Block pruning settings.
pub keep_blocks: KeepBlocks,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

block_pruning ? body_pruning ? for name consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike state pruning, the only setting here is the number of blocks to keep. So I went with more straightforward name.

let mut hashes = Vec::with_capacity(body.len());
for extrinsic in body {
let extrinsic = extrinsic.encode();
let hash = HashFor::<Block>::hash(&extrinsic);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some collision possible here? two identically encoded extrinsic between two different blocks.
Edit: just viewed the 'ref counted' comment in the pr description, just not 100% sure it is related.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collisions will be handled by reference counting indeed. ParityDb should use internal reference counting mechanism and for RocksDb there will be an explicit counter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this explicit counter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do you mean this counter is added automatically by kvdb?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not implemented yet. This will require a new extrinsic format. Will be added in a future PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this require a new extrinsic format?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The native code will have to distinguish between extrinsics that actually contain new data and the ones that reference a previous extrinsic. This will probably require some special version of OpaqueExtrinsic.
Quoting Gav:

so one way would be introducing a new kind of extrinsic which is like a "by-reference" extrinsic.
and assumes that the node either already stores the preimage or that it can easily find it from other nodes.
this might also be useful for block propagation and alleviate the need to gossip the entire block if you know most of your peers already have the transactions in it

Also according to Gav, this should not be handled by the runtime, but rather by the database/networking layer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh okay, makes sense.

fn new(db: Arc<dyn Database<DbHash>>) -> ClientResult<Self> {
fn new(
db: Arc<dyn Database<DbHash>>,
transaction_storage: TransactionStorage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
transaction_storage: TransactionStorage
transaction_storage: TransactionStorage,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'TransactionStorage' name confused me a little, my first thought was that it was a specific storage instance.

Maybe 'Mode' or 'Config' naming variant or 'StoreTransaction' (similar to 'KeepBlock') could be alternative naming.

if finalized >= keep.into() {
let number = finalized.saturating_sub(keep.into());
match read_db(&*self.storage.db, columns::KEY_LOOKUP, columns::BODY, BlockId::<Block>::number(number))? {
Some(body) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a use case where we want to remove the transactions but not the body reference? (from description not really)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think so

@arkpar
Copy link
Member Author

arkpar commented Jan 13, 2021

Wondering a bit about addressing by something like 'BlockID ++ Extrinsic index' instead of extrinsic hash.
But I guess it doesn't make sense if extrinsic/transaction hash is going to be replaced with ipfs address and support ref counting?

Right, IPFS content is addressable by content hash. And the idea with renewal transactions is that that they put the same content on the chain by increasing the reference counter.

'Transaction' usage in naming and description instead of block extrinsic confused me a little, but it is probably related to the targeted functionalities (which I am unaware of).

My understanding that we use the term "transaction" for user generated pieces of block body. And "extrinsic" should not be user facing. So I've used "transaction" in the database layer to underline the fact that this is intended for storing blocks of user data, addressable by data hash.

Copy link
Contributor

@cheme cheme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding that we use the term "transaction" for user generated pieces of block body. And "extrinsic" should not be user facing. So I've used "transaction" in the database layer to underline the fact that this is intended for storing blocks of user data, addressable by data hash.

Makes me a bit curious about this 'Finite history direct storage chain' purpose.

I am not really sold to using a different name (transaction) in the context of this storage mode.

Anyway, code looks good to me.

@burdges
Copy link

burdges commented Jan 13, 2021

The idea is to maximize on-chain storage efficiency. Here's Gav's writeup on the feature:

What does it store? And why?

Ignoring parachains internal archetecture, there are three storage security models proposed for Polkadot:

  • Relay chain blocks and state known by all validators.
  • Availability store data guaranteed available by validators, but efficiently erasure coded across all validators. It expires data after 24 hours currently.
  • XCMP messages provided by a sending block's approval checkers.
  • Archival long-term storage where we make zero security guarantees and data could simply vanish forever, but it theoretically exists for curious people.

If I understand, you've designed an archival-ish storage system with zero security guarantees, meaning PVFs cannot ever depend upon data stored here, yes? In other words, anytime a PVF wants data stored here it must slurp said data back into its own block, which then makes the data available for validators. If so, why should this require Polkadot integration?

Also, you expire data after 7 days. Why archive such a short time?

You're doing so as a parachain. Why? As a parachain, this pushes its data into the availability store, meaning this costs a parachain exactly what placing data into its own blocks costs.

* IPFS-addressable extrinsics (new feature)

What does this mean?

* State only handles data market and maintains upcoming renewals. 

So there is zero user data in these block? Only IPFS addresses for which storage nodes store the data? If so, yes this makes sense as a storage model. :)

Why would this require any special polkadot integration though? We cannot use this data from polkadot directly so why not do everything inside the parachain's runtime?

Is the 7 days just an accounting feature?

* Each block from the last 7 days also has a 32-byte Merkle root stored in state. The trie is formed (in-memory) from the 32-byte chunked extrinsics (or perhaps use some other chunking means), indexed by `(extrinsic_index, chunk_index)`. This should be checked as part of block verification.

What does this mean?

* Creating a new block requires solving a short proof-of-work. The seed for the PoW is derived from several randomly selected bytes from the block 7 days prior (and which is now going out of scope); the proof of these bytes as well as the PoW solution is included on-chain. The risk of short-term forks may first be reduced through a proof-of-stake mechanism which applies different difficult requirements to each potential block author.

I'm confused: A priori, I'd envision this being some parathread, so surely paying the parathread fee mostly suffices, no? Are we talking huge amounts of data here? Like what?

* Block authors are paid out 50% of the fees immediately, and 50% is placed in a pot for them to split pro-rata weekly.

Is the block author hosting all the user data? If not, they're not doing much work, so no reason to pay them. We've no mechanism to enforce they serve the data obviously, but that's fine since we'll never depend upon the data.

* Network maintainers (i.e. full nodes) are required to retain

Who are these? Not Polkadot full nodes obviously. It's whoever joins this parachain?

* Assume only 7-day block history (and all current state) needed for sync of full nodes.
* All data (extrinsics) kept for minimum 7 days. Data can be renewed with a simple on-chain tx (no need to resend data).

That's useful.


Sorry that got long..

In short, I'm unsure why this requires any polkadot integration, given that polkadot cannot depend upon archival-like guarantees.

Just fyi, we'll later expand availability store usage for required data: At present, anytime a parachain calls set_code we require all polkadot nodes see its code immediately. In theory, parachains could post_code with their own parachain candidate block, which then stores the new code into the erasure coded availability store. After we finalize the relay chain block including post_code parachain candidate, then the parachain could switch_code into the new code. At this point, we've perhaps only one polkadot validator who possesses the code. Yet, any who require this code could fetch it like approval checkers fetch the blocks they check, thanks to the availability store. We'll require this for hierarchical/multiple relay chain operation, and maybe it'd make polkadot more efficient even now, but currently we're hoping code churns slowly enough to make it worth storing parachain code on the relay chain.

@arkpar
Copy link
Member Author

arkpar commented Jan 13, 2021

@burdges This PR does not mention polkadot or parachains anywhere. This is to add generic support for such chains in substrate. The idea is to store user data not in the state merkle tree, but rather as block bodies. Runtime only keeps track of data hashes but has no access to the data itself. This way storage would be way less expensive. @gavofyork can probably explain rationale better than myself.

Users send data as transaction payload, which is added to the block database (hosted by all full nodes) and may be retrieved by CID (data hash) over IPFS.
The payload is removed from the database after 7 days unless there's another renewal transaction that references existing data hash. The runtime only validates the merkle tree route of all the added/renewed data hashes in the block.

@burdges
Copy link

burdges commented Jan 13, 2021

Runtime only keeps track of data hashes but has no access to the data itself.

Alright sounds fine then. I'm still unclear on why this needs special support, but anyways it breaks nothing afaik. :)

@burdges
Copy link

burdges commented Jan 14, 2021

What purpose does the PoW serve here? Any block production constraint suffices, no?

@arkpar
Copy link
Member Author

arkpar commented Jan 14, 2021

@burdges right. Initial implementation will use existing consensus at least. Not sure why @gavofyork mentioned PoW there, but it is out of the scope for now.

Alright sounds fine then. I'm still unclear on why this needs special support, but anyways it breaks nothing afaik. :)

Special support would be in the database/networking layer to enable querying and reference-counting individual transactions.

Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Mostly nitpicks.

We should make sure that this appears in the next polkadot release as information for the users that the db format changes.

client/db/src/lib.rs Outdated Show resolved Hide resolved
client/db/src/lib.rs Outdated Show resolved Hide resolved
client/db/src/upgrade.rs Outdated Show resolved Hide resolved
client/db/src/utils.rs Outdated Show resolved Hide resolved
@@ -327,6 +327,23 @@ pub fn read_db<Block>(
})
}

/// Remove database column entry for the given block.
pub fn remove_db<Block>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that this naming follows the read_db function above, but it is really confusing to me IMHO.

Isn't this more a remove_block_from_db or similar?

If I understand this correctly, this will remove a block entry from the db?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is generic and may remove anything that is referenced with col_index. E.g. justifications. I'll rename it to remove_from_db

let mut hashes = Vec::with_capacity(body.len());
for extrinsic in body {
let extrinsic = extrinsic.encode();
let hash = HashFor::<Block>::hash(&extrinsic);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do you mean this counter is added automatically by kvdb?

if let KeepBlocks::Some(keep_blocks) = self.keep_blocks {
// Always keep the last finalized block
let keep = std::cmp::max(keep_blocks, 1);
if finalized >= keep.into() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if finalized >= keep.into() {
if finalized >= keep.into() {
return Ok(())
}

Than we don't require that much indentation :D

client/db/src/lib.rs Outdated Show resolved Hide resolved
columns::BODY,
BlockId::<Block>::number(number)
)?;
debug!(target: "db", "Removing block #{}", number);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be moved above the remove_db call, because that one could fail and we would not see this log message.

Comment on lines 1522 to 1536
match self.transaction_storage {
TransactionStorageMode::BlockBody => {},
TransactionStorageMode::StorageChain => {
match Vec::<Block::Hash>::decode(&mut &body[..]) {
Ok(hashes) => {
for h in hashes {
transaction.remove(columns::TRANSACTION, h.as_ref());
}
}
Err(err) => return Err(sp_blockchain::Error::Backend(
format!("Error decoding body list: {}", err)
)),
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
match self.transaction_storage {
TransactionStorageMode::BlockBody => {},
TransactionStorageMode::StorageChain => {
match Vec::<Block::Hash>::decode(&mut &body[..]) {
Ok(hashes) => {
for h in hashes {
transaction.remove(columns::TRANSACTION, h.as_ref());
}
}
Err(err) => return Err(sp_blockchain::Error::Backend(
format!("Error decoding body list: {}", err)
)),
}
}
}
if let TransactionStorageMode::StorageChain = self.transaction_storage {
match Vec::<Block::Hash>::decode(&mut &body[..]) {
Ok(hashes) => {
for h in hashes {
transaction.remove(columns::TRANSACTION, h.as_ref());
}
}
Err(err) => return Err(sp_blockchain::Error::Backend(
format!("Error decoding body list: {}", err)
)),
}
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would really prefer to keep match here for the same reason match is required to be exhaustive in rust. Code update will be required In case a new variant is added to the enum.

Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
arkpar and others added 2 commits January 14, 2021 17:38
Co-authored-by: Bastian Köcher <bkchr@users.noreply.github.com>
@arkpar arkpar requested a review from bkchr January 14, 2021 14:46
let mut hashes = Vec::with_capacity(body.len());
for extrinsic in body {
let extrinsic = extrinsic.encode();
let hash = HashFor::<Block>::hash(&extrinsic);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this require a new extrinsic format?

@bkchr
Copy link
Member

bkchr commented Jan 14, 2021

bot merge

@ghost
Copy link

ghost commented Jan 14, 2021

Trying merge.

@ghost ghost merged commit 8ee55dd into master Jan 14, 2021
@ghost ghost deleted the a-tx-storage branch January 14, 2021 18:55
@h4x3rotab
Copy link
Contributor

Is there a general level tracking issue for the storage chain? I think it's very helpful for our use case: We need to store a series of "snapshot" for the encrypted confidential contract states and do re-encryption periodically. Since they are just snapshots, only the recent ones are actually useful, and therefore we would like to just prune the older snapshots by default for all the full node. It can make our blockchain much much lighter and more efficient.

This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants