As Ethereum grows, the availability of on-chain data has come up more and more frequently.
As they attempt to resolve the so-called blockchain trilemma, which refers to the tradeoffs between security, scalability, and decentralization, Ethereum developers are now considering where and how data should be kept on blockchain networks.
Data availability in the context of cryptography refers to the idea that data stored on a network is accessible and retrievable by all network users.
On Ethereum layer-1, it is challenging to execute erroneous transactions because the network’s nodes download all the information in each block.
While this can provide security, it can also be a very wasteful process since it forces each network node to validate and store every piece of data in a block, which hinders blockchain scalability.
This issue is addressed by layer-2 scaling solutions for Ethereum.
The optimistic rollup, such as Arbitrum and Optimism, is a common modern approach. Rollups that presume transactions are genuine unless the contrary is established are called “optimistic” rollups.
According to Anurag Arjun, co-founder of the modular blockchain Avail, the majority of rollups today only have one sequencer, creating a danger of centralization.
Currently, this is not a significant issue because rollup solutions must store the raw transaction data on Ethereum using something called calldata, which is currently the least expensive form of storage on Ethereum, as Arjun points out.
According to Neel Somani, the creator of Eclipse, a blockchain scaling solution, once a calldata is published to the Ethereum mainnet, anybody can dispute whether or not it is correct within a predetermined time frame.
Once the allotted time has passed, the rollup will be deemed genuine on Ethereum if no one objects to it.
Somani points out that if they don’t have the data, it becomes difficult for someone to demonstrate that a transaction was carried out incorrectly.
You need to know precisely what I did in order to repair it, Somani explained, because if I don’t tell you what I did, you won’t be able to show that it was incorrect. Therefore, all blockchains must in some manner, shape, or form demonstrate data availability.
Data availability sampling
It might be inefficient to download an entire block onto a network, which brings up the initial data availability issue once more. All blockchains must demonstrate data availability.
Somani said, “As someone who doesn’t want to download the entire block, I still want the assurance that the data on the block is not being withheld.
Somani suggests that the usage of data availability sampling be used as a means of establishing the veracity of the block.
In order to gain arbitrarily high confidence that the block is there, Somani explains that data availability sampling entails sampling random portions of the block.
With this approach, relationships between variables in a block are modeled using polynomials, a mathematical expression with variables, coefficients, and exponentiation.
According to Somani, a widespread misconception about data availability sampling is that if you just sample half of a block, your confidence in the accuracy of the information in the block is only 50%. Unlike data availability sampling, users must make sure they have enough points to reconstruct the original polynomial, therefore this is untrue, he argues.
Solutions for data availability sampling are presently being developed by projects like Celestia and Avail.
Arjun told Blockworks, “What we really feel is that every base layer is going to be a data availability layer. The primary battle we are engaged in is over how to grow data availability at the base layer while maintaining execution and roll up on the second layer.