In practice, we have no interest in verifying availability of data that has never been published. This has an impact on the design of the data availability system, because we need to be able to efficiently publish the availability of data without violating its privacy. There are pretty good systems for doing this. The erasure code bits that might not be published are not truly available; in the case that an erasure code block can be proven to be available, it is so because at least one of the nodes in the network has published it. At the end of this post, I describe a system that can be used to teach peer-to-peer erasure codes, which is an important step in understanding them.
We want to ensure that we can scan all the history of data A is available. This means we must be able to scan all the blocks that have it available. With current implementations, this means that we cannot verify that it is (even) available for every single block — we can check it for all blocks before one-time blocks expire. To verify that it is available across all blocks, we need to be able to scan all history of the data until one-time blocks expire. To expose the data, we need to be able to publish it in the first place and verify via an oracle.
A simple fraud proof solution does not work because it relies on knowledge of the data set (or randomly choosing a block for each data point). The data set remains the same for a lot, but there is not sufficient entropy to make random sampling sufficiently difficult to crack. If you want to make it robust, then it has to use and use actual data from the data set (for example, data X is available when the original record for data X was submitted; data Y never existed before data X, so if data Y is available, then that proves data X is available). 7211a4ac4a