Codex update 07/12/2023 to 07/21/2023

Overall we continue working in various directions, distributed testing, marketplace, p2p client, research, etc…

Our main milestone is to have a fully functional testnet with the marketplace and durability guarantees deployed by end of year. A lot of grunt work is being done to make that possible. Progress is steady, but there are lots of stabilization and testing & infra related work going on.

We’re also onboarding several new members to the team (4 to be precise), this will ultimately accelerate our progress, but it requires some upfront investment from some of the more experienced team members.

DevOps/Infrastructure:

  • Adopted nim-codex Docker builds for Dist Tests.
  • Ordered Dedicated node on Hetzner.
  • Configured Hetzner StorageBox for local backup on Dedicated server.
  • Configured new Logs shipper and Grafana in Dist-Tests cluster.
  • Created Geth and Prometheus Docker images for Dist-Tests.
  • Created a separate codex-contracts-eth Docker image for Dist-Tests.
  • Set up Ingress Controller in Dist-Tests cluster.

Testing:

  • Set up deployer to gather metrics.
  • Debugging and identifying potential deadlock in the Codex client.
  • Added metrics, built image, and ran tests.
  • Updated dist-test log for Kibana compatibility.
  • Ran dist-tests on a new master image.
  • Debugging continuous tests.

Development:

  • Worked on codex-dht nimble updates and fixing key format issue.
  • Updated CI and split Windows CI tests to run on two CI machines.
  • Continued updating dependencies in codex-dht.
  • Fixed decoding large manifests (PR #479).
  • Explored the existing implementation of NAT Traversal techniques in nim-libp2p.

Research

Cross-functional (Combination of DevOps/Testing/Development):

  • Fixed discovery related issues.
  • Planned Codex Demo update for the Logos event and prepared environment for the demo.
  • Described requirements for Dist Tests logs format.
  • Configured new Logs shipper and Grafana in Dist-Tests cluster.
  • Dist Tests logs adoption requirements - Updated log format for Kibana compatibility.
  • Hetzner Dedicated server was configured.
  • Set up Hetzner StorageBox for local backup on Dedicated server.
  • Configured new Logs shipper in Dist-Tests cluster.
  • Setup Grafana in Dist-Tests cluster.
  • Created a separate codex-contracts-eth Docker image for Dist-Tests.
  • Setup Ingress Controller in Dist-Tests cluster.

Conversations

  1. zk_id 07/24/2023 11:59 AM

We’ve explored VDI for rollups ourselves in the last week, curious to know your thoughts

  1. dryajov 07/25/2023 1:28 PM

It depends on what you mean, from a high level (A)VID is probably the closest thing to DAS in academic research, in fact DAS is probably either a subset or a superset of VID, so it’s definitely worth digging into. But I’m not sure what exactly you’re interested in, in the context of rollups…

  1. zk_id 07/25/2023 3:28 PM

    The part of the rollups seems to be the base for choosing proofs that scale linearly with the amount of nodes (which makes it impractical for large numbers of nodes). The protocol is very simple, and would only need to instead provide constant proofs with the Kate commitments (at the cost of large computational resources is my understanding). This was at least the rationale that I get from reading the paper and the conversation with Bunz, one of the founders of the Espresso shared sequencer (which is where I found the first reference to this paper). I guess my main open question is why would you do the sampling if you can do VID in the context of blockchains as well. With the proofs of dispersal on-chain, you wouldn’t need to do that for the agreement of the dispersal. You still would need the sampling for the light clients though, of course.

  2. dryajov 07/25/2023 8:31 PM

    I guess my main open question is why would you do the sampling if you can do VID in the context of blockchains as well. With the proofs of dispersal on-chain, you wouldn’t need to do that for the agreement of the dispersal.

    Yeah, great question. What follows is strictly IMO, as I haven’t seen this formally contrasted anywhere, so my reasoning can be wrong in subtle ways.

    • (A)VID - dispersing and storing data in a verifiable manner
    • Sampling - verifying already dispersed data

    tl;dr Sampling allows light nodes to protect against dishonest majority attacks. In other words, a light node cannot be tricked to follow an incorrect chain by a dishonest validator majority that withholds data. More details are here - https://dankradfeist.de/ethereum/2019/12/20/data-availability-checks.html ------------- First, DAS implies (A)VID, as there is an initial phase where data is distributed to some subset of nodes. Moreover, these nodes, usually the validators, attest that they received the data and that it is correct. If a majority of validators accepts, then the block is considered correct, otherwise it is rejected. This is the verifiable dispersal part. But what if the majority of validators are dishonest? Can you prevent them from tricking the rest of the network from following the chain?

    Dankrad Feist

    Data availability checks

    Primer on data availability checks

  3. [8:31 PM]

    Dealing with dishonest majorities

    This is easy if all the data is downloaded by all nodes all the time, but we’re trying to avoid just that. But lets assume, for the sake of the argument, that there are full nodes in the network that download all the data and are able to construct fraud proofs for missing data, can this mitigate the problem? It turns out that it can’t, because proving data (un)availability isn’t a directly attributable fault - in other words, you can observe/detect it but there is no way you can prove it to the rest of the network reliably. More details here https://github.com/ethereum/research/wiki/A-note-on-data-availability-and-erasure-coding So, if there isn’t much that can be done by detecting that a block isn’t available, what good is it for? Well nodes can still avoid following the unavailable chain and thus be tricked by a dishonest majority. However, simply attesting that data has been publishing is not enough to prevent a dishonest majority from attacking the network. (edited)

  4. dryajov 07/25/2023 9:06 PM

    To complement, the relevant quote from https://github.com/ethereum/research/wiki/A-note-on-data-availability-and-erasure-coding, is:

    Here, fraud proofs are not a solution, because not publishing data is not a uniquely attributable fault - in any scheme where a node (“fisherman”) has the ability to “raise the alarm” about some piece of data not being available, if the publisher then publishes the remaining data, all nodes who were not paying attention to that specific piece of data at that exact time cannot determine whether it was the publisher that was maliciously withholding data or whether it was the fisherman that was maliciously making a false alarm.

    The relevant quote from from https://dankradfeist.de/ethereum/2019/12/20/data-availability-checks.html, is:

    There is one gap in the solution of using fraud proofs to protect light clients from incorrect state transitions: What if a consensus supermajority has signed a block header, but will not publish some of the data (in particular, it could be fraudulent transactions that they will publish later to trick someone into accepting printed/stolen money)? Honest full nodes, obviously, will not follow this chain, as they can’t download the data. But light clients will not know that the data is not available since they don’t try to download the data, only the header. So we are in a situation where the honest full nodes know that something fishy is going on, but they have no means of alerting the light clients, as they are missing the piece of data that might be needed to create a fraud proof.

    Both articles are a bit old, but the intuitions still hold.

July 26, 2023

  1. zk_id 07/26/2023 10:42 AM

    Thanks a ton @dryajov ! We are on the same page. TBH it took me a while to get to this point, as it’s not an intuitive problem at first. The relationship between the VID and the DAS, and what each is for is crucial for us, btw. Your writing here and your references give us the confidence that we understand the problem and are equipped to evaluate the different solutions. Deeply appreciate that you took the time to write this, and is very valuable.

  2. [10:45 AM]

    The dishonest majority is critical scenario for Nomos (essential part of the whole sovereignty narrative), and generally not considered by most blockchain designs

  3. zk_id

    Thanks a ton @dryajov ! We are on the same page. TBH it took me a while to get to this point, as it’s not an intuitive problem at first. The relationship between the VID and the DAS, and what each is for is crucial for us, btw. Your writing here and your references give us the confidence that we understand the problem and are equipped to evaluate the different solutions. Deeply appreciate that you took the time to write this, and is very valuable.

    dryajov 07/26/2023 4:42 PM

    Great! Glad to help anytime

  4. zk_id

    The dishonest majority is critical scenario for Nomos (essential part of the whole sovereignty narrative), and generally not considered by most blockchain designs

    dryajov 07/26/2023 4:43 PM

    Yes, I’d argue it is crucial in a network with distributed validation, where all nodes are either fully light or partially light nodes.

  5. [4:46 PM]

    Btw, there is probably more we can share/compare notes on in this problem space, we’re looking at similar things, perhaps from a slightly different perspective in Codex’s case, but the work done on DAS with the EF directly is probably very relevant for you as well

July 27, 2023

  1. zk_id 07/27/2023 3:05 AM

    I would love to. Do you have those notes somewhere?

  2. zk_id 07/27/2023 4:01 AM

    all the links you have, anything, would be useful

  3. zk_id

    I would love to. Do you have those notes somewhere?

    dryajov 07/27/2023 4:50 PM

    A bit scattered all over the place, mainly from @Leobago and @cskiraly @cskiraly has a draft paper somewhere

July 28, 2023

  1. zk_id 07/28/2023 5:47 AM

    Would love to see anything that is possible

  2. [5:47 AM]

    Our setting is much simpler, but any progress that you make (specifically in the computational cost of the polynomial commitments or alternative proofs) would be really useful for us

  3. zk_id

    Our setting is much simpler, but any progress that you make (specifically in the computational cost of the polynomial commitments or alternative proofs) would be really useful for us

    dryajov 07/28/2023 4:07 PM

    Yes, we’re also working in this direction as this is crucial for us as well. There should be some result coming soon(tm), now that @bkomuves is helping us with this part.

  4. zk_id

    Our setting is much simpler, but any progress that you make (specifically in the computational cost of the polynomial commitments or alternative proofs) would be really useful for us

    bkomuves 07/28/2023 4:44 PM

    my current view (it’s changing pretty often :) is that there is tension between:

    • commitment cost
    • proof cost
    • and verification cost

    the holy grail which is the best for all of them doesn’t seem to exist. Hence, you have to make tradeoffs, and it depends on your specific use case what you should optimize for, or what balance you aim for. we plan to find some points in this 3 dimensional space which are hopefully close to the optimal surface, and in parallel to that figure out what balance to aim for, and then choose a solution based on that (and also based on what’s possible, there are external restrictions)

July 29, 2023

  1. bkomuves

    my current view (it’s changing pretty often :) is that there is tension between: 

    • commitment cost
    • proof cost
    • and verification cost

     the holy grail which is the best for all of them doesn’t seem to exist. Hence, you have to make tradeoffs, and it depends on your specific use case what you should optimize for, or what balance you aim for. we plan to find some points in this 3 dimensional space which are hopefully close to the optimal surface, and in parallel to that figure out what balance to aim for, and then choose a solution based on that (and also based on what’s possible, there are external restrictions)

    zk_id 07/29/2023 4:23 AM

    I agree. That’s also my understanding (although surely much more superficial).

  2. [4:24 AM]

    There is also the dimension of computation vs size cost

  3. [4:25 AM]

    ie the VID scheme (of the paper that kickstarted this conversation) has all the properties we need, but it scales n^2 in message complexity which makes it lose the properties we are looking for after 1k nodes. We need to scale confortably to 10k nodes.

  4. [4:29 AM]

    So we are at the moment most likely to use KZG commitments with a 2d RS polynomial. Basically just copy Ethereum. Reason is:

    • Our rollups/EZ leader will generate this, and those are beefier machines than the Base Layer. The base layer nodes just need to verify and sign the EC fragments and return them to complete the VID protocol (and then run consensus on the aggregated signed proofs).
    • If we ever decide to change the design for the VID dispersal to be done by Base Layer leaders (in a multileader fashion), it can be distributed (rows/columns can be reconstructed and proven separately). I don’t think we will pursue this, but we will have to if this scheme doesn’t scale with the first option.

August 1, 2023

  1. dryajov

    A bit scattered all over the place, mainly from @Leobago and @cskiraly @cskiraly has a draft paper somewhere

    Leobago 08/01/2023 1:13 PM

    Note much public write-ups yet. You can find some content here:

    We also have a few Jupiter notebooks but they are not public yet. As soon as that content is out we can let you know 🙂

    Codex Storage Blog

    Data Availability Sampling

    The Codex team is busy building a new web3 decentralized storage platform with the latest advances in erasure coding and verification systems. Part of the challenge of deploying decentralized storage infrastructure is to guarantee that the data that has been stored and is available for retrieval from the beginning until

    GitHub

    GitHub - codex-storage/das-research: This repository hosts all the …

    This repository hosts all the research on DAS for the collaboration between Codex and the EF. - GitHub - codex-storage/das-research: This repository hosts all the research on DAS for the collabora…

    GitHub - codex-storage/das-research: This repository hosts all the ...

  2. zk_id

    So we are at the moment most likely to use KZG commitments with a 2d RS polynomial. Basically just copy Ethereum. Reason is: 

    • Our rollups/EZ leader will generate this, and those are beefier machines than the Base Layer. The base layer nodes just need to verify and sign the EC fragments and return them to complete the VID protocol (and then run consensus on the aggregated signed proofs).
    • If we ever decide to change the design for the VID dispersal to be done by Base Layer leaders (in a multileader fashion), it can be distributed (rows/columns can be reconstructed and proven separately). I don’t think we will pursue this, but we will have to if this scheme doesn’t scale with the first option.

    dryajov 08/01/2023 1:55 PM

    This might interest you as well - https://blog.subspace.network/combining-kzg-and-erasure-coding-fc903dc78f1a

    Medium

    Combining KZG and erasure coding

    The Hitchhiker’s Guide to Subspace  — Episode II

    Combining KZG and erasure coding

  3. [1:56 PM]

    This is a great analysis of the current state of the art in structure of data + commitment and the interplay. I would also recoment reading the first article of the series which it also links to

  4. zk_id 08/01/2023 3:04 PM

    Thanks @dryajov @Leobago ! Much appreciated!

  5. [3:05 PM]

    Very glad that we can discuss these things with you. Maybe I have some specific questions once I finish reading a huge pile of pending docs that I’m tackling starting today…

  6. zk_id 08/01/2023 6:34 PM

    @Leobago @dryajov I was playing with the DAS simulator. It seems the results are a bunch of XML. Is there a way so I visualize the results?

  7. zk_id

    @Leobago @dryajov I was playing with the DAS simulator. It seems the results are a bunch of XML. Is there a way so I visualize the results?

    Leobago 08/01/2023 6:36 PM

    Yes, checkout the visual branch and make sure to enable plotting in the config file, it should produce a bunch of figures 🙂

  8. [6:37 PM]

    You might find also some bugs here and there on that branch 😅

  9. zk_id 08/01/2023 7:44 PM

    Thanks!