Proposed change in process for protocol upgrades

Zcash has become increasingly decentralized as more parties propose and implement changes to the protocol. As a result, coordinating work and timelines among parties for potential Network Upgrades has become more challenging. Each network upgrade requires changes to many components of the ecosystem to be successful, and different organizations with varying priorities can mean that one entity in the ecosystem ends up “blocked” on needing another entity to make changes. As the ecosystem grows, the slowdowns caused by such bottlenecks become more problematic.

In March, I mentioned that ECC intends to eventually fork the zebrad repo so that we can work independently and suggested that others do the same. The proposal is an extension of that intention and recommendation.

Previously, the Electric Coin Company had taken primary responsibility for coordinating network upgrades within the Zcash ecosystem. To promote further decentralization, we are now proposing a change to how new Zcash features are readied for inclusion in Network Upgrades, to take effect after the conclusion of the current Network Upgrade (NU7) development efforts.

In this new model, the party implementing a protocol change will have individual responsibility to drive the full lifecycle of the intended change. This includes ownership of relevant ecosystem participant coordination, and the software development, testing, and auditing of their proposed features within their own forks of the relevant GitHub repositories. This may include development or developer coordination for changes required for commonly used wallet software and supporting services. We encourage them to do this development in public and to solicit early feedback from node implementors and protocol experts. When they are ready, they will then advocate for inclusion of their feature in the next Network Upgrade and for the other repository owners to merge their changes.

If the other repo owners are satisfied that there is a clear community consensus for the upgrade and that the upgrade is ready for activation, they will merge the changes. If they are not, the proposer would address any concerns, such as security vulnerabilities or lack of community support.

Alternatively, the proposer could attempt to push their changes to consensus nodes, potentially causing a chain fork. While this would be an extreme action, it is a necessary part of decentralized protocol governance.

As an example, Shielded Labs has indicated that Crosslink should be considered for Network Upgrade 8 (NU8), which may be activated as early as Q4 2026. In this model, Shielded Labs is responsible for building, testing, auditing, and advocating for its inclusion.

At the same time, Qedit may be ready and advocate to add ZSA swaps to the protocol. If Shielded Labs is delayed, Qedit could still advocate for ZSA swaps to be included in NU8 as its work is decoupled from Shielded Labs’ work. NU8 could activate on time, and Shielded Labs would then need to ready itself for inclusion in NU9, likely 6 months later.

We believe this model simplifies coordination, enhances decentralization, and decouples the Network Upgrade schedule from specific features. We welcome your feedback, questions, and concerns.

Thanks to @nuttycom for input and feedback on this post.

12 Likes

Will each fork cut their own releases, with differing user agent strings?

Will one of the releases be presented as the “official” one? If not, how will end users figure out which zebra fork to run?

With multiple forks, the code base can drift a lot which will greatly increase the number of conflicts, which will require more engineering time to solve. Is that productivity loss worth the additional decentralization? I don’t see how this “simplifies coordination” - any major refactoring would be basically impossible to do in this scenario unless carefully coordinated between all forks.

What if someone proposes a protocol upgrade that is approved by the community but does not have the engineering resources to implement it?

5 Likes

Is there a risk that CEX’s wont be willing to choose which version to use and just delist? I’m not saying I’m against it, but has this been thought through? :eyes:

Curious on how folks really feel tbh, and thanks for sharing your perspective Josh :student:

Yes, I think that having differing user agents is appropriate.

Getting away from the idea that there’s an “official” node is kind of the whole point; which Ethereum node implementation is the “official” one? Is it Geth? Besu? Erigon?

The idea that there should be an “official” node is far too centralizing. The Zcash protocol is defined by its specification, not its implementation.

This could mean that the community has to go to more of a patch-based workflow like the Linux kernel uses. And this doesn’t preclude having dominant implementations; it just means that if someone wants to promote a change to Zcash, they have to do the work - not just on the consensus node implementation, but throughout the stack. In the past, this has been exactly the role that ECC has taken, in that ECC has been the primary maintainer of zcashd, the embedded zcashd wallet, and the librustzcash stack (in addition to the protocol, the proving systems, and so on.) In order for a network upgrade to go live, it needs to be tested and broadly supported; testing requires testing tools, which really means wallet integration. In addition, for a network upgrade to go live, important actors in the ecosystem (miners, exchanges, DEX implementers, etc.) all need to be able to upgrade; if critical pieces are missing, then it puts the whole of Zcash at risk.

This also doesn’t in any way preclude different organizations from collaborating! But it makes the bar clear; for a feature to go live, the entitie(s) responsible for that feature must see its implementation through end-to-end.

Then it’s up to that proposer to get enough of the community on board to get it implemented.

4 Likes

It’s important to note that all of this can happen today if a big enough actor wants it and gains consensus. Moreso, given the hashrate concentration Zcash has, a single mining pool could launch their own version of Zcash with no consensus whatsoever. So there is no “big change” in terms of “what is possible today”.

coming back to the proposed process, I’m trying to wrap my head around it. I can see it as a “release train” kind of thing where there’s a release cadence of X amount of weeks and teams that have a feature ready to be release can hop on the next departing train and get their software released.

I wonder how that can be done in a non-chaotic and decentralized way given that:

  • There could be many (or none) trains departing that day
  • Software developers don’t actually decide which train departs, miners do. (except for developers that are miners too)
  • A train or no train departing means that miners have decided to depart a particular set of trains.
  • A train departure board will be needed to know what’s going on. This is true at the moment but there was a single train so far so the screen could be somebody shouting in the station’s main hall.
  • Two visually identical trains can split when they are already in transit. This is true already but the environment was much more controlled given there was one or two implementations of the same train.

Interesting times coming ahead .

I’m not super familiar with them but from what I could quickly tell they are all very different, with Erigon being written in another language entirely, which is different than keeping multiple almost identical forks of the same code base. Also I’m not sure Ethereum is a good reference when they have orders of magnitude more resources then we have.

How is this different from the current way things are done? QEDIT had to implement the whole thing and now is having the work to update with upstream. Same for Shielded Labs.

The thing about requiring integrating with the whole stack makes sense, but I don’t see how it requires different orgs to have different releases of the node.

I’m not against the concept, I just feel this is trying to solve a problem that doesn’t even exist yet. Decentralization just for decentralization sake is not always a better thing. I thought we learned that with the dozen of not-great wallets situation.

7 Likes

In this new model, the party implementing a protocol change will have individual responsibility to drive the full lifecycle of the intended change. This includes ownership of relevant ecosystem participant coordination, and the software development, testing, and auditing of their proposed features within their own forks of the relevant GitHub repositories. This may include development or developer coordination for changes required for commonly used wallet software and supporting services. We encourage them to do this development in public and to solicit early feedback from node implementors and protocol experts. When they are ready, they will then advocate for inclusion of their feature in the next Network Upgrade

I would consider this the ideal development model.

and for the other repository owners to merge their changes.

I see why that approach seems attractive, but I see a couple problems:

  1. Code quality will suffer due to, off the top of my head:
  • Competing design patterns and tendencies, most notably, in my experience, Zebra code is unusually functional, modular, well-documented, and concurrent compared to other crates in the ecosystem.
  • Under-familiarity with the codebase and what is already there, we’re seeing this problem with Zaino already, where development effort, either in Zaino or in Zebra, has been invested into supporting use cases that were already supported, most recently, #9725.
  • Concurrent efforts to fix the same issues (efforts that could instead be invested in making additional improvements) which could introduce duplicate logic, particularly consensus-critical logic that should never be duplicated.
  • Excessive conflicts that require care to resolve correctly and always pose a risk of being resolved incorrectly.
  • It becomes less clear who bears responsibility for the boring maintenance work, and with no one wanting to do it, it’ll be left undone.
  1. It’s expensive and does not scale. For example, Qedit’s changes in Zebra to support ZSAs have gone through several rounds of review from multiple Zebra team members, and are likely due at least a couple more rounds. If there are multiple repos cutting releases, it would multiply the review effort by n^2. Even internally, code reviews represent a significant portion of the effort that goes into Zebra development such that I suspect raising n from 1 to just 2 would significantly slowdown development.
8 Likes

Does anyone remember where I get my permit stamped?

1 Like

As much as I love snarks they are often not productive in discussions. What are you trying to say? You don’t need any permits and no one is saying you need permits.

As mentioned elsewhere, I am not clear on the exact problem statement that this proposal seeks to solve. In part because of that, I am not prepared to offer a detailed analysis of this proposal.

Per my previous post, sorry if the reference to a “permit” was confusing.

I meant to highlight, in a lighthearted way, what I take your point to be:

No permit is required for the experiments suggested here.

Per an initial reaction to the proposal

I don’t feel like our code is taking “too long” to land in Zebra, so I don’t understand why we would want to create a divergent version. Let’s invest in gitops, and upgrade errors instead.

Per code taking too long to land in other repos, I have the impression that code review is the bottleneck, and setting aside a lower bar on code review, the obvious solution is to add code-reviewers.

That this perspective differs from others so significantly, is what I call the “feeling the elephant in the dark room” issue I mentioned elsewhere. It seems like different things are “obvious” to different parties. Better test infrastructure (e.g. with gitops, and better errors) would “brighten the room”.

Per the specific idea of investing in divergent repos

Having a simple “cooperative” github fork that offers a convenient testbed to push the envelope on things like “gitops” driven test automation is obviously worthwhile, and so mundane (zebra has 131 forks as of this writing) it probably isn’t worth much commentary.

Per having more divergent forks of Zebra, that appears to be a more extreme approach with significant potential downsides, and I’m curious why that’s proposed when (again to me, dangling from the left ear of the elephant) there’s much lower hanging fruit that is obvious:

  1. Proper gitops driven testing.
  2. More attention (across projects) to Error handling

Finally, I will note that in any robust fork better gitops and better error handling would be beneficial. The simplest way to get something into every fork is to land it before the divergence.

I just thought of something

Now that I have written this, I notice that I have an interesting difference in perspective here. The code that I think should be landing in Zebra isn’t new protocol features. From my post-at-the-ear, the code that should land should make bug-hunts much easier.

By my previous assertion that pre-divergence code is the code that everyone gets, I have now talked myself into a stronger assertion:

T/F We should not invest in more divergent forks (right now), instead we should invest in things we know we want all forks to have (immediately).

1 Like

Good Aborist call this week, check out the open discussion at the end for important ideas expressed in regards to “protocol upgrades processes” :eyes: :student:

3 Likes

I’ll try to further clarify my personal motivations here:

As mentioned before, for a long time the responsibility for ensuring end-to-end readiness for network upgrades rested primarily with the ECC team, with support from and coordination with ZF as Zebra has come online. This responsibility needs to be distributed in order for the Zcash ecosystem to be able to grow at a faster pace.

Let’s think about this in the context of a few of the major post-NU7 Zcash upgrades that have been proposed, in no particular order:

  • Tachyon
  • Hybrid Proof-of-Stake / Crosslink
  • ZSA atomic swaps

Each of these changes has broad-reaching implications for node maintainers, node operators, centralized exchanges, DEXes, and wallets. In order for any of these protocol upgrades to go live on the network, it’s insufficient to just implement the necessary changes in the consensus rules. So there are a couple of possibilities.

One possible approach would be for the entire ecosystem to come to consensus which change is the “next one”, and then each entity in the ecosystem would take on a piece of the problem. This could be an efficient way to go about things, but it has some downsides - that process of coming to agreement can be contentious, and then it’s possible for a single entity to hold up the whole process, and there’s risk for the entire ecosystem if, for example, an entity that is responsible for some key piece runs out of funding. In an ideal scenario, this could potentially be the most efficient way forward, but it’s not really decentralized and carries significant risks.

The other approach, the approach that we’re advocating here, is that we take the hacker way of “loose consensus and running code”. Anyone who’s willing to do the work - all of the work - can make a network upgrade happen. And this doesn’t preclude collaboration! But it’s the bazaar approach instead of the cathedral approach; those collaborations are likely to be more transactional and/or market-oriented in nature. There are tradeoffs in terms of risk - as @arya2 mentioned, I’m sure there will be people who worry about issues of code quality and robustness (myself among them!) but in my opinion the greatest risk to the Zcash project has always been the risk of being outcompeted by more nimble projects.

5 Likes