Proposal: Lower Zcash block target spacing to 25s
This post proposes we lower the ZCash network block target spacing from 75s to 25s in a small upgrade this year. The aim is to start discussion on the metrics, benchmarks and improvements required to make this happen.
Zcash faces a few performance bottlenecks limiting its path to serving billions, the two from consensus are:
- Transaction inclusion time: Users wait on average 75 seconds for block inclusion, regardless of network utilization. This causes delays to every usecase: private payment users, people onboarding via exchanges, and cross-chain bridges (Near Intents).
- Bandwidth scaling: Right now, consensus bandwidth can only support 2.9 shielded Orchard TPS (27kb/s), this is far too low.
This proposal introduces a step on improving this: namely reducing block target spacing by 3x, which leads to multiplicative throughput improvement with Tachyon, along with significant UX improvements.
Precise proposal
- Lower block target spacing from 75s to 25s
- Keep the following parameters at same expected wall clock values:
- Anchor block interval
- All economic parameters (e.g. issuance per block is lowered, and issuance is kept as the same per day)
- Any rounding concern can get sent to the NSM.
- Difficulty adjustment stays at the same number of blocks, as was done in Blossom[1].
- Introduce limits for the 2MB blockspace, based on the pool
- All 2MB can be used for Orchard txs
- All 2MB can be used for Transparent txs
- Up to 600kb of blockspace can be used for Sapling txs
- Up to 600kb of blockspace can be used for Sprout txs
Furthermore, we claim:
- This raises 3x’s the 2 Action Orchard TPS from 2.9 to 8.6 TPS
- This significantly improves the worst case wallet bandwidth from trial decryption.
- The chain at maximum (adversarial) capacity would cause 9.8GB/year less bandwidth to wallets.
- Wallets will have to download extra compact block headers due to faster block time (80MB per year).
- Bandwidth reductions for Sapling and Sprout cause no empirical usage concern
- Worst case block processing time will not meaningfully increase
- Stale block rate will remain well under 5% (Ethereum long term operated at 5.4%, with 12s blocks[2])
We expand on these claims in more depth. First more motivation on why.
Motivation
Why raise max TPS
Zcash operates with a 75-second current target block time, with the max blocksize of 2 MB which implies that the Orchard pool’s max TPS is 2.9. Bitcoin’s max theoretical TPS is 7, and empirically at 5.4 TPS in production. Our ambition is far higher than both of these Bitcoin figures, with an aim to serve billions of users.
Just enabling users to quickly migrate to post-quantum recoverability requires higher TPS. Bitcoin at its 7 theoretical TPS will already have issues: https://x.com/zkDragon/status/2017899288378904941?s=20
For reference, Mastercard and Visa operate in the thousands of TPS consistently. To realize Zcash’s end-state vision, we must scale the TPS considerably over time, and bandwidth scaling is a critical step in that direction.
Why reduce latency
Today, every transaction waits ~75 seconds for inclusion, even when blocks are not full. This imposes unnecessary friction across the ecosystem.
- Cross-chain swaps: Market makers must see a transaction on-chain before acting. While they tolerate PoW finality risk, they require at least 1 confirmation (often 2–3).
- This increases inventory risk, leading to worse pricing and longer user-facing delays.
- Spendability of received funds:
- Directly limits ZEC’s money velocity.
- Zashi currently waits 10 blocks (~12.5 minutes) before funds are spendable.
- Bots at minimum must wait one block, just due to anchor references.
All current use cases benefit from lower latency. Today, users are forced to wait even when blockspace is underutilized, degrading UX without providing additional security or throughput benefits.
25s is not where we should long term stop, but is a great next move.
Tachyon
Project Tachyon [3] is going on to address critical, privacy-specific scaling challenges. Namely eliminating wallet sync overheads, full node processing times, and nullifier state growth. As Tachyon delivers, we need to focus on raising consensus bandwidth limits even higher.
Justification of claims
TPS Count
This is straightforward. a 2-in-2-out Orchard tx is 9.14kb. The usable space in a block is 1.998MB [4].
So its a max of 218, two Action txs per block. At one block every 75s on average, thats 2.9 TPS. At one block every 25s, that is 8.7 TPS.
Worst Case Wallet Bandwidth
A major bottleneck for Zcash wallets is trial-decrypting shielded outputs. Wallet’s bandwidth requirement scales with the number of compact block headers, and the number of shielded outputs.
An empty block’s compact blockheader size is ~100 bytes [5]. Thus faster block time leads to an additional 80.2MB/year being downloaded by clients.
However, the worst case number of shielded outputs stays the same here. To adversarially grief wallets, you’d maximize # outputs / tx size. Today, you do this via Sapling txs, due to Groth16’s much smaller proof size. We need to compare 1-in-many-amt-out Sapling txs, and many-action Orchard txs.
Today, a 1-in-32-out Sapling tx has size 30.8kb. This lets you have 32 * (1.998 MB / 30.8kb) = 2075 outputs/block or 27.6 outputs per second at today’s block time.
A 32 action Orchard tx takes 104kb. This leads to 32 * (1.998 MB / 104kb) = 614 outputs/block. At the proposed new block time, this is 24.5 outputs per second. Thus the proposed Orchard max outputs per second is less than today’s from Sapling, so Orchard doesn’t increase max wallet overhead.
The proposed parameters for Sapling slightly lower the bandwidth available for Sapling transactions today, dropping it from today’s 1.998MB available per 75s, to 1.8MB available per 75s. That drops worst case Sapling outputs per second to be 24.9. (assuming 32 out txs)
Doing such a reduction improves the worst case for wallets, in the event of another sand-blasting. A compact Sapling output costs 116 bytes for a client to trial decrypt. Dropping sapling max outputs per second by 2.7 saves light clients 9.8GB of download a year in the worst case.
Bandwidth reductions on Sapling and Sprout
Sapling and sprout txs have never taken anywhere the full block space allotted. (The sandblast was in Orchard) Today, Sapling accounts for ~12% of all shielded supply and ~0.5% for Sprout, with these relative percentages continuing to decline. So lowering their allocated bandwidths by 10% as proposed causes no user friction(See zcashpulse.net).
Worst case block processing time
We are concerned with the time it takes to validate a block, since this affects:
- Block propagation delay
- Note, that all of the expensive ZKP verifications and signature checks should be cached from nodes seeing them in the mempool.
- There is not an advantage to miners “withholding” orchard txs from the mempool. That is the equivalent of selfish mining, which they could just directly do.
- The advantage a non-block-validating miner can have over one who validates their block
- This concern is minimal with miners who can parallel verify blocks
- Full node’s ability to sync on cheaper hardware
The block validation time consists of:
- Verifying Transparent/Sapling/Orchard transactions
- Merkelization updates
We will do more in-depth posts into this in the future. But intuitively the main bottleneck for consensus is verifying ZKP’s, and the second bottleneck is doing merkelization updates.
An under-appreciated fact about Orchard, is the ZKP’s batch verify incredibly well by design, and this is already done on-chain. On our twelve core machine, 1 standard 2 action tx verifies in 9.8ms, whereas 64 standard 2 action txs batch verify in 59ms. (This is full action verification, not just the ZKP)
Under this new parameterization, our current estimate is that verifying a max capacity block, with nothing checked from the mempool, should not take more than 1s on a 4-physical core machine. However, we expect most transactions to have already been mempool checked, leaving verification time sub 300ms. We will more rigorously test this, as was done for Blossom.
We get to 1s as the estimate, by noting that max capacity blocks are 218 2-action txs. If nothing was pre-checked, we would verify with three 64 bundle batches and one 26 bundle batch. This gives a total verification time of ~215ms on our 12 core machine. Scaling down to 4 physical cores yields 650ms. Considering state updates, I/O and mempool overhead of an additional ~200ms, we get 850ms of worst-case full block verification time, and we roundup. If 90% of the transactions were checked in the mempool (as is the case in almost every network without MEV), we get the 300ms estimate. We will optimize, and far better benchmark this over the coming weeks. (There are many known improvements)
Note that for live consensus, this should be very safe. The load almost perfectly scales with num cores. Miners spend millions on GPU’s, they can be asked to have a 16 physical core CPU. Furthermore, realistically all the txs under load will have been pre-gossipped and verified from the mempool. Keeping verification times for miners likely well under 300ms.
Stale rate
Stale rate is the percentage of blocks that get orphaned. This is going to relate to block propagation delay between miners, and the block production rate. Proof of work block times are typically modelled as a Poisson process (If there were no P2P delays, it would exactly be a Poisson process). Today, our stale rate is 0.4%. However, our stale rate is likely artificially lower than gossip delay, due to centralization within the mining pools. (https://zecminers.xyz/)
The stale rate is then the percentage of blocks that get mined + gossipped within a “block_propogation_delay”. Doing the poisson distribution math yields an expected network delay of 300ms, without accounting for any “flat overheads” applied to all nodes or the many low level p2p concerns which we need to optimize.
We empirically measure the network conditions of a new, 1 physical CPU node in US-East, as having a median gossip delay of 350ms, and p90 of 700ms. (https://zcashpulse.net/).
At 25s, the same “theoretical” model estimates the stale rate would go to 1.3%. If we took today’s measured p90, rounded it up to 1s, and assumed that was always gossip delay, we would predict a 3.9% stale rate. Both are far less than the Ethereum POW’s demonstrated 5.4% stale rate.[6]
Also note, that the delays all significantly improve as gossip engineering improves, e.g. Compact Blocks and UDP packets.
The typical “folklore”[7] is to keep gossip delay + processing times to be less than 10% mean block time, which we are well within then.
Conclusion: Conservative Improvement, Clear Path Forward
We propose going towards 25-second blocks in a small upgrade this year. We view it as a key step to improve UX for users, and scale the TPS to address demand and enable swift quantum migrations.
We think it is conservative on the consensus safety side as well, and look forward to discussing strategies to better prove this.
As a next step, we will prove out the gossip delay and processing time overheads on more recent hardware (e.g. assuming more parallelism), in a similar fashion to what was done for Blossom[8]. And aim to get optimizations for these.
While Tachyon will enable true scale on the privacy front, we must lay the right foundation today to unlock its full potential. Reducing block target spacing is one key step in that direction.
Co-written with @valardragon