Edit: I must troutslap myself for not double-checking while everything is shut down.
Will try and see…
A datapoint: My node has been stuck since late June because of this (combined with another problem). It was not the reason for my return to this forum; that was coincidental. I have been watching some relevant activity on Github, patiently awaiting a fix; however, if the blockchain disk space requirements have ballooned as implied here, there will be no way for me to catch up.
I run on slow/old hardware—canary in the coal mine for those who lack wealth or otherwise access to the latest hardware.
[Merged] daira merged 8 commits into zcash:master from str4d:orchard-batch-validation Jul 1, 2022
It’s too bad that people are still trying to use Zcash with swaps as a Bitcoin mixer,* without realizing that the only way to get strong privacy from Zcash is to hold ZEC for a significant time in a shielded value pool. As @zooko says, privacy comes from shielded money at rest, not from money in flight.
Money within a Zcash shielded value pool has no “anonymity set” issues like CoinJoins or Monero mixins. In and of itself, a fully shielded transaction reveals only, “Someone sent some money somewhere.” (Timing, and network-layer linkage of txids, may be compromising metadata in some scenarios; but that is an orthogonal issue. If you want to use a very large number of inputs in a single spend, I do recommend merging them a random time before—just not to reveal to your counterparty that you had an unusually large number of shielded notes in your wallet.)
Grokking Zcash requires a different mindset. It most attracts those who already have that mindset. It tends to be misunderstood by those who think in terms of counting inputs for anonymity sets.
(* Although I myself have sometimes used Zcash to unlink BTC, it is not simply a matter of swapping in/out. I always hold significant ZEC; thus, I can make adjustments in either direction between my ZEC and BTC holdings at arbitrary times in a variety of ways, without making it obvious to cross-chain analytics that the transactions are linked.)
The below re recursion was substantively on my mind for near-future reply to the above, when @str4d said this one minute before I finished writing my first post on this thread:
In Bitcoin, I accepted long ago the reality that it is physically impossible for billions of people to Be Their Own Banks on-chain—on any blockchain. That left me standing with Hal Finney’s vision of “secondary level payment systems”—viz., in my own thoughts, a new Free Banking Era of banks of issue (at risk of “wildcat banks” like Terra). I advocate keeping blocks small, and performance high, so that anyone anywhere in the world can afford to BYOB if sufficiently motivated; but I do not pretend that any blockchain scaling solution can ever bridge an orders-of-magnitude TPS gap.
“Blockchain” is an unfortunate buzzword. By design, it is the world’s most inefficient database—I do not even refer to the POW vs. POS debate, but rather, to the blockchain concept in itself. Wouldn’t it be awful awesome to make all of the “Blockchain! Blockchain!” hype buzzword careers obsolete. So many tears on “crypto Twitter”, mmmm. Delicious.
Zero-knowledge proofs open the way to succinctness, but I wish to go a step further. I dislike how Mina has done that (and more strongly dislike how they didn’t make transactional privacy a version 1 release feature—a ZK proof coin with no privacy, say what!?). I wonder how it could be done better, and if the Zcash team may be interested in doing it better.
The “crypto” world is a cesspit of cargo-culting. Most people have no idea why we use a blockchain. Let’s get back to basics: In the abstract, in rigorous theory, what problems does Satoshi’s blockchain solve?
It is only a problem of distributed consensus on transaction ordering. That’s it! Admittedly, it is a huge problem; that is why we can’t just pass around PGP-signed notes saying, “I pay to @daira and @str4d 1 PGP each.” (As a personal aside, it is also why my 2006 vintage half-baked idea fantasy of a system for cryptographic money was quite useless.)
Now, I despise the word “mempool” in any context except discussing a single node. There are mempools, plural; the purpose of the blockchain is globally to synchronize one view of many mempools, and authoritatively to resolve inter-mempool conflicts without any central authority. Mempools contain inchoate potentials and proposals; the blockchain is a plenary global source of Truth.
Enough hints and vague talk. Just in case I may have hit on any commercially valuable ideas, I do not want to be too specific about what’s on my mind. Let’s just say that if we were to build new distributed consensus algorithms that are designed from scratch fully to exploit  the network characteristics of P2P cryptocurrency, and  the awe-inspiring raw power of ZKP recursion—without cargo-culting the block-production process, as Mina does… (Suspension point, ellipsis, thoughts. Ping me if curious.)
Edit, P.S. @daira, somewhat OT
If you dislike high fees, dig around in the Solana GH discussions for a view of a chain that sold itself on low fees preparing to commit fee-market suicide. IMO, it can work for Bitcoin—but probably only for Bitcoin.
Maybe it would make sense to discourage almost all non-financial use of the blockchain. (In Bitcoin, almost any non-financial on-chain data are spam stigmatized as harshly as we can.)
It makes no sense to run chat apps and websites off a blockchain: An extremely inefficient, append-only global database. More importantly here, it is a net negative for ZEC holders: It brings relatively small new demand for ZEC, while degrading the blockchain’s performance and raising its costs for financial usages.
I strongly urge that ECC/ZF should focus on P2P, B2C, and B2B financial usages for ZEC, future ZSAs, and other things that are about money and valuable assets, not fun and games.
it’s not a solution to this problem. you can scale to whatever level of storage efficiency, even 1 bit/transaction, if transaction fees don’t reflect the price of resources used by the transaction (storage, disk IO, CPU time, RAM usage, network bandwidth, etc.), a single spammer will still be able to cripple your blockchain at a minimal cost. the solution is accurate resource pricing. (I implied that in my previous post too, you’re correct that a fee market alone won’t solve this.)
this is not true, because shielded addresses have close to zero commercial adoption. this single fact makes shielded pools nothing but mostly big mixers for transparent ZEC, BTC, or whatever else you exchange the shielded ZEC for, no matter for how long or how big amounts you keep in the shielded pool. (even you mentioned the need for on-chain obfuscation techniques, contradicting your main argument.) you need a circular economy of shielded ZEC to sidestep this in any meaningful way. the easiest and fastest way to do that is sunsetting the transparent pool. the ECC leadership implicitly expressed they won’t do this, and even a part of the community supports them in this.
The specific problem you posed was disk storage, and you claimed that a fee market would be required to fix the disk storage consumption issue:
Even if we have accurate resource pricing, there is nothing stopping blocks being consistently filled by legitimate non-spam usage because they agree with that resource pricing and want to use the chain. A fee market can alter resource allocation (by e.g. enabling spammers to be out-priced by legitimate users), but it cannot prevent peak resource usage - if one or more people are collectively willing to pay the cost of filling blocks, the blocks will be full.
nothing can prevent peak resource usage, but accurate resource pricing radically reduces the probability of attackers using up resources instead of legitimate users. the proof of this is in all other major blockchains.
you made it clear you don’t intend to solve this, that’s all I was really here for.
Highlighted context was restored by nullius to an overly-snipped internal quote:
LOL, I see that @str4d is playing it cool with his replies. Poker face! What did you mean about recursion?
As I recently observed in another context, most people do not and cannot understand recursion. Most people cannot understand zero-knowledge proofs. Put the two together, and people will be stunned at achievements that seem impossible.
You entirely missed my point. Will not repeat. A different mindset is required to understand zero-knowledge proof privacy.
By projecting your own view of ZEC onto the whole world, you are revealing your usage patterns of ZEC. I suggest that you should be more careful with your own opsec here. Anyway, it is irrelevant: All the world is not you.
Even for that use case, time within the shielded value pool is important. Linking close-in-time BTC→ZEC→BTC swaps in similar amounts is trivial for cross-chain analytics. That is not a Zcash problem: It is a PEBKAC problem. And it is an even worse problem if using Monero as a Bitcoin mixer. The amounts on the BTC side are always fully transparent; and Monero mixins definitely leak enough information for a confirmation attack. If BTC→XMR and XMR→BTC swaps show similar amounts on the BTC side (adjusted for known XMR/BTC price data), and the BTC chain preserves the approximate times when the swaps occurred on the Monero side, Monero preserves a clear trail leading from one transaction to the other—a trail that is highly improbable to occur by random chance—even if you churn. Zcash has no such problem.
Linking BTC→ZEC and ZEC→BTC trades of very different amounts, far apart in time—LOL, good luck with that. (I also have some other tricks which, for obvious reasons, I will decline to discuss publicly.) I have never used XMR for this purpose, for the above-stated reasons inter alia.
No, I didn’t. I mentioned that I think it’s a good idea not to show your counterparties if you have a large number of shielded notes in your wallet. That information would reveal nothing else, in itself; in particular, it would not reduce the notes’ “anonymity set” as you put it, counting potential inputs. In anonymity matters, I always just like to make things look as ordinary as I can.
This is true, within the four corners of what I have quoted. I use fully-shielded ZEC with counterparties who pay or accept it. For instance—not revealing anything that isn’t already public—I have some domain names registered with Njalla (onion). Njalla only has shielded Zcash addresses in their shopcart; if you send from a shielded address, then the tx is fully-shielded. I recommend them to others, and I recommend using fully-shielded ZEC with Njalla.
True. I will say it: Zcash blocks are too big and too fast. I should have said it years ago. Now, I am burned.
Question: I have never pruned—not on any chain. (I actually run zcashd with txindex enabled, at a very moderate cost, so that I can examine historical transactions without hitting an untrusted block explorer.) Do you have any practical, hands-on advice for how to switch suddenly to a very tight pruning policy, without potentially somehow shooting myself in the foot? I don’t want to—I really do not want to. But I may have no choice. (I have not even tried to catch up now, due to this aspect of the problem.)
That forbids people who aren’t rich from running full nodes. 2.14GiB per day devoted to zcashd alone may be trivial to you. It is cost-prohibitive to me, and no doubt to many others around the world.
The disk cost may be partly resolved by pruning—but as I have repeatedly urged in Bitcoinland, disk cost is, albeit problematic, the least problem. Pruning does not help with network cost. (Fortunately not an issue for me; but I have personally spoken to others for whom Blockstream Satellite makes obtaining the whole Bitcoin blockchain a possibility. Zcash has no equivalent infrastructure; and its blockchain is set up to grow much faster.) Pruning also does not help with the resource-use issues from UTXO set size (Zcash transparent pool will behave identically to Bitcoin here), or with any potential performance issues in the nullifier system that Zcash needs to prevent double-spends in the shielded pools.
And running a full node is even more important for Zcash than for Bitcoin: Light wallets for Zcash will never be able to match the privacy of running your own zcashd, unless they switch to some form of fancy PIR. A full node essentially gives the theoretically maximum benefits of PIR for viewing your own transactions.
(Yes, I am a “small-blocker” in Bitcoin. There is a reason for that. I never did “our team, rah rah” type of politics on any side.)
Edit: Fairly restored some context that I inadvertently chopped up in a way that I did not intend. I noticed this myself on a reread. See diff.
I hope that they will never blindly prioritize smaller transactions over larger ones: Shielded transactions are necessarily larger than simple, common cases of transparent transactions. If that ever becomes a problem, then I would urge consensus changes to realign incentives. (It is a complicated issue, so I will decline to toss out off-the-cuff ideas which could be totally wrong.)
Any type of discrimination against shielded transactions would damage the fundamental value of ZEC, so I would hope that miners are already sufficiently incentivized never to do that.
I’ve never tried - because I started with one normal pruned (then non-pruned) full node, then I wanted to start doing offline / private full ZEC exploring like you, and that had to re-index from scratch, so I had to undergo the ‘weeks of pain’ for that to be created.
But perhaps in the reverse (make a copy of your explorer mode datadir instance, then remove -index nd related tags), it might ‘just work’ and won’t ask to re-index? Try that. Nothing to lose if you have the disk space and just make another copy to play around with and modify.
Now that they’re indexed, I live with two nodes side by side, and update them separately.
@Zchurn, thanks for the suggestions. I began to write a long analysis here, then realized that I made a serious error I must correct posthaste lest it mislead anyone reading this.
This also raises a feature request. @str4d and @daira, please consider a short- to mid-term solution of pruned node wallet support.
Pruning is not an option here.man zcashd says:
Reduce storage requirements by pruning (deleting) old blocks. This mode disables wallet support and is incompatible with -txindex. Warning: Reverting this setting requires re-downloading the entire blockchain. (default: 0 = disable pruning blocks, >550 = target size in MiB to use for block files)
This is entirely different from Bitcoin. I had known this, but had forgotten. In Bitcoin, you can discard most of the block data after processing, and keep using your wallet with your own node; there is even some recent work for running Lightning nodes off pruned Bitcoin nodes. There are people running Bitcoin nodes with full security and privacy benefits, albeit some severe functional limitations (no reindex/rescan), in 20–30 GiB disk usage or less (can be customized) when the blockchain is >400 GiB. In Zcash, pruning disables the wallet.
I understand that they made this change when developing shielded support. Zcash has new, complicated machinery for that, which Bitcoin does not have. I now understand that perhaps, the Zcash zero-knowledge wizards may be pondering how to make the blockchain disappear—not with curses, but with re-cursion—abacadabra! In the interim, however, I urge that a pruning implementation that supports private wallet usage is critical to letting people run nodes.
I myself will probably wind up buying new hardware just for this. But this is not only about me. Many other people cannot reasonably devote this type of disk space to the permanent storage of Zcash block data:
That is an upper bound of >781 GiB per year in disk growth—just for blocks, not counting chainstate (and excluding orphans that stay stored on disk, which are more frequent in Zcash due to the lower blocktime). For comparison, the Bitcoin blockchain has been running since 2009; its blocks have been full for longer than Zcash has existed. As of mid-2022, it is “only” in total just over half the size by which Zcash could grow in one year.
With no option to prune, running a fully private, secure full-node Zcash wallet will be cost-prohibitive for many people.
Note that with Zcash, I very strongly advise against running nodes “in the cloud” for wallet usage. For full privacy and security, people need to be able to run independent nodes on hardware in their physical possession. (I do appreciate people who run cloud nodes to help support the network!)
Documentation references in the source (not digging into the functional source now):
Another important way to help people handle blockchain growth:
As a separate issue, the status of pruning in zcashd:
How does that help users who want for their own Zcash wallets to have the full privacy and security of a full node?
In Bitcoin, the principal argument for individuals and businesses to run nodes is self-interest aligned with the common good: Be Your Own Bank. Bitcoin Validators who enforce the consensus rules don’t need any direct financial compensation, because the security and privacy benefits of your own full node are payment enough—and the costs of running a Bitcoin Validator are so low.
Multiply and exponentiate this for Zcash. I sometimes run a zcashd but no bitcoind, because running your own node on your own hardware, under your physical control is critical for gaining the full privacy benefits of Zcash. And privacy is the reason to use Zcash. If you can’t run your own Zcash node, why bother?
I thought this chart was interesting. The ‘flood’ of ‘spam’ transactions seems trivial compared to other chains. Ive checked a few block explorers and they all seem to converge at 12.5k tx /day for current transaction throughput.
Pay per output is a good step forward towards actual tx cost balancing. It does not address what happens when transactions are 10x or 100x the current volume.
I am of the belief that running a full node is not for everyone and it cannot be a hard requirement if adoption is the goal. End users need key management, signing & shielding, and safe transmission of signed shielded transactions. Lightwalletd and the wallet threat model are good progress in supporting this.
Super users, companies, finance groups, miners, etc run nodes . They are not the AAA* user; they have specalized knowledge, higher costs for hardware, and higher volume with higher note counts.
There’s a reason for that: a succinct blockchain is easier to do in an account-based model, but account-based models are incompatible with the commitment-and-nullifier approach to privacy used by Zcash (originated by [Sander and Ta-Shma 1999]). The problem is that Mina’s approach requires the account tree to be public in order to perform updates. A complete redesign is needed to get privacy and blockchain succinctness simultaneously.
Let’s calculate the cost: you can buy a new 6 TB hard disk with a 5-year warranty for USD 80 (this is an “enterprise” drive rated for continuous operation, I’m not choosing a consumer drive here). 2.14 GiB/day will fill that disk in ~2611 days (~7.1 years) if it is only used for block data, giving an amortized cost of ~3 cents a day. Of course in practice you have to buy a disk in advance. It’s a bit more complicated because zcashd doesn’t actually support spanning block data across disks, but we could make it do so before that became a problem.
Given that around 1 billion people live on less than 1 USD/day, it’s true that many people around the world couldn’t afford to run a Zcash node (or any computer at all) given their other living costs. But most people in developed countries can, if they can afford a computer.
What do you think of the “private memory” [Khovratovich and Vladimirov 2019] approach used by Dusk? I have not rigorously analysed it; I do not vouch for its privacy properties. I don’t know if it could work with a succinct blockchain; but I don’t immediately see why it couldn’t. (Disclosure: I have some DUSKs that I picked up awhile ago on a whim, because ZK proof privacy is like catnip to me. Have not paid much attention to it. I am unhappy that it’s POS.)
On a brief search now for other potentially relevant ideas, I find BlockMaze [Guan, Wan, Yang, Zhou, and Huang 2019]. Need to read, and follow backwards/forwards references. On Ctrl-F, I note that your 2019 Halo paper is reference 29.
I tend to be reluctant to discuss this in public, outside of a strictly development forum. There is too much hype in this space; and although I have been gone from this forum for years, I have some rep elsewhere. By mentioning a few things off-hand, I do not want to give the impression of having the magic answer; you may reply, “LOLWUT, I can’t make that do whatever you imagine”, or worse, find a privacy-wrecking information leak. More fun is that when trying to get a more rigorous handle on my understanding of the two different transaction models, I noticed that something on p. 11 of [Zahnentferner 2018] is encrypted with RSA key, ID 0x7998B6321C09B6CE. A brief search of various keyservers (including Protonmail’s) and the web fails to find any information about this key. Sigh. I love little puzzles, but this is just mean!