Quantifying Zcash's Privacy Set

I am defining the privacy set by the number of all transactions which include a shielded output.

The ECC page refers to the Sapling pool as the “privacy set” (Zcash Metrics - Electric Coin Company).

  1. Is it correct to assume the Zcash privacy set is defined by only the current pool (sapling in this case)?

  2. Or, can the privacy set be defined by all the pools (sapling + sprout)?

  3. Will the privacy set reset with the new Orchard upgrade which will have a new pool?

What is the proper way of quantifying the Zcash privacy set as an entirety ?

1 Like

Hi there! Welcome.

Great questions!

I have to start with a caveat that the size of the privacy set is not all that matters, because “set intersection attack” (which is the mathematical generalization of more or less all privacy-penetrating attacks) can — in the wrong circumstances — eliminate an arbitrarily large amount of your set.

So, it’s one of the confusing “yes and no” situations where if someone asks “Is X better than Y because X’s privacy set is bigger than Y’s?”, the answer is often “Yes, if all other things equal, but all other things are not equal so what matters more in this case is what set intersections the attackers can apply.”.

(Here’s a talk I gave recently in which I give an example of an attack that I discovered that can eliminate basically 100% of Monero’s protections every time — the attacker makes two controlled buys. The Monero researchers call this and other similar situations “EAE” for “Eve→Alice→Eve”, and Ian Miers called this “flashlight attack”.)

Note that with today’s Zcash wallets, today’s version of Monero, as well as things like MetaMask and others, a huge attack surface for intersection attack is the network layer. If the attacker can leverage that against you, then it can often negate most or all of the protection that the blockchain layer is trying to offer!

(So, in terms of technology roadmap, it looks to me like the most important privacy upgrade for crypto as a whole would be stronger network-level privacy, such as Tor’s Arti project, the Nym project, or Taylor Hornby’s idea that the Zcash p2p network is already an anonymous communication network. I basically think that — once wallets deploy the new Shielded By Default standard and Halo/Orchard — that Zcash blockchain layer will be so good at privacy that we don’t get much added value from further improving the privacy at the blockchain layer, but that we very much need improvements at the network layer. If others here have different opinions about that, I’d be interested to hear them.)

Okay so with that caveat aside, yes, we at ECC currently think “number of Sapling transactions with at least one shielded output” is currently the best approximation for the privacy set, as shown on our metrics page. That’s assuming that what you want to do is make a transaction from your shielded ZEC in the Sapling pool. If you make a transaction from your shielded ZEC in the Sapling pool, then there is no [*] information at the blockchain layer that gives any hints as to which of those 416,000 previous transactions your shielded ZEC came from. Make sense?

[*] almost no

Now as to your question about whether the “privacy set” includes Sprout, if you make a transaction from your Sprout shielded ZEC, then it is easy to tell from the blockchain layer that your money did not come (directly) from the Sapling pool, so those 416,000 previous Sapling transactions are not part of your privacy set.

As to your question 3, yes, the Orchard pool will begin life with no ZEC in it. Once the first person moves some of their ZEC into the Orchard pool and leaves it there [**], the privacy set will be 1 — if they were then to move their ZEC back out, everyone would be able to tell, from the blockchain layer, where it came from. Once the second person moves some ZEC into the Orchard pool, the privacy set will be 2 — there is no information visible at the blockchain layer allowing an attacker to tell which of the two sources the output came from, and so on.

Shielded By Default will all support auto-shielding by default, which means moving ZEC from the transparent pool to the Orchard pool. Hopefully they will also support auto-migration, which means moving ZEC from the Sprout and/or Sapling pools to the Orchard pool.

[**] IMPORTANT NOTE: If you move your ZEC through a shielded pool on its way to somewhere else, that provides little to no added privacy — to you or to anyone else. Privacy comes from your shielded wallet keeping your ZEC in a privacy pool. I also touch on this issue in the talk I linked above (Zooko Wilcox: How to protect your crypto through privacy - YouTube).

Hope this helps! Interested to find out where you are going with this.

13 Likes

This definitely helps, thanks for the detailed response.

My goal was to build an interactive dashboard which helps others realize that Zcash has “infinitesimal privacy”. Over time as transactions with a single shielded output increases, the chance of ones transaction being pinpointed goes down in terms of information entropy.

The motivation was from this article: About 33 Bits | 33 Bits of Entropy

When this article was written someone needed 33 bits of information to determine’s someone identity for a total of 6.6 Billion people. Since most people in the world have a global identity, information leakage is very easy. Taken from the article above, “The second consequence is that 33 bits is not really a lot. If your hometown has 100,000 people, then knowing your hometown gives me 16 bits of entropy about you, and only 17 bits remain.”.

Taking this idea to Zcash. As the number of transactions (ones with a single shielded output) increases the number of information bits will also increase to determine whose transaction is whose. But, the information leakage will stay constant (or I previously thought so before reading your response). An outsider would only know that a single transaction has taken place so 1/416000.

For example:

Number of information entropy bits needed before knowing someones “identity” in the sapling pool : ~ 19 bits (2^18.67 = 416k (#of transactions with a single shielded output in sapling)

Now I am stuck at calculating the number of information bits will be available to an attacker on the blockchain level. Just like knowing someones hometown yields 16 bits of entropy. Then I would use this to make a compelling reason to say why Zcash has an infinitesimal privacy.

Thoughts?

2 Likes

Are you using infinitesimal correctly here? It sounds like you are saying Zcash has very little privacy?

I meant to say the attack on privacy becomes infinitesimal over time. Or in other words the information leakage becomes smaller over time.

1 Like

What do mean, on the blockchain level?

On the blockchain*

Would this be a correct conclusion below?

On the blockchain no new information can be seen other than knowing a transaction involved a shielded output. Similarly, in the real world it would mean an attacker knowing that the target is a human. Having that information does not not help at all since everyone is a human. There is no “entropy” in either pieces of information.

Then, there is no way someone can get to 16.5 bits (amount of bits to pinpoint a transaction from the Sapling pool) of information entropy from the Zcash blockchain. Over time as the privacy set grows, the number of required information bits will only increase.

Do you feel my logic makes sense here ?