Can shielded transaction values be unmasked via practical brute forcing?

Hi again,

As I continue to consolidate my cryptocurrency OPSEC*, this question suddenly came to mind.

If it were possible to cryptographically prove the value of a fully shielded transaction on the blockchain, by already knowing its actual ZEC amount (or successfully guessing it), then obviously that would be very bad news. I just realised there’s not that many numbers to choose from to brute force with, and humans tend to use more rounded numbers with fewer decimal places and have predictable usage patterns.

Various multi-step attacks would become possible. (E.g. to unmask someone churning their Zcash, if you unmasked one Tx amount, you brute force values within close range on other Tx’s.)

E.g. with AES-256, if you have a really dumb password like 1.125 (protecting the privacy of your encrypted blob), no amount of AES quantum resistance will protect against a simple, numbers-based dictionary attack!

Hopefully the protocol design is smarter than I fear - e.g. the attacker would need multiple pieces of usually secret information (known only to the sender, sender+receiver, or sender+receiver+node at worst), all of which would be needed to create the final hash representing the transaction amount. Like this analogy? (If only I were a cryptographer I’d easily know!)

I’m actually curious about this regarding Monero’s masked value amount as much as Zcash, but I lean on Zcash as my final ‘last hope’ in my current financial OPSEC.

Thanks team, wishing you continued success in early 2022 as you work on my favourite cryptocurrency. :slight_smile:

(*) It still is clear, in my continued observation including real world usage of full nodes, that Zcash is far better than Monero for passive untraceability, but TIL Monero at least uses Dandelion++ to mitigate against active network graph analysis. Zcash should adopt that as a minimum!

2 Likes

There is a randomization factor in the value commitments. It protects against replay attacks and what you mention (kind like the salt in the password hash)

So, even if I know the transaction amount (e.g. it’s my own Tx I sent), it’s not possible, for me even, to cryptographically reproduce the hash on the ledger representing the Tx amount, using knowledge of Tx amount alone?

If so, it’d be good to know the full list of vectors needed to cryptographically determine the Tx amount (i.e. to do a reverse ‘proof’). I assume it’s scattered inside protocol.pdf which I keep like a bible.

Would a layman’s analogy be: compared to an AES-256 blob you want to decrypt (in order to know / see / ‘prove’ its plaintext value), the Zcash protocol requires multiple “passwords” (public keys like sender address, recipient address, etc.), in order to ‘decrypt’ the Tx amount?

I wonder how many other vectors (other than Tx amount) are needed to make a numbers-based dictionary attack of Tx amount become practical. (All of them? Only recipient address?)

Nonetheless, if you churn at least once with good OPSEC, in order to place an air gap between incoming and outgoing ZEC Tx’s in terms of other people’s knowledge of such vectors (meaning you’re not reusing addresses already known to other people for a second time), I can’t think of a way where an attacker could discover other Tx vectors and not the Tx value all at the same time. (I.e. they’d have to gain access to the system running your full node. IIRC, Zcash p2p network can only expose the link between an individual Tx and the node’s IP address which creates it, and/or other non-cryptographic heuristics like timestamp.)

The net value is in clear. The transparent inputs/outputs are also in clear obviously. You “just” need to find the shielded output values. Either you can decrypt them because you have the incoming/outgoing viewing keys or you can guess the pre image of the value commitments. However, the later has a random factor (a nonce if you want).

Edit: My bad, rcv is not stored. I mixed it up with another randomization factor.

You are correct that note values are particularly sensitive information, because humans are predictable, meaning they will both select nice values, and not intentionally select awkward values.

In transactions that involve intentionally-transparent ZEC movements (e.g. transparent-to-Sapling, or Sprout-to-Sapling via the turnstile), some or all of the net transaction amount gets exposed (depending on how the transaction was constructed), which is why wallets generally need to take more care in those situations[1]. But this is inherent to the “transparent” aspect of these transactions and thus unavoidable disclosure, so we can put those aside for the remainder of the discussion.

In fully-shielded transactions, values are stored inside note plaintexts, which is encrypted to the recipient using ChaCha20Poly1305. This is a widely-used and well-studied AEAD encryption scheme, so we’re happy to rely on it for confidentiality. We use an encryption key derived using ephemeral-static ECDH from the recipient’s address and an ephemeral secret (\mathsf{esk} in the protocol spec). The corresponding ephemeral public key \mathsf{epk} is published in the transaction, but the recipient’s address is not published in the transaction. So even if an adversary can recover \mathsf{esk} from \mathsf{epk} (e.g. the sender’s randomness source was weak, or the adversary can break EC-DL), they don’t have enough information to derive the encryption key.

The other place that values “appear” in fully-shielded transactions are the value commitments (the “hashes” you were referring to above). These are commitments to the values that are used for connecting otherwise-independent ZKPs together for balancing value across the transaction. We use a really neat cryptographic technique to ensure that these commitments reveal no information about the values they commit to.

Value commitments are Pedersen commitments, of the form

\mathsf{cv} = [\mathsf{v}]\mathcal{G} + [\mathsf{rcv}]\mathcal{H}

where \mathcal{G} and \mathcal{H} are independent bases (meaning no-one knows any k such that \mathcal{H} = [k] \mathcal{G}). The neat cryptographic property is that, if \mathsf{rcv} is a scalar randomly sampled from the uniform distribution, then \mathsf{cv} will also be uniformly random, no matter what \mathsf{v} is. This means that Pedersen commitments are an unconditionally hiding commitment scheme, meaning they reveal no information about the value being committed to.

Now, let’s say the sender selects \mathsf{rcv} = 0 when creating a shielded output (say their RNG is completely and utterly broken and returning zeroes). Then [\mathsf{rcv}]\mathcal{H} = 0 and so \mathsf{cv} = [\mathsf{v}]\mathcal{G}. You might think this opens that output up to a brute-force attack where the adversary generates [\mathsf{v}]\mathcal{G} points for a wide array of plausible \mathsf{v} values (selecting based on human predictability), stores them in a table, and then looks for them occuring as \mathsf{cv} values on-chain. If they find one of those generated points on-chain, it’s plausible that it’s a \mathsf{cv} generated with a zero \mathsf{rcv}. However, it’s impossible for the adversary to know this for certain, because those points can also be produced from any other value \mathsf{v}', by sampling \mathsf{rcv}' = \frac{\mathsf{v} - \mathsf{v}'}{k}:

\mathsf{cv}' = [\mathsf{v}']\mathcal{G} + [\mathsf{rcv}']\mathcal{H} = [\mathsf{v}]\mathcal{G} + [\mathsf{v}' - \mathsf{v}]\mathcal{G} + [\frac{\mathsf{v} - \mathsf{v}'}{k} \cdot k]\mathcal{G} = [\mathsf{v}]\mathcal{G} = \mathsf{cv}

It’s 2:30am so I’m going to leave working through the probability analysis (comparing the probability of such a \mathsf{cv} corresponding to a broken sender setting \mathsf{rcv} = 0 vs a real value with a randomly-sampled \mathsf{rcv}' producing it) as an exercise to the reader. What this means in practice is that as long as the sender’s RNG is good, we can rely on Pedersen commitments being unconditionally hiding, and therefore there is no brute-force attack possible with work smaller than the security parameter. And we rely on the sender’s RNG to be good for most aspects of the shielded protocol.

Just given the note value, no. You need to also know \mathsf{rcv} in order to derive \mathsf{cv} as above. If you’re the sender you’ll know this value for each of the output notes because they are sampled from your RNG during transaction creation, but transaction builders (should) discard them once the transaction has been created.

[1] Ideally wallets would check with their users as to what information a transaction is going to reveal before creating the transaction. This is completely impractical with the Bitcoin-inherited zcashd RPC workflow, but is something I’d like to see implemented in mobile wallets.

5 Likes

@str4d Thank you for the mini crypto lesson!

Indeed, I have learned more recently that plausible deniability, under several threat models, isn’t much of a defence if the attacker makes up their own mind about statistical probability using their own analysis. “There’s a 23% chance this person sent that transaction” isn’t very private when you think about it. In fact, it can be dangerous - a (sane) attacker might falsely think you’re the sender, or give you grief, due to mere probability-derived suspicion. Far better to not be discoverable as a potential candidate in the first place. This is clearly an important OPSEC concept.

(E.g. Monero I see relies on plausible deniability as a key part of its core defence, e.g. their ‘(currently it’s) 10 decoy outputs per Tx’ design, whereas Zcash has more ‘perfect’ privacy in the default experience, which is far more desirable.)

So to summarise in my words: so long as the rcv value generated by ZEC software has sufficient cryptographic strength (which obviously is the case in default zcashd/zcash-cli etc.), and is not leaked out to the attacker somehow, there’s simply no possibility of reversing a transaction value using any kind of dictionary attack checking commonly-used values, and there is not even a sane reason for plausible deniability to be relevant to the scenario. Any transaction value they might try to claim, or prove, is as cryptographically ‘likely’ as any other.

Super cool. That’s more than plausible deniability - that’s cryptographic gaslighting!

But obviously either rcv or certain other values used in the crypto (known to the receiver or sender, e.g. their own public keys) are kept in perpetuity, otherwise in the future they wouldn’t be able to know the value of their past Tx’s to track their own money.

So I think my ELI5 analogy of ‘multiple passwords instead of one password like AES’ is sound. In fact, it’s more like multi-factor auth, i.e. an extra ‘something you possess’ (a randomly generated, privately held key like an ‘object’ you have on file and don’t need to memorise) is also needed for decryption, and is why no one else can prove or reverse the Tx amount on the blockchain, even if they know its value.

In all honesty, I hope both Monero and Zcash improve (and hell, Bitcoin too, and Ethereum). Clearly there’s different coins for different uses cases, and that’s fine. More than one coin for a single use case is healthy too. Fortunately, fully shielded ZEC daily usage seems to have (at least) doubled recently: https://zcha.in/statistics/usage - it was more like 100 a month ago. This very small real-time Tx anonymity set (compared to Monero which has tens of thousands Tx’s per day) wouldn’t be feasible if not for the perfectly hiding design of ZEC’s crypto. If we do the best we can with Tor and other local node OPSEC, all we then have to worry about is timing heuristics, and possible far-future quantum deobfuscation (in part or even full).

I urge Zcash project to embrace Tor Project-esque sensibilities and literal implementation of Tor tech like default .onion URLs for all websites. It helps win users over. Sadly most people don’t make their decisions based on objective information. Gotta look the part too.

2 Likes