I’ve been trying to understand how Zcash and Libsnark works for the past week. Here is my current understanding. Is it correct? Please explain if not correct.

There is a proof attached to any private transaction. It proves the following statement: “I know a private key for an address for which there exist unspent inputs, and do not exist any output transactions.”
Here is how it is proven. Developers of Zcash written a function in C language, which looks through all the transactions in blockchain and returns True if the statement above is correct, and False otherwise. Then Zcash developers use the Libsnark library, given this C function as an input, they output public parameters, which include proving key and verification key. Proving and verification keys are distributed with Zcash software for everyone to use.
Then, when Alice wants to send some money to Bob anonymously, she uses the proving key, her private key with money in it as an input, she generates a private transaction with a proof.
Everyone else uses verification key to make sure the private transaction is correct, meaning if they knew sender’s address and amount, and if they run the C function above on that transaction, it would return True.

My understanding is that proving key contains many randomly generated requests to a proof, and the requests are in form of quadratic polynomials, which should be satisfied by a proof. Is it correct? Are these requests known to sender/prover Alice? Why can’t she create a fake solution that would satisfy this particular subset of proofs, but does not satisfy other requests, which are not being checked?

My understanding is that PCP proofs are always probabilistic, meaning the proofs are not absolute, and there is a small chance that Alice can spend money she does not have. What is the chance of that in case of Zcash? I assume this chance is diminishingly small and is about the same as for Alice to randomly generate someone elses private key with money in it.

My own understanding is only at a very abstract level but what I understand of the nature of the proof is just that input and output amounts are equal. This permits the integrity of the blockchain and the privacy of the transaction to be maintained.

Beyond that, it has been mentioned (somewhere…) that wallets containing protected addresses will need to scan the entire blockchain for protected transactions and individually test each of them to determin if they pertain to those addresses in order to derive balances for them.

He talks about minting zerocash from Bitcoin, which I know is no longer true, because zcash will have it’s own blockchain with it’s own tokens. Are any other parts of his explanation has also changed? It is confusing.

Is zero knowledge only used to prove a statement “I know original value of this hash h, here is a proof of that” and everything else builds on that? Or is zero knowledge used to prove more complicated statement, involving scanning blockchain, similar to the one in my question?

Most of it is above my head so sorry I can’t directly answer your question.

And as you mentioned the Zerocash protocol has changed since it has been developed into Zcash so for a complete answer hopefully we can get a developer like @daira to chime in with more technical details.

I’ll try to make this as concise as possible without losing too much accuracy:

Value is carried by “notes” which specify an amount and a public key. To each note there is cryptographically associated a commitment, and a nullifier (so that there is a 1:1:1 relation between notes, commitments, and nullifiers). However, it is not possible to correlate a commitment with its nullifier without knowledge of the note. Computing the nullifier requires the associated private key. An unspent valid note, at a given point on the block chain, is one for which the commitment has been publicly revealed on the block chain prior to that point, but the nullifier has not.

Transactions can contain “transparent” inputs, outputs, and scripts, which all work basically as in Bitcoin. They also contain a sequence of zero or more “JoinSplits”. Each JoinSplit takes in a transparent value and up to two input notes, and produces a transparent value and up to two output notes. The nullifiers of the input notes are revealed (thus preventing them from being spent again) and the commitments of the output notes are revealed (this allowing them to be spent in future). Each JoinSplit also includes a computationally sound zero-knowledge proof-of-knowledge (SNARK) which proves all of the following:

The inputs and outputs balance (individually for each JoinSplit).

For each input note of non-zero value, some revealed commitment exists for that note.

The prover knew the private keys of the input notes.

The nullifiers and commitments are computed correctly.

The private keys of the input notes are cryptographically linked to a signature over the whole transaction, in such a way that the transaction cannot be modified by a party who did not know these private keys.

Each output note is generated in such a way that its nullifier will not collide with the nullifier of any other note.

Outside the SNARK, it is also checked that the nullifiers for the input notes had not already been revealed (i.e. they had not already been spent).

A payment address includes two public keys: one that matches the public key of notes sent to that address, and another for a key-private asymmetric encryption scheme. “Key-private” means essentially that ciphertexts look like random data and do not reveal information about which key they were encrypted to, except to a holder of the corresponding private key. This is used to communicate encrypted output notes on the block chain to their intended recipient, who can use the corresponding private “viewing key” to scan the block chain for notes addressed to them.

So, the basis of the privacy properties of Zcash is that when we spend a note, we only prove that some commitment for it had been revealed, without revealing which one. I.e. the anonymity set is all previously created notes, and a spent note cannot be linked to the transaction in which it was created. The nullifiers are necessary to prevent double-spending: each note only has one valid nullifier, and so attempting to spend a note twice would reveal the nullifier twice, which would cause the second transaction to be rejected.

Incidentally, the NP statement to be proven/verified is expressed by calls to the libsnark API written in C++. (You might be confusing this with the C-to-TinyRAM compiler from SNARKs for C: Verifying Program Executions Succinctly and in Zero Knowledge . Zcash uses the same proving system but does not use that compiler.)

This reminds me that Andrew Miller pointed out the Zerocash paper uses “NP statement” in a sort-of-invalid way. It’s really an NP language which describes the circuit.

Something I’ve too readily glossed over in my own understanding is the proving key(s). Surely it’s not one big, single key - so is it lots of smaller keys? When the size of that file changes with new releases, what’s actually happening? And what aspects of Zcash are influenced by the proving key(s)?

Thanks a lot Daira for your comprehensive explanation! Wow, Zcash has so many levels of complexity. Your explanation is the missing link between superficial descriptions of zcash in articles or podcasts, and highly technical description in papers. This gives me enough context to start understanding the zcash paper.

Could you also please explain a bit lower level of how Libsnark works? What exactly do proving and verifying keys contain? Why is toxic waste being generated?
I heard that proving key contains public keys for each “wire” of a circuit, and toxic waste contains private keys for those public keys. However I cannot imagine how public keys can be used without private keys.
Also, my understanding, is that in order to turn an interactive proving scheme into non-interactive, we need somehow to encode requests to a verifier, so he can later answer them. However these requests should look random to the verifier, so that he cannot predict them in advance to produce a fake solution that answers requests.
Could you please clarify?

No, the verifying key is used to verify proofs. (The integrity of the downloaded proving and verifying keys is checked using hashes by the script that downloads them, so that we don’t rely on the S3 hosting of those keys for integrity.)

Hi @daira, just trying to close the loop for this process in my mind.

I think understand the way the nullifiers are revealed and the commitments are created and revealed. I’m hoping to get a high level explanation of how the SNARK works. I understand how the public key allows us to construct the non-interactive proof.

What I’d like some more light on is SNARKS. I understand that SNARKs work by making it so that…you’re tasked with making two polynomials that are equal to each other. And you can only do that if you know the numbers you say that you know. Could you shed some light on how that works? It sounds like elliptic curves are involved. Is this similar to the general concept of “you can prove primality and composition using elliptic curves”?

Thanks in advance! Really thankful the zcash team is as communicative as you all are.

For non-interactive proof: is it true that Zcash-specific design include a challenge (toxic waste) hidden from provers, and circuit-derived polynomials evaluated at that challenge with bilinear pairing? At what point challenge is produced with a hash? It seems this design idea is getting wider adoption.