Idea for private syncing of deterministically derived t-addresses

I’ve been thinking more lately about the privacy trade-offs with t-addresses. I wanted to share my idea, and see if anyone can poke any holes in it.

First, for context, here’s the problem I’m trying to solve:
Currently, major zcash wallets do not rotate t-addresses. There’s valid reasons why wallets do not rotate t-addresses, but t-address reuse is still highly undesirable.

The leading candidate to avoid t-address reuse(afaict) is to use ephemerally generated t-addresses. I’m concerned about risk of loss, however, as users receive money to ephemeral keys they no longer have access to. If the user gives out a t-address, but then changes wallets or loses their phone, etc, then any money sent to that address will not be recoverable by their wallet recovery phrase. This is a small risk, but I think it will happen more than you might expect… Users are likely to bookmark a t-address for reuse, without realizing its ephemeral or what the consequences of an ephemeral address are. Educating normies about the consequences of ephemeral addresses will be a challenge (to put it gently :sweat_smile:).

Ok, so my idea:
The problem really has to do with the way HD wallets sync. An HD wallet derives 20 addresses (or whatever you’ve configured for the gap limit) and queries the RPC to see if any address has been used. If so, it derives the next 20 addresses and checks again, until having discovered every address which has been used. It then downloada the UTXOs these addresses own, so that it can construct new transactions that spend those UTXOs. As mentioned before, this will expose to the RPC that you own all of the requested addresses… a very bad privacy leak.

But we don’t have to scan this way. We could download the entire t-addr UTXO set, locally derive the first 10k addresses (or some reasonably large number) and just scan the UTXO set for any of these addresses.

Obviously this has some performance tradeoffs:

  1. a few GB download of UTXO set
  2. <1MB of memory for 10k address derivations
  3. time to deriving address and search the UTXO set (even an old phone can likely do this in <60 seconds)
  4. user can only receive 10k transactions, lol

IMO the initialization time is acceptable, because its a one-time cost and not meaningfully different from the initialization time already required for syncing shielded outputs.

There’s also some work to be done syncing updates to the UTXO set, but this is doable.

Conclusion
Unless I’m missing something, I believe this would give us the best of both worlds: Deterministic wallets (no risk of loss of funds to ephemeral addresses) without revealing addresses to a wallet RPC.

Curious to hear what others think.

1 Like

I think it makes sense? Wallets already have to scan the entire blockchain to find shielded txs sent to them. Why not look for UTXOs while they’re at it?

1 Like

Yep, that was my thought.

Just to be clear though, this would be more efficient than downloading the entire blockchain. This proposal only requires the wallet to download the current UTXO set.

There are a few things here:

  • With respect to downloading the UTXO set, something we have discussed in the past is representing it as a compressed map - doing so would allow us to discover subsections of the chain that are potentially relevant to the wallet. I would be really enthusiastic if someone has the time to add this kind of functionality to the light wallet protocol - it would also be excellent to get nullifiers for shielded notes in this fashion; it would make syncing much much faster because wallets could then entirely skip scanning sections of the chain that aren’t relevant to them.
  • We are already in the process of adding transparent input and output information to compact blocks so that normal scanning can discover UTXOs without linking addresses in the view of the light wallet server.
  • We already implement fetching of specific transactions over Tor.

The main thing I am concerned about is seed reuse in wallets that don’t handle transparent UTXOs properly. Now, I think that seed reuse in general is a terrible idea, and we should discourage it wherever possible across the ecosystem, but there are times where it is necessary, so the other part to all of this is define standards for wallets around how well they protect users who receive funds on taddrs, and have well-publicized ratings for how well each wallet conforms to those standards. We need multiple wallets in the ecosystem that are all doing things right, so that people can have safe recovery options in the case the wallet that they are using becomes unmaintained or suffers from a blocking bug or whatever.

2 Likes

Can you expand on the seed reuse issue? I don’t really get it. If the wallet uses normal scanning to discover UTXOs, does it matter? Or is the concern is about wallets that don’t do that? (In that case, does it matter? They already compromise privacy anyway)

1 Like

Yes, the concern is about wallets that have Bitcoin-derived transparent functionality and so (a) query block explorers or light wallet servers in a way that links addresses, and even worse (b) may construct transactions in ways that link addresses.

1 Like

This is very interesting. I’ll look into this and see if I can help out.

I understand this concern and this is actually what led me down this path. Hear me out…

Bad wallets will do bad things, regardless of t-addr rotation/derivation/scanning/etc. If a user chooses a wallet which handles any form of wallet logic improperly, then those users will be subject to privacy or even security risks. But we can’t eliminate the freedom of choice for users to choose whichever wallet software they like.

But what we CAN do, is standardize the correct wallet scanning technique, and allow the community to encourage/discourage use of wallets who do/don’t follow the Zcash standard way of operating HD wallets. Much like how BIP-32 defined the HD wallet structure that the Bitcoin community argues to be “correct” and now Bitcoin wallet developers would never dare implement something other than the BIP-32 standard.

I am happy to write a ZIP for this.

2 Likes

Another interesting consequence of this design, is that the wallet would no longer require indexers of any sort

1 Like

I don’t see how is that solvable, and how is that not already a problem? Don’t existing wallets already query derived addresses?

This is a misunderstanding. When we say “ephemeral addresses”, we do not mean “randomly-generated addresses”. Ephemeral addresses are still deterministically derived under the account’s BIP 44 derivation path; they are just derived in a separate subtree that Bitcoin-derived transparent wallets do not look in (and if they are closely aligned with Bitcoin, will never look in because the Bitcoin ecosystem makes a bunch of assumptions about there only being two places that addresses can be derived), which means that importing a Zashi seed phrase into a Bitcoin-derived transparent wallet won’t cause the ephemeral addresses to be linked on chain.

2 Likes

Ahhh, ok, I definitely was thrown off by that terminology. Is there a spec for this anywhere I can read? I still have lots of questions :sweat_smile:

First, if I recover my wallet into Zashi again in the future, how does Zashi know which ephemeral key indices I have already used without having to scan (and thus link) the keys under my wallet’s “ephemeral” tree?

Second, maybe this is just semantics, but purpose = 44' in the bip-44 path specifies that all subtrees will be scanned with the same gap-limit logic. So perhaps this “ephemeral address” tree should be under a different purpose identifier with different scanning expectations?

Third, I’m confused how this provides any additional protection. An improperly implemented wallet (e.g. out of spec Zcash or multi-coin wallet) may still scan whichever ephemeral key tree you choose e.g. to ensure they discover all the user’s funds. The effectiveness of the ephemeral key tree approach (as I understand it) still comes down to the Zcash community’s ability to actively discourage use of those out-of-spec wallets, which is the same regardless which bip-44 path is used.

Lastly, despite all of this, I think my proposal still offers a meaningful privacy benefit, since it eliminates the need to query the RPC at all. Curious if you agree, or if you see something I’m missing.

Please don’t waste your time answering my questions if there is a spec I can read instead! I only ask, because I haven’t found documentation for these design decisions elsewhere.

The main drawback is that it’s gonna make transparent scanning very slow. Without testing, it’s hard to see how much, but I wouldn’t be surprised if it was as slow as shielded synching.

IMO, you give up privacy the moment you use transparent addresses. At least, I get nearly instant synchronization. I wouldn’t use a wallet that takes that from me too.

Yeah, t-addr syncing will be slower, but its a one-time cost at startup that can happen in parallel with the z-addr sync, so I doubt it would actually affect UX in any meaningful way.

Zcash supports t-addrs and Zashi is already pioneering how to use t-addrs (when you must) without giving up on privacy. Seems like worthwhile effort to me :person_shrugging:

Not everyone uses shielded addresses. I guess it makes sense for Zashi and privacy purists, but I think you’ll get a lot of hate from the other users.

If don’t care about privacy or shielded usage, I think Zashi is the wrong wallet for you. Sorry.

1 Like

I’m ok with it as long as it still supports the old synchronization method.