ECC Acceleration using GPU Compute on Mobile devices

Applicant background

I am a software developer who has had senior positions in a major software company. I was the CTO of listed companies and hedge funds.

Description of Problem or Opportunity

Zcash underlying crypto makes heavy usage of Elliptical Curve Crypto on specialized curves (Jubjub & Pasta). Currently, there is no support for hardware acceleration for these curves and the work must be done by the CPU.

For instance, Synchronization from seed requires scanning the blockchain and building the note commitment tree. With Warp Sync, I optimized the process down from several hours to a few minutes. The remaining bottleneck is the Pedersen hashes. They can be batch computed in parallel which make them ideal for GPU compute.

Proposed Solution

I propose to develop a library that will leverage GPGPU compute in OpenCL to offload the ECC calculations and distribute them to a large number of ALU cores.

Solution Format

It will be in the form of source code that builds a library usable from Android and iOS.

Technical approach

It would be similar to the approach taken in GitHub - hhanh00/secp256k1-cl, a similar project that I have done for secp256k1, the ECC curve used in Bitcoin.

I need to implement the integral field operations and the ECC addition and multiplication as GPU Compute kernels. The first target is batch Pedersen Hash calculation in order to speed up Warp Sync further.

How big of a problem would it be to not solve this problem?

Orchard also use ECC albeit on different curves. Without appropriate hardware support, the other coins perform much better because they rely on popular curve for which these libraries exist.

Execution risks

  1. Performance of Hardware acceleration is notably hard to predict. It could lead to disappointing results. Without doing a prototype, we don’t know how well the ECC algorithms are going to behave. However, there is prior work (but on other curves). Also, difference in hardware, even between generations, can have an impact.
  2. GPU programming is hard. It is akin to embedded development (i.e. Hardware wallets). We may experience delays due to technical difficulties.

Unintended Consequences

It could be too fast

Evaluation plan

We will benchmark with several account IVK against other wallets.

Schedule and Milestones

Subject to review and adjustments

  1. Proof of concept, Integral modular math
  2. Add ECC math
  3. Add PH
  4. Warp Sync
  5. Investigate Trial Decrypt
  6. Pasta curves
  7. Multi Exp
  8. Orchard Trial Decrypt

Budget

  1. ~1 month
    2-5: 6-12 months
    7-8: 4 months

We are looking for funding for the first phase at 15k.

Then we can access the time & cost needed for the rest.

15 Likes

Great idea !

5 Likes

Can you copy the contents of the proposal and paste it in the post?

1 Like
  1. Have you studied whether the data structures used in the Jubjub & Pasta curves be represented with the GPGPU implementation for use in the targeted mobile processors? Or does your work include working on the underlying math? Or is it simply porting the math to OpenCL?

Congrats on the WarpSync work, it seems to be slightly faster than BlazeSync by @adityapk00

  1. Do you have experience in packaging GPU compute code as Mobile SDKs with appropriate interfaces for use within Mobile applications?
  2. Is there a special hardware acceleration permission/approval required for the proposed Android & iOS integration?
  3. Are there specific versions of Android and iPhones that support the GPU usage? Is there a minimum OS requirement to utilize the APIs that access the ALUs?
  4. Do you have any benchmarks of efficiency when off-loading several computations(series of blocks/UTXO data) sequentially via GPU vs CPU?
  5. Have you communicated and collected feedback from the potential users of the solution that you are building?
  1. Does that mean the deliverable be plain C/C++/Rust code or will you be delivering the useable Android & iOS specific libs via maven central/pods distribution channel with documentation?
  1. The linked SECP256K1-CL is a fork of sipa’s (Pieter Wuille) optimized ECDSA library for Bitcoin. Is your modification of batching & parallelizing being used anywhere? Is your fork being maintained?(the last commit was 7 years ago).
  1. Are you working on this alone @hanh or will there be a team working on this per the “we” language?
  1. Is your plan to get compensated at $15,000 per month for 11 to 17 months?

Can we have developers from ECC/ZF comment on the initiative, to check if anyone else has pursued hardware acceleration improvements and whether any existing GPU computing methods be re-used? @daira @dconnolly @str4d
Theres work done towards accelerating the ECC for Zcash GitHub - ZcashFoundation/zcash-fpga: Zcash FPGA acceleration engine and the documentation is quite detailed zcash-fpga/zcash_fpga_design_doc_v1.4.2.pdf at master · ZcashFoundation/zcash-fpga · GitHub
IMO It would be preferable to have a detailed spec for any math related improvements/optimizations to evaluate and audit the output.

–
@ZcashGrants since this proposal was submitted just 1 day before the ZOMG bi-weekly meet, I request giving ample time to the community to comment on this research driven proposal & discuss the benefits of the works to the Zcash community for paying top coin or will this be another throw-away code of changes with the math/acceleration.

3 Likes

GPU have a large number of arithmetic cores (ALU) but fewer control flow hardware. In order to use ALUs efficiently, the code has to be written to maximize parallelism. The work includes choosing and implementing appropriate math algorithms.

I’m not sure what you mean by data structures used in Jubjub & Pasta. Could you clarify the question?

Thanks, WarpSync is 2x to 10x faster than BlazeSync. I’d be glad to participate in an official benchmark with ZecWallet, NightHawk, and ZWallet.

No, but initial investigation shows that it is not really different than regular C code. It should be easier than using assembly language for instance. OpenCL is quite portable.

Most likely not. Games use GPU and they don’t require special permissions.

GPU usage started at least 5 years ago. Nowadays there are several API (Vulkan, Renderscript, etc.) I intend to use OpenCL because of its maturity and widespread support.

For zcash, I don’t think so since there is no implementation on GPU.
For bitcoin, I have benchmarked parallel batch signature verification on desktop.

A big part of the work here is to benchmark and optimize. But obviously, we need to implement it first.
Btw, the advantage of the GPU will show if we can parallelize the work. Offloading sequentially won’t help.

I had a discussion with @str4d on the R&D discord. I don’t want to quote him without permission but he seemed interested for Sapling and Orchard too.

Just like librustzcash. Portable Rust code that can be interfaced from Android and iOS. I will dogfood it in ZWallet for sure.

At that time, I was involved in the Bitcoin community more and I worked with Peter Wuille and Gregory Maxwell. The result is this prototype. I haven’t touched it but it still should work. The math hasn’t changed.
I quote this project for its similar approach, but not because the deliverable should be in the same form.

I welcome any help. Moreover, if anyone wants to do this project (even without me), I’ll be backing their submission as long as they can show the right skillset.

The next line in the proposal says

Agreed. That’s exactly the purpose of funding the first phase. We can investigate the math, write a prototype, evaluate and audit the results. And then decide whether to proceed with the rest.

Yes, there is no hurry at all. I believe it is fairly priced with clearly defined objectives and deliverables. @ZcashGrants This proposal could also be a bounty.

What other “throw-away code of changes with the math/acceleration” are you referring to? FPGA is not throw away. It helped ASICs for sure.

5 Likes

I have developer options enabled on my mobile, GPU will always be available to apps that request it. Happy to see this.

The data structures support in platform/devices for Open CL or any other framework that you are planning to build on. If you are sticking to Open CL, it should support all kinds of data types VS experimenting with Vulcan/MLkit. Maybe you can go over it in details.

I don’t see the point in benchmarking with wallets that are not optimized. IMO it would have been beneficial for the broader Zcash users had you contributed WarpSync upstream, so wallets that depend on full calculations using ECC SDKs like Nighthawk & Unstoppable would have benefitted with your improvements. Not a problem now, as there are even faster variants of fast sync in the works. I have made the point earlier about how the small Zcash dev community should learn to fill the gaps and contribute to building the software that enables financial privacy to the world.

Okay, do you agree it would be better to apply for this grant with a working barebones PoC lib & integrating it to a client app to check what it takes to offloading dummy tasks to the GPU before embarking on a paid research for PoC?

Per your own recent comment in ΚΣ Labs / 1y Fellowship :

You stated this in a very confident manner, to have technical specifications ready for a research driven proposal.

Good, but I don’t believe @str4d would be the end user of the output of this research, sure he can help verify and port the GPGPU support upstream as your grant doesn’t include the creation of ready-to-use libs for wallet developers. Would you be willing to collaborate and chart a path for your work for use in existing Zcash toolchain?

This is one primary concern of this grant, without connecting with wallet developers and getting their feedback, how do we know if the output of this work will benefit Zcash users? And how to avoid a Ledger grant like situation where the underlying cryptography work is paid for, done & verified; but the output remains un-useable and un-available for the broader Zcash community?

So it will be you working by yourself. Do you have a mentor or a researcher with whom you will be sharing your progress? You have good documentation skills as seen in the WarpSync writeup, maybe you can help document the tech specs planned out for this grant like the one for the FPGA improvements?

This is a secondary concern, 3 months ago Han had proposed to integrate ZEC to BTCPayServer for $80,000. ZF Grants - Payment Gateway
Which was replaced with another grant to add ZEC support to CoinPayments where the rate was more than doubled to $192,000 ZF Grants - CoinPayments Integration which ran in to the identified execution risk of “Not much risks except the dependencies on vendor participation.”
Can you provide an update on the status of CoinPayments/BTCPayServer work? Are you going to help add Zcash support to BTCPayServer upstream repo so BTCPayServer users can start accepting payments in ZEC?

The rates for month 2-16 of this grant are unknown. It would be good to have an approximate estimate of the ask, it can always be extended if the work needs more time.
Otherwise, the ask to start work on this PoC with a $15,000 funding and an approximate 11-16 months timeline(which can turn to a full fledged $240,000 payment for 16 months considering $15,000/month) to deliver the end result. I am all for long term funding high skilled individuals who can collaborate & contribute to Zcash. Keep in ming though, several grants for projects with lower burn rate per team member(and possibly higher ROI) than this R&D grant were struck down recently.

2 Likes

I’m sorry but I still don’t understand what data structures you expect a ECC library to have and what it has to do with Vulkan. Could you give an example?

The API I intend to expose will deal with Hashes and Points in Affine and Extended form but most of it can be treated as opaque byte arrays.

  • BlazeSync is an optimization. ZecWallet is clearly optimized for sync. I guess you don’t want NightHawk and Unstoppable to participate?
  • Users see the performance and don’t care if the developer has optimized or not
  • Benchmarks provide a baseline for further work. You need a benchmark to measure your progress against it.
  • I’m replying to your usage of the word slightly

I feel it downplays the improvement achieved when the increase is at least 2x.

WarpSync is open sourced in its own crate with only dependencies on librustzcash. This is as upstream as it should be. Users of a librustzcash don’t need to use warpsync if they don’t want it and if they want it there are no extra dependencies than what they probably already use.

Even librustzcash is moving in the direction of breaking down into smaller crates.

Whether ECC SDK, NH or U want to use WS is beyond my control. If you expected a drop-in replacement of the existing sync algorithm, I’m afraid that it is not the case. WS uses a different API because it needs to.

I think we should try to increase the size of the Zcash dev community instead. I have provided a list of suggestions during the ZOMG - What to build meeting. I’d rather not have developers use libraries without understanding how they work, because they are too busy.

It’s good to hear. Again, I’d like to have an open set of scenarios and metrics in order to objectively measure sync progress. I’d be best if they were chosen by the user community with participation from wallet owners. At this point, it is impossible to include ZecWallet or NightHawk because one has to share the seed phrase. That’s obviously not acceptable for a public study. Adding support for viewing keys would go a long way towards transparency.

That’s the goal of the first phase. I expect people to be compensated for POC work. It’s standard practice.

This project requires skills in:

  1. zcash protocol: to understand which part of the protocol can be improved
  2. software engineering: define API, integrate with platform libs, optimal data structures, parallelism, concurrency
  3. cryptography: knowledge of ECC, field math and the various algorithms used to speed up operations (algorithmically and mathematically)
  4. GPU programming: Which functions would benefit from being on GPU, how to use VRAM efficiently

I think it’s a rare combination of skills, so I provided a link to a previous project in order to give the community prior work experience in these areas.

Warp Sync was about 1-3. This adds 4.

If you want, I can give you technical specifications for phase 1. The project goal and deliverables are well defined already.

I will provide an example of integration since it will be in ZWallet as a separate crate. To be clear, it will have a different API but it will not depend on WS. It will be similar to the jubjub crate.

As it stands, I would be working by myself.

I will share the progress with the community as always.

Andre and Michelle have provided an update in a CoinPayments Integration - #10 by ml_sudo

Unfortunately, as you said, it ran into the risk mentioned as the vendor decided to go a different direction.

The ZEC/BTCPayServer was not funded.

I offered to look into reusing some of the work done in the CP integration but on top of BTCPayServer.
Once I have a proposal, I will submit it on the portal.

I’m 100% OK with that. Grant money should be used on the projects that bring the most value to cash.

On a parting note, I’d like to say that even 7 years after, secp256k1-cl is my most starred and forked repo. I think it’s because:

  • there is still ongoing interest in ECC on GPU (last star was a week ago)
  • it’s difficult to do.
6 Likes

Great, is this done by Nighthawk team? Can you show us the progress? Or if not, when do you think you can show it to the community? Cheers.

Btw I am supportive of this grant as @hanh has clearly prioritized working and sharing the progress with the community, especially regarding Zwallet.

3 Likes

I agree with @aiyadt that it would be really helpful to get feedback from ECC (@str4d @daira) and ZF (@teor @dconnolly) on the overall likelihood of this bringing major improvements to end users.

@hanh my questions for you are:

  1. My understanding is that your work on this will become part of librustzcash, or that it will be really easy to integrate into zecwallet-light-cli or the official mobile SDK that Nighhawk uses once it is complete. Is that correct? Where would it be merged to? librustzcash? Can you confirm with ZF and ECC that they would merge these changes if this was completed and working? Knowing this up front, and uncovering any reasons why they might not want to merge these changes, would help mitigate the risk that this work will not be useful to a large portion of Zcash users. It will be helpful to have some indication that, if successful, this work is likely to land in mainline Zcash tools used by most or many projects.

  2. How vulnerable is this work to future changes in the protocol? What are the chances of the operations you are optimizing being abandoned or changing in some way in the time it takes to complete this work and “land” it in the broader set of tools the ecosystem uses? Is the protocol still changing too much for investment in this kind of optimization to be worthwhile? It would be great to have an opinion on this from ECC/ZF as well.

  3. On what order do you expect the performance improvements to be? 10x? 100x?

  4. What are the primary UX issues you hope to address with this work (initial sync time? tx sending time?) and can you give us some numbers on likely impact in those cases? Are you sure this is the lowest hanging fruit for addressing those issues?

  5. Am I correct that after one month you’ll have data showing us how promising this approach is likely to be? How much confidence will we be able to put in that data? And if you know after one month how fast it’s really going to be when complete, help me understand the work that comes after that. It sort of sounds like you’re saying “We won’t know how well this works until it works” and also “I’ll know how promising this path is after 1 month”. I can see ways that both of these can be true, but I want to know more about this.

Also thanks @aiyadt for kicking the tires on this grant and advocating for prioritizing reusability of code across the ecosystem. And thanks @hanh for engaging so eagerly with questions and feedback!

6 Likes

The Filecoin Project has a fork of bellman that already supports GPU-based Groth16 proving on BLS12-381: GitHub - filecoin-project/bellperson: zk-SNARK library . bellperson is licensed in the same way as bellman (MIT/Apache).

bellperson supports a wide range of GPUs and has been well tested on the Filecoin network. The reason it hasn’t been adopted in zcashd and the mobile SDKs is lack of time/bandwidth on ECC’s part (mainly, to resolve any conflicts between the bellman / bellperson and pairing / paired code bases), rather than because more work is needed on the GPU-related code itself. I believe that code is generic enough that it could be adapted to Pasta and Halo 2, although that might require a significant amount of work.

3 Likes

We’ll have an update and possibly apply for a grant when a PoC is ready.

In your Milestone 1, you stated “Code an OpenCL library that implements the field arithmetic operations on the Gallois Field of Jubjub. - Benchmark batch operations compared to CPU only Use ARM based mobile GPU (Adreno)”.
It would be important to have a technical spec and support by your chosen framework to build on for the PoC. But it seems like you want to work on that as part of the PoC grant.

The point isn’t about not wanting to participate or writing up app comparison tables which can be misunderstood. It is about having access to re-useable code which can be shared by Zcash Builders, which ultimately benefits the broader Zcash community VS making Proof of Concepts.
But your point stands, Benchmarks indeed provide a baseline for further work. That is why we(LCWG Mobile SDKs) have been pushing for improvements which will most likely be bundled with support for Orchard Address.

It wasn’t my intention to downplay your efforts. In my wallet testing, BlazeSync performance was apt enough VS WarpSync, it may be due to the new wallet/length of history.

Absolutely, Zcash needs many more builders. But not at the cost of fragmentation, or no one will know where to build from.

Thanks for sharing your concern. It would have helped if you had communicated and let me or @adityapk00 know.

I’m glad you have the experience in all 4 and hope you respect the experiences from other grant writers when they come up with their proposals.

1 Like

I was under the impression that this kind of work would be covered by the existing R&D grant to NH. So far, it has been mostly UI updates: new transaction page, new logo, etc. UI is important but IMO, sync performance is even higher.

Of course, feel free to submit another grant if you disagree.

Yes, the GPU ecosystem on mobile is a hot mess. It is much more fragmented than on desktop where CUDA and OpenCL dominate. I should have clarified this.

Code can be reused in a variety of ways. I think that the dev ecosystem should work towards offering more choices with pros and cons rather than one library.
With no alternate choice, apps end up simply wrapping the library.

For example, in the early days of JPEG, there was a single library capable of decoding. It was the reference library provided by the JPEG group. Every image app was built around it. Then came ACDSee which was orders of magnitude faster because it had its own decoder. Eventually, the competition picked up, and now the JPEG group library serves as a reference implementation only (as it was meant to be).

I understand that some wallet devs would like to have a simple SDK and focus on other things. Not everyone should be a core dev.

But on the other hand, saying that ALL of the core work should fit the current API limits innovation.

I am not asking any wallet to participate in a benchmark if they don’t feel they are ready.
However, if one claims it is 2x or 5x or 10x faster than XYZ, it must expect to be benchmarked.

At that time, we should have clear benchmarks with several goals:

For example,

  • time to correct wallet balance. Notes may not be spendable yet.
  • time to first note spendable
  • time to all notes spendable

Different reference hardware and wallet composition (age, number of tx, number of notes)

I believe I mentioned it in the forum already.

Well, that’s why we need the above-mentioned benchmarks. I can also provide cases where BlazeSync is 10x slower. Though in general, I’d say that BlazeSync’s performance is good.

However, with BlazeSync, people may stop the stopwatch when they see the correct balance. AFAIK, BlazeSync continues synching and all the notes can’t be spent yet (?) @adityapk00

IMO, it’s not a matter of respect but due diligence. I apologize to anyone who was hurt.
I want to evaluate grants on these factors:

  • What value does it provide the zcash community?
  • Will the team be able to execute and deliver the project?
  • Is the cost worth the value?

If it’s not the case, maybe the grant can still be adjusted. Therefore, we need to ask questions and work out a deal.

Edit: Replying to @daira

I have to say I’m a bit miffed that Filecoin has a GPU accelerated implementation of Groth16 before Zcash when Groth16 was invented by the Zcash team. If the ECC team doesn’t have the time and resources, and the project is deemed worthwhile, maybe this is something that the MG was created for.

5 Likes

This is incorrect. The 10 month Nighthawk grant was especially for Design & Development to maintain, support and ship the wallets & contribute to SDK development. And I agree, fast syncing is important, that’s why LCWG is making efforts to implement an optimal syncing algorithm.

Additionally, Nighthawk has kept the promise to post updates on each grant every quarter and taken active part in monthly calls per the ZIP-2014. ZIP 1014: Establishing a Dev Fund for ECC, ZF, and Major Grants

More choices are better, and first, I believe Zcash needs to develop standardized toolchains for each platform so new developers who want to interface can do it quicker.

Hi @hanh,

ZOMG is really interested in this proposal, but we want to make sure it gets used and, as I’m sure you can appreciate, it’s hard for us to evaluate this proposal without a lot of outside input from qualified people.

Can you try to give a reply to each of my questions above? I think those are the answers I need to make the case for this in the next ZOMG meeting. (And I think if I’d had answers to them in the last meeting there’s a chance we could be farther along at this point.)

I’m really conscious of this having dragged out and that’s not the experience we want to give you, as an applicant. I hope we can get everything we need lined up before the next meeting!

Also, what is your response given the information that daira provided above re: the bellperson library? Does this change anything about the proposal?

@daira – do you think it would be fruitful for @hanh (and worth community funding through ZOMG) to pursue adapting this GPU-based Groth16 library from Filecoin to Zcash?

@dconnolly @teor does the Zebra team have a perspective on this?

1 Like

Two points of clarification:

  • Groth16 was not invented by us; it is a proving system created by Jens Groth in a 2016 paper. I think it’s fair to say that it was popularised by us, due to us writing the bellman crate (though nowadays there are several other proving system impls).
  • Filecoin’s GPU acceleration was driven by their needs: onboarding disk space to their network requires very heavy circuits that are computationally intensive to prove, so they were incentivised to make that as fast as possible. Zcash Sapling is a much smaller circuit by comparison, and proving times of a few seconds are perfectly sufficient for most desktop users, so there wasn’t an incentive to spend engineering resources to accelerate that further.

By the time the Zcash mobile SDKs were first released, I think Filecoin had already done their GPU backend work, so there was then no incentive for us to reinvent the wheel - we should leverage their work, in the same way they are leveraging ours. To that end, I’ve recently (in spare time around NU5 work) been upstreaming various changes from Filecoin’s forks (and discussing those changes with the Filecoin devs), to benefit both Zcash (we get useful improvements - yay open source!) and Filecoin (instead of maintaining forks, they can contribute to a shared crate ecosystem). The first set of those changes has just been released in the latest updates to ff, pairing, and bellman.

It is my personal goal to have Filecoin’s GPU backend upstreamed into our Sapling stack by the end of the year. Right now, that goal is limited by the time I have to spend on it; additional people working on or reviewing PRs would probably make this more likely to occur.

As @daira notes, some of this upstreaming work will likely be generalisable to Orchard; I’m hoping that the GPU field arithmetic backend will end up in the ff crate as a feature that can be used by both the bls12_381 and pasta_curves crates. But my initial goal for the upstreaming process is to enable bellman to be feature-compatible with bellperson, rather than to add GPU proving to halo2; the latter will likely require GPU knowledge that I don’t currently have.

I think it would therefore be useful to figure out whether Filecoin’s GPU backend could be extended to mobile GPUs, or whether the latter will need a separate backend. That would inform whether it is better to have the grant be linked to the upstreaming work (and/or contributing to it), or an independent development effort. I definitely want to ensure that any funded grant in this area will end up in our crates, so we should proceed in any case on the assumption that the Filecoin GPU code will be part of that ecosystem.

5 Likes

I spent an hour or so last night looking over the Filecoin repos, and I think the relevant place to start looking for this is GitHub - filecoin-project/ec-gpu: OpenCL code generator for finite-field arithmetic over arbitrary prime fields which the Filecoin devs are currently migrating from their crate forks to the latest ff etc. crates, and I think possibly planning to add EC codegen as well (moved out of bellperson). So it definitely seems like this would be a good place to start.

4 Likes

Feedback from Filecoin dev dignfiedquire:

ec-gpu-gen generates working opencl for both nvidia, intel and amd devices, so any tweaks needed on that front can totally be PRed into there.

For the actual execution abstraction we are building GitHub - filecoin-project/rust-gpu-tools: Rust tools for OpenCL and GPU management. which makes it nicer to spawn kernels and such things, and abstracts over opencl/cuda calling conventions. So that should need only some smaller tweaks to run on mobile as well I would expect

5 Likes
  1. My understanding is that your work on this will become part of librustzcash, …

This work would not be directly integrated into librustzcash because it is related to fundamental arithmetic operations on integers and elliptic curves. Upon these, zksnarks proving systems are built. And then the proving systems are used to build circuits and finally zksnarks. The current relation between crates is roughly ff → jubjub → bellman → librustzcash.

  1. How vulnerable is this work to future changes in the protocol?

The work proposed here is at the level of ff, jubjub and bellman.

ff (finite fields ~ integer math) is used by all ECC. jubjub is a specific curve but the other curves have a similar structure.
There are properties that do not exist in all ECC, though I don’t think they are blockers.

  1. On what order do you expect the performance improvements to be? 10x? 100x?

It will depend on the scenario. I think 10x or even 100x are possible.

  1. What are the primary UX issues you hope to address with this work …

Initial sync time would benefit from a faster Pedersen Hash Batch implementation. And tx sending time would benefit from a faster circuit implementation. Though for sapling, the circuit is quite short. But on mobile devices, where resources are more limited, making a tx with a lot of outputs can take minutes.

  1. Am I correct that after one month you’ll have data showing us how promising this approach is likely to be? …

The first month would be about porting the OpenCL FF library to Mobile and profiling the results. IMO, this is an important step
because it clarifies which toolkit and platform is best suited for Mobile dev. On Desktop, the situation is fairly simple.
There are several Rust libraries that allow you to do GPU work with minimum friction. On Mobile, there aren’t any (AFAIK).
So even simple things like getting the OpenCL library runtime aren’t obvious. Which brings up the question: Should we even use OpenCL? Trying and testing other libraries is part of step 1.

The performance numbers for ff are useful for deciding if it is worth proceeding further. When we have ECC math working, it will be more accurate.

For instance, if we were building a racing car. We could work on the engine first. With a good engine performance, we can hope that the racing car will be successful. However, there are many other components that still are left to build and can limit the engine.

zksnarks rely on ECC math and ECC math depends on FF math. Therefore, we start with FF then ECC (jubjub, pasta) then proving systems (sapling, halo). FF is smaller than the rest and has no dependencies. IMO it’s a good starting point. Filecoin has an implementation for OpenCL (desktop). It will help too.

1 Like

As currently implemented, neither jubjub used for Sapling nor the pasta_curves used in Orchard use the derive-able prime order field implementations from ff, they implement the traits provided by ff explicitly. bls12_381 which is used inside the Sapling circuit does not use ff at all. bellman contains batch Groth16 proof verification implementations that may also need to be updated. jubjub, pasta_curves, bls12_381, bellman, and maybe orchard would all have to be ported to use any OpenCL targets produced.

@str4d can correct me if any of that is inaccurate

6 Likes