Breaking equihash in Solutions per GB second

Speaking only for my solver, yes it discards solutions. That’s the only way I can guarantee a peak memory that’s 47% higher than average.

To guarantee finding all solutions all the time would I think require a peak memory usage of many many GB …

2 Likes

I see no reason why the number of discarded solutions actually matters very much, since the solution rate takes that into account. What matters for performance on given hardware is:

  • the solution rate;
  • the latency to find solutions after a new block header is available;
  • the memory usage (the peak usage is most important).

All of these performance factors need to be considered.

2 Likes

Due to lack of interest, I am retracting my offer to open source my solvers in exchange for Cuckoo Cycle bounty insurance.

Lack of interest? From reading your previous comments I would think the lack of interest is on your part. There are many people watching that are very interested that you don’t see. You said earlier that your not sold on this specific currency. what if a crowd fund of $60,000 is set in escrow? would you then consider making your solver public? Many people have not pulled the trigger on the crowd fund because they haven’t seen a developer that’s capable, like you.

2 Likes

@tromp It’s unfortunate that you weren’t able to find a way to release your work. It now seems like the GPU takeover has begun and the only guy who could re-balance the scale for CPU is you.
Do you think you will ever released it, even as a binary (not open source)? I think there may still be a window that you can make some money off your work but with a new GPU farm popping up every day the likelihood of CPU mining being abandoned by the public is looming.

2 Likes

Well, it seems that no matter how you slice it. GPU is going to end up with an edge over CPU that makes CPU a non viable option, even if heavily optimized. I may be wrong but that’s what I’m thinking.

If Tromps 36x improvement is accurate, that would go far for leveling the scale, even if not achieving parity

3 Likes

Having completed my initial CUDA port, which is just treating the GPU like a many-core CPU, and implementing library routines not available in CUDA, I can now report that its results perfectly match that of the CPU miner.

On a stock Nvidia GTX980, it’s doing 14.4 S/s,
which is pretty close to jtoomim’s results for a R9 290.

(the 16S/s number I had briefly put there turns out to be a fluke)

3 Likes

I rewrote my CPU miner for speed and can now report the following performance on a Core i7 4790K at 4GHz:

single-threaded: 1.22 Sol/s
8-threaded: 5.5 Sol/s

I also twice rewrote my CUDA solver today hoping to speed it up, but to no avail:-(
The Nvidia GTX980 is still stuck at 14.4 Sol/s, which is no longer that close to jtoomim’s R9 290 solver.
I feel there should be plenty opportunities for improvement, but it might take a more experienced GPU coder to identify them…

Assuming jtoomim is just about done optimizing his GPU solver,
that gives a GPU/CPU advantage of 5x.

4 Likes

Update on the CUDA solver; some minor improvements have taken it to

17.3 Sol/s on a GTX980, still way behind jtoomim…

A tiny improvement to 17.6 Sol/s on a GTX980

Feels like I’m scraping the bottom with my limited CUDA expertise…

Also a tiny improvement to the CPU miner:

now doing 5.9 Sol/s on 8 threads of a Core i7 4790K at 4GHz

Still 4.5x slower than jtoomim’s OpenCL GPU miner on R9 290…

2 Likes

I forget. Will you apply for the bounty with the CPU miner or what is the plan?

1 Like

There’s a possibility jtoomim will setup a crowdfund for his miners that includes incentives for open sourcing my solvers.

I’ve claimed that the Cuckoo Cycle Bounty Insurance should be valued at no more than 20BTC, considering the unlikelihood of further optimization of my Cuckoo Cycle solvers at GitHub - tromp/cuckoo: a memory-bound graph-theoretic proof-of-work system

So I’ll enter the contest if people contribute 20BTC towards the Cuckoo Cycle Bounty Fund (which at least for the next 4 years can only be used to pay out bounties).

I found that my single threaded CPU performance suffers from having atomics enabled by default. I decided to compile separate non-atomic binaries for single threaded use.

I now get 1.53 Sol/s on a single core of an Intel Core i7-4790K CPU @ 4.00GHz.

Running 4 single-threaded solvers is actually about 12% faster than running a single 4-threaded solver (while taking 4x more memory).

And with some new optimization, I’m up to

1.72 Sol/s on a single core of an Intel Core i7-4790K CPU @ 4.00GHz …

1 Like

So I’ll enter the contest if people contribute 20BTC towards the Cuckoo Cycle Bounty Fund (which at least for the next 4 years can only be used to pay out bounties).

How can one contribute?

We need to either set up a crowdfunding escrow, or if people trust me with the funds then I could just publish a bitcoin address for the Cuckoo Cycle Bounty Fund. And I’d pay back the contributors if the threshold is not met at the deadline (in which case I’d request multiples of 0.5 BTC).

1 Like

So on a Octacore CPU performance can be ca 13,76 Sol/s ( 1/3 of GPU ) ?

Hello tromp,
It looks like an awesome big quantities of S/GBs, doesn’t it?.
Waiting to see some picture of the gpu miner, running smothly and fast like the light.
Thank you very much.