ZCash GPU mining faster than CPU mining?

How is equihash different from ethereum’s mining algorithm?

Both are based on memory hardness and memory bandwidth. Ethereum’s mining is basically all done on GPUs at this point.

So why would equihash expect anything different?

Edit:

Also, gpus are roughly 100 times more efficient than cpus for mining ethereum, at the same prices for hardware. So we should expect the same for ZCash. correct?

Update:

Ok I think the efficiency comes from the typical laptop’s bottleneck for ZCash mining to be the number of cores, rather than memory. So for a macbook pro-retina, you’d only have 2 cores, so you can’t even utilize all the RAM. Whereas on a GPU for a fourth of the price, you can get 32 cores with 5-10x memory bandwidth. So the multiplier might be something like 16 * 10 * price multiplier. Someone correct me if I’m wrong?

Update 2:
Does anyone have benchmark numbers for how long context switching takes for ethereum / zcash mining? And how often do context switches tend to happen, ect. That may prove or disprove the core bottleneck hypothesis.

1 Like

If GPUs are that much faster than CPUs at mining Ethereum then their POW can’t be as memory hard as they hoped.

Ethereum is actually more memory hard than ZCash. Ethereum has 1GB memory per thread requirement, whereas ZCash is 700MB per thread with current parameters.

Amount of memory is too simple a metric to make that call. I’ve only skimmed an Equihash paper or two but even that was enough to help me understand your assertion is poorly founded.

Yeah I haven’t read the paper, only skimmed it too.

But these are some higher level, and very important issues, someone should be able to explain it in a few sentences rather than 13 pages.

Also the abstract says basically the entire idea is based on memory and memory bandwidth, so even without reading the paper I think the assertions are well founded.

Also, the only relevant things to know are what the specific bottlenecks. And some forum members seemed to confirm my initial assumptions.

The specific bottleneck is memory bandwidth and while graphics memory is faster (at least for sequential accesses) than system memory, it isn’t 100x faster. If that comment about Ethereum mining is accurate, then clearly the Ethereum POW also relies heavily on processing in such a way that GPUs are favoured - whether intentionally or not.

[quote=“Voluntary, post:6, topic:1300”]
The specific bottleneck is memory bandwidth and while graphics memory is faster (at least for sequential accesses) than system memory, it isn’t 100x faster. If that comment about Ethereum mining is accurate, then clearly the Ethereum POW also relies heavily on processing in such a way that GPUs are favoured - whether intentionally or not.[/quote]

Yes, the comment about ethereum mining is accurate, when comparing typical laptop hardware to commodity gpu hardware. Ethereum POW doesn’t rely on processing that favors GPU any more than ZCash.

The memory bandwidth bottleneck matters only if the number of cores is not the bottleneck. But for the case of hardware the typical computer user has around, the number of cores is actually the bottleneck; hence, the 100x efficiency. In other words, yes memory bandwidth is only 5-10x more efficient, but it turns out typical computer user hardware vs dedicated hardware has a 100x unit price efficiency gain because number of cores is the bottleneck, and not memory bandwidth.

For those that don’t choose to ignore the meaning of a POW that is actually bound by memory bandwidth and not merely advertised as such, additional cores don’t give additional memory bandwidth. GPUs certainly do outperform CPUs at particular tasks but when a task is being completed 100x faster and the memory speed difference is only 10x faster, that task is clearly not memory bound.

1 Like

It puzzles me why this should not be obvious. Now, bearing this all in mind, it is true that video memory chips tend to be faster at any given time when you look at what is available. When DDR2 was the norm for CPUs, DDR3 was being put into a lot of video cards.

But this does not give a 100x advantage. At absolute best, the advantage is a factor of clock-per-byte rates and numbers of channels. As far as I know, there is not any kind of optimisations in existence for any system that lets you choose how your processor (and gpu’s included) access the memory channels. As far as I know, that is baked into the memory controller chips.

This is something that could be discovered, by intrepid, well heeled testers, but I don’t know of any way currently to control this. Possibly there is some way to identify the memory regions where a channel lives, and this could achieve this end. If it is possible, someone will do it. But 1 computer with 4 fast cores, and 4 memory channels, with a memory optimised algorithm that restricts each thread to a core, is going to be up against a dual core cpu, and a single channel, and the second computer will cost less than 1/4 of the first.

What I think is awesome and very helpful is, that this will tighten down the criteria for price/performance ratio of mining to two things: The cheapest hardware that can execute a thread on a given amount of memory bandwidth, and the amount of power that this system requires to run it.

Outside this, the price optimisation to hash power calculation is just going to then become mainly a matter of watts = hashpower. I believe this is the objective, and Ethereum isn’t the first to try, litecoin also makes an attempt at memory hardness as well. I have benchmarked a machine that only has 1.2ghz quad core, and its performance at equihash in z8 was not too shabby compared to what people with 3+ghz quads and 1600 memory got (mine has 1333).

So far from what I have directly seen, zCash’s PoW lives up to the hype, and makes ricing your rig a waste of time.

2 Likes

Exactly. I strongly suggest @overclocked read the paper first then come back to talk about this insane 100X number. One more thing I want to pinpoint as well. zooko seems concerns GPU and pay close attention to this thing, the pow is subjected to change if GPU takes too much advantage. Pow could change, e.g., to increase 700M to 2G, per core in future if GPU jump in, given the current GPU has less memory than your PC RAM (highest still 8G ddr5, might be?), and taking into account that CPU high frequency comparing to GPU < 1.5 GHz, the advantage say 4 mining core on i7 vs a ddr5 could be very minimum or even outperform GPU. That’s my understanding, correct me if I am wrong.

2 Likes

[quote=“Voluntary, post:8, topic:1300”]
For those that don’t choose to ignore the meaning of a POW that is actually bound by memory bandwidth and not merely advertised as such, additional cores don’t give additional memory bandwidth. GPUs certainly do outperform CPUs at particular tasks but when a task is being completed 100x faster and the memory speed difference is only 10x faster, that task is clearly not memory bound.[/quote]

The ratio right now is 700mb memory to 1 mining thread. But let’s say you have a computer with 32gb ram, but only 4 cores. Then, effectively you may only be able to use 2.8gb - 5.6gb ram if you run 1-2 threads per core. So in this case, it is not in fact bound by memory.

Well with the current mining parameters, I wonder why anybody would sanely estimate anything less than a 100x efficiency.

Ethereum’s mining algorithm is basically the same in terms of bottlenecks, and the (official?) docs even mention this number. Also emperically agreed upon by a lot of people actually mining on dedicated gpus vs typical laptops they have lying around.

The way ethash is using is memory hard, but GPU friendly. That’s the reason. Any algo can say it is memory hard, as long as it uses lots of memory and bandwidth. The key to achieve CPU friendly is the algo designed to prevent some parallelism.

From eth official github you see the Ethash Design Rationale:

GPU friendliness: We try to make it as easy as possible to mine with GPUs. Targeting CPUs is almost certainly impossible, as potential specialization gains are too great, and there do exist criticisms of CPU-friendly algorithms that they are vulnerable to botnets, so we target GPUs as a compromise.

Completely different claims from zcash team.

1 Like

I’m also still trying to make sense of how Ethereum gpu mining is 100 times more efficient than cpu mining. If Equihash can have gpu mining only 4-10x more efficient, than that is a great improvement, and I would be super excited to hear the theory behind it. But I have yet to hear a clear explanation of what innovations equihash has done to make this a reality.

This 100 times number for ethereum appears to come from comparing normal hardware (aka macbook pro ect) to dedicated gpus. And my hypothesis is that it is because it is limited by cores.

Yes, this is true, but also these mining algorithms have a lot fewer interrupts so my hypothesis is that the time spent context switching is long enough such that you can’t effectively use all the memory without adding cores for a normal pc/ laptop. Thoughts?

Yes, and they are completely different claims because Ethereum actually acknowledges that it is “almost certainly impossible” to target for CPU mining. From the link you referenced:

We try to make it as easy as possible to mine with GPUs. Targeting CPUs is almost certainly impossible, as potential specialization gains are too great, and there do exist criticisms of CPU-friendly algorithms that they are vulnerable to botnets, so we target GPUs as a compromise.

So first of all, Equihash is claiming that Ethereum’s statement is in fact false. So this is a pretty big statement to make.

Second, Ethereum uses 1gb ram per mining thread, so you can’t parallelize beyond that 1gb ram. Zcash’s current parameters of equihash is only 700mb per mining thread. So in fact, you can actually parallelize more with Zcash than ethereum. So reduced parallelism does not appear to be equihash’s claimed innovation over ethash.

You can easily work on the same solve in ETH with multiple cores. The whole purpose of Equihash, invented after ETH, is to make it hard to do that. You need fast access to each 700 MB block, and only 1 core needs to be sorting that block at a time. Think about how you would sort a bunch of numbers. How do you do it without keeping all the numbers in memory during the sorting? How would you let another thread work on that memory block of numbers without messing up what the 1st thread is doing? It’s not easy or efficient. There is a long history of trying to get GPU’s to sort as good as CPUs. Some rare expert could do parallel programming to get some benefit to this particular sort task, but it will come at an electrical cost that I’ve estimated in past posts to be over 2x, possibly 4x.

Does AMD OpenCL has similar things as the CUDA, in terms of the optimized sorting library.

Yes, that is the stated purpose, but it actually makes it even easier than ETH to work with multiple cores. As I said before on another thread, it requires 1gb memory per thread to work with ETH (you cannot parallelize beyond that 1gb further). However, it requires 700mb per mining thread on equihash with current parameters. Obviously they can change the parameters later, but the basic premise for ETH and zcash is the same in terms of preventing parallelization beyond a certain point.

Does AMD OpenCL has similar things as the CUDA, in terms of the optimized sorting library.

I saw numbers of 1.5x to 2x efficiency increase for ETH, by using CUDA for nvida instead of opencl.

1 GB ETH is not similar to 700 MB ZEC. As sections of the Equihash paper describe, it is “parallelism constrained” and “optimization-free”. A GPU capable of 10x a CPU will average 4 hashes per second when using 200 W, a factor of 1,000,000,000x more expensive per hash than ETH miners. Electricity costs will not matter until there are 50,000 CPU-equivalents on the network, in which case it will cost $1.50 per coin in electricity.