Max and relative GPU speeds based on their RAM bandwidth

zawy · November 12, 2016, 3:27pm

DDR3 1333 is maxed out at 21 S/s on the best CPUs and the xenoncat code. DDR3 1600 and DDR4 2400 results are in agreement with this (when using 2 cards).

The equihash paper paper says GTX480 DDR5 bus was 134 GB/s vs DDR3 1600 which is 17 GB/s. Using this as a baseline and the above CPU observations and looking at the DDR5 MHz and bus width of all the GPUs, I get the following,

RX 470 4GB max is 5.4x faster than DDR3 1333 = 114 S/s max (my baseline)
RX 480 4GB bus is 7% faster =121 S/s max
RX 480 8GB bus is 21% faster =138 S/s max (best buy?)
R9 270x 2 GB bus is 25% faster = 142 S/s max
R9 280x 3 GB bus is 34% faster = 152 S/s max
R9 290 4GB bus is 49% faster = 170 S/s max
HD 7850 2 GB bus is 75% as fast = 85 S/s max
1070 8GB bus is 20% faster = 136 s/s (worst buy?)

Nano and Fuxy use HBM instead of DDR5, so I can’t compare, except by using optiminer’s data where he has a nano going 38% faster than where DDR5 memory would have placed it, for his code (127 S/s is where DDR5 would put it). So I get as future max values:
R9 nano: 187 S/s max
Fury: 374 S/s ??

Compare this to all the results and I think you’ll see the memory bus bandwidth is determining how fast every card and CPU can go.

RX 470 watts are not max out, but DDR5 is.
Therefore devs should use more core watts to spare DDR5 bus.
If GPU algorithms can become as efficient the CPU algorithm if they do not parallelize the code in an inefficient way. So far they are doing good as the watts are not maxed out. Now they will probably have to parallelize in inefficient ways, so the follow are max, if CPUs are maxed out.

trolloniex · November 12, 2016, 3:47pm

The Nano is just a cutdown Fury X for mini itx builds. The perfomance in games and other algorithms is -15-30%.

zawy · November 12, 2016, 4:01pm

Since he’s getting 175 S/s on the nano with half the MHz, do you think he’ll get 350 on fury?

trolloniex · November 12, 2016, 4:03pm

One user is getting 200