mrb
October 31, 2016, 2:40am
248
I released SILENTARMY v2 which now supports GCN 1st gen GPU. In fact all AMD GPUs should in theory be supported, but keep in mind I still have done zero testing on TeraScale (pre-GCN) AMD GPUs. See:
# Current tip
* Avoid 100% CPU usage with Nvidia's OpenCL, aka busywait fix (Kubuxu)
* Optimization: +10% speedup, increase collision items tracked per thread
(nerdralph). 'make test' finds 196 sols again
* Implement mining.extranonce.subscribe (kenshirothefist)
* mining.authorize sends an empty string if no password is specified
* Fix memory leaks
* Avoid fatal error when OpenCL platform returns CL_DEVICE_NOT_FOUND
# Version 5 (11 Nov 2016)
* Optimization: major 2x speedup (eXtremal) by storing 8 atomic counters in
1 uint, and by reducing branch divergence when iterating over and XORing Xi's;
note that as a result of these optimizations, sa-solver compiled with
NR_ROWS_LOG=20 now only finds 182 out of 196 existing solutions ("make test"
verification data was adjusted accordingly)
* Defaulting OPTIM_SIMPLIFY_ROUND to 1; GPU memory usage down to 0.8 GB per
instance
* Optimization: significantly reduce CPU usage and PCIe bandwidth (before:
This file has been truncated. show original
I would be interested if people with Nvidia GPUs could test SILENTARMY now. There is a chance my fixes help with a separate Nvidia issue: “clEnqueueReadBuffer (-5)”, see No Nvidia support: errors out with clEnqueueReadBuffer (-5) with both CUDA 7.5 libs and system libs · Issue #6 · mbevand/silentarmy · GitHub
2 Likes