Ewbf, workers stopping and not recovering

I'm using ewbf, haven't had any problems until recently. My rig seems to stop and not recover.

It'll run 18 hours + and stop for some unknown reason. By the time I get back to the rig, the log is filled with restart attempts.

edit*(there was a slight overclock and reduced power limit) At first I thought it was the overclock but now it seems to do it even with stock settings.

Does this have something to do with the temperature settings in afterburner?

My average hashrate is taking a beating from this

Luckily, I had teamviewer up and ready so I was able to restart it manually edit* the computer locked up after remote restart and I had to perform a hard reset a few hours later.

I think despite running over 18 hours + the overclock was unstable, so I'm just going to lower the overclock and try again.

What cards are you running in your rig? Some earlier gtx 1070s may run unstable at stock settings and need a bios update.

2 Likes

I'm running gigabyte 980 ti's bios revision A1, 3 cards at the moment, waiting for an rma on the 4th

Logs are great. Would be good to know with what error code it stopped working, before the restart attempts started. Most likely still and overclock issue tho. I’ve experienced 3 days of error free mining with my Gtx 1050 before it crashed with:
CUDA: Device: 0 Thread exited with code: 77
ERROR: Looks like GPU0 are stopped. Restart attempt.
CUDA: Device: 0 Thread exited with code: 46
Most likely because of a high +700 memory overclock (now experimenting with +680).

And the Gtx 1070 also went stable more then a day before crashing with the same error. Removed the curve adjusted overclock to run the core at a constant 2000mhz with 950mv and seems to be running fine now with core +100, memory +500.

That's interesting. Normally I'd consider a overclock stable after 24 hours of stress testing, but I'm not so sure what could cause this after an extended period of time.

Could heat soaking, and temperature limit have anything to do with it? I raised my home temperature up 2 degrees f° and 8 hours later I got the code 46, when prior to that I ran 2 days stable.

Hey Project, I’m dealing with such issue right now …well I’ve been dealing with it ever since I assembled my rig but now is like every single day. When I run on stock values nothing (besides having lame revenues) happen. I get error 4 and 46 a lot and just like @l33t666 I happen to be at work but without remote access ¬¬.

I tried pulling down OC a bit but I still get the exit code and vga’s stopped. What I did see is that Voltage fluctuates a lot when I modify ANY single vaule even if it’s just 1 mhz of clock. How did you manage to fix the voltage at 950 mv? The ‘‘Force constant voltage’’ won’t work on my Afterburner.
I have 3 GTX 1080, win 10 and EWBF.

I only experimented with the core mhz / core voltage curve a short time. Decided to just go with a fixed overclock mhz and leave the voltage alone.

Heres a couple of basic things you can do. Upgrade your Win 10? to the latest 1703. Download new lan drivers, dont use wifi. Update the motherboard bios.

Next off, try running the cards in separate windows, using the --cuda_devices command. (in one scripts you use --cuda_devices 0, in the next one --cuda_devices 1 and for the third card --cuda_devices 2).

And this is what my config looks like. It still loses sols, when it happens to restart, but at least it will not crash totally:

:restart
TIMEOUT 3
miner --server eu1-zcash.flypool.org --port 3333 --bla.rig1 --pass --pec --log 1 --logfile flyerrors --eexit 3
goto :restart

and also add a shortcut of your flypool start script to your computer start folder (start button - run - shell:startup)

I will try that out, thanks for the tip!
I never used wifi though, I will check mobo for bios’s update and see which version of win I have.

That restart command is the one that comes with EWBF? Because I’m already thinking of some complicared automatization with some code I am writing and win tasks scheduler, but that one seems to be more efficient . So far so good -eexit 3 works well and it’s the one I need for the script to ask if the .bat is running or not.

By the way, which version of Afterburner do you have? the gui differs a little from mine.
Thanks again! :wink: