Even at stock clocks the system was crashing within 10 minutes each time. I ended up removing the problem GTX 1080Ti (there was 1 card that would keep losing performance down to 650 sols then it would crash, so I inferred this was the problem card). I swapped this card to another riser and motherboard PCI port, and at default clocks the system seems stable so far (been running for ~3-4 hours now without a crash) but performance is pretty low at these clocks (~4.4k for 5x GTX 1080Ti’s and 2x GTX 1070’s).
A trick that works for me. Put the problem card on the mobo in the last X16 pcie slot so it does not block a slot. This seems to make an unstable card stable… But you can’t have more than one / rig.
Thanks, I’ll give that a go if it keeps having problems. For now I am just overclocking the core clock and memory clock separately for 1 card at a time until it starts having crashes, but it’s a time consuming process because to know that each card is really stable at that overclock I really should let it run for at least 6 hours +.
I’m still not sure why each EVGA SC 1080Tti has such a drastic performance difference from each other though. I have 3 of those cards and right now at stock clocks 1 is running at ~720-730 sols, 1 at ~690 sols and 1 at ~680 sols. Before adding the 3rd EVGA 1080Ti I was running each card at +110 core and +450 memory and getting around 750 to 760 sols per card.
Well the system has gone back to crashing within 10 min at stock clocks at again… I’m not home at the moment so I can’t try plugging this card directly into the motherboard just yet… but for now I am trying running every card with an underclock.
I have a theory that the performance loss and crashing is related to a power issue. I am using 3x Corsair 1000W HX power supplies so there shouldn’t be a problem there (unless there is a fault with a power supply), but because of the nature of the performance, there will always be 1 GPU that is lower performance but it’s not always the same GPU. Perhaps there is a problem with the mains power connection that is limiting the amount of power (the wall socket is rated for 10A which is ~2400W on 240V, and the system is drawing ~1850W measured at the wall). I have severely limited the power consumption of each card to ~200W per 1080Ti and I’ve bumped the overclocks back up to what I reported as stable before adding the last 2 cards. If it is stable now then that should confirm this theory. If this is the case I’ll just plug 1 of the power supplies into a different power socket (instead of having them all on 1x 10A power board) and see how that goes.
EDIT:
I’ll just add that since reducing the power limit of each card to ~200W (each Zotac card is at 65% and each EVGA card at 80%) all 1080Ti’s are getting roughly equal performance now (~680 sols) which seems like a good sign. I am still running the 1070’s at 100% though.
It seems like my theory was correct. It ran fine all afternoon without any stability issues. I just got home and plugged 1 of the power supplies into a different wall socket. The system still crashes almost immediately if I run each card at 100% power. I measured the power consumption of each power supply and they’re all sitting between 500 and 700W (when running the cards at 100% power). I am currently running the 1070’s at 95%, the Zotac 1080Ti’s at 80% and the EVGA cards at 90% and that seems stable so far. Getting ~4520 sols total.
I don’t think it’s actually related to a problem with the load through the main circuit, because if I turn both Zotac cards down to 50% I still can’t run the EVGA cards at 100% (even though it would use less total power). Each sata riser for the EVGA cards has its own sata cable.
Do you think it’s a riser problem or a socket problem ?
How do you have your risers powered? Via the same PSU that is powering the GPU? There is a lot of false information out there from the early days when risers did not have their power lines cut to the mobo. Today all risers have no power connection between the riser and the mobo, just data (you should measure and confirm for your risers). The 3.3v for the riser is created ON the riser from the 12V rail with either a regulator or a switcher. Therefore it is important to use the same PSU for the GPU AND the riser otherwise you can cross PSU 12V rails and can get PSU’s fighting each other to regulate the 12V rail. Generally speaking you can get away with a graphics card powered from two different PSU’s as the card usually has protection in place between the PCIe slot and 8pin power, but you cant guarantee this. Would not be fun to find out a particular OEM card does not have this.
There are still people that insist that you should power the riser with the same PSU as the mobo (from when the risers had power lines connected to the mobo). Today, with newer powered risers this is wrong.
Also a SATA cable is rated for 4.5 amps on the 12V rail (1.5 amps / 12V pin). A molex connector is rated at 11amps on the 12V pin (much higher). The PCIe specification allows a maximum of 5.5 amps from the slot. The GTX 1080 ti FE’s at 80% power, draw ~4.4 amps (see link below). Therefore you are likely drawing too much power over your SATA connector at 100% power. PSU’s usually allow a peripheral cable (4 pin molex) to plug into a SATA port giving you the higher amp molex connector to your risers. You still use only one 4 pin molex cable for each GPU riser. I will admit I have run a GTX 1080 ti’s with a SATA cable without issue, but strictly speaking its drawing too much power and a bad idea. It all depends on the SATA connectors and their quality, some could be fine, others may get warm or hot. If they heat up at all, then they are dropping voltage and will cause problems.
Try swapping to 4 pin molex.
Thanks. All the EVGA cards are powered by a single Sata cable that is connected to the same PSU powering each card. 1 of the EVGA cards gets amazing performance (around 760 sols/s) but the other 2 get average performance (and must be run at less than 100% power).
The Zotac cards have no issue, they run really well, but I have both zotac cards run off a single sata cable. Each zotac card has 2x 8 pin power connections though (maybe this doesn’t matter?). Is it possible still for the power to the zotac cards to effect the evga cards? I also have my 2 1070’s off 1 sata cable but they run fine (around 480 sols).
Maybe it’s just an average card idk? I just rewired the whole build to make sure that every sata riser has its own individual sata cable and that each card is powered by the same psu powering the riser. No difference. That card still runs at 670-690 sols and I have to run it at 85% power. The other EVGA card must be run at a max of 90% and gets ~700 sols, and the good EVGA card runs at ~740-750 sols at 100% power. The Zotac cards aren’t bad… I have them at 85% (although they can run at 100% no drama) and they get 750-760 sols (they get ~780 sols at 100% power but that’s an extra 50W per card). I’m pretty happy with the 1070’s, getting ~480 sols per card.
EDIT:
Been monitoring voltages for each EVGA card. For all 3 EVGA cards set to 85% power with the same overclock. They are all drawing around 210W at this power setting.
Card 1: 0.900V - 720 sols
Card 2: 0.875V - 690 sols
Card 3: 0.825V - 650 sols
Card 3 is the problem card. After starting the miner program it will start out at around 0.875V then slowly decrease to ~0.825V then the miner will crash
My power supplies have multi 12V rail and single 12V rail, not sure if 1 is better than the other. Tried both and neither seems to make any difference.
If you have more than 1 GPU riser running off a SATA cable you are asking for trouble. Each riser can draw up to 5.5 amps per the PCIe 3 specification. Your SATA cable and connector is rated at 4.5 amps max. You NEVER run more than 1 riser / SATA cable. I would argue don’t use SATA cables at all, and do so in another post.
If you have more than one riser on a SATA cable you ARE dropping voltage to your risers over the SATA cable since it cant handle that much current. A rig will behave erratically when the 12 volt rail is not regulated well. Your not being able to run at 100% would be the symptom of a SATA getting hot and dropping voltage due to overloading.
Putting all that aside you are playing with FIRE, literally. Running more than one riser / SATA cable can cause a fire!!!
When run at the same settings, all my GPU’s perform about the same on all my rigs with a slight exception for GPU0 (one with video on it). I can run all my cards at 150% if I want with no issues. Once you start over clocking cards things will move around and some will be better than others.
So if you fixed the SATA cable issue the dropping voltage still points to a power issue. Are you using the SATA to molex adapters? If so check those, they can get warm and drop voltage. I have seen them get very hot with a single riser. Go straight 4 pin molex to GPU3 and see if the problem goes away.
If that does not solve your problem then get a better PSU. You definitely have a power issue.
I have 3 Corsair HX1000’s and each is only at about 60-70% load. I tried running the card on 2 of the different PSU’s with no change. These are Sata risers so I can’t directly plug in a molex cable without using an adaptor, but I’ll see if I have any sata to molex adaptors lying around and I’ll give that a go to see if it helps. Would you suggest buying some molex risers to see if that solves the problem? Why do you think 1 of the GTX 1080Ti’s has no problem? I also notice when observing the voltage to the problem card that it fluctuates a fair bit (up and down ~0.02V fairly regularly) while the card with solid performance stays very constant. I have tried running this card on another sata riser and it didn’t seem to make any difference, but I have another 2 spare sata risers I could also try.
Well I tried running that card via Molex using an adaptor and that didn’t make any difference either. I’m going to try running the card in my desktop computer to see if perhaps it’s a problem with the card itself.
Well, it appears as though the fault is with the card itself. I can’t get it to run at over 80% power in my desktop computer directly into the motherboard (with a 1000W PSU powering just that card and a GTX 980).
Are the other cards in you rig working normally now? I have seen bad GPU’s before but never seen a bad GPU mess with the performance of other GPU’s in a rig.
However, I don’t mix GPU’s on my rigs anymore. Always same brand and model for each rig. I noticed I always had more issues with rigs that had mixed GPU’s.
Every card is working normally except for 2 of the EVGA cards. To give you the full context I am running 5x GTX 1080Ti’s and 2x GTX 1070’s. 3 of the 1080Ti’s are EVGA SC and 2x are Zotac Amp extreme. The zotac cards and the 1070’s are running perfectly and I can ramp them up to 120% power if I want with no problems at all. 1 of the EVGA cards runs really well also, but of the remaining 2 I need to run 1 of them at 90% and the other at 80%. This is independent of the computer these cards are in (I tried them in my desktop) and what else is plugged into the system.
I’ll just add that these GPU’s are not messing with the performance of other cards (other than causing the program to crash).
Did you change all your PCIe BIOS setting on the computer to GEN2?
Yes, I setup the board as per: Asus Z270-A / AR Walkthrough for 6+ crypto rig stability - YouTube
Well, I appear to have lucked out with my final EVGA 1080Ti (just arrived). Getting 760 sols for 250W out of it, no problem so far at 100%. Still can’t run the other 2 cards above 80% and 90% and even at those power limits they get pretty awful performance (if I put my good cards to that limit they get far more sols). I think I might send them to EVGA to get fixed/replaced/whatever, my only concern is I haven’t tested them for anything other than zcash mining, so maybe they will work for whatever they test them for (but I have confirmed using multiple computers that they will not run EWBF at more than 80% and 90% power respectfully while the other cards have no problem, so there must be something wrong with them).