Remote rig resetting tool with PoE and relay switch

My rigs are 30-35 minutes drive away from home and it’s a pain in the ass to get there every time I need to restart the miner when it freezes or after power blackout some of them won’t come back to live so I came up with an idea to build remote resetting tool with PoE and relay switch. Anyone can do this (this was my first soldering experience)… It is cheap and it is working.
What I used:

  1. Cheap PoE switch from ebay - 35$ 24 port powerdsine 6524

  2. Relay switch from local electronic store 12v TIANBO - 0.76$

    3.Ethernet cable crimped with RJ45 standard B on one side to connect to switch
    (you don’t need image for that)
  3. Power/Reset SW cable 0.10$ without button (you don’t need button neither)

I soldered one pair of RJ45 cable outgoing from switch on relay switch (full green and stripped brown, other contacts may also work)


On relay switch I also soldered Power SW cable with pins on other contacts which will later connect to motherboard

Then I connected pins to motherboard and connected RJ45 to PoE switch port

You have to write down which PoE switch ports are connected to which rigs to restart the miner you need to restart.
After all is set up next steps are simple, when rig freezes or won’t come back after power blackout you need to connect to PoE switch’s management IP and head to Port Configurations Enable/Disable section. By default you have all ports disabled, if you need to restart the rig which is connected to PoE switches port 1 you will enable it for about 3 seconds and disable. The miner will shut down, after that you will do this step again, enable the port for 3 seconds and disable, now rig will start up again and starts to mine.

I use H81 BTC and ASUS H81M-K motherboards, in case of H81 i never used HDMI dummys, after restarting them they power up and start the miner from startup folder, in case of H81M-k I’m using HDMI dummys because in most cases they need monitor to be plugged in.

I’m little bit sleepy now so if something is hard to understand or I did some technical mistakes feel free tell. I just wanted to share.
if it works it ain’t stupid!

5 Likes

I tried this with a Cisco 3560 I had laying around but the switch will only enable power if it detects a valid device on the port. Just connecting a relay to 2 pins is not enough to fool the switch into allowing power. The switch doesn’t see the relay as a valid NIC either so the port is effectively shutdown. I figured the switch you used must not do any checking for a valid powered device so I ordered one. It does the same thing as the Cisco and the manual even states that it will not output power on a port without a valid PD attached. Is there some other step you did to bypass this safety check? I have the device detection set to “802.3af & legacy”. The only other option is just 802.3af. A multimeter shows a fraction of a volt ether way which is not enough to close the relay.

If anyone has some advice I’ve already completely automated the reset via Claymore’s Ethman and a powershell script which I’ll post if the hardware can actually do this.

Cisco is a smart ass, but the device I used doesn’t require any pre configuration besides the first few steps, management IP and stuff. With it you just turn on and off the port, it actually doesn’t care if anything is connected on the other side.
BTW thank you for reminding me about this thread, I’m testing new, cheaper and smaller option for same purpose, considering that some products cost 100$+ and can handle only couple devices it only motivates me and others to find better and cheaper solutions. I’ll update as soon as I get parts to build the prototype, beforehand I can say that it’s even cheaper than POE option, under 30$ or so and can handle up to 20 devices, but this is not yet confirmed.

Oh and also consider to use 24V relay switch, 12V is not recommended any more,

I also tried the same switch as you except it’s the gigabit version (6524G). It definitely cares what is connected. Guess I need to grab the non-gigabit one or just go another route. I have 24V relays, 12V relays, and some really rugged automotive 12V ones. None of them trigger on the Powerdsine. The Cisco would occasionally trigger the 12V ones but only for a split second and not reliably. I suspect this was because it wasn’t providing enough power since it never detected a valid device.

IMG_20170731_153044

there is one option in cisco case that comes to my mind, I’ll try to explain as I can. So for rigs you only need 2 pairs for rx and tx, if you somehow manage to find other two pairs (as far as I remember the brown ones are for POE) that only transfer power in case of POE you will be able to use one ethernet cable for resetting and for ethernet. You even can avoid RJ45 standards and for short distances manage your own. Thie is just a thought which I would like to try myself too

So as I promised I’m testing arduino vs relay and ethernet module, which allows user to remotely restart rig via telnet command, the price of whole build won’t exceed 40$ and can handle up to 16 rigs so far.
Further development in process and after successful testing I’ll upload the code and every detail you need to know about it. Here’s a preview pic of it
[img]//cdck-file-uploads-global.s3.dualstack.us-west-2.amazonaws.com/zcash/original/2X/e/ed55c00a68c6caebb06b1835d0d0260b8de5f467.jpg[/img]

What a great project your are doing!
I’ve found this product https://shop.simplemining.net/ssrv2.html?___SID=U that is has the same fuction but cost is waaaaay more higher!

Are you going to make any sales?

Best luck mate!

Everything you are trying to do can be done with software. Hard resets without flushing HD buffers will eventually cause file system problems. Unless the power goes out I never do a hard reset on my rigs anymore.

  1. Set BIOS to auto boot after power loss (AC comes on).
  2. Watchdog on rig restarts PC and miner if it detects any issues
  3. Watchdog does a shutdown vs a reboot if it detects a serious issue that you need to fix (fan failure, smoke alarm goes off).
  4. You can also use a smart power switch to kill power from your cell phone or have an ITTT routine to automatically kill power if a sensor trips for heat or smoke.

I have a raspberry pi that monitors my rigs and can send a command to shutdown all or just one of my rigs in about 7 sec if it thinks there is a problem, and I can also do it manually from my phone. The ITTT trip is instantaneous and is a hard power kill tied to the smoke alarm and fire sensors (I only use this for the rigs in my home).

Example code:
#!bin/bash
#code to determine fault

#code to log error and or send you txt msg of a restart

#code to kill appropriate miner depending on what coin is currently running.
sudo killall miner #Nvidia_ZEC_EWBF=miner, Genoil_ETC=ethminer, AMD_Optiminer=optiminer-zcash etc etc

#begin controlled restart
sleep 5
echo 1 > /proc/sys/kernel/sysrq
echo “Taking keyboard from X11”
echo r > /proc/sysrq-trigger
echo “Syncing disks”
echo s > /proc/sysrq-trigger
echo “Remounting filesystems RO”
echo u > /proc/sysrq-trigger
sleep 1
echo “Rebooting”
echo b > /proc/sysrq-trigger

replace the last line with “echo o > /proc/sysrq-trigger” if you want to shutdown vs a reboot.

Thats it, no extra hardware and works every time.

You already wrote one good reason to have a hard reset.

Let’s say that if you wanna be professional you gotta have some sort of phisical access to your systems. Software can get stuck and not living by your mining farm could get annoying having to drive by (2am email: “worker is not working”).

Your solution is great and very solid for personal use

Sorry but needing physical access to your rigs is OK for personal use. If you want to be a “professional”, then you need a full automation and remote control system. Physical access is to perform hardware maintenance only.

If your rigs lock up, and you cant cleanly reset them automatically (without you even getting involved), then you are doing something wrong.

Update:
I must apologize. Perhaps I did not explain in enough detail. Linux sysrq magic keys is a kernel level command and will NEVER lock up. You can blow up the hard drive and it still works since it is at the kernel level, and nothing needs to load from disk.

Selling? Of course not, these all are open source, even the code I managed to rewrite for my needs (I have 0 understanding in coding and after release everyone can modify it and make it better), people need stuff like this. Of course you can pay much more for ethernet relays and/or other solutions to meet your needs, but when you are mining and even providing hosting or take care service it’s a good option. I’m managing more than 150 rigs and they are all over the town, somewhere 50 and somewhere 20… I know you can do a lot of triggers via software to make rig do whatever it needs to be done, from rebooting to disabling specific GPU, but sometimes, in rare cases the rig just freezes and there’s nothing you can do, nothing. At least I didn’t find any other way than a “physical contact” to reboot the rig itself. I’m pretty sure there are plenty of solutions out there, but what I like the most with arduino is that the possibilities are enormous. I have no idea how to code or do a web programming, but I’m pretty sure there should be a way to build a page where you can add and edit everything, with minimal nice looking environment, add some kind of buttons and etc. But for no I only have this and I’m sure lots of people out there were waiting for cheaper solutions than other products on the market, plus it’s fun to build and do it yourself rather than buy expensive one and just plug and play.

ZC93_
I agree on all the steps you wrote above, these are just basics, everyone should know this. Also I will test the script you shared, you are saying that no matter what will happen to system, even when it freezes this will help me to reboot it remotely ?

Yes, but only in Linux. sysrq magic keys are kernel level and very powerful. They are functional from the sec the kernel is loaded into RAM and even work when booting hangs (kernel loads first). Hitting the power switch or the reset switch is like using a sledge hammer and the sysrq is more like a screwdriver.

All my rigs are 100% automated. They set their own over clock, throttle their power based on temperature, reboot when a GPU hangs, the whole rigs hangs, or a GPU performance start dropping. They log everything from Shares, GPU temps, GPU performance and statistics, and all Errors. They share all this with a central monitor and can accept incoming commands as well. I don’t ever have to touch them, even to update software.

Update:
You need to run this as root or with sudo privileges, and the sysrq keys have to be enabled in the kernel. Since these are low level kernel commands, if they ever fail, your PC is a brick anyways. You need to pay attention to how you implement the script, if its called from a script that is running in X11 and X11 locks up then it may not get called. So I use two watchdogs, one that does the heavy lifting (miner logging, decision making) and a second root cron job that can execute the reboot within itself (in memory, not call a script from the HD). The script needs to monitor itself (how often it runs), to prevent a reboot loop when there is a serious problem. If my rigs reboot three or more times in 6 min they enter a non mining safe mode and send me a message so I can log in and see what is the problem (never had that happen but I have tested it).

this sound interesting and I would like to know more about it if you are willing to share

How well do you know Linux and can you write basic BASH and or Python scripts? I can help and give some guidance, but due to a pending business arrangement I cant openly share my code.

That would be very kind of you, just to give me the right direction. I was thinking about zabbix and full automation. Any guidance will help me a lot