Automatically Power Back on Servers After a Power Outage With a Raspberry Pi and Wake on Lan

Context

I recently woke up with a non-working network connection in my house, both internet and LAN accesses were down.
I immediately checked on my internet provider, but my router seemed running just fine and no breakages were reported on their side in my region.

I then decided to check on my own infrastructure and realised that both of my physical servers were down. Those servers host my DNS and DHCP services, which explains why I had a faulty network connection.
After a quick thought, I understood that a power outage occurred during the night (which wasn’t so obvious at first as the power was back when I woke up), causing my servers to go down. While my router automatically switched back on when the power came back, my servers didn’t. Indeed, without extra configuration, you physically need to press down the power button to power them on.

During my reflexion about how to prevent that for the future, I obviously thought of buying a UPS to get a fault tolerance in case of short power outage or at least have the servers shutdown properly in case of long ones. But the low frequency of power cuts in my current place and the low impact of an eventual downtime of my services/infrastructure when those happen simply doesn’t worth the (high) cost.
So I ended up thinking about a cheap, modular and easy solution to add a sort of power outage “tolerance” to my servers thanks to a simple Raspberry Pi that I’ll present to you through this article.

The hardware

I personally used a Raspberry Pi (3B+) that was sleeping in my drawer, but any model would do.
There’s no specific requirements in terms of performance, as long as it is capable of running a simple Linux server you’re good.

The only prerequisite is to get hardware that automatically powers on when the power comes in, like it’s the case for Raspberry Pis (and for most other small single board computers that do not have a power button).

Raspberry Pis are cheap, have a low power consumption and run Linux pretty well, hence why I recommend them. But if you have another hardware that respect the above prerequisites (automatic power on and runs Linux), that’s perfect!

The solution

The solution consist of a script running on the Raspberry Pi that monitors my servers health and make use of the “Wake On Lan” network standard to power them back on remotely after a (configurable) period of downtime.
Indeed, thanks to the way Raspberry Pis are automatically powered on when power comes in, the Raspberry Pi will automatically power back on when the power is restored after a power outage, allowing it to check and power on my other physical servers in such cases.

But before creating the script on the Raspberry Pi, we have to make the monitored servers compatible with Wake On Lan so they can handle the related network packet sent by the script properly.

Enabling Wake On Lan support on the monitored servers

If your hardware/motherboard is fairly recent, it should be compatible with Wake On Lan but you may have to enable the related parameter in your UEFI/BIOS settings.
It is usually located under the “power management” or “network” section.

If you can’t find such a parameter, it might be named differently or already enabled by default.
Check instructions from your motherboard vendor.

Once enabled on the hardware side, the Wake On Lan support has to be enabled on the software side:
To enable Wake On Lan support on your network adapter, install the ethtool package (if not installed already) and run the following command (that enables Wake On Lan support for your current session only):

sudo ethtool -s eth0 wol g # Replace "eth0" by the name of your network adapter

Then, to enable it permanently, follow the instructions related to the network manager you use:

Configuring the Raspberry Pi

The first thing we need is a Wake On Lan application/utility capable of sending the network packet needed to power on the servers.
I personally use this one which is packaged by most Linux distributions.

We then need a script to monitor servers and send a Wake On Lan packet if needed.
Here’s the one I wrote:

It sends a ping to the given list of servers and increment a “fail counter” per server each time a ping doesn’t get a response.
By default, a Wake On Lan packet is sent after 6 consecutive fails and there’s a wait period of 5 minutes between each try, so that’s a total of 30 minutes of downtime.
I chose to use those not too “aggressive” values by default so I can still shutdown my servers for a maintenance (for instance) without the script being triggered instantly and sending Wake On Lan packets right away. But you can, of course, modify those values to your liking!

See the comments in the script to adapt it to your needs and environment.

#!/bin/bash

# Replace "Server1/Server2" by the DNS name or IP address of your servers.
# If you use DNS names and your DNS server is running on the monitored servers (like it's the case for me), remember to fill in `/etc/hosts` accordingly.
#
# Then replace "MAC_address_of_the_network_adapter" by the MAC address of the network adapter of the corresponding server.
# You can find it in the `link/ether` field when running `ip link` on your server.
#
# Example:
# servers["pmx01.rc"]="7c:10:c9:8c:88:9d"
# servers["pmx02.rc"]="68:1d:ef:30:cc:88"
#
# You can declare as many servers as you want.
declare -A servers
servers["Server1"]="MAC_address_of_the_network_adapter"
servers["Server2"]="MAC_address_of_the_network_adapter"

declare -A fail_counter

# This log file is used to collect logs for sent Wake On Lan packets.
# Put it wherever you want but make sure the parent directories have been created beforehand.
logfile="/var/log/monitor-servers-wakeonlan/wol_packet.log"

while true; do
        for server in "${!servers[@]}"; do
                if ping -c1 "${server}" &>/dev/null; then
                        fail_counter["${server}"]=0
                        echo "${server} fail counter: ${fail_counter["${server}"]}"
                else
                        fail_counter["${server}"]=$((fail_counter["${server}"] + 1))
                        echo "${server} fail counter: ${fail_counter["${server}"]}"

                        # Here is defined the number of consecutive fails needed to send a Wake On Lan packet.
                        # You can adapt it if needed.
                        if [ "${fail_counter["${server}"]}" -eq 6 ]; then
                                wakeonlan "${servers["${server}"]}" && echo "$(date) - Wake On Lan packet sent to ${server}" >> "${logfile}" || echo "$(date) - Error sending a Wake On Lan packet to ${server}" >> "${logfile}"
                                fail_counter["${server}"]=$((fail_counter["${server}"] - 1))
                        fi
                fi
        done

        # Here is defined the wait period between each try (in seconds).
        # You can adapt it if needed.
        sleep 300
done

Finally, we can create a systemd service to launch the script automatically at boot (given that your Linux distribution uses systemd as its init system. If not, check your init system’s documentation).

Here’s the one I wrote, under /usr/local/lib/systemd/system/monitor-servers-wakeonlan.service:

[Unit]
Description=Run the script that monitors physical servers' responsiveness and power them back on if needed
After=network-online.target

[Service]
Type=oneshot
ExecStart=/path/to/the/script # Adapt this line to the path of your script

[Install]
WantedBy=default.target

Then start the service and enable it at boot:

sudo systemctl enable --now monitor-servers-wakeonlan.service

Since the script is launched via a systemd service, you can actually see the output of the script in real time with journalctl/systemctl status:

$ sudo systemctl status monitor-servers-wakeonlan.service

● monitor-servers-wakeonlan.service - Run the script that monitors physical servers' responsiveness and power them back on if needed
     Loaded: loaded (/usr/local/lib/systemd/system/monitor-servers-wakeonlan.service; enabled; preset: enabled)
     Active: activating (start) since Wed 2023-09-20 14:15:24 CEST; 1 month 6 days ago
   Main PID: 492 (monitor-servers)
      Tasks: 2 (limit: 963)
     Memory: 2.7M
        CPU: 11.972s
     CGroup: /system.slice/monitor-servers-wakeonlan.service
             ├─  492 /bin/bash /usr/local/bin/monitor-servers-wakeonlan
             └─20701 sleep 300

Oct 27 12:46:23 rasp01.rc monitor-servers-wakeonlan[492]: pmx02.rc fail counter: 0
Oct 27 12:46:23 rasp01.rc monitor-servers-wakeonlan[492]: pmx01.rc fail counter: 0
Oct 27 12:51:23 rasp01.rc monitor-servers-wakeonlan[492]: pmx02.rc fail counter: 0
Oct 27 12:51:23 rasp01.rc monitor-servers-wakeonlan[492]: pmx01.rc fail counter: 0
Oct 27 12:56:23 rasp01.rc monitor-servers-wakeonlan[492]: pmx02.rc fail counter: 0
Oct 27 12:56:23 rasp01.rc monitor-servers-wakeonlan[492]: pmx01.rc fail counter: 0
Oct 27 13:01:23 rasp01.rc monitor-servers-wakeonlan[492]: pmx02.rc fail counter: 0
Oct 27 13:01:23 rasp01.rc monitor-servers-wakeonlan[492]: pmx01.rc fail counter: 0
Oct 27 13:06:23 rasp01.rc monitor-servers-wakeonlan[492]: pmx02.rc fail counter: 0
Oct 27 13:06:23 rasp01.rc monitor-servers-wakeonlan[492]: pmx01.rc fail counter: 0

You can also stop the script’s execution if needed by simply stopping the associated service:

sudo systemctl stop monitor-servers-wakeonlan.service

Conclusion

So this is it! A simple, easy and cheap solution to monitor your servers’ responsiveness and power them back on automatically if needed.

However, while this solution works to power servers back on after a power outage, it does not prevent them to be brutally shut down when the power loss occurs.
For a proper tolerance to short power outage and a proper shutdown for your servers in case of a long one, buy a UPS.

Note that this “Wake On Lan” solution can totally co-exist with a UPS, so your servers are properly shutdown in case of a long power outage and automatically powered back on when the power’s back! 🙂