Thursday, December 1, 2022

Host is down - one way to fix broken systemd

kill -15 1

Some of our automation jobs sometimes seems to break systemd on our hosts resulting in the following error when interacting with it (e.g. trying to reboot):

System has not been booted with systemd as init system (PID 1). Can't operate.

Failed to connect to bus: Host is down

Failed to talk to init daemon.

Clearly this is a lie as ps confirms.

"Host is down" solutions recommend usually to issue some systemctl command but that just caused the same behavior.

However, you can issue those commands by sending signals as the manpage reveals. In our case sending SIGTERM to PID 1 solved the issue (though not sure what causes the issue in the first place :/).

SIGNALS

       SIGTERM

           Upon receiving this signal the systemd system manager serializes its state, reexecutes itself and deserializes

           the saved state again. This is mostly equivalent to systemctl daemon-reexec.