Linux how to restart service automatically to avoid server downtime
systemd
service restart
policy to control it easily.Symptom
I have nginx server running for months, suddenly got a alarm from monitor service indicate the nginx server is not providing service.
I can ssh to server, so server is still online. Then check nginx
server status use systemctl status nginx
,
I see nginx is not running due to a core dump. yes, even nginx may crash.
$ systemctl status nginx # or sudo service nginx status
● nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
Active: failed (Result: core-dump) since Thu 2022-01-06 18:51:49 PST; 702ms ago
Docs: man:nginx(8)
Process: 720751 ExecReload=/usr/sbin/nginx -g daemon on; master_process on; -s reload (code=exited, status=0/SUCCES>
Main PID: 700787 (code=dumped, signal=SEGV)
Tasks: 0 (limit: 1110)
Memory: 14.5M
CGroup: /system.slice/nginx.service
Jan 06 10:57:38 systemd[1]: Reloading A high performance web server and a reverse proxy server.
Jan 06 10:57:38 systemd[1]: Reloaded A high performance web server and a reverse proxy server.
Jan 06 18:51:46 systemd[1]: Reloading A high performance web server and a reverse proxy server.
Jan 06 18:51:46 systemd[1]: Reloaded A high performance web server and a reverse proxy server.
Jan 06 18:51:49 systemd[1]: nginx.service: Main process exited, code=dumped, status=11/SEGV
Jan 06 18:51:49 systemd[1]: nginx.service: Killing process 718970 (nginx) with signal SIGKILL.
Jan 06 18:51:49 systemd[1]: nginx.service: Killing process 718971 (nginx) with signal SIGKILL.
Jan 06 18:51:49 systemd[1]: nginx.service: Killing process 718970 (nginx) with signal SIGKILL.
Jan 06 18:51:49 systemd[1]: nginx.service: Killing process 718971 (nginx) with signal SIGKILL.
Jan 06 18:51:49 systemd[1]: nginx.service: Failed with result 'core-dump'.
Use ps -ef|grep nginx
also verified there is no nginx is running:
$ ps -ef|grep nginx
ubuntu 720761 718615 0 18:52 pts/0 00:00:00 grep --color=auto nginx
I can restart nginx service manually to recover but I hope a Linux service can restart automatically to avoid any downtime.
Solution to automatically restart Linux service to avoid downtime
Luckily Linux systemd system and service manager
already provide this feature in service configuration. You can specific Restart
policy.
Service Restart
policy
This following the full description of Restart
policy:
Configures whether the service shall be restarted when the
service process exits, is killed, or a timeout is reached.
The service process may be the main service process, but it
may also be one of the processes specified with
ExecStartPre=
, ExecStartPost=
, ExecStop=
, ExecStopPost=
, or
ExecReload=
. When the death of the process is a result of
systemd operation (e.g. service stop or restart), the service
will not be restarted. Timeouts include missing the watchdog
“keep-alive ping” deadline and a service start, reload, and
stop operation timeouts.
Restart
value takes one of no
, on-success
, on-failure
,on-abnormal
, on-watchdog
, on-abort
, or always
.
If set to no
(the default), the service will not be restarted.
If set to on-success
, it will be restarted only when the service
process exits cleanly.
In this context, a clean exit means any of the following:
- exit code of 0;
- for types other than
Type=oneshot
, one of the signalsSIGHUP
,SIGINT
,SIGTERM
, orSIGPIPE
; - exit statuses and signals specified in
SuccessExitStatus=
.
If set to on-failure
, the service will be restarted when the
process exits with a non-zero exit code, is terminated by a
signal (including on core dump, but excluding the
aforementioned four signals), when an operation (such as
service reload) times out, and when the configured watchdog
timeout is triggered.
If set to on-abnormal
, the service will
be restarted when the process is terminated by a signal
(including on core dump, excluding the aforementioned four
signals), when an operation times out, or when the watchdog
timeout is triggered.
If set to on-abort
, the service will be
restarted only if the service process exits due to an
uncaught signal not specified as a clean exit status. If set
to on-watchdog, the service will be restarted only if the
watchdog timeout for the service expires.
If set to always
,
the service will be restarted regardless of whether it exited
cleanly or not, got terminated abnormally by a signal, or hit
a timeout.
Table: Exit causes and the effect of the Restart=
settings
┌──────────────┬────┬────────┬────────────┬────────────┬─────────────┬──────────┬─────────────┐
│Restart │ no │ always │ on-success │ on-failure │ on-abnormal │ on-abort │ on-watchdog │
│settings/Exit │ │ │ │ │ │ │ │
│causes │ │ │ │ │ │ │ │
├──────────────┼────┼────────┼────────────┼────────────┼─────────────┼──────────┼─────────────┤
│Clean exit │ │ X │ X │ │ │ │ │
│code or │ │ │ │ │ │ │ │
│signal │ │ │ │ │ │ │ │
├──────────────┼────┼────────┼────────────┼────────────┼─────────────┼──────────┼─────────────┤
│Unclean exit │ │ X │ │ X │ │ │ │
│code │ │ │ │ │ │ │ │
├──────────────┼────┼────────┼────────────┼────────────┼─────────────┼──────────┼─────────────┤
│Unclean │ │ X │ │ X │ X │ X │ │
│signal │ │ │ │ │ │ │ │
├──────────────┼────┼────────┼────────────┼────────────┼─────────────┼──────────┼─────────────┤
│Timeout │ │ X │ │ X │ X │ │ │
├──────────────┼────┼────────┼────────────┼────────────┼─────────────┼──────────┼─────────────┤
│Watchdog │ │ X │ │ X │ X │ │ X │
└──────────────┴────┴────────┴────────────┴────────────┴─────────────┴──────────┴─────────────┘
As exceptions to the setting above, the service will not be
restarted if the exit code or signal is specified in
RestartPreventExitStatus=
or the service is
stopped with systemctl stop
or an equivalent operation. Also,
the services will always be restarted if the exit code or
signal is specified in RestartForceExitStatus=
.
Note that service restart is subject to unit start rate
limiting configured with StartLimitIntervalSec=
and
StartLimitBurst=
, see systemd.unit(5)
for details. A
restarted service enters the failed state only after the
start limits are reached.
Setting this to on-failure
is the recommended choice for
long-running services, in order to increase reliability by
attempting automatic recovery from errors. For services that
shall be able to terminate on their own choice (and avoid
immediate restarting), on-abnormal
is an alternative choice.
Change Restart
policy
systemctl
have edit
command to override the service config:
edit UNIT...
Edit a drop-in snippet or a whole replacement file if --full is specified, to extend or override the specified unit.
To override existing unit file for nginx
,
Run sudo systemctl edit nginx
,
then paste following two lines to specific Restart
policy as always
to indicate the service will be restarted regardless of whether it exited cleanly or not, got terminated abnormally by a signal, or hit a timeout:
[Service]
Restart=always
Save and quit. It should take effect immediately.
The sudo systemctl edit nginx
command write nginx config in /etc/systemd/system/nginx.service.d/override.conf
, you can use cat
to see its content.
$ cat /etc/systemd/system/nginx.service.d/override.conf
[Service]
Restart=always
You can also check the full config of nginx
service unit file by systemctl cat nginx.service
$ systemctl cat nginx.service
# /lib/systemd/system/nginx.service
# Stop dance for nginx
# =======================
#
# ExecStop sends SIGSTOP (graceful stop) to the nginx process.
# If, after 5s (--retry QUIT/5) nginx is still running, systemd takes control
# and sends SIGTERM (fast shutdown) to the main process.
# After another 5s (TimeoutStopSec=5), and if nginx is alive, systemd sends
# SIGKILL to all the remaining processes in the process group (KillMode=mixed).
#
# nginx signals reference doc:
# http://nginx.org/en/docs/control.html
#
[Unit]
Description=A high performance web server and a reverse proxy server
Documentation=man:nginx(8)
After=network.target
[Service]
Type=forking
PIDFile=/run/nginx.pid
ExecStartPre=/usr/sbin/nginx -t -q -g 'daemon on; master_process on;'
ExecStart=/usr/sbin/nginx -g 'daemon on; master_process on;'
ExecReload=/usr/sbin/nginx -g 'daemon on; master_process on;' -s reload
ExecStop=-/sbin/start-stop-daemon --quiet --stop --retry QUIT/5 --pidfile /run/nginx.pid
TimeoutStopSec=5
KillMode=mixed
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/nginx.service.d/override.conf
[Service]
Restart=always
Then start nginx service:
$ sudo systemctl start nginx
$ systemctl status nginx
● nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/nginx.service.d
└-override.conf
Active: active (running) since Tue 2022-02-01 15:18:18 PST; 4s ago
Docs: man:nginx(8)
Process: 2302672 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Process: 2302673 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Main PID: 2302674 (nginx)
Tasks: 3 (limit: 1113)
Memory: 9.2M
CGroup: /system.slice/nginx.service
├-2302674 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
├-2302675 nginx: worker process
└-2302676 nginx: worker process
Test kill nginx
process by sudo pkill -f nginx
,
then use systemctl status nginx
to check nginx status,
you should see nginx is active (running)
but with different process id,
this indicate the nginx service restart automatically. cheers.
$ sudo pkill nginx
$ systemctl status nginx
● nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/nginx.service.d
└-override.conf
Active: active (running) since Tue 2022-02-01 15:18:56 PST; 2s ago
Docs: man:nginx(8)
Process: 2302830 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Process: 2302831 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
Main PID: 2302832 (nginx)
Tasks: 3 (limit: 1113)
Memory: 9.8M
CGroup: /system.slice/nginx.service
├-2302832 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
├-2302833 nginx: worker process
└-2302834 nginx: worker process
Summary
To config a service restart automatically, use sudo systemctl edit <service name>
to add following config:
[Service]
Restart=always
Save and quit, that’s simple.
Read more on Linux manual systemd(1) , systemctl(1) and systemd.service(5) .
References
- systemd(1) — Linux manual page
- systemctl(1) — Linux manual page
- systemd.service(5) — Linux manual page
- Configure Debian start up services
OmniLock - Block / Hide App on iOS
Block distractive apps from appearing on the Home Screen and App Library, enhance your focus and reduce screen time.
DNS Firewall for iOS and Mac OS
Encrypted your DNS to protect your privacy and firewall to block phishing, malicious domains, block ads in all browsers and apps