Eventually, everyone runs into the problem of a long running script. These scripts shouldn’t fail but if they do, there should be some method to relaunch and ensure that your application does not crash.
If you are new to process supervision on Linux, be prepared to get involved in a long running war and controversy over service management in the kernel. System services, often called daemons in Linux/Unix are programs that run in the background and provide functionality to other software.
For a long time (since 2006) Linux was shipped with Upstart: a program that could run daemons without having to modify the startup scripts. When Canonical (creators of Ubuntu) released Ubuntu 6.10, Upstart was part of the release and this caused most Linux distributions to migrate towards Upstart. Here’s where the war starts. During this shift in process supervision the Debian systems decided to migrate towards another software called Systemd.
Systemd was the creation of Lennart Poettering and Key Sievers both of whom were Red Hat engineers. Within the next 3 years a majority of distributions switched over to Systemd. Today, Systemd is shipped with most distributions but you will still catch Upstart in some legacy systems. For anyone developing on the Linux kernel this controversy has turned into a war over which program is better.
Recently, I created a python application that used sockets and many third-party libraries to manage data. This script ran pretty well with some data causing the process to die. I started investigating which program would be best to keep this script running as a service in the operating system. In the end, I decided to abandon both projects for Supervisor a process management software that met all my needs.
sudo apt-get update sudo apt-get install -y supervisor sudo service supervisor start
cd /etc/supervisor/ vim supervisord.conf
In the supervisord.conf file you will notice the following code:
[include] files = /etc/supervisor/conf.d/*.conf
This tells you that anything located in the conf.d folder ending with a .conf file will be included in the runtime of supervisor. Now we can proceed to create a configuration file for our long running process. First change directories into the conf.d folder and create a file for your application. The following configuration file is for my app SITL.
[program:sitl] directory=/home/SITL/ command=/usr/bin/python3 /home/SITL/server.py autostart=true autorestart=true startretries=3 stderr_logfile=/var/log/sitl/sitl.err.log stdout_logfile=/var/log/sitl/sitl.out.log environment=PYTHONPATH=/home/.local/lib/python3.5/site-packages
[program:sitl]– Define the program to monitor
command– This is the command to run that kicks off the monitored process
directory– Set a directory for Supervisord to “cd” into for before running the process, useful for cases where the process assumes a directory structure relative to the location of the executed script.
autostart– Setting this “true” means the process will start when Supervisord starts (essentially on system boot).
autorestart– If this is “true”, the program will be restarted if it exits unexpectedly.
startretries– The number of retries to do before the process is considered “failed”
stderr_logfile– The file to write any errors output.
stdout_logfile– The file to write any regular output.
user– The user the process is run as.
environment– Environment variables to pass to the process.
Since we specified where the log files should be stored we need to create that directory in our system.
sudo mkdir /var/log/sitl
sudo supervisorctl reread sudo supervisorctl update
Now to visualize the process and check its runtime run the following code:
Now you should see the following process and it’s PID / uptime.
sitl RUNNING pid 11713, uptime 1:25:24
You can double check this by running the following code:
ps -aux | grep server.py
This should also give you similar output
root 11713 0.0 3.0 1145480 122668 ? S 18:56 0:05 /usr/bin/python3 /home/SITL/server.py
Remember you can start and stop the process at your will. By killing the process id you will not terminate the service keeping the app alive. Do the following to start and stop the service entirely:
sudo supervisorctl stop sitl sudo supervisorctl start sitl