Eventually, everyone runs into the problem of a long running script. These scripts shouldn’t fail but if they do, there should be some method to relaunch and ensure that your application does not crash.

If you are new to process supervision on Linux, be prepared to get involved in a long running war and controversy over service management in the kernel. System services, often called daemons in Linux/Unix are programs that run in the background and provide functionality to other software.

For a long time (since 2006) Linux was shipped with Upstart: a program that could run daemons without having to modify the startup scripts. When Canonical (creators of Ubuntu) released Ubuntu 6.10, Upstart was part of the release and this caused most Linux distributions to migrate towards Upstart. Here’s where the war starts. During this shift in process supervision the Debian systems decided to migrate towards another software called Systemd.

Systemd was the creation of Lennart Poettering and Key Sievers both of whom were Red Hat engineers. Within the next 3 years a majority of distributions switched over to Systemd. Today, Systemd is shipped with most distributions but you will still catch Upstart in some legacy systems.  For anyone developing on the Linux kernel this controversy has turned into a war over which program is better.

Recently, I created a python application that used sockets and many third-party libraries to manage data. This script ran pretty well with some data causing the process to die. I started investigating which program would be best to keep this script running as a service in the operating system. In the end, I decided to abandon both projects for Supervisor a process management software that met all my needs.

Installation

sudo apt-get update
sudo apt-get install -y supervisor
sudo service supervisor start

Configuration

cd /etc/supervisor/
vim supervisord.conf

In the supervisord.conf file you will notice the following code:

[include] 
files = /etc/supervisor/conf.d/*.conf

This tells you that anything located in the conf.d folder ending with a .conf file will be included in the runtime of supervisor. Now we can proceed to create a configuration file for our long running process. First change directories into the conf.d folder and create a file for your application. The following configuration file is for my app SITL.

[program:sitl]
directory=/home/SITL/
command=/usr/bin/python3 /home/SITL/server.py
autostart=true
autorestart=true
startretries=3
stderr_logfile=/var/log/sitl/sitl.err.log stdout_logfile=/var/log/sitl/sitl.out.log
environment=PYTHONPATH=/home/.local/lib/python3.5/site-packages 
  • [program:sitl] – Define the program to monitor
  • command – This is the command to run that kicks off the monitored process
  • directory – Set a directory for Supervisord to “cd” into for before running the process, useful for cases where the process assumes a directory structure relative to the location of the executed script.
  • autostart – Setting this “true” means the process will start when Supervisord starts (essentially on system boot).
  • autorestart – If this is “true”, the program will be restarted if it exits unexpectedly.
  • startretries – The number of retries to do before the process is considered “failed”
  • stderr_logfile – The file to write any errors output.
  • stdout_logfile – The file to write any regular output.
  • user – The user the process is run as.
  • environment – Environment variables to pass to the process.

Since we specified where the log files should be stored we need to create that directory in our system.

sudo mkdir /var/log/sitl

Controlling Processes

sudo supervisorctl reread 
sudo supervisorctl update

Now to visualize the process and check its runtime run the following code:

sudo supervisorctl

Now you should see the following process and it’s PID / uptime.

sitl                             RUNNING   pid 11713, uptime 1:25:24

You can double check this by running the following code:

ps -aux | grep server.py 

This should also give you similar output

root 11713  0.0  3.0 1145480 122668 ?  S    18:56   0:05 /usr/bin/python3 /home/SITL/server.py 

Remember you can start and stop the process at your will. By killing the process id you will not terminate the service keeping the app alive. Do the following to start and stop the service entirely:

sudo supervisorctl stop sitl
sudo supervisorctl start sitl