At some point, anyone who writes code long enough runs into it—the script that needs to keep running.
Not the kind you casually restart. Not the kind you watch closely. I’m talking about the ones that quietly sit in the background, doing real work, until one day… they stop. No warning, no message, just gone.
And suddenly you realize you don’t just need code—you need something keeping an eye on the code.
That was the situation I found myself in recently. I had built a Python application that relied on sockets and a handful of third-party libraries to manage data. For the most part, it worked exactly as expected. But every now and then, under the right conditions, it would crash.
Not often enough to ignore. Not consistently enough to easily debug.
So I started thinking less about fixing it immediately, and more about making sure it could recover on its own.
That’s when I fell into the world of process supervision on Linux.
If you’ve never gone down that rabbit hole, it’s surprisingly… opinionated. What should be a simple question—“how do I keep my script running?”—quickly turns into a debate about system design, philosophy, and control over the operating system.
Historically, Linux has gone through a bit of an identity shift here. For a long time, systems relied on tools like Upstart, which made it easier to manage background services without constantly editing startup scripts. When Ubuntu adopted it years ago, it pushed a lot of distributions in that direction.
Then Systemd came along.
Built by engineers at Red Hat, Systemd didn’t just solve the same problem—it redefined how services were managed entirely. Within a few years, most major Linux distributions had adopted it, and suddenly the ecosystem split between those who embraced it and those who… very much did not.
Even today, you’ll still find people who feel strongly about both sides.
I looked into both approaches, mostly out of curiosity, but in the middle of all that, I realized something: I didn’t actually need to pick a side in a decades-long debate.
I just needed my script to stay alive.
That’s what led me to Supervisor.
It’s one of those tools that doesn’t try to do everything—it just does one thing well: managing and monitoring processes. It felt straightforward, easy to reason about, and most importantly, it worked the way I expected it to.
Installation
sudo apt-get update sudo apt-get install -y supervisor sudo service supervisor start
Configuration
cd /etc/supervisor/ vim supervisord.conf
In the supervisord.conf file you will notice the following code:
[include] files = /etc/supervisor/conf.d/*.conf
The first thing that stood out to me was how clean the structure is. There’s a central config file that basically says, “look in this folder for anything I need to run.” That’s it. From there, you just define your program in its own file—what command to run, where to run it, and how you want it handled. My application below is called SITL.
[program:sitl] directory=/home/SITL/ command=/usr/bin/python3 /home/SITL/server.py autostart=true autorestart=true startretries=3 stderr_logfile=/var/log/sitl/sitl.err.log stdout_logfile=/var/log/sitl/sitl.out.log environment=PYTHONPATH=/home/.local/lib/python3.5/site-packages
[program:sitl]– Define the program to monitorcommand– This is the command to run that kicks off the monitored processdirectory– Set a directory for Supervisord to “cd” into for before running the process, useful for cases where the process assumes a directory structure relative to the location of the executed script.autostart– Setting this “true” means the process will start when Supervisord starts (essentially on system boot).autorestart– If this is “true”, the program will be restarted if it exits unexpectedly.startretries– The number of retries to do before the process is considered “failed”stderr_logfile– The file to write any errors output.stdout_logfile– The file to write any regular output.user– The user the process is run as.environment– Environment variables to pass to the process.
For my setup, I gave it a few basic instructions: start automatically, restart if it crashes, try a few times before giving up, and log everything.
That last part ended up being more useful than I expected.
When your script fails, the hardest part is often figuring out why. Having separate log files for standard output and errors made debugging much easier. It turned the problem from “it stopped working” into “here’s exactly where it broke.”
sudo mkdir /var/log/sitl
Controlling Processes
Once everything was configured, bringing it online was almost anticlimactic. A quick reload, an update, and suddenly the script was running… but differently. It wasn’t tied to a terminal session anymore. It wasn’t something I had to babysit. It was just there, running as part of the system.
sudo supervisorctl reread sudo supervisorctl update
Now to visualize the process and check its runtime run the following code:
sudo supervisorctl
Now you should see the following process and it’s PID / uptime.
sitl RUNNING pid 11713, uptime 1:25:24
You can double check this by running the following code:
ps -aux | grep server.py
This should also give you similar output
root 11713 0.0 3.0 1145480 122668 ? S 18:56 0:05 /usr/bin/python3 /home/SITL/server.py
Remember you can start and stop the process at your will. By killing the process id you will not terminate the service keeping the app alive. Do the following to start and stop the service entirely:
sudo supervisorctl stop sitl sudo supervisorctl start sitl
Looking back, I probably spent more time than necessary reading about system-level debates when the solution I needed was much simpler. But that’s kind of part of working in this space. You don’t just learn the tool—you learn the ecosystem around it.
And sometimes the takeaway isn’t which approach is “better.” It’s understanding what you actually need in that moment.
For me, it was reliability.
Something that runs, restarts when it fails, and gives me just enough visibility to trust it.
Supervisor did exactly that.
And now, I don’t think twice about long-running scripts anymore.



