Aikido, programming, system administration, and other things I find interesting

Keeping a bunch of processes running

From time to time, I need some processes that keep running. It they were simple daemons, I could use something like monit, but what if I need X instances of worker A and Y instances of worker B?

I whipped up a quick script that makes it pretty easy to do that, when needed:

#!/usr/bin/perl -w
use strict;

use Config::Any;

my %pid_to_class;
my %workers;

$SIG{INT} = \&catch_kill;
my $killed = 0;

sub catch_kill {
  $killed = 1;
  foreach my $pid (keys %pid_to_class) {
    kill 9,$pid;
  }
}

sub start {
  my $worker = shift;
  if (my $pid = fork) { # in parent
    $pid_to_class{$pid} = $worker;
    $worker->{active}++;
    $worker->{started}=time;
  } else {
    exec($worker->{invoke});
  }
}

foreach (@ARGV) { # remove file extensions, they confuse Config::Any
  s/\.(\w+)$//;
}

my $cfg = Config::Any->load_stems({stems => \@ARGV, use_ext=>1});

foreach my $c (@{$cfg}) {
  foreach my $file (keys %{$c}) {
    %workers = %{$c->{$file}{workers}{worker}};
    foreach my $wk (keys %workers) {
      my $worker = $workers{$wk};
      $worker->{name}=$wk;
      print "Worker: ".$worker->{name}." ".$worker->{invoke}."\n";
      $worker->{active} = 0;
      $worker->{wait_before_restart} = 10 unless $worker->{wait_before_restart};
      $worker->{min} = $worker->{start} unless $worker->{min};
      $worker->{max} = $worker->{start} unless $worker->{max};
      $worker->{start} = $worker->{min} unless $worker->{start};
      $worker->{min_runtime} = 30       unless $worker->{min_runtime};
      for my $c (1..$worker->{start}) {
        start($worker);
      }
    }
  }
}

while ((my $pid = wait) != -1) {
  my $worker = $pid_to_class{$pid};
  print "pid $pid exited, ".$worker->{name}."\n";
  if (time-$worker->{started} < $worker->{min_runtime}) {
    warn sprintf "Worker %s (%s) died prematurely (%d seconds)\n",
      $worker->{name},$worker->{invoke},time-$worker->{started};
    sleep $worker->{wait_before_restart};
  }
  $worker->{active}--;
  delete($pid_to_class{$pid});
  last if $killed && ((scalar keys %pid_to_class) == 0);
  next if $killed;
  warn sprintf "Worker %s (%s) died prematurely (%d seconds)\n",
    $worker->{name},$worker->{invoke},time-$worker->{started}
      if (time-$worker->{started} < $worker->{min_runtime});
  start($worker) unless $worker->{active} > $worker->{min};
}

As you can see, it takes a configuration file in your favorite format (XML, JSON, YAML, etc), and you can
specify how many processes to start, what is the minimum number of processes to keep running, and at what point a process can be deemed to have exited prematurely.

Here’s an example of the configuration file:

<xml>
<general>
<minimum_respawn>10</minimum_respawn>
</general>
<workers>
  <worker name="A" start="3" invoke="bin/some_process_a"/>
  <worker name="B" start="8" invoke="bin/some_process_b"/>
  <worker name="C" start="1" invoke="bin/some_process_c"/>
<!--   -->
</workers>

</xml>

Related Posts

Why is my munin slow and how to speed it up

At $work we are monitoring a network of hundreds of servers, and that means that we end up recording hundreds of thousands of variable values every five minutes. After a while, the server started slowing down, taking more than 300 seconds to collect the data. Since it has a whole-system lock, that means the next […]

Read More

A munin plugin to monitor each CPU core separately

Monitoring each core separately may seem like a waste – after all, we have an overall CPU usage already available under “system” in munin, isn’t that enough? It turns out that it isn’t. Sometimes, when using top on a multicore/multicpu machine, you can see a process pegged at 100%, while other processes are comfortably using […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *