The "get_pid" function for slon-tools.pm tries to find a running "slon" for a given node by doing this: ps | egrep "host=$dbhost dbname=$dbname.*port=$dbport" The ".*" on the end of the dbname means that weird stuff happens if you have a set of nodes like this: node1 host=127.0.0.1 dbname=testdb node2 host=127.0.0.1 dbname=testdb2 node3 host=127.0.0.1 dbname=testdb3 Specifically, the success of slon_start is then order-dependent. This works: slon_start 1 slon_start 2 slon_start 3 ...but this fails: slon_start 3 slon_start 2 slon_start 1 << fails with "slon already running" I think you want a "\b" before the ".*" to indicate word boundary?
I should add - this is because "dbname.*" machines "dbname" as well as "dbname1", "dbname2" etc. The \b prevents this
I wonder if this is perhaps totally the wrong approach altogether. This draws us deeper into a "world of hurt" where we are essentially requiring that database connection strings have a particular format, and that we are, in detail, parsing them. It heads towards more sophisticated Perl regular expressions, and while Perl is pretty good at that, people aren't necessarily good at reading, understanding, or debugging those! The "better" way I am thinking of would be to consult the appropriate PID file generated by ech slon process that would allow us to avoid using ps altogether. I'd like to poke at that a bit more; I'd be happy to hear other thoughts on the matter.