Bug 307

Summary: slon-tools.pm get_pid function uses "ps | egrep" badly
Product: Slony-I Reporter: p.mayers
Component: altperlAssignee: Christopher Browne <cbbrowne>
Status: ASSIGNED ---    
Severity: minor CC: slony1-bugs
Priority: low    
Version: 2.0   
Hardware: All   
OS: Linux   

Description p.mayers 2013-07-29 04:31:21 UTC
The "get_pid" function for slon-tools.pm tries to find a running "slon" for a given node by doing this:

 ps | egrep "host=$dbhost dbname=$dbname.*port=$dbport"

The ".*" on the end of the dbname means that weird stuff happens if you have a set of nodes like this:

 node1 host=127.0.0.1 dbname=testdb
 node2 host=127.0.0.1 dbname=testdb2
 node3 host=127.0.0.1 dbname=testdb3

Specifically, the success of slon_start is then order-dependent. This works:

 slon_start 1
 slon_start 2
 slon_start 3

...but this fails:

 slon_start 3
 slon_start 2
 slon_start 1 << fails with "slon already running"

I think you want a "\b" before the ".*" to indicate word boundary?
Comment 1 p.mayers 2013-07-29 05:25:57 UTC
I should add - this is because "dbname.*" machines "dbname" as well as "dbname1", "dbname2" etc.

The \b prevents this
Comment 2 Christopher Browne 2013-08-15 12:59:44 UTC
I wonder if this is perhaps totally the wrong approach altogether.

This draws us deeper into a "world of hurt" where we are essentially requiring that database connection strings have a particular format, and that we are, in detail, parsing them.  It heads towards more sophisticated Perl regular expressions, and while Perl is pretty good at that, people aren't necessarily good at reading, understanding, or debugging those!

The "better" way I am thinking of would be to consult the appropriate PID file generated by ech slon process that would allow us to avoid using ps altogether.

I'd like to poke at that a bit more; I'd be happy to hear other thoughts on the matter.