Bug 303 - Slony Watchdog failed starting up the child process
Summary: Slony Watchdog failed starting up the child process
Status: ASSIGNED
Alias: None
Product: Slony-I
Classification: Unclassified
Component: slon (show other bugs)
Version: 2.0
Hardware: PC Linux
: low major
Assignee: Steve Singer
URL:
Depends on:
Blocks:
 
Reported: 2013-07-24 12:57 UTC by Rose Nancy
Modified: 2014-01-29 12:22 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Rose Nancy 2013-07-24 12:57:53 UTC
At the server startup,   slony daemon crashed with the following error

FATAL localListenThread: "select "_xxxx".cleanupNodelock(); insert into "_xxxx".sl_nodelock values ( 9151, 0,
"pg_catalog".pg_backend_pid()); " - ERROR: duplicate key value violates unique constraint "sl_nodelock-pkey"
DETAIL: Key (nl_nodeid, nl_conncnt)=(9151, 0) already exists.
2013-07-23 07:10:16 UTC FATAL Do you already have a slon running against this node?
2013-07-23 07:10:16 UTC FATAL Or perhaps a residual idle backend connection from a dead slon?
2013-07-23 07:10:16 UTC DEBUG2 slon_abort() from pid=1699
2013-07-23 07:10:16 UTC FATAL main: localListenThread did not start
2013-07-23 07:10:16 UTC CONFIG slon: child terminated signal: 9; pid: 1699, current worker pid: 1699
2013-07-23 07:10:16 UTC INFO slon: done

I know the origin of the duplicate key error, but I expected the Watchdog to try every 10s to start the daemon again.
In my case it seems like the watchdog process died with the child process. 


---------------

The propose solution is to add a parameter in the slony configuration file that set how many time the watchdog process will try before end of live.
Comment 1 Jan Wieck 2013-07-31 09:51:08 UTC
Patch for feature implementation: https://github.com/wieck/slony1-engine/commit/e4285eba5740dfe535925af232086e0ab3d0077b
Comment 2 Steve Singer 2013-08-12 07:08:36 UTC
This looks fine to me