Bug 174

Summary: Isolated Nodes
Product: Slony-I Reporter: Christopher Browne <cbbrowne>
Component: slonikAssignee: Christopher Browne <cbbrowne>
Status: ASSIGNED ---    
Severity: enhancement CC: slony1-bugs
Priority: low    
Version: devel   
Hardware: All   
OS: All   
URL: http://wiki.postgresql.org/wiki/SlonyBrainstorming#Isolated_Node

Description Christopher Browne 2010-12-03 13:42:15 UTC
Per [http://lists.slony.info/pipermail/slony1-general/2010-November/011353.html discussion on list]

* Allow configuring that some nodes should be ignored for the purpose
  of confirmations.

* This allows the cleanup thread to trim out sl_log_(1|2) data for the
  ignored nodes

* FAILOVER may already support the notion that an "ignored" node might
  be lost.

* An interesting extension: SUBSCRIBE SET might automatically mark the
  "child" node as shunned until such time as the subscription has
  completed and caught up.  This only works if only one set is
  involved; if there are multiple subscriptions, shunning only works
  out well for the first one.

* [Jan] It will work just fine for multiple subscriptions. The point
  here is to free the master and other forwarders than the data
  provider for the node, that is busy subscribing, from keeping copies
  of the log.
Comment 1 Christopher Browne 2010-12-09 12:33:20 UTC
Making this more concrete...

Set up branch for development...  https://github.com/cbbrowne/slony1-engine/tree/bug174
Comment 2 Christopher Browne 2010-12-09 12:40:39 UTC
Proposed...  Add a table that will be replicated via event publishing that captures tuples of the form [excluder, excludee]

That is, node [excluder] will ignore events coming from node [excludee].

(Open question: Will it just be SYNC events that are ignored?  Seems quite likely apropos...)
Comment 3 Christopher Browne 2010-12-09 12:54:57 UTC
Add in functions to manage the exclusion table.

https://github.com/cbbrowne/slony1-engine/commit/062e79d42c986fc08e826ef2c0523350ee913abb

4 functions:

- excludenode(excluder, excludee)
- excludenode_int(excluder, excludee)
- unexcludenode(excluder, excludee)
- unexcludenode_int(excluder, excludee)

Exclude functions add in nodes

Unexclude functions remove nodes
Comment 4 Christopher Browne 2010-12-09 12:56:38 UTC
Add validation rules:

- You can't add an exclusion if there's a subscription between the nodes

https://github.com/cbbrowne/slony1-engine/commit/5cb7cad6aa22069360be9aa7ad3d5edffd78255d

- You can't set up a subscription if the nodes are involved in exclusion

https://github.com/cbbrowne/slony1-engine/commit/796df78da7e8e639e7df2df642800671ca945b04
Comment 5 Steve Singer 2010-12-09 13:02:58 UTC
(In reply to comment #2)
> Proposed...  Add a table that will be replicated via event publishing that
> captures tuples of the form [excluder, excludee]
> 
> That is, node [excluder] will ignore events coming from node [excludee].
> 
> (Open question: Will it just be SYNC events that are ignored?  Seems quite
> likely apropos...)

How would WAIT FOR (todays manual WAIT FOR - ignoring any automatic one) work if you ignore non sync events?  

Also, just because node 1 is excluding node 4 doesn't mean that node 3 is excluding node 4 (it probably isn't).  How will WAIT FOR change.

Also what are the implications with respect to FAILOVER and MOVE SET or rehshaping a subscription for this change.

Also can you propose some syntax for making a node excluded?

What happens if you no longer want a node to be excluded?  Especially if the events have already been deleted on the excluder.
Comment 6 Christopher Browne 2010-12-09 13:37:56 UTC
(In reply to comment #5)
> (In reply to comment #2)
> > Proposed...  Add a table that will be replicated via event publishing that
> > captures tuples of the form [excluder, excludee]
> > 
> > That is, node [excluder] will ignore events coming from node [excludee].
> > 
> > (Open question: Will it just be SYNC events that are ignored?  Seems quite
> > likely apropos...)
> 
> How would WAIT FOR (todays manual WAIT FOR - ignoring any automatic one) work
> if you ignore non sync events?  

> Also, just because node 1 is excluding node 4 doesn't mean that node 3 is
> excluding node 4 (it probably isn't).  How will WAIT FOR change.

If it's only present in the new world (e.g. - where WAIT is implicit), then that's not *totally* important :-).

But sure, it's a good question.

I imagine the result is that events from Node A will get ignored on Node B.

WAIT FOR presently has 2 forms:

a) WAIT FOR a particular node; this only breaks down if you ask to wait for an event from Node A arriving at Node B.  I think we can make such a request error out - it's not sensible to ask for this if there's an "exclusion" present.

b) WAIT FOR ALL.  I expect we'd have these requests ignore exclusions.

Seems sensible?
 
> Also what are the implications with respect to FAILOVER and MOVE SET or
> rehshaping a subscription for this change.

I expect we have to refuse these requests.  It shouldn't be problematic to add tests similar to <https://github.com/cbbrowne/slony1-engine/commit/796df78da7e8e639e7df2df642800671ca945b04> to FAILOVER and MOVE SET.
 
> Also can you propose some syntax for making a node excluded?
> 
> What happens if you no longer want a node to be excluded?  Especially if the
> events have already been deleted on the excluder.

I'd think slonik syntax would look like:

IGNORE NODE (IGNORE ID=2, IGNORE BY=4);
UNIGNORE NODE( ID=2, IGNORE BY=4);

Whether events have been deleted or not seems beside the point; that's a question at the time of SUBSCRIBE SET, MOVE SET, FAILOVER.
Comment 7 Jan Wieck 2011-01-27 08:19:26 UTC
(In reply to comment #2)
> Proposed...  Add a table that will be replicated via event publishing that
> captures tuples of the form [excluder, excludee]
> 
> That is, node [excluder] will ignore events coming from node [excludee].
> 
> (Open question: Will it just be SYNC events that are ignored?  Seems quite
> likely apropos...)

This definition is different from the original and changes the semantics fundamentally.

The original idea is to "assume" confirmations in order to avoid keeping events and log data for distant remote nodes.

That does not mean that we ignore events at all. It means that we don't care when a node 2 hops away catches up to them.