Bug 137

Summary: execute script does not get applied in the correct order
Product: Slony-I Reporter: Steve Singer <ssinger>
Component: stored proceduresAssignee: Jan Wieck <jan>
Status: RESOLVED FIXED    
Severity: major CC: slony1-bugs
Priority: low    
Version: devel   
Hardware: All   
OS: All   
Bug Depends on: 134    
Bug Blocks:    

Description Steve Singer 2010-06-24 10:48:39 UTC
This issue was found while running the javascript version of the testdeadlockddl test.

We have a table table4 with columns id1,id2.

We insert into the table as follows

INSERT INTO table4 VALUES ('......')

Some DDL via EXECUTE SCRIPT is executed to rename id1=>col1 and id2=>col2.

The test as run when the problem occured did the following
1. Launch a large file of INSERTS via psql. Don't wait for it to finish, the individual inserts appear to be running in autocommit mode
2. execute the DDL via EXECUTE SCRIPT. Wait for slonik to finish
3. Launch more transactions

After the DDL_SCRIPT event is processed on the slave we start to get replication failures slon is trying to apply INSERTS with id1,id2 but the columns on the slave have already been switched.

The relevant part of sl_event is

    1 | 5000000022 | SYNC                | 144375:144525:144375,144383,144393,144407,144412,144423,144476,144485
    1 | 5000000023 | DDL_SCRIPT          | 144522:144538:144529
    1 | 5000000024 | SYNC                | 144375:144529:144375,144383,144393,144407,144412,144423,144476,144485,144522


in sl_log_1 the inserts on table4 switch from referencing id1,id2 to col1,col2 at log_txid=144526(references id1,id2)  log_txid=144583 (references col1,col2)

Notice that sync 24 includes logtx_id numbers below 144538

What I THINK is happening is


T1: transaction starts - via slonik EXECUTE SCRIPT
T1: ddlScriptPrepare  creates a SYNC event in sl_event event #10
T2: transaction starts - some sql session
T2: INSERT into a table 'foo'
T2: COMMITS  - this works so far T1 has had no reason to obtain any locks on T1.
T1:  executes the DDL script, this script ALTERS the table 'foo' and renames a column.  This obtains the requisite locks on 'foo'.  T2 has long since committed and released its locks
T1: Inserts the DDL_SCRIPT event into sl_event event #11
T3:  A worker thread generates the next normal sync event on the database this is event #12

The INSERT that T2 did will be picked up by sync #12 since it hadn't yet committed when sync #10 happened.

On the replica  the DDL script (event number #11) will happen before event #12.  When it gets to sync #12 the data in sl_log for the insert will reference the old columns of 'foo' which no longer exist.
Comment 1 Christopher Browne 2010-08-11 13:08:03 UTC
This can only really be rectified in 2.1.
Comment 2 Jan Wieck 2010-08-11 13:36:45 UTC
This is an MVCC problem. Consider the following time line with xact2 being the EXECUTE SCRIPT done by slonik:

xact1: begin;
xact1: insert ...

xact2: begin;
xact2: select ddlScript_prepare_int(); -- Creates SYNC event

xact1: commit;

xact2: delete ...

This is running in read committed mode, so the delete of xact2 will see the row, inserted by xact1 and delete it. However, the SYNC was created before xact1 committed, so it is considered IN PROGRESS. This causes the insert not to be part of the SYNC and it will get replicated after the ddl script delete.

To fix this, slonik needs to issue LOCK TABLE statements right after "begin;". That way, it will wait until all conflicting transactions have committed before generating its own snapshot. 

We need new syntax to add this information to the EXECUTE SCRIPT command and it will be the responsibility of the user to provide the list of tables to lock.


Jan
Comment 3 Christopher Browne 2010-08-18 10:22:09 UTC
Concurrent with resolution of the other issues, we might wish to shift the DDL data from sl_event to sl_log_1/2.

In that case, we'd parse the statements just once, inside Slonik, and introduce one sl_log_* tuple for each statement.

This eliminates the need to parse things later, and eliminates the need to link the C code in src/parsestatements to both slon and slonik - it's only needed by slon.

We had speculated that this split might allow avoiding the ordering problems, but that is regrettably not the case, even in the ambitious scenario where we'd have commit timestamps available to us, because the time at which the DDL is executed and the time at which the DDL statement is recorded in sl_log_* isn't identical.

But we'd get the other benefits mentioned earlier from the split/shift, e.g. - parse once, and eliminate component from slon.
Comment 4 Christopher Browne 2010-08-18 10:23:28 UTC
BTW, the dependancy on #134 is there because, if we shift DDL into sl_log_*, then the processing of it on subscribers shifts around into the same logic that bug #134 (implementing TRUNCATE-handling triggers) uses.
Comment 5 Christopher Browne 2010-08-19 12:59:05 UTC
(In reply to comment #4)
> BTW, the dependancy on #134 is there because, if we shift DDL into sl_log_*,
> then the processing of it on subscribers shifts around into the same logic that
> bug #134 (implementing TRUNCATE-handling triggers) uses.

Note that some further discussion is taking place on the mailing list; see the following thread:

http://lists.slony.info/pipermail/slony1-general/2010-August/011051.html
Comment 6 Christopher Browne 2011-05-05 13:13:11 UTC
I am "poking away" at this; see branch at GitHub:

https://github.com/cbbrowne/slony1-engine/tree/bug137
Comment 7 Christopher Browne 2011-05-17 10:07:56 UTC
After discussion with Marko Kreen and Jan Wieck, it should not be necessary to split off the LOCK statements from the DDL.

If it's needful to lock tables in particular order to evade deadlock, that is dealt with by putting LOCK requests into the DDL script.

But by running each statement immediately before logging into sl_log_{1,2}, that imposes all necessary locks in an order that protects from race conditions.
Comment 8 Christopher Browne 2011-11-09 14:56:58 UTC
A version of this that functions against version 2.2 may be found here:
https://github.com/cbbrowne/slony1-engine/tree/bug137-squashed

Note that this branch was arrived at by merging Jan's 2.2 branch and Steve's remote worker changes atop "master":

https://github.com/wieck/slony1-engine/tree/copy-protocol-22
https://github.com/ssinger/slony1-engine/tree/copy-protocol_22_remoteworker

In practice, I think my changes are all expressed by one patch thus:
https://github.com/cbbrowne/slony1-engine/commit/5187d5343145c3c145e328ad05912d35334cd601

I'll be poking further at the documentation, as some of the old docs are invalidated by this change.
Comment 9 Steve Singer 2011-11-15 13:04:19 UTC
I have looked at your changes.

Here is my review



1. Your unit test changes should be to the javascript version of the regression ones not the shell ones

2. It would be nice if we could be syntax backwards compatible on EXECUTE SCRIPT.
ie 
EXECUTE SCRIPT(set id=1, FILENAME='foo', EXECUTE ONLY ON=1, EVENT NODE=1); 

should still work. The grammer accepts this, it is the code in assign_options that need to have an INT_OR_STR option. or something.  We would also need to make event_node optional if only 1 ONLY ON node is specifid.  



3.  In ddlchanges.sgml item 3
I am opposed to saying in the manual that DDL can be submitted directly with the ddlCapture SQL function.  I feel that only slonik should be calling these slony functions.  By telling people in the manual that they can do this then we are committing to a somewhat stable API.
I think it will also encourage people to call other slony SQL functions directly.


The counter argument is that now that ddlChange() isn't an event it actually can be safely called by someone other than slonik and it allows applications (ie java, php,perl,python etc...) do do there own DDL without having to make a system() call to slonik.  I am somewhat sympathetic to this but I remind people that in any version prior to 2.2 this was a really bad/broken idea. Even in 2.2 calling ddlChange() outside of slonik with more than one EVENT NODE can expose you to race conditions.


In slonik_ref.sgml
I would like to see any encouragement to calling ddlCapture() directly removed (see above for reasons)


slony1_funcs.sql

In 2.1 the ddlSCript_complete_int() function calls repair_log_triggers() and updateRelname() to fixup the slony sl_table and log triggers after a DDL event.  I don't see these being used in 2.2.  I *think* the log_apply trigger needs to perform both of these actions after executing the SQL.

slony1_base.sql

In sl_log_script definition we are using bigint for log_txid and int8 for log_actionseq.  I see sl_log_1 sl_log_2 are defined the same way.  Is there a difference between bigint and int8?  Should we be consistnet? Should we be changing sl_log_1 and sl_log_2 (since the 2.2 upgrade code is dropping sl_log_1 anyway)?


slonik.c
--------------

slonik_ddl_script

We don't free res1 in the happy path, just in the error case.

If the SET ID is never used should we make it an optional element in the grammer (script_check_stmtS)




remote_worker.c
----------------------

Create a two node test case.

create set(set id=1, origin=1);
set add table(set id=1, fully qualified name='public.foo');
subscribe set(id=1,provider=1,receiver=2);
execute script(set id=1, filename='bar.sql', execute on=2);

The script gets executed on node 2 but never on node 1.  Even though I didn't submit a 'EXECUTE ONLY ON'.  The problem is that if a node 1 is not receiving any subscription sets from node 2 the node 1 will never see any DDL from node 2 because the provider structures in the remoteWorker don't have it listening for these events.
Comment 10 Christopher Browne 2011-11-15 14:41:25 UTC
(In reply to comment #9)
> I have looked at your changes.
> 
> 1. Your unit test changes should be to the javascript version of the regression
> ones not the shell ones

Good point; I have patched the clustertest tests.

https://github.com/cbbrowne/slony1-engine/commit/e6eb7808f6cc3124385c974c235bb48e925af4f7

https://github.com/cbbrowne/slony1-engine/commit/8065a2e4c4426f8f66a9a276fd90a90407bc9dd0
Comment 11 Christopher Browne 2011-11-15 15:00:09 UTC
(In reply to comment #9)
> 2. It would be nice if we could be syntax backwards compatible on EXECUTE
> SCRIPT.
> ie 
> EXECUTE SCRIPT(set id=1, FILENAME='foo', EXECUTE ONLY ON=1, EVENT NODE=1); 
> 
> should still work. The grammer accepts this, it is the code in assign_options
> that need to have an INT_OR_STR option. or something.  We would also need to
> make event_node optional if only 1 ONLY ON node is specifid.  

Good points.  I'm taking a poke at this.
Comment 12 Christopher Browne 2011-11-17 10:00:30 UTC
(In reply to comment #9)
> 2. It would be nice if we could be syntax backwards compatible on EXECUTE
> SCRIPT.
> ie 
> EXECUTE SCRIPT(set id=1, FILENAME='foo', EXECUTE ONLY ON=1, EVENT NODE=1); 
> 
> should still work. The grammer accepts this, it is the code in assign_options
> that need to have an INT_OR_STR option. or something.

https://github.com/cbbrowne/slony1-engine/commit/30912a5af6a1077e07437e96a09d788dd412e544
Comment 13 Christopher Browne 2011-11-17 10:04:55 UTC
(In reply to comment #9)
> 3.  In ddlchanges.sgml item 3
> I am opposed to saying in the manual that DDL can be submitted directly with
> the ddlCapture SQL function.  I feel that only slonik should be calling these
> slony functions.  By telling people in the manual that they can do this then we
> are committing to a somewhat stable API.
> I think it will also encourage people to call other slony SQL functions
> directly.
> 
> 
> The counter argument is that now that ddlChange() isn't an event it actually
> can be safely called by someone other than slonik and it allows applications
> (ie java, php,perl,python etc...) do do there own DDL without having to make a
> system() call to slonik.  I am somewhat sympathetic to this but I remind people
> that in any version prior to 2.2 this was a really bad/broken idea. Even in 2.2
> calling ddlChange() outside of slonik with more than one EVENT NODE can expose
> you to race conditions.
 
https://github.com/cbbrowne/slony1-engine/commit/af54adb535146a84923ae3e59c1c6c27217e1ab8
 
> In slonik_ref.sgml
> I would like to see any encouragement to calling ddlCapture() directly removed
> (see above for reasons)

https://github.com/cbbrowne/slony1-engine/commit/789534ca1658bcc685a59f325fe2658827d3de5c
Comment 14 Christopher Browne 2011-11-18 09:07:28 UTC
(In reply to comment #9)
> 2. It would be nice if we could be syntax backwards compatible on EXECUTE
> SCRIPT.
> ie 
> EXECUTE SCRIPT(set id=1, FILENAME='foo', EXECUTE ONLY ON=1, EVENT NODE=1); 
> 
> should still work. The grammer accepts this, it is the code in assign_options
> that need to have an INT_OR_STR option. or something.  We would also need to
> make event_node optional if only 1 ONLY ON node is specifid.  

Added this to docs:

https://github.com/cbbrowne/slony1-engine/commit/6dc3401b52b3ee03dd26b0a6d87c22101a19fffb

Added regression tests for several cases surrounding this:

https://github.com/cbbrowne/slony1-engine/commit/4e6e90dd332c965dea399abde2266b2b9cdc5063
Comment 15 Christopher Browne 2011-11-18 09:40:02 UTC
(In reply to comment #9)
> slony1_funcs.sql
> 
> In 2.1 the ddlSCript_complete_int() function calls repair_log_triggers() and
> updateRelname() to fixup the slony sl_table and log triggers after a DDL event.
>  I don't see these being used in 2.2.  I *think* the log_apply trigger needs to
> perform both of these actions after executing the SQL.

Yep, quite right.

Call these functions:
https://github.com/cbbrowne/slony1-engine/commit/b07cf0014b8e52b7c597b54a4765ee3cda6d421e

Optimization: only fix values that are busted
https://github.com/cbbrowne/slony1-engine/commit/bd913b2adbe356e15b3a59c85a830ea4269d403a
Comment 16 Steve Singer 2013-06-06 11:41:59 UTC
This was fixed in the 2.2 development cycle