HW/SW info: Slony: v2.2.1 OS: CentOS release 6.4 (2.6.32-358.6.2.el6.x86_64) DB: PostgreSQL 9.2.4 Details: Sync failed right after initial copy. At first time, with "omit copy = yes" it failed with: >>>> 2013-12-03 22:13:02 HKT ERROR remoteWorkerThread_1: SYNC aborted 2013-12-03 22:13:12 HKT INFO remoteWorkerThread_1: syncing set 18 with 219 table(s) from provider 1 2013-12-03 22:13:12 HKT ERROR remoteWorkerThread_1_1: error at end of COPY IN: ERROR: invalid memory alloc request size 1869439327 CONTEXT: COPY sl_log_2, line 949: "1 1810515783 18203 27151 casino jackpot_contrib_summary U 1 {contrib_amount_bc,4.317294024,summary_d..." 2013-12-03 22:13:12 HKT ERROR remoteWorkerThread_1_1: failed SYNC's log selection query was ' COPY ( select log_origin, log_txid, NULL::integer, log_actionseq, NULL::text, NULL::text, log_cmdtype, NULL::integer, log_cmdargs from "_cluster_casino".sl_log_script where log_origin = 1 and log_txid >= "pg_catalog".txid_snapshot_xmax('1810515020:1810515020:') and log_txid < '1810515860' and "pg_catalog".txid_visible_in_snapshot(log_txid, '1810515860:1810515860:') union all select log_origin, log_txid, NULL::integer, log_actionseq, NULL::text, NULL::text, log_cmdtype, NULL::integer, log_cmdargs from "_cluster_casino".sl_log_script where log_origin = 1 and log_txid in (select * from "pg_catalog".txid_snapshot_xip('1810515020:1810515020:') except select * from "pg_catalog".txid_snapshot_xip('1810515860:1810515860:') ) union all select log_origin, log_txid, log_tableid, log_actionseq, log_tablenspname, log_tablerelname, log_cmdtype, log_cmdupdncols, log_cmdargs from "_cluster_casino".sl_log_1 where log_origin = 1 and log_tableid in (18229,18046,18058,18047,18048,18080,18104,18049,18050,18001,18002,18003,18005,18006,18007,18008,18009,18011,18012,18013,18010,18014,18015, 18016,18017,18018,18019,18020,18021,18022,18023,18024,18025,18026,18030,18027,18028,18031,18032,18033,18034,18035,18036,18037,18038,18039,18040, 18041,18042,18043,18044,18045,18051,18052,18053,18054,18055,18056,18057,18059,18060,18061,18062,18063,18064,18065,18066,18067,18068,18069,18070, 18071,18072,18073,18074,18075,18076,18077,18078,18079,18081,18082,18083,18084,18085,18086,18087,18088,18089,18090,18091,18092,18093,18094,18095, 18096,18097,18098,18099,18100,18101,18102,18103,18105,18106,18107,18110,18111,18112,18113,18114,18115,18116,18117,18118,18119,18120,18121,18122, 18123,18124,18125,18126,18127,18128,18129,18130,18132,18135,18136,18137,18138,18139,18140,18141,18142,18143,18145,18146,18147,18148,18149,18150, 18151,18152,18153,18154,18155,18156,18157,18158,18159,18160,18161,18162,18163,18164,18165,18166,18167,18168,18169,18170,18171,18172,18173,18174, 18175,18176,18177,18178,18179,18180,18181,18182,18183,18184,18185,18186,18187,18189,18190,18191,18192,18193,18194,18195,18196,18197,18198,18199, 18201,18202,18203,18204,18205,18206,18207,18208,18209,18210,18211,18212,18213,18214,18215,18216,18217,18218,18219,18220,18221,18222,18223,18224, 18225,18226,18227,18228) and log_txid >= '1810515020' and log_txid < '1810515860' and "pg_catalog".txid_visible_in_snapshot(log_txid, '1810515860:1810515860:') union all select log_origin, log_txid, log_tableid, log_actionseq, log_tablenspname, log_tablerelname, log_cmdtype, log_cmdupdncols, log_cmdargs from "_cluster_casino".sl_log_1 where log_origin = 1 and log_tableid in (18229,18046,18058,18047,18048,18080,18104,18049,18050,18001,18002,18003,18005,18006,18007,18008,18009,18011,18012,18013, 18010,18014,18015,18016,18017,18018,18019,18020,18021,18022,18023,18024,18025,18026,18030,18027,18028,18031,18032,18033,18034,18035,18036, 18037,18038,18039,18040,18041,18042,18043,18044,18045,18051,18052,18053,18054,18055,18056,18057,18059,18060,18061,18062,18063,18064, 18065,18066,18067,18068,18069,18070,18071,18072,18073,18074,18075,18076,18077,18078,18079,18081,18082,18083,18084,18085,18086,18087, 18088,18089,18090,18091,18092,18093,18094,18095,18096,18097,18098,18099,18100,18101,18102,18103,18105,18106,18107,18110,18111,18112, 18113,18114,18115,18116,18117,18118,18119,18120,18121,18122,18123,18124,18125,18126,18127,18128,18129,18130,18132,18135,18136,18137, 18138,18139,18140,18141,18142,18143,18145,18146,18147,18148,18149,18150,18151,18152,18153,18154,18155,18156,18157,18158,18159,18160, 18161,18162,18163,18164,18165,18166,18167,18168,18169,18170,18171,18172,18173,18174,18175,18176,18177,18178,18179,18180,18181,18182, 18183,18184,18185,18186,18187,18189,18190,18191,18192,18193,18194,18195,18196,18197,18198,18199,18201,18202,18203,18204,18205,18206, 18207,18208,18209,18210,18211,18212,18213,18214,18215,18216,18217,18218,18219,18220,18221,18222,18223,18224,18225,18226,18227,18228) and log_txid in (select * from "pg_catalog".txid_snapshot_xip('1810515020:1810515020:') except select * from "pg_catalog".txid_snapshot_xip('1810515860:1810515860:') ) union all select log_origin, log_txid, log_tableid, log_actionseq, log_tablenspname, log_tablerelname, log_cmdtype, log_cmdupdncols, log_cmdargs from "_cluster_casino".sl_log_2 where log_origin = 1 and log_tableid in (18229,18046,18058,18047,18048,18080,18104,18049,18050,18001,18002,18003,18005, 18006,18007,18008,18009,18011,18012,18013,18010,18014,18015,18016,18017,18018,18019,18020,18021,18022,18023,18024,18025,18026,18030,18027,18028,18031, 18032,18033,18034,18035,18036,18037,18038,18039,18040,18041,18042,18043,18044,18045,18051,18052,18053,18054,18055,18056,18057,18059,18060,18061,18062,18063,18064, 18065,18066,18067,18068,18069,18070,18071,18072,18073,18074,18075,18076,18077,18078,18079,18081,18082,18083,18084,18085,18086,18087,18088,18089,18090,18091,18092,18093, 18094,18095,18096,18097,18098,18099,18100,18101,18102,18103,18105,18106,18107,18110,18111,18112,18113,18114,18115,18116,18117,18118,18119,18120,18121,18122,18123,18124, 18125,18126,18127,18128,18129,18130,18132,18135,18136,18137,18138,18139,18140,18141,18142,18143,18145,18146,18147,18148,18149,18150,18151,18152,18153,18154,18155,18156, 18157,18158,18159,18160,18161,18162,18163,18164,18165,18166,18167,18168,18169,18170,18171,18172,18173,18174,18175,18176,18177,18178,18179,18180,18181,18182,18183,18184, 18185,18186,18187,18189,18190,18191,18192,18193,18194,18195,18196,18197,18198,18199,18201,18202,18203,18204,18205,18206,18207,18208,18209,18210,18211,18212,18213,18214, 18215,18216,18217,18218,18219,18220,18221,18222,18223,18224,18225,18226,18227,18228) and log_txid >= '1810515020' and log_txid < '1810515860' and "pg_catalog".txid_visible_in_snapshot(log_txid, '1810515860:1810515860:') union all select log_origin, log_txid, log_tableid, log_actionseq, log_tablenspname, log_tablerelname, log_cmdtype, log_cmdupdncols, log_cmdargs from "_cluster_casino".sl_log_2 where log_origin = 1 and log_tableid in (18229,18046,18058,18047,18048,18080,18104,18049,18050,18001,18002,18003,18005,18006,18007,18008,18009,18011,18012,18013,18010,18014,18015,18016,18017,18018,18019,18020,18021,18022,18023,18024,18025,18026,18030,18027,18028,18031,18032,18033,18034,18035,18036,18037,18038,18039,18040,18041,18042,18043,18044,18045,18051,18052,18053,18054,18055,18056,18057,18059,18060,18061,18062,18063,18064,18065,18066,18067,18068,18069,18070,18071,18072,18073,18074,18075,18076,18077,18078,18079,18081,18082,18083,18084,18085,18086,18087,18088,18089,18090,18091,18092,18093,18094,18095,18096,18097,18098,18099,18100,18101,18102,18103,18105,18106,18107,18110,18111,18112,18113,18114,18115,18116,18117,18118,18119,18120,18121,18122,18123,18124,18125,18126,18127,18128,18129,18130,18132,18135,18136,18137,18138,18139,18140,18141, 18142,18143,18145,18146,18147,18148,18149,18150,18151,18152,18153,18154,18155,18156,18157,18158,18159,18160,18161,18162,18163,18164,18165,18166,18167,18168,18169,18170,18171,18172,18173,18174,18175,18176,18177,18178,18179,18180,18181,18182,18183,18184,18185,18186,18187,18189,18190,18191,18192,18193,18194,18195,18196,18197,18198,18199,18201,18202,18203,18204,18205,18206,18207,18208,18209,18210,18211,18212,18213,18214,18215,18216,18217,18218,18219,18220,18221,18222,18223,18224,18225,18226,18227,18228) and log_txid in (select * from "pg_catalog".txid_snapshot_xip('1810515020:1810515020:') except select * from "pg_catalog".txid_snapshot_xip('1810515860:1810515860:') ) order by log_actionseq) TO STDOUT <<<< Later after cluster uninstall, after initial copy with "omit copy = no": >>>> 2013-12-04 06:59:42 HKT INFO remoteWorkerThread_1: 0.280 seconds to copy sequences 2013-12-04 06:59:42 HKT INFO remoteWorkerThread_1: copy_set SYNC found, use event seqno 5000000641. 2013-12-04 06:59:43 HKT INFO remoteWorkerThread_1: 0.477 seconds to build initial setsync status 2013-12-04 06:59:43 HKT INFO copy_set 18 done in 10132.370 seconds 2013-12-04 06:59:43 HKT CONFIG enableSubscription: sub_set=18 2013-12-04 07:00:13 HKT CONFIG storeListen: li_origin=1 li_receiver=2 li_provider=1 2013-12-04 07:00:13 HKT CONFIG remoteWorkerThread_1: update provider configuration 2013-12-04 07:00:13 HKT CONFIG remoteWorkerThread_1: added active set 18 to provider 1 2013-12-04 07:00:13 HKT CONFIG version for "dbname=db01 host=db01-slony port=5443 user=slon_casino" is 90204 2013-12-04 07:00:13 HKT INFO remoteWorkerThread_1: syncing set 18 with 219 table(s) from provider 1 2013-12-04 07:00:13 HKT INFO cleanupThread: 9616.145 seconds for cleanupEvent() 2013-12-04 07:00:17 HKT INFO remoteWorkerThread_1: SYNC 5000000642 done in 3.942 seconds 2013-12-04 07:00:17 HKT INFO remoteWorkerThread_1: syncing set 18 with 219 table(s) from provider 1 2013-12-04 07:01:26 HKT ERROR remoteWorkerThread_1_1: error at end of COPY IN: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. 2013-12-04 07:01:26 HKT ERROR remoteWorkerThread_1_1: failed SYNC's log selection query was 'COPY ( select log_origin, log_txid, NULL::integer, log_actionseq, NULL::text, NULL::text, log_cmdtype, NULL::integer, log_cmdargs from "_clust er_casino".sl_log_script where log_origin = 1 and log_txid >= "pg_catalog".txid_snapshot_xmax('1814446848:1814446855:1814446848') and log_txid < '1814450484' and "pg_catalog".txid_visible_in_snapshot(log_txid, '1814450484:1814450484:') union all select log_origin, log_txid, NULL::integer, log_actionseq, NULL::text, NULL::text, log_cmdtype, NULL::integer, log_cmdargs from "_cluster_casino".sl_log_script where log_origin = 1 and log_txid in (select * from "pg_catalog".t xid_snapshot_xip('1814446848:1814446855:1814446848') except select * from "pg_catalog".txid_snapshot_xip('1814450484:1814450484:') ) union all select log_origin, log_txid, log_tableid, log_actionseq, log_tablenspname, log_tablerelname, log_cmdtype, log_cmdupdncols, log_cmdargs from "_cluster_casino".sl_log_1 where log_origin = 1 and log_tableid in (18001,18002,18003,18005,18006,18007,18008,18009,18010,18011,18012,18013,18014,18015,18016,18017,18018,18019,18020,18021,1 8022,18023,18024,18025,18026,18027,18028,18030,18031,18032,18033,18034,18035,18036,18037,18038,18039,18040,18041,18042,18043,18044,18045,18046,18047,18048,18049,18050,18051,18052,18053,18054,18055,18056,18057,18058,18059,18060,18061,180 62,18063,18064,18065,18066,18067,18068,18069,18070,18071,18072,18073,18074,18075,18076,18077,18078,18079,18080,18081,18082,18083,18084,18085,18086,18087,18088,18089,18090,18091,18092,18093,18094,18095,18096,18097,18098,18099,18100,18101 ,18102,18103,18104,18105,18106,18107,18110,18111,18112,18113,18114,18115,18116,18117,18118,18119,18120,18121,18122,18123,18124,18125,18126,18127,18128,18129,18130,18132,18135,18136,18137,18138,18139,18140,18141,18142,18143,18145,18146,1 8147,18148,18149,18150,18151,18152,18153,18154,18155,18156,18157,18158,18159,18160,18161,18162,18163,18164,18165,18166,18167,18168,18169,18170,18171,18172,18173,18174,18175,18176,18177,18178,18179,18180,18181,18182,18183,18184,18185,181 86,18187,18189,18190,18191,18192,18193,18194,18195,18196,18197,18198,18199,18201,18202,18203,18204,18205,18206,18207,18208,18209,18210,18211,18212,18213,18214,18215,18216,18217,18218,18219,18220,18221,18222,18223,18224,18225,18226,18227 ,18228,18229) and log_txid >= '1814446855' and log_txid < '1814450484' and "pg_catalog".txid_visible_in_snapshot(log_txid, '1814450484:1814450484:') union all select log_origin, log_txid, log_tableid, log_actionseq, log_tablenspname, lo g_tablerelname, log_cmdtype, log_cmdupdncols, log_cmdargs from "_cluster_casino".sl_log_1 where log_origin = 1 and log_tableid in (18001,18002,18003,18005,18006,18007,18008,18009,18010,18011,18012,18013,18014,18015,18016,18017,18018,180 19,18020,18021,18022,18023,18024,18025,18026,18027,18028,18030,18031,18032,18033,18034,18035,18036,18037,18038,18039,18040,18041,18042,18043,18044,18045,18046,18047,18048,18049,18050,18051,18052,18053,18054,18055,18056,18057,18058,18059 ,18060,18061,18062,18063,18064,18065,18066,18067,18068,18069,18070,18071,18072,18073,18074,18075,18076,18077,18078,18079,18080,18081,18082,18083,18084,18085,18086,18087,18088,18089,18090,18091,18092,18093,18094,18095,18096,18097,18098,1 8099,18100,18101,18102,18103,18104,18105,18106,18107,18110,18111,18112,18113,18114,18115,18116,18117,18118,18119,18120,18121,18122,18123,18124,18125,18126,18127,18128,18129,18130,18132,18135,18136,18137,18138,18139,18140,18141,18142,181 43,18145,18146,18147,18148,18149,18150,18151,18152,18153,18154,18155,18156,18157,18158,18159,18160,18161,18162,18163,18164,18165,18166,18167,18168,18169,18170,18171,18172,18173,18174,18175,18176,18177,18178,18179,18180,18181,18182,18183 ,18184,18185,18186,18187,18189,18190,18191,18192,18193,18194,18195,18196,18197,18198,18199,18201,18202,18203,18204,18205,18206,18207,18208,18209,18210,18211,18212,18213,18214,18215,18216,18217,18218,18219,18220,18221,18222,18223,18224,1 8225,18226,18227,18228,18229) and log_txid in (select * from "pg_catalog".txid_snapshot_xip('1814446848:1814446855:1814446848') except select * from "pg_catalog".txid_snapshot_xip('1814450484:1814450484:') ) union all select log_origin, log_txid, log_tableid, log_actionseq, log_tablenspname, log_tablerelname, log_cmdtype, log_cmdupdncols, log_cmdargs from "_cluster_casino".sl_log_2 where log_origin = 1 and log_tableid in (18001,18002,18003,18005,18006,18007,18008,1800 9,18010,18011,18012,18013,18014,18015,18016,18017,18018,18019,18020,18021,18022,18023,18024,18025,18026,18027,18028,18030,18031,18032,18033,18034,18035,18036,18037,18038,18039,18040,18041,18042,18043,18044,18045,18046,18047,18048,18049, 18050,18051,18052,18053,18054,18055,18056,18057,18058,18059,18060,18061,18062,18063,18064,18065,18066,18067,18068,18069,18070,18071,18072,18073,18074,18075,18076,18077,18078,18079,18080,18081,18082,18083,18084,18085,18086,18087,18088,18 089,18090,18091,18092,18093,18094,18095,18096,18097,18098,18099,18100,18101,18102,18103,18104,18105,18106,18107,18110,18111,18112,18113,18114,18115,18116,18117,18118,18119,18120,18121,18122,18123,18124,18125,18126,18127,18128,18129,1813 0,18132,18135,18136,18137,18138,18139,18140,18141,18142,18143,18145,18146,18147,18148,18149,18150,18151,18152,18153,18154,18155,18156,18157,18158,18159,18160,18161,18162,18163,18164,18165,18166,18167,18168,18169,18170,18171,18172,18173, 18174,18175,18176,18177,18178,18179,18180,18181,18182,18183,18184,18185,18186,18187,18189,18190,18191,18192,18193,18194,18195,18196,18197,18198,18199,18201,18202,18203,18204,18205,18206,18207,18208,18209,18210,18211,18212,18213,18214,18 215,18216,18217,18218,18219,18220,18221,18222,18223,18224,18225,18226,18227,18228,18229) and log_txid >= '1814446855' and log_txid < '1814450484' and "pg_catalog".txid_visible_in_snapshot(log_txid, '1814450484:1814450484:') union all se lect log_origin, log_txid, log_tableid, log_actionseq, log_tablenspname, log_tablerelname, log_cmdtype, log_cmdupdncols, log_cmdargs from "_cluster_casino".sl_log_2 where log_origin = 1 and log_tableid in (18001,18002,18003,18005,18006, 18007,18008,18009,18010,18011,18012,18013,18014,18015,18016,18017,18018,18019,18020,18021,18022,18023,18024,18025,18026,18027,18028,18030,18031,18032,18033,18034,18035,18036,18037,18038,18039,18040,18041,18042,18043,18044,18045,18046,18 047,18048,18049,18050,18051,18052,18053,18054,18055,18056,18057,18058,18059,18060,18061,18062,18063,18064,18065,18066,18067,18068,18069,18070,18071,18072,18073,18074,18075,18076,18077,18078,18079,18080,18081,18082,18083,18084,18085,1808 6,18087,18088,18089,18090,18091,18092,18093,18094,18095,18096,18097,18098,18099,18100,18101,18102,18103,18104,18105,18106,18107,18110,18111,18112,18113,18114,18115,18116,18117,18118,18119,18120,18121,18122,18123,18124,18125,18126,18127, 18128,18129,18130,18132,18135,18136,18137,18138,18139,18140,18141,18142,18143,18145,18146,18147,18148,18149,18150,18151,18152,18153,18154,18155,18156,18157,18158,18159,18160,18161,18162,18163,18164,18165,18166,18167,18168,18169,18170,18 171,18172,18173,18174,18175,18176,18177,18178,18179,18180,18181,18182,18183,18184,18185,18186,18187,18189,18190,18191,18192,18193,18194,18195,18196,18197,18198,18199,18201,18202,18203,18204,18205,18206,18207,18208,18209,18210,18211,1821 2,18213,18214,18215,18216,18217,18218,18219,18220,18221,18222,18223,18224,18225,18226,18227,18228,18229) and log_txid in (select * from "pg_catalog".txid_snapshot_xip('1814446848:1814446855:1814446848') except select * from "pg_catalog" .txid_snapshot_xip('1814450484:1814450484:') ) order by log_actionseq) TO STDOUT' 2013-12-04 07:01:26 HKT ERROR remoteWorkerThread_1: SYNC aborted 2013-12-04 07:01:26 HKT ERROR remoteWorkerThread_1: "rollback transaction" PGRES_FATAL_ERROR 2013-12-04 07:01:26 HKT CONFIG slon: child terminated signal: 9; pid: 12018, current worker pid: 12018 2013-12-04 07:01:26 HKT CONFIG slon: restart of worker in 10 seconds <<<< Together with second attempt's failure I got "Segmentation fault" in Postgres: >>>> <2013-12-04 07:00:13 HKT-slon_casino-XXX.XXX.XXX.85(57578)-db01 [app:slon.local_cleanup,pid:12031,00000]>LOG: process 12031 acquired AccessExclusiveLock on relation 355503 of database 31761 after 9616098.245 ms <2013-12-04 07:00:13 HKT-slon_casino-XXX.XXX.XXX.85(57578)-db01 [app:slon.local_cleanup,pid:12031,00000]>STATEMENT: begin;lock table "_cluster_casino".sl_config_lock;select "_cluster_casino".cleanupEvent('10 minutes'::interval);commi t; <2013-12-04 07:00:13 HKT-slon_casino-XXX.XXX.XXX.85(57578)-db01 [app:slon.local_cleanup,pid:12031,00000]>LOG: duration: 9616145.184 ms statement: begin;lock table "_cluster_casino".sl_config_lock;select "_cluster_casino".cleanupEven t('10 minutes'::interval);commit; <2013-12-04 07:01:26 HKT--- [app:,pid:75714,00000]>LOG: server process (PID 12028) was terminated by signal 11: Segmentation fault <2013-12-04 07:01:26 HKT--- [app:,pid:75714,00000]>DETAIL: Failed process was running: COPY "_cluster_casino"."sl_log_2" ( log_origin, log_txid,log_tableid,log_actionseq,log_tablenspname, log_tablerelname, log_cmdtype, log_cmdupdncols, log_cmdargs) FROM STDIN <2013-12-04 07:01:26 HKT--- [app:,pid:75714,00000]>LOG: terminating any other active server processes <2013-12-04 07:01:27 HKT--- [app:,pid:75714,00000]>LOG: all server processes terminated; reinitializing <2013-12-04 07:01:30 HKT--- [app:,pid:21980,00000]>LOG: database system was interrupted; last known up at 2013-12-04 06:58:01 HKT <2013-12-04 07:01:30 HKT--- [app:,pid:21980,00000]>LOG: database system was not properly shut down; automatic recovery in progress <2013-12-04 07:01:30 HKT--- [app:,pid:21980,00000]>LOG: redo starts at 759/FA2ED570 <2013-12-04 07:01:36 HKT-slon_xg-XXX.XXX.XXX.84(58626)-db01 [app:[unknown],pid:21983,57P03]>FATAL: the database system is in recovery mode <<<< Restart neither postgresql nor the slony has not helped. So, I did a rollback to Slony 2.1.4 and it works perfect now. P.s. I have 3 independent slony clusters (for 3 different db schemas) within one database. I got such problems only with one of them. Other 2 worked good with v2.2.1. Unfortunately, I can't do more investigations for you, because this is production env and I already did a rollback to 2.1.4.
Interestingly the number 1869439327 (the requested memory size) represents the string "_amo", which is part of the array data of that very log row in the column/value list right in the first column name "contrib_amount_bc". This would indicate something went wrong in the apply trigger and apparently wrote that column name over some other variables. I'll take a detailed look at the apply trigger code.
I think I may have hit this same bug. My symptoms: Set up a cluster with Postgres 9.2.6 and Slony 2.2.1. Cascading replication set up with Master -> Intermediate Slave -> 4 Slaves After 2 days of uptime and no DDL changes, the intermediate slave's slon daemon stated saying messages like: 2014-01-23 14:55:58 EST ERROR remoteWorkerThread_1_1: error at end of COPY IN: ERROR: invalid input syntax for integer: "33.7726910000000018" The column in question is a “double precision” on both master & slave. I tried restarting the slon daemon on both master and intermediate slave. The error message changed slightly: COPY IN: ERROR: invalid input syntax for integer: "Los Angeles" Unfortunately I have to tear down this cluster since we need the machinery, but I'll keep my eye out for it again
Created attachment 199 [details] Proposed fix
I was able to reproduce the memory corruption and found what I believe is the place, where it happens. The code, that evicts an apply query form the cache when more than apply_cache_size different query plans are needed inside of a single SYNC processing, free'd data from the "new" cache entry, not the one to evict. The commit is here: https://github.com/wieck/slony1-engine/commit/6b56d480e211344eb952c9c23924ba555ac21d28
This fix looks safe to commit. I haven't verified if the fix actually fixes the problem reported. How difficult is it to write a unit test that would reproduce this issue?
Patch applied to REL_2_2_STABLE and master. A unit test does not necessarily make sense for this particular bug. Reproducing it was rather simple by creating an artificial transaction profile that caused individual LRU based apply cache eviction. After fixing it I ran the corresponding test application over night, pushing millions of transactions with a random number of updates while replicating thousands of SYNC events without any problems or apparent memory leaks. Any sort of unit test would not even scratch that test.
*** Bug 330 has been marked as a duplicate of this bug. ***