PATCH: logical_work_mem and logical streaming of large in-progress transactions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
539 messages Options
1 ... 24252627
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Dilip Kumar-2
On Mon, Sep 7, 2020 at 12:00 PM Amit Kapila <[hidden email]> wrote:
On Sat, Sep 5, 2020 at 8:55 PM Dilip Kumar <[hidden email]> wrote:
>
>
> I have reviewed the changes and looks fine to me.
>

Thanks, I have pushed the last patch. Let's wait for a day or so to
see the buildfarm reports and then we can probably close this CF
entry.

Thanks.
 
I am aware that we have one patch related to stats still
pending but I think we can tackle it along with the spill stats patch
which is being discussed in a different thread [1]. Do let me know if
I have missed anything?

[1] - https://www.postgresql.org/message-id/CAA4eK1JBqQh9cBKjO-nKOOE%3D7f6ONDCZp0TJZfn4VsQqRZ%2BuYA%40mail.gmail.com

Sound good to me.
 
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

akapila
On Mon, Sep 7, 2020 at 12:57 PM Dilip Kumar <[hidden email]> wrote:

>
> On Mon, Sep 7, 2020 at 12:00 PM Amit Kapila <[hidden email]> wrote:
>>
>> On Sat, Sep 5, 2020 at 8:55 PM Dilip Kumar <[hidden email]> wrote:
>> >
>> >
>> > I have reviewed the changes and looks fine to me.
>> >
>>
>> Thanks, I have pushed the last patch. Let's wait for a day or so to
>> see the buildfarm reports and then we can probably close this CF
>> entry.
>
>
> Thanks.
>

I have updated the status of CF entry as committed now.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Tomas Vondra-4
Hi,

while looking at the streaming code I noticed two minor issues:

1) logicalrep_read_stream_stop is never defined/called, so the prototype
in logicalproto.h is unnecessary

2) minor typo in one of the comments

Patch attached.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

streaming-fixes.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Dilip Kumar-2
On Wed, Sep 9, 2020 at 2:13 PM Tomas Vondra <[hidden email]> wrote:
Hi,

while looking at the streaming code I noticed two minor issues:

1) logicalrep_read_stream_stop is never defined/called, so the prototype
in logicalproto.h is unnecessary


Yeah, right.
 
2) minor typo in one of the comments

Patch attached.

 Looks good to me.
 
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

akapila
In reply to this post by Tomas Vondra-4
On Wed, Sep 9, 2020 at 2:13 PM Tomas Vondra
<[hidden email]> wrote:

>
> Hi,
>
> while looking at the streaming code I noticed two minor issues:
>
> 1) logicalrep_read_stream_stop is never defined/called, so the prototype
> in logicalproto.h is unnecessary
>
> 2) minor typo in one of the comments
>
> Patch attached.
>

LGTM.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

akapila
On Wed, Sep 9, 2020 at 2:26 PM Amit Kapila <[hidden email]> wrote:

>
> On Wed, Sep 9, 2020 at 2:13 PM Tomas Vondra
> <[hidden email]> wrote:
> >
> > Hi,
> >
> > while looking at the streaming code I noticed two minor issues:
> >
> > 1) logicalrep_read_stream_stop is never defined/called, so the prototype
> > in logicalproto.h is unnecessary
> >
> > 2) minor typo in one of the comments
> >
> > Patch attached.
> >
>
> LGTM.
>

Pushed.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Tom Lane-2
Amit Kapila <[hidden email]> writes:
> Pushed.

Observe the following reports:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2020-09-13%2016%3A54%3A03
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2020-09-10%2009%3A08%3A03
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2020-09-05%2020%3A22%3A02
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-04%2001%3A52%3A03
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-03%2020%3A54%3A04

These are all on HEAD, and all within the last ten days, and I see
nothing comparable in any branch before that.  So it's hard to avoid
the conclusion that somebody broke something about ten days ago.

None of these animals provided gdb backtraces; but we do have a built-in
trace from several, and they all look like pgoutput.so is trying to
list_free() garbage, somewhere inside a relcache invalidation/rebuild
scenario:

TRAP: FailedAssertion("list->length > 0", File: "/home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/../pgsql/src/backend/nodes/list.c", Line: 68)
postgres: publisher: walsender bf [local] idle(ExceptionalCondition+0x57)[0x9081f7]
postgres: publisher: walsender bf [local] idle[0x6bcc70]
postgres: publisher: walsender bf [local] idle(list_free+0x11)[0x6bdc01]
/home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/tmp_install/home/bf/build/buildfarm-idiacanthus/HEAD/inst/lib/postgresql/pgoutput.so(+0x35d8)[0x7fa4c5a6f5d8]
postgres: publisher: walsender bf [local] idle(LocalExecuteInvalidationMessage+0x15b)[0x8f0cdb]
postgres: publisher: walsender bf [local] idle(ReceiveSharedInvalidMessages+0x4b)[0x7bca0b]
postgres: publisher: walsender bf [local] idle(LockRelationOid+0x56)[0x7c19e6]
postgres: publisher: walsender bf [local] idle(relation_open+0x1c)[0x4a2d0c]
postgres: publisher: walsender bf [local] idle(table_open+0x6)[0x524486]
postgres: publisher: walsender bf [local] idle[0x9017f2]
postgres: publisher: walsender bf [local] idle[0x8fabd4]
postgres: publisher: walsender bf [local] idle[0x8fa58a]
postgres: publisher: walsender bf [local] idle(RelationCacheInvalidateEntry+0xaf)[0x8fbdbf]
postgres: publisher: walsender bf [local] idle(LocalExecuteInvalidationMessage+0xec)[0x8f0c6c]
postgres: publisher: walsender bf [local] idle(ReceiveSharedInvalidMessages+0xcb)[0x7bca8b]
postgres: publisher: walsender bf [local] idle(LockRelationOid+0x56)[0x7c19e6]
postgres: publisher: walsender bf [local] idle(relation_open+0x1c)[0x4a2d0c]
postgres: publisher: walsender bf [local] idle(table_open+0x6)[0x524486]
postgres: publisher: walsender bf [local] idle[0x8ee8b0]

010_truncate.pl itself hasn't changed meaningfully in a good long time.
However, I see that 464824323 added a whole boatload of code to
pgoutput.c, and the timing is right for that commit to be the culprit,
so that's what I'm betting on.

Probably this requires a relcache inval at the wrong time;
although we have recent passes from CLOBBER_CACHE_ALWAYS animals,
so that can't be the whole triggering condition.  I wonder whether
it is relevant that all of the complaining animals are JIT-enabled.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Tom Lane-2
I wrote:
> Probably this requires a relcache inval at the wrong time;
> although we have recent passes from CLOBBER_CACHE_ALWAYS animals,
> so that can't be the whole triggering condition.  I wonder whether
> it is relevant that all of the complaining animals are JIT-enabled.

Hmmm ... I take that back.  hyrax has indeed passed since this went
in, but *it doesn't run any TAP tests*.  So the buildfarm offers no
information about whether the replication tests work under
CLOBBER_CACHE_ALWAYS.

Realizing that, I built an installation that way and tried to run
the subscription tests.  Results so far:

* Running 010_truncate.pl by itself passed for me.  So there's still
some unexplained factor needed to trigger the buildfarm failures.
(I'm wondering about concurrent autovacuum activity now...)

* Starting over, it appears that 001_rep_changes.pl almost immediately
gets into an infinite loop.  It does not complete the third test step,
rather infinitely waiting for progress to be made.  The publisher log
shows a repeating loop like

2020-09-13 21:16:05.734 EDT [928529] tap_sub LOG:  could not send data to client: Broken pipe
2020-09-13 21:16:05.734 EDT [928529] tap_sub CONTEXT:  slot "tap_sub", output plugin "pgoutput", in the commit callback, associated LSN 0/1660628
2020-09-13 21:16:05.843 EDT [928581] 001_rep_changes.pl LOG:  statement: SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub';
2020-09-13 21:16:05.861 EDT [928582] tap_sub LOG:  statement: SELECT pg_catalog.set_config('search_path', '', false);
2020-09-13 21:16:05.929 EDT [928582] tap_sub LOG:  received replication command: IDENTIFY_SYSTEM
2020-09-13 21:16:05.930 EDT [928582] tap_sub LOG:  received replication command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', publication_names '"tap_pub","tap_pub_ins_only"')
2020-09-13 21:16:05.930 EDT [928582] tap_sub LOG:  starting logical decoding for slot "tap_sub"
2020-09-13 21:16:05.930 EDT [928582] tap_sub DETAIL:  Streaming transactions committing after 0/1652820, reading WAL from 0/1651B20.
2020-09-13 21:16:05.930 EDT [928582] tap_sub LOG:  logical decoding found consistent point at 0/1651B20
2020-09-13 21:16:05.930 EDT [928582] tap_sub DETAIL:  There are no running transactions.
2020-09-13 21:16:21.560 EDT [928600] 001_rep_changes.pl LOG:  statement: SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub';
2020-09-13 21:16:37.291 EDT [928610] 001_rep_changes.pl LOG:  statement: SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub';
2020-09-13 21:16:52.959 EDT [928627] 001_rep_changes.pl LOG:  statement: SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub';
2020-09-13 21:17:06.866 EDT [928636] tap_sub LOG:  statement: SELECT pg_catalog.set_config('search_path', '', false);
2020-09-13 21:17:06.934 EDT [928636] tap_sub LOG:  received replication command: IDENTIFY_SYSTEM
2020-09-13 21:17:06.934 EDT [928636] tap_sub LOG:  received replication command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', publication_names '"tap_pub","tap_pub_ins_only"')
2020-09-13 21:17:06.934 EDT [928636] tap_sub ERROR:  replication slot "tap_sub" is active for PID 928582
2020-09-13 21:17:07.811 EDT [928638] tap_sub LOG:  statement: SELECT pg_catalog.set_config('search_path', '', false);
2020-09-13 21:17:07.880 EDT [928638] tap_sub LOG:  received replication command: IDENTIFY_SYSTEM
2020-09-13 21:17:07.881 EDT [928638] tap_sub LOG:  received replication command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', publication_names '"tap_pub","tap_pub_ins_only"')
2020-09-13 21:17:07.881 EDT [928638] tap_sub ERROR:  replication slot "tap_sub" is active for PID 928582
2020-09-13 21:17:08.618 EDT [928641] 001_rep_changes.pl LOG:  statement: SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub';
2020-09-13 21:17:08.753 EDT [928642] tap_sub LOG:  statement: SELECT pg_catalog.set_config('search_path', '', false);
2020-09-13 21:17:08.821 EDT [928642] tap_sub LOG:  received replication command: IDENTIFY_SYSTEM
2020-09-13 21:17:08.821 EDT [928642] tap_sub LOG:  received replication command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', publication_names '"tap_pub","tap_pub_ins_only"')
2020-09-13 21:17:08.821 EDT [928642] tap_sub ERROR:  replication slot "tap_sub" is active for PID 928582
2020-09-13 21:17:09.689 EDT [928645] tap_sub LOG:  statement: SELECT pg_catalog.set_config('search_path', '', false);
2020-09-13 21:17:09.756 EDT [928645] tap_sub LOG:  received replication command: IDENTIFY_SYSTEM
2020-09-13 21:17:09.757 EDT [928645] tap_sub LOG:  received replication command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', publication_names '"tap_pub","tap_pub_ins_only"')
2020-09-13 21:17:09.757 EDT [928645] tap_sub ERROR:  replication slot "tap_sub" is active for PID 928582
2020-09-13 21:17:09.841 EDT [928582] tap_sub LOG:  could not send data to client: Broken pipe
2020-09-13 21:17:09.841 EDT [928582] tap_sub CONTEXT:  slot "tap_sub", output plugin "pgoutput", in the commit callback, associated LSN 0/1660628

while the subscriber is repeating

2020-09-13 21:15:01.598 EDT [928528] LOG:  logical replication apply worker for subscription "tap_sub" has started
2020-09-13 21:16:02.178 EDT [928528] ERROR:  terminating logical replication worker due to timeout
2020-09-13 21:16:02.179 EDT [920797] LOG:  background worker "logical replication worker" (PID 928528) exited with exit code 1
2020-09-13 21:16:02.606 EDT [928571] LOG:  logical replication apply worker for subscription "tap_sub" has started
2020-09-13 21:16:03.117 EDT [928571] ERROR:  could not start WAL streaming: ERROR:  replication slot "tap_sub" is active for PID 928529
2020-09-13 21:16:03.118 EDT [920797] LOG:  background worker "logical replication worker" (PID 928571) exited with exit code 1
2020-09-13 21:16:03.544 EDT [928574] LOG:  logical replication apply worker for subscription "tap_sub" has started
2020-09-13 21:16:04.053 EDT [928574] ERROR:  could not start WAL streaming: ERROR:  replication slot "tap_sub" is active for PID 928529
2020-09-13 21:16:04.054 EDT [920797] LOG:  background worker "logical replication worker" (PID 928574) exited with exit code 1
2020-09-13 21:16:04.479 EDT [928576] LOG:  logical replication apply worker for subscription "tap_sub" has started
2020-09-13 21:16:04.990 EDT [928576] ERROR:  could not start WAL streaming: ERROR:  replication slot "tap_sub" is active for PID 928529
2020-09-13 21:16:04.990 EDT [920797] LOG:  background worker "logical replication worker" (PID 928576) exited with exit code 1
2020-09-13 21:16:05.415 EDT [928579] LOG:  logical replication apply worker for subscription "tap_sub" has started
2020-09-13 21:17:05.994 EDT [928579] ERROR:  terminating logical replication worker due to timeout

I'm out of patience to investigate this for tonight, but there is
something extremely broken here; maybe more than one something.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

akapila
In reply to this post by Tom Lane-2
On Mon, Sep 14, 2020 at 3:08 AM Tom Lane <[hidden email]> wrote:
I'll analyze these reports.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Tom Lane-2
In reply to this post by Tom Lane-2
I wrote:
> * Starting over, it appears that 001_rep_changes.pl almost immediately
> gets into an infinite loop.  It does not complete the third test step,
> rather infinitely waiting for progress to be made.

Ah, looking closer, the problem is that wal_receiver_timeout = 60s
is too short when the sender is using CCA.  It times out before we
can get through the needed data transmission.

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

akapila
In reply to this post by Tom Lane-2
On Mon, Sep 14, 2020 at 3:08 AM Tom Lane <[hidden email]> wrote:

>
> Amit Kapila <[hidden email]> writes:
> > Pushed.
>
> Observe the following reports:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2020-09-13%2016%3A54%3A03
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2020-09-10%2009%3A08%3A03
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2020-09-05%2020%3A22%3A02
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-04%2001%3A52%3A03
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-03%2020%3A54%3A04
>
> These are all on HEAD, and all within the last ten days, and I see
> nothing comparable in any branch before that.  So it's hard to avoid
> the conclusion that somebody broke something about ten days ago.
>
> None of these animals provided gdb backtraces; but we do have a built-in
> trace from several, and they all look like pgoutput.so is trying to
> list_free() garbage, somewhere inside a relcache invalidation/rebuild
> scenario:
>

Yeah, this is right, and here is some initial analysis. It seems to be
failing in below code:
rel_sync_cache_relation_cb(){ ...list_free(entry->streamed_txns);..}

This list can have elements only in 'streaming' mode (need to enable
'streaming' with Create Subscription command) whereas none of the
tests in 010_truncate.pl is using 'streaming', so this list should be
empty (NULL). The two different assertion failures shown in BF reports
in list_free code are as below:
Assert(list->length > 0);
Assert(list->length <= list->max_length);

It seems to me that this list is not initialized properly when it is
not used or maybe that is true in some special circumstances because
we initialize it in get_rel_sync_entry(). I am not sure if CCI build
is impacting this in some way.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Dilip Kumar-2
On Mon, Sep 14, 2020 at 8:48 AM Amit Kapila <[hidden email]> wrote:

>
> On Mon, Sep 14, 2020 at 3:08 AM Tom Lane <[hidden email]> wrote:
> >
> > Amit Kapila <[hidden email]> writes:
> > > Pushed.
> >
> > Observe the following reports:
> >
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2020-09-13%2016%3A54%3A03
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2020-09-10%2009%3A08%3A03
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2020-09-05%2020%3A22%3A02
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-04%2001%3A52%3A03
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-03%2020%3A54%3A04
> >
> > These are all on HEAD, and all within the last ten days, and I see
> > nothing comparable in any branch before that.  So it's hard to avoid
> > the conclusion that somebody broke something about ten days ago.
> >
> > None of these animals provided gdb backtraces; but we do have a built-in
> > trace from several, and they all look like pgoutput.so is trying to
> > list_free() garbage, somewhere inside a relcache invalidation/rebuild
> > scenario:
> >
>
> Yeah, this is right, and here is some initial analysis. It seems to be
> failing in below code:
> rel_sync_cache_relation_cb(){ ...list_free(entry->streamed_txns);..}
>
> This list can have elements only in 'streaming' mode (need to enable
> 'streaming' with Create Subscription command) whereas none of the
> tests in 010_truncate.pl is using 'streaming', so this list should be
> empty (NULL). The two different assertion failures shown in BF reports
> in list_free code are as below:
> Assert(list->length > 0);
> Assert(list->length <= list->max_length);
>
> It seems to me that this list is not initialized properly when it is
> not used or maybe that is true in some special circumstances because
> we initialize it in get_rel_sync_entry(). I am not sure if CCI build
> is impacting this in some way.


Even I have analyzed this but did not find any reason why the
streamed_txns list should be anything other than NULL.  The only thing
is we are initializing the entry->streamed_txns to NULL and the list
free is checking  "if (list == NIL)" then return. However IMHO, that
should not be an issue becase NIL is defined as (List*) NULL.  I am
doing further testing and investigation.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

akapila
On Mon, Sep 14, 2020 at 1:23 PM Dilip Kumar <[hidden email]> wrote:

>
> On Mon, Sep 14, 2020 at 8:48 AM Amit Kapila <[hidden email]> wrote:
> >
> >
> > Yeah, this is right, and here is some initial analysis. It seems to be
> > failing in below code:
> > rel_sync_cache_relation_cb(){ ...list_free(entry->streamed_txns);..}
> >
> > This list can have elements only in 'streaming' mode (need to enable
> > 'streaming' with Create Subscription command) whereas none of the
> > tests in 010_truncate.pl is using 'streaming', so this list should be
> > empty (NULL). The two different assertion failures shown in BF reports
> > in list_free code are as below:
> > Assert(list->length > 0);
> > Assert(list->length <= list->max_length);
> >
> > It seems to me that this list is not initialized properly when it is
> > not used or maybe that is true in some special circumstances because
> > we initialize it in get_rel_sync_entry(). I am not sure if CCI build
> > is impacting this in some way.
>
>
> Even I have analyzed this but did not find any reason why the
> streamed_txns list should be anything other than NULL.  The only thing
> is we are initializing the entry->streamed_txns to NULL and the list
> free is checking  "if (list == NIL)" then return. However IMHO, that
> should not be an issue becase NIL is defined as (List*) NULL.
>
Yeah, that is not the issue but it is better to initialize it with NIL
for the sake of consistency. The basic issue here was we were trying
to open/lock the relation(s) before initializing this list. Now, when
we process the invalidations during open relation, we try to access
this list in rel_sync_cache_relation_cb and that leads to assertion
failure. I have reproduced the exact scenario of 010_truncate.pl via
debugger. Basically, the backend on publisher has sent the
invalidation after truncating the relation 'tab1' and while processing
the truncate message if WALSender receives that message exactly after
creating the RelSyncEntry for 'tab1', the Assertion shown in BF can be
reproduced.

The attached patch will fix the issue. What do you think?

--
With Regards,
Amit Kapila.

v1-0001-Fix-initialization-of-RelationSyncEntry-for-strea.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Dilip Kumar-2
On Mon, Sep 14, 2020 at 4:50 PM Amit Kapila <[hidden email]> wrote:

>
> On Mon, Sep 14, 2020 at 1:23 PM Dilip Kumar <[hidden email]> wrote:
> >
> > On Mon, Sep 14, 2020 at 8:48 AM Amit Kapila <[hidden email]> wrote:
> > >
> > >
> > > Yeah, this is right, and here is some initial analysis. It seems to be
> > > failing in below code:
> > > rel_sync_cache_relation_cb(){ ...list_free(entry->streamed_txns);..}
> > >
> > > This list can have elements only in 'streaming' mode (need to enable
> > > 'streaming' with Create Subscription command) whereas none of the
> > > tests in 010_truncate.pl is using 'streaming', so this list should be
> > > empty (NULL). The two different assertion failures shown in BF reports
> > > in list_free code are as below:
> > > Assert(list->length > 0);
> > > Assert(list->length <= list->max_length);
> > >
> > > It seems to me that this list is not initialized properly when it is
> > > not used or maybe that is true in some special circumstances because
> > > we initialize it in get_rel_sync_entry(). I am not sure if CCI build
> > > is impacting this in some way.
> >
> >
> > Even I have analyzed this but did not find any reason why the
> > streamed_txns list should be anything other than NULL.  The only thing
> > is we are initializing the entry->streamed_txns to NULL and the list
> > free is checking  "if (list == NIL)" then return. However IMHO, that
> > should not be an issue becase NIL is defined as (List*) NULL.
> >
>
> Yeah, that is not the issue but it is better to initialize it with NIL
> for the sake of consistency. The basic issue here was we were trying
> to open/lock the relation(s) before initializing this list. Now, when
> we process the invalidations during open relation, we try to access
> this list in rel_sync_cache_relation_cb and that leads to assertion
> failure. I have reproduced the exact scenario of 010_truncate.pl via
> debugger. Basically, the backend on publisher has sent the
> invalidation after truncating the relation 'tab1' and while processing
> the truncate message if WALSender receives that message exactly after
> creating the RelSyncEntry for 'tab1', the Assertion shown in BF can be
> reproduced.

Yeah, this is an issue and I am also able to reproduce this manually
using gdb.  Basically, I have inserted some data in publication table
and after that, I stopped in get_rel_sync_entry after creating the
reentry and before calling GetRelationPublications.  Meanwhile, I have
truncated this table and then it hit the same issue you pointed here.

> The attached patch will fix the issue. What do you think?

The patch looks good to me and fixing the reported issue.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Tom Lane-2
In reply to this post by akapila
Amit Kapila <[hidden email]> writes:
> The attached patch will fix the issue. What do you think?

I think it'd be cleaner to separate the initialization of a new entry from
validation altogether, along the lines of

    /* Find cached function info, creating if not found */
    oldctx = MemoryContextSwitchTo(CacheMemoryContext);
    entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
                                              (void *) &relid,
                                              HASH_ENTER, &found);
    MemoryContextSwitchTo(oldctx);
    Assert(entry != NULL);

    if (!found)
    {
        /* immediately make a new entry valid enough to satisfy callbacks */
        entry->schema_sent = false;
        entry->streamed_txns = NIL;
        entry->replicate_valid = false;
        /* are there any other fields we should clear here for safety??? */
    }

    /* Fill it in if not valid */
    if (!entry->replicate_valid)
    {
        List       *pubids = GetRelationPublications(relid);
        ...

BTW, unless someone has changed the behavior of dynahash when I
wasn't looking, those MemoryContextSwitchTos shown above are useless.
Also, why does the comment refer to a "function" entry?

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

akapila
On Mon, Sep 14, 2020 at 9:42 PM Tom Lane <[hidden email]> wrote:

>
> Amit Kapila <[hidden email]> writes:
> > The attached patch will fix the issue. What do you think?
>
> I think it'd be cleaner to separate the initialization of a new entry from
> validation altogether, along the lines of
>
>     /* Find cached function info, creating if not found */
>     oldctx = MemoryContextSwitchTo(CacheMemoryContext);
>     entry = (RelationSyncEntry *) hash_search(RelationSyncCache,
>                                               (void *) &relid,
>                                               HASH_ENTER, &found);
>     MemoryContextSwitchTo(oldctx);
>     Assert(entry != NULL);
>
>     if (!found)
>     {
>         /* immediately make a new entry valid enough to satisfy callbacks */
>         entry->schema_sent = false;
>         entry->streamed_txns = NIL;
>         entry->replicate_valid = false;
>         /* are there any other fields we should clear here for safety??? */
>     }
>

If we want to separate validation then we need to initialize other
fields like 'pubactions' and 'publish_as_relid' as well. I think it
will be better to arrange it the way you are suggesting. So, I will
change it along with other fields that required initialization.

>     /* Fill it in if not valid */
>     if (!entry->replicate_valid)
>     {
>         List       *pubids = GetRelationPublications(relid);
>         ...
>
> BTW, unless someone has changed the behavior of dynahash when I
> wasn't looking, those MemoryContextSwitchTos shown above are useless.
>

As far as I can see they are useless in this case but I think they
might be required in case the user provides its own allocator function
(using HASH_ALLOC). So, we can probably remove those from here?

> Also, why does the comment refer to a "function" entry?
>

It should be "relation" instead. I'll take care of changing this as well.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

Tom Lane-2
Amit Kapila <[hidden email]> writes:
> On Mon, Sep 14, 2020 at 9:42 PM Tom Lane <[hidden email]> wrote:
>> BTW, unless someone has changed the behavior of dynahash when I
>> wasn't looking, those MemoryContextSwitchTos shown above are useless.

> As far as I can see they are useless in this case but I think they
> might be required in case the user provides its own allocator function
> (using HASH_ALLOC). So, we can probably remove those from here?

You could imagine writing a HASH_ALLOC allocator whose behavior
varies depending on CurrentMemoryContext, but it seems like a
pretty foolish/fragile way to do it.  In any case I can think of,
the hash table lives in one specific context and you really
really do not want parts of it spread across other contexts.
dynahash.c is not going to look kindly on pieces of what it
is managing disappearing from under it.

(To be clear, objects that the hash entries contain pointers to
are a different question.  But the hash entries themselves have
to have exactly the same lifespan as the hash table.)

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

akapila
On Tue, Sep 15, 2020 at 8:38 AM Tom Lane <[hidden email]> wrote:

>
> Amit Kapila <[hidden email]> writes:
> > On Mon, Sep 14, 2020 at 9:42 PM Tom Lane <[hidden email]> wrote:
> >> BTW, unless someone has changed the behavior of dynahash when I
> >> wasn't looking, those MemoryContextSwitchTos shown above are useless.
>
> > As far as I can see they are useless in this case but I think they
> > might be required in case the user provides its own allocator function
> > (using HASH_ALLOC). So, we can probably remove those from here?
>
> You could imagine writing a HASH_ALLOC allocator whose behavior
> varies depending on CurrentMemoryContext, but it seems like a
> pretty foolish/fragile way to do it.  In any case I can think of,
> the hash table lives in one specific context and you really
> really do not want parts of it spread across other contexts.
> dynahash.c is not going to look kindly on pieces of what it
> is managing disappearing from under it.
>
I agree that doesn't make sense. I have fixed all the comments
discussed in the attached patch.

--
With Regards,
Amit Kapila.

v2-0001-Fix-initialization-of-RelationSyncEntry-for-strea.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

akapila
On Tue, Sep 15, 2020 at 10:17 AM Amit Kapila <[hidden email]> wrote:

>
> On Tue, Sep 15, 2020 at 8:38 AM Tom Lane <[hidden email]> wrote:
> > > As far as I can see they are useless in this case but I think they
> > > might be required in case the user provides its own allocator function
> > > (using HASH_ALLOC). So, we can probably remove those from here?
> >
> > You could imagine writing a HASH_ALLOC allocator whose behavior
> > varies depending on CurrentMemoryContext, but it seems like a
> > pretty foolish/fragile way to do it.  In any case I can think of,
> > the hash table lives in one specific context and you really
> > really do not want parts of it spread across other contexts.
> > dynahash.c is not going to look kindly on pieces of what it
> > is managing disappearing from under it.
> >
>
> I agree that doesn't make sense. I have fixed all the comments
> discussed in the attached patch.
>

Pushed.

--
With Regards,
Amit Kapila.


1 ... 24252627