Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
58 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

Bharath Rupireddy
Hi,

While providing thoughts on the design in [1], I found a strange
behaviour with the $subject. The use case is shown below as a sequence
of steps that need to be run on publisher and subscriber to arrive at
the strange behaviour.  In step 5, the table is dropped from the
publication and in step 6, the refresh publication is run on the
subscriber, from here onwards, the expectation is that no further
inserts into the publisher table have to be replicated on to the
subscriber, but the opposite happens i.e. the inserts are still
replicated to the subscriber. ISTM as a bug. Let me know if I'm
missing anything.

Thoughts?

step 1) on the publisher:
DROP TABLE t1;
DROP PUBLICATION mypub1;
CREATE TABLE t1 (a int);
INSERT INTO t1 VALUES (1);
CREATE PUBLICATION mypub1 FOR TABLE t1;
postgres=# SELECT r1.*, r2.relname, r3.* FROM pg_publication_rel r1,
pg_class r2, pg_publication r3 WHERE r1.prrelid = r2.oid AND
r1.prpubid = r3.oid;
  oid  | prpubid | prrelid | relname |  oid  | pubname | pubowner |
puballtables | pubinsert | pubupdate | pubdelete | pubtruncate |
pubviaroot
-------+---------+---------+---------+-------+---------+----------+--------------+-----------+-----------+-----------+-------------+------------
 16462 |   16461 |   16458 | t1      | 16461 | mypub1  |       10 | f
          | t         | t         | t         | t           | f
(1 row)

step 2) on the subscriber:
DROP TABLE t1;
DROP SUBSCRIPTION mysub1;
CREATE TABLE t1 (a int);
CREATE SUBSCRIPTION mysub1 CONNECTION 'host=localhost dbname=postgres
user=bharath port=5432' PUBLICATION mypub1;
postgres=# SELECT r1.*, r2.relname, r3.* FROM pg_subscription_rel r1,
pg_class r2, pg_subscription r3 WHERE r1.srrelid = r2.oid AND
r1.srsubid = r3.oid;
 srsubid | srrelid | srsubstate | srsublsn | relname |  oid  | subdbid
| subname | subowner | subenabled | subbinary | substream |
 subconninfo                      | subslotname | subsynccommit |
subpublications
---------+---------+------------+----------+---------+-------+---------+---------+----------+------------+-----------+-----------+---------------------
----------------------------------+-------------+---------------+-----------------
   16446 |   16443 | i          |          | t1      | 16446 |   12872
| mysub1  |       10 | t          | f         | f         |
host=localhost dbnam
e=postgres user=bharath port=5432 | mysub1      | off           | {mypub1}
(1 row)
postgres=# SELECT * FROM t1;
 a
---
 1
(1 row)

step 3) on the publisher:
INSERT INTO t1 VALUES (2);

step 4) on the subscriber:
postgres=# SELECT * FROM t1;
 a
---
 1
 2
(2 rows)

step 5) on the publisher:
ALTER PUBLICATION mypub1 DROP TABLE t1;
postgres=# SELECT r1.*, r2.relname, r3.* FROM pg_publication_rel r1,
pg_class r2, pg_publication r3 WHERE r1.prrelid = r2.oid AND
r1.prpubid = r3.oid;
 oid | prpubid | prrelid | relname | oid | pubname | pubowner |
puballtables | pubinsert | pubupdate | pubdelete | pubtruncate |
pubviaroot
-----+---------+---------+---------+-----+---------+----------+--------------+-----------+-----------+-----------+-------------+------------
(0 rows)
INSERT INTO t1 VALUES (3);

step 6) on the subscriber:
postgres=# SELECT * FROM t1;
 a
---
 1
 2
 3
(3 rows)
ALTER SUBSCRIPTION mysub1 REFRESH PUBLICATION;
postgres=# SELECT r1.*, r2.relname, r3.* FROM pg_subscription_rel r1,
pg_class r2, pg_subscription r3 WHERE r1.srrelid = r2.oid AND
r1.srsubid = r3.oid;
 srsubid | srrelid | srsubstate | srsublsn | relname | oid | subdbid |
subname | subowner | subenabled | subbinary | substream | subconninfo
| subslotn
ame | subsynccommit | subpublications
---------+---------+------------+----------+---------+-----+---------+---------+----------+------------+-----------+-----------+-------------+---------
----+---------------+-----------------
(0 rows)

step 7) on the publisher:
INSERT INTO t1 VALUES (4);

step 8) on the subscriber:
postgres=# SELECT * FROM t1;
 a
---
 1
 2
 3
 4
(4 rows)

step 9) on the publisher:
INSERT INTO t1 SELECT * FROM generate_series(5,100);

step 10) on the subscriber:
postgres=# SELECT count(*) FROM t1;
 count
-------
   100
(1 row)

[1] - https://www.postgresql.org/message-id/CAA4eK1L5TejNHNctyPB3GVuEriRQw6xxU32iMyv%3Dh4tCJKkLew%40mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

akapila
On Mon, Jan 11, 2021 at 6:51 PM Bharath Rupireddy
<[hidden email]> wrote:

>
> Hi,
>
> While providing thoughts on the design in [1], I found a strange
> behaviour with the $subject. The use case is shown below as a sequence
> of steps that need to be run on publisher and subscriber to arrive at
> the strange behaviour.  In step 5, the table is dropped from the
> publication and in step 6, the refresh publication is run on the
> subscriber, from here onwards, the expectation is that no further
> inserts into the publisher table have to be replicated on to the
> subscriber, but the opposite happens i.e. the inserts are still
> replicated to the subscriber. ISTM as a bug. Let me know if I'm
> missing anything.
>

Did you try to investigate what's going on? Can you please check what
is the behavior if, after step-5, you restart the subscriber and
separately try creating a new subscription (maybe on a different
server) for that publication after step-5 and see if that allows the
relation to be replicated? AFAIU, in AlterSubscription_refresh, we
remove such dropped rels and stop their corresponding apply workers
which should stop the further replication of such relations but that
doesn't seem to be happening in your case.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

japin

On Tue, 12 Jan 2021 at 11:37, Amit Kapila wrote:

> On Mon, Jan 11, 2021 at 6:51 PM Bharath Rupireddy
> <[hidden email]> wrote:
>>
>> Hi,
>>
>> While providing thoughts on the design in [1], I found a strange
>> behaviour with the $subject. The use case is shown below as a sequence
>> of steps that need to be run on publisher and subscriber to arrive at
>> the strange behaviour.  In step 5, the table is dropped from the
>> publication and in step 6, the refresh publication is run on the
>> subscriber, from here onwards, the expectation is that no further
>> inserts into the publisher table have to be replicated on to the
>> subscriber, but the opposite happens i.e. the inserts are still
>> replicated to the subscriber. ISTM as a bug. Let me know if I'm
>> missing anything.
>>
>
> Did you try to investigate what's going on? Can you please check what
> is the behavior if, after step-5, you restart the subscriber and
> separately try creating a new subscription (maybe on a different
> server) for that publication after step-5 and see if that allows the
> relation to be replicated? AFAIU, in AlterSubscription_refresh, we
> remove such dropped rels and stop their corresponding apply workers
> which should stop the further replication of such relations but that
> doesn't seem to be happening in your case.

If we restart the subscriber after step-5, it will not replicate the records.

As I said in [1], if we don't insert a new data in step-5, it will not
replicate the records.

In both cases, the AlterSubscription_refresh() call RemoveSubscriptionRel()
and logicalrep_worker_stop_at_commit().  However, if we insert a data in
step-5, it doesn't work as expected.  Any thoughts?

[1] https://www.postgresql.org/message-id/A7A618FB-F87C-439C-90A3-93CF9E7341FF@...

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

akapila
On Tue, Jan 12, 2021 at 9:58 AM japin <[hidden email]> wrote:

>
>
> On Tue, 12 Jan 2021 at 11:37, Amit Kapila wrote:
> > On Mon, Jan 11, 2021 at 6:51 PM Bharath Rupireddy
> > <[hidden email]> wrote:
> >>
> >> Hi,
> >>
> >> While providing thoughts on the design in [1], I found a strange
> >> behaviour with the $subject. The use case is shown below as a sequence
> >> of steps that need to be run on publisher and subscriber to arrive at
> >> the strange behaviour.  In step 5, the table is dropped from the
> >> publication and in step 6, the refresh publication is run on the
> >> subscriber, from here onwards, the expectation is that no further
> >> inserts into the publisher table have to be replicated on to the
> >> subscriber, but the opposite happens i.e. the inserts are still
> >> replicated to the subscriber. ISTM as a bug. Let me know if I'm
> >> missing anything.
> >>
> >
> > Did you try to investigate what's going on? Can you please check what
> > is the behavior if, after step-5, you restart the subscriber and
> > separately try creating a new subscription (maybe on a different
> > server) for that publication after step-5 and see if that allows the
> > relation to be replicated? AFAIU, in AlterSubscription_refresh, we
> > remove such dropped rels and stop their corresponding apply workers
> > which should stop the further replication of such relations but that
> > doesn't seem to be happening in your case.
>
> If we restart the subscriber after step-5, it will not replicate the records.
>
> As I said in [1], if we don't insert a new data in step-5, it will not
> replicate the records.
>

Hmm, but in Bharath's test, it is replicating the Inserts in step-7
and step-9 as well. Are you seeing something different?

> In both cases, the AlterSubscription_refresh() call RemoveSubscriptionRel()
> and logicalrep_worker_stop_at_commit().  However, if we insert a data in
> step-5, it doesn't work as expected.  Any thoughts?
>

I think the data inserted in step-5 might be visible because we have
stopped the apply process after that but it is not clear why the data
inserted in steps 7 and 9 is getting replicated. I think due to some
reason apply worker is not getting stopped even after Refresh
Publication statement is finished due to which the data is being
replicated after that as well and after restart the apply worker won't
be restarted so the data replication doesn't happen.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

japin

On Tue, 12 Jan 2021 at 13:39, Amit Kapila wrote:

> On Tue, Jan 12, 2021 at 9:58 AM japin <[hidden email]> wrote:
>>
>>
>> On Tue, 12 Jan 2021 at 11:37, Amit Kapila wrote:
>> > On Mon, Jan 11, 2021 at 6:51 PM Bharath Rupireddy
>> > <[hidden email]> wrote:
>> >>
>> >> Hi,
>> >>
>> >> While providing thoughts on the design in [1], I found a strange
>> >> behaviour with the $subject. The use case is shown below as a sequence
>> >> of steps that need to be run on publisher and subscriber to arrive at
>> >> the strange behaviour.  In step 5, the table is dropped from the
>> >> publication and in step 6, the refresh publication is run on the
>> >> subscriber, from here onwards, the expectation is that no further
>> >> inserts into the publisher table have to be replicated on to the
>> >> subscriber, but the opposite happens i.e. the inserts are still
>> >> replicated to the subscriber. ISTM as a bug. Let me know if I'm
>> >> missing anything.
>> >>
>> >
>> > Did you try to investigate what's going on? Can you please check what
>> > is the behavior if, after step-5, you restart the subscriber and
>> > separately try creating a new subscription (maybe on a different
>> > server) for that publication after step-5 and see if that allows the
>> > relation to be replicated? AFAIU, in AlterSubscription_refresh, we
>> > remove such dropped rels and stop their corresponding apply workers
>> > which should stop the further replication of such relations but that
>> > doesn't seem to be happening in your case.
>>
>> If we restart the subscriber after step-5, it will not replicate the records.
>>
>> As I said in [1], if we don't insert a new data in step-5, it will not
>> replicate the records.
>>
>
> Hmm, but in Bharath's test, it is replicating the Inserts in step-7
> and step-9 as well. Are you seeing something different?
>

Yes, however if we don't Inserts in step-5, the Inserts in step-7 and
step-9 will not replicate.


--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

Bharath Rupireddy
In reply to this post by akapila
On Tue, Jan 12, 2021 at 9:05 AM Amit Kapila <[hidden email]> wrote:

>
> On Mon, Jan 11, 2021 at 6:51 PM Bharath Rupireddy
> <[hidden email]> wrote:
> >
> > Hi,
> >
> > While providing thoughts on the design in [1], I found a strange
> > behaviour with the $subject. The use case is shown below as a sequence
> > of steps that need to be run on publisher and subscriber to arrive at
> > the strange behaviour.  In step 5, the table is dropped from the
> > publication and in step 6, the refresh publication is run on the
> > subscriber, from here onwards, the expectation is that no further
> > inserts into the publisher table have to be replicated on to the
> > subscriber, but the opposite happens i.e. the inserts are still
> > replicated to the subscriber. ISTM as a bug. Let me know if I'm
> > missing anything.
> >
>
> Did you try to investigate what's going on? Can you please check what
> is the behavior if, after step-5, you restart the subscriber and
> separately try creating a new subscription (maybe on a different
> server) for that publication after step-5 and see if that allows the
> relation to be replicated? AFAIU, in AlterSubscription_refresh, we
> remove such dropped rels and stop their corresponding apply workers
> which should stop the further replication of such relations but that
> doesn't seem to be happening in your case.

Here's my analysis:
1) in the publisher, alter publication drop table successfully
removes(PublicationDropTables) the table from the catalogue
pg_publication_rel
2) in the subscriber, alter subscription refresh publication
successfully removes the table from the catalogue pg_subscription_rel
(AlterSubscription_refresh->RemoveSubscriptionRel)
so far so good
3) after the insertion into the table in the publisher(remember that
it's dropped from the publication in (1)), the walsender process is
unable detect that the table has been dropped from the publication
i.e. it doesn't look at the pg_publication_rel catalogue or some
other, but it only does is_publishable_relation() check which returns
true in pgoutput_change(). Maybe the walsender should look at the
catalogue pg_publication_rel in is_publishable_relation()?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

akapila
On Tue, Jan 12, 2021 at 11:39 AM Bharath Rupireddy
<[hidden email]> wrote:

>
> On Tue, Jan 12, 2021 at 9:05 AM Amit Kapila <[hidden email]> wrote:
> >
> > On Mon, Jan 11, 2021 at 6:51 PM Bharath Rupireddy
> > <[hidden email]> wrote:
> > >
> > > Hi,
> > >
> > > While providing thoughts on the design in [1], I found a strange
> > > behaviour with the $subject. The use case is shown below as a sequence
> > > of steps that need to be run on publisher and subscriber to arrive at
> > > the strange behaviour.  In step 5, the table is dropped from the
> > > publication and in step 6, the refresh publication is run on the
> > > subscriber, from here onwards, the expectation is that no further
> > > inserts into the publisher table have to be replicated on to the
> > > subscriber, but the opposite happens i.e. the inserts are still
> > > replicated to the subscriber. ISTM as a bug. Let me know if I'm
> > > missing anything.
> > >
> >
> > Did you try to investigate what's going on? Can you please check what
> > is the behavior if, after step-5, you restart the subscriber and
> > separately try creating a new subscription (maybe on a different
> > server) for that publication after step-5 and see if that allows the
> > relation to be replicated? AFAIU, in AlterSubscription_refresh, we
> > remove such dropped rels and stop their corresponding apply workers
> > which should stop the further replication of such relations but that
> > doesn't seem to be happening in your case.
>
> Here's my analysis:
> 1) in the publisher, alter publication drop table successfully
> removes(PublicationDropTables) the table from the catalogue
> pg_publication_rel
> 2) in the subscriber, alter subscription refresh publication
> successfully removes the table from the catalogue pg_subscription_rel
> (AlterSubscription_refresh->RemoveSubscriptionRel)
> so far so good
>

Here, it should register the worker to stop on commit, and then on
commit it should call AtEOXact_ApplyLauncher to stop the apply worker.
Once the apply worker is stopped, the corresponding WALSender will
also be stopped. Something here is not happening as per expected
behavior.

> 3) after the insertion into the table in the publisher(remember that
> it's dropped from the publication in (1)), the walsender process is
> unable detect that the table has been dropped from the publication
> i.e. it doesn't look at the pg_publication_rel catalogue or some
> other, but it only does is_publishable_relation() check which returns
> true in pgoutput_change(). Maybe the walsender should look at the
> catalogue pg_publication_rel in is_publishable_relation()?
>

We must be somewhere checking pg_publication_rel before sending the
decoded change because otherwise, we would have sent the changes for
the table which are not even part of this publication. I think you can
try to create a separate table that is not part of the publication
under test and see how the changes for that are filtered.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

japin

On Tue, 12 Jan 2021 at 14:38, Amit Kapila wrote:

> On Tue, Jan 12, 2021 at 11:39 AM Bharath Rupireddy
> <[hidden email]> wrote:
>>
>> On Tue, Jan 12, 2021 at 9:05 AM Amit Kapila <[hidden email]> wrote:
>> >
>> > On Mon, Jan 11, 2021 at 6:51 PM Bharath Rupireddy
>> > <[hidden email]> wrote:
>> > >
>> > > Hi,
>> > >
>> > > While providing thoughts on the design in [1], I found a strange
>> > > behaviour with the $subject. The use case is shown below as a sequence
>> > > of steps that need to be run on publisher and subscriber to arrive at
>> > > the strange behaviour.  In step 5, the table is dropped from the
>> > > publication and in step 6, the refresh publication is run on the
>> > > subscriber, from here onwards, the expectation is that no further
>> > > inserts into the publisher table have to be replicated on to the
>> > > subscriber, but the opposite happens i.e. the inserts are still
>> > > replicated to the subscriber. ISTM as a bug. Let me know if I'm
>> > > missing anything.
>> > >
>> >
>> > Did you try to investigate what's going on? Can you please check what
>> > is the behavior if, after step-5, you restart the subscriber and
>> > separately try creating a new subscription (maybe on a different
>> > server) for that publication after step-5 and see if that allows the
>> > relation to be replicated? AFAIU, in AlterSubscription_refresh, we
>> > remove such dropped rels and stop their corresponding apply workers
>> > which should stop the further replication of such relations but that
>> > doesn't seem to be happening in your case.
>>
>> Here's my analysis:
>> 1) in the publisher, alter publication drop table successfully
>> removes(PublicationDropTables) the table from the catalogue
>> pg_publication_rel
>> 2) in the subscriber, alter subscription refresh publication
>> successfully removes the table from the catalogue pg_subscription_rel
>> (AlterSubscription_refresh->RemoveSubscriptionRel)
>> so far so good
>>
>
> Here, it should register the worker to stop on commit, and then on
> commit it should call AtEOXact_ApplyLauncher to stop the apply worker.
> Once the apply worker is stopped, the corresponding WALSender will
> also be stopped. Something here is not happening as per expected
> behavior.
>
>> 3) after the insertion into the table in the publisher(remember that
>> it's dropped from the publication in (1)), the walsender process is
>> unable detect that the table has been dropped from the publication
>> i.e. it doesn't look at the pg_publication_rel catalogue or some
>> other, but it only does is_publishable_relation() check which returns
>> true in pgoutput_change(). Maybe the walsender should look at the
>> catalogue pg_publication_rel in is_publishable_relation()?
>>
>
> We must be somewhere checking pg_publication_rel before sending the
> decoded change because otherwise, we would have sent the changes for
> the table which are not even part of this publication. I think you can
> try to create a separate table that is not part of the publication
> under test and see how the changes for that are filtered.

I find that pgoutput_change() use a hash table RelationSyncCache to
cache the publication info for tables.  When we drop tables from the
publication, the RelationSyncCache doesn't updated, so it replicate
records.

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

japin

On Jan 12, 2021, at 5:47 PM, japin <[hidden email]> wrote:


On Tue, 12 Jan 2021 at 14:38, Amit Kapila wrote:
On Tue, Jan 12, 2021 at 11:39 AM Bharath Rupireddy
<[hidden email]> wrote:

On Tue, Jan 12, 2021 at 9:05 AM Amit Kapila <[hidden email]> wrote:

On Mon, Jan 11, 2021 at 6:51 PM Bharath Rupireddy
<[hidden email]> wrote:

Hi,

While providing thoughts on the design in [1], I found a strange
behaviour with the $subject. The use case is shown below as a sequence
of steps that need to be run on publisher and subscriber to arrive at
the strange behaviour.  In step 5, the table is dropped from the
publication and in step 6, the refresh publication is run on the
subscriber, from here onwards, the expectation is that no further
inserts into the publisher table have to be replicated on to the
subscriber, but the opposite happens i.e. the inserts are still
replicated to the subscriber. ISTM as a bug. Let me know if I'm
missing anything.


Did you try to investigate what's going on? Can you please check what
is the behavior if, after step-5, you restart the subscriber and
separately try creating a new subscription (maybe on a different
server) for that publication after step-5 and see if that allows the
relation to be replicated? AFAIU, in AlterSubscription_refresh, we
remove such dropped rels and stop their corresponding apply workers
which should stop the further replication of such relations but that
doesn't seem to be happening in your case.

Here's my analysis:
1) in the publisher, alter publication drop table successfully
removes(PublicationDropTables) the table from the catalogue
pg_publication_rel
2) in the subscriber, alter subscription refresh publication
successfully removes the table from the catalogue pg_subscription_rel
(AlterSubscription_refresh->RemoveSubscriptionRel)
so far so good


Here, it should register the worker to stop on commit, and then on
commit it should call AtEOXact_ApplyLauncher to stop the apply worker.
Once the apply worker is stopped, the corresponding WALSender will
also be stopped. Something here is not happening as per expected
behavior.

3) after the insertion into the table in the publisher(remember that
it's dropped from the publication in (1)), the walsender process is
unable detect that the table has been dropped from the publication
i.e. it doesn't look at the pg_publication_rel catalogue or some
other, but it only does is_publishable_relation() check which returns
true in pgoutput_change(). Maybe the walsender should look at the
catalogue pg_publication_rel in is_publishable_relation()?


We must be somewhere checking pg_publication_rel before sending the
decoded change because otherwise, we would have sent the changes for
the table which are not even part of this publication. I think you can
try to create a separate table that is not part of the publication
under test and see how the changes for that are filtered.

I find that pgoutput_change() use a hash table RelationSyncCache to
cache the publication info for tables.  When we drop tables from the
publication, the RelationSyncCache doesn't updated, so it replicate
records.


IIUC the logical replication only replicate the tables in publication, I think
when the tables that aren't in publication should not be replicated.

Attached the patch that fixes it.  Thought?
 
-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.





alter-publication-drop-table.patch (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

Bharath Rupireddy
In reply to this post by akapila
On Tue, Jan 12, 2021 at 12:06 PM Amit Kapila <[hidden email]> wrote:

> > Here's my analysis:
> > 1) in the publisher, alter publication drop table successfully
> > removes(PublicationDropTables) the table from the catalogue
> > pg_publication_rel
> > 2) in the subscriber, alter subscription refresh publication
> > successfully removes the table from the catalogue pg_subscription_rel
> > (AlterSubscription_refresh->RemoveSubscriptionRel)
> > so far so good
> >
>
> Here, it should register the worker to stop on commit, and then on
> commit it should call AtEOXact_ApplyLauncher to stop the apply worker.
> Once the apply worker is stopped, the corresponding WALSender will
> also be stopped. Something here is not happening as per expected
> behavior.

On the subscriber, an entry for worker stop is created in AlterSubscription_refresh --> logicalrep_worker_stop_at_commit. At the end of txn, in AtEOXact_ApplyLauncher, we try to stop that worker, but it cannot be stopped because logicalrep_worker_find returns null (AtEOXact_ApplyLauncher --> logicalrep_worker_stop --> logicalrep_worker_find). The worker entry for that subscriber is having relid as 0 [1], due to which the following if condition will not be hit. The apply worker on the subscriber related to the subscription on which refresh publication was run is not closed. It looks like relid 0 is valid because it will be applicable only during the table sync phase, the comment in the LogicalRepWorker structure says that.

And also, I think, expecting the apply worker to be closed this way doesn't make sense because the apply worker is a per-subscription base, and the subscription can have other tables too.

    /* Search for attached worker for a given subscription id. */
    for (i = 0; i < max_logical_replication_workers; i++)
    {
        LogicalRepWorker *w = &LogicalRepCtx->workers[i];

        if (w->in_use && w->subid == subid && w->relid == relid &&
            (!only_running || w->proc))
        {
            res = w;
            break;
        }
    }

[1]
(gdb) p subid
$5 = 16391
(gdb) p relid
$6 = 16388
(gdb) p *w
$4 = {launch_time = 663760343317760, in_use = true, generation = 1, proc = 0x7fdfd9a7cc90,
  dbid = 12872, userid = 10, subid = 16391, relid = 0, relstate = 0 '\000', relstate_lsn = 0,
  relmutex = 0 '\000', last_lsn = 22798424, last_send_time = 663760483945980,
  last_recv_time = 663760483946087, reply_lsn = 22798424, reply_time = 663760483945980}

postgres=# select * from pg_stat_get_subscription(16391);
 subid | relid |  pid   | received_lsn |        last_msg_send_time        |      last_msg_receipt_time       | latest_end_lsn |         latest_end_time          
-------+-------+--------+--------------+----------------------------------+----------------------------------+----------------+----------------------------------
 16391 |       | 466779 | 0/15BE140    | 2021-01-12 15:26:48.778813+05:30 | 2021-01-12 15:26:48.778878+05:30 | 0/15BE140      | 2021-01-12 15:26:48.778813+05:30
(1 row)

> > 3) after the insertion into the table in the publisher(remember that
> > it's dropped from the publication in (1)), the walsender process is
> > unable detect that the table has been dropped from the publication
> > i.e. it doesn't look at the pg_publication_rel catalogue or some
> > other, but it only does is_publishable_relation() check which returns
> > true in pgoutput_change(). Maybe the walsender should look at the
> > catalogue pg_publication_rel in is_publishable_relation()?
> >
>
> We must be somewhere checking pg_publication_rel before sending the
> decoded change because otherwise, we would have sent the changes for
> the table which are not even part of this publication. I think you can
> try to create a separate table that is not part of the publication
> under test and see how the changes for that are filtered.

As pointed out by japin in the next thread, the walsender process in the publisher uses RelationSyncCache to hold the relations to which the insertions need to be sent to the subscriber. The RelationSyncCache gets created during startup of the walsender process(pgoutput_startup->init_rel_sync_cache) and also rel_sync_cache_publication_cb callback gets registered. So, if any alters happen to the pg_publication_rel catalog table, the callback rel_sync_cache_publication_cb is called, all the entries in the RelationSyncCache are marked as invalid, with the expectation that on the next use of any cache entry in get_rel_sync_entry (pgoutput_change->get_rel_sync_entry), that entry is validated again.

In the use case, the invalidation happens as expected in rel_sync_cache_publication_cb and while revalidating the entry in get_rel_sync_entry, since there is only one publication to which the given relation is attached to, the pubids will be null and we don't set the entry->pubactions.pubinsert/update/delete/truncate to false, so the publisher keeps publishing the inserts to the relation.

/* Validate the entry */
if (!entry->replicate_valid)
{
List   *pubids = GetRelationPublications(relid);

But, we cannot right away set to entry->pubactions.pubinsert/update/delete/truncate to false, when there are no publications attached to the given relation, because in that case, we don't know whether the user has run the alter subscription...refresh publication; on the subscriber.

If we want to achieve the drop table behaviour what's stated in the document (i.e. after publisher drops a table from the publication, the subscriber keeps receiving the data until refresh publication is run on it), the right place to set those false i.e. tell the walsender process not to publish the insertions further, is if there is a way in the publisher we could know that the user has run the alter subscription...refresh publication on the subscriber. But I don't see currently, the subscriber informing back to the publisher after AlterSubscription_refresh().

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

Bharath Rupireddy
In reply to this post by japin
On Tue, Jan 12, 2021 at 4:47 PM Li Japin <[hidden email]> wrote:
> IIUC the logical replication only replicate the tables in publication, I think
> when the tables that aren't in publication should not be replicated.
>
> Attached the patch that fixes it.  Thought?

With that change, we don't get the behaviour that's stated in the
document - "The ADD TABLE and DROP TABLE clauses will add and remove
one or more tables from the publication. Note that adding tables to a
publication that is already subscribed to will require a ALTER
SUBSCRIPTION ... REFRESH PUBLICATION action on the subscribing side in
order to become effective" -
https://www.postgresql.org/docs/devel/sql-alterpublication.html.

The publisher stops sending the tuples whenever the relation gets
dropped from the publication, not waiting until alter subscription ...
refresh publication on the subscriber.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

japin

On Tue, 12 Jan 2021 at 19:32, Bharath Rupireddy wrote:

> On Tue, Jan 12, 2021 at 4:47 PM Li Japin <[hidden email]> wrote:
>> IIUC the logical replication only replicate the tables in publication, I think
>> when the tables that aren't in publication should not be replicated.
>>
>> Attached the patch that fixes it.  Thought?
>
> With that change, we don't get the behaviour that's stated in the
> document - "The ADD TABLE and DROP TABLE clauses will add and remove
> one or more tables from the publication. Note that adding tables to a
> publication that is already subscribed to will require a ALTER
> SUBSCRIPTION ... REFRESH PUBLICATION action on the subscribing side in
> order to become effective" -
> https://www.postgresql.org/docs/devel/sql-alterpublication.html.
>

The documentation only emphasize adding tables to a publication, not
include dropping tables from a publication.

> The publisher stops sending the tuples whenever the relation gets
> dropped from the publication, not waiting until alter subscription ...
> refresh publication on the subscriber.
>

If we want to wait the subscriber executing alter subscription ... refresh publication,
maybe we should send some feedback to walsender.  How can we send this feedback to
walsender in non-walreceiver process?

> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com


--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

akapila
On Tue, Jan 12, 2021 at 5:23 PM japin <[hidden email]> wrote:

>
> On Tue, 12 Jan 2021 at 19:32, Bharath Rupireddy wrote:
> > On Tue, Jan 12, 2021 at 4:47 PM Li Japin <[hidden email]> wrote:
> >> IIUC the logical replication only replicate the tables in publication, I think
> >> when the tables that aren't in publication should not be replicated.
> >>
> >> Attached the patch that fixes it.  Thought?
> >
> > With that change, we don't get the behaviour that's stated in the
> > document - "The ADD TABLE and DROP TABLE clauses will add and remove
> > one or more tables from the publication. Note that adding tables to a
> > publication that is already subscribed to will require a ALTER
> > SUBSCRIPTION ... REFRESH PUBLICATION action on the subscribing side in
> > order to become effective" -
> > https://www.postgresql.org/docs/devel/sql-alterpublication.html.
> >
>
> The documentation only emphasize adding tables to a publication, not
> include dropping tables from a publication.
>

Right and I think that is ensured by the subscriber by calling
should_apply_changes_for_rel() which won't return true unless the
newly added relation is not synced via Refresh Publication. So, this
means with or without this patch we should be sending the changes of
the newly published table from the publisher?

I have another question on your patch which is why in some cases like
when we have not inserted in step-5 (as mentioned by you) the
following insertions are not sent. Is somehow we are setting the
pubactions as false in that case, if so, how?

> > The publisher stops sending the tuples whenever the relation gets
> > dropped from the publication, not waiting until alter subscription ...
> > refresh publication on the subscriber.
> >
>
> If we want to wait the subscriber executing alter subscription ... refresh publication,
> maybe we should send some feedback to walsender.  How can we send this feedback to
> walsender in non-walreceiver process?
>

I don't think we need this if what I said above is correct.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

akapila
In reply to this post by Bharath Rupireddy
On Tue, Jan 12, 2021 at 4:59 PM Bharath Rupireddy
<[hidden email]> wrote:

>
> On Tue, Jan 12, 2021 at 12:06 PM Amit Kapila <[hidden email]> wrote:
> > > Here's my analysis:
> > > 1) in the publisher, alter publication drop table successfully
> > > removes(PublicationDropTables) the table from the catalogue
> > > pg_publication_rel
> > > 2) in the subscriber, alter subscription refresh publication
> > > successfully removes the table from the catalogue pg_subscription_rel
> > > (AlterSubscription_refresh->RemoveSubscriptionRel)
> > > so far so good
> > >
> >
> > Here, it should register the worker to stop on commit, and then on
> > commit it should call AtEOXact_ApplyLauncher to stop the apply worker.
> > Once the apply worker is stopped, the corresponding WALSender will
> > also be stopped. Something here is not happening as per expected
> > behavior.
>
> On the subscriber, an entry for worker stop is created in AlterSubscription_refresh --> logicalrep_worker_stop_at_commit. At the end of txn, in AtEOXact_ApplyLauncher, we try to stop that worker, but it cannot be stopped because logicalrep_worker_find returns null (AtEOXact_ApplyLauncher --> logicalrep_worker_stop --> logicalrep_worker_find). The worker entry for that subscriber is having relid as 0 [1], due to which the following if condition will not be hit. The apply worker on the subscriber related to the subscription on which refresh publication was run is not closed. It looks like relid 0 is valid because it will be applicable only during the table sync phase, the comment in the LogicalRepWorker structure says that.
>
> And also, I think, expecting the apply worker to be closed this way doesn't make sense because the apply worker is a per-subscription base, and the subscription can have other tables too.
>

Okay, that makes sense. As responded to Li Japin, let's focus on
figuring out why we are sending the changes from the publisher node in
some cases and not in other cases.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

Bharath Rupireddy
In reply to this post by akapila
On Wed, Jan 13, 2021 at 10:33 AM Amit Kapila <[hidden email]> wrote:

>
> On Tue, Jan 12, 2021 at 5:23 PM japin <[hidden email]> wrote:
> >
> > On Tue, 12 Jan 2021 at 19:32, Bharath Rupireddy wrote:
> > > On Tue, Jan 12, 2021 at 4:47 PM Li Japin <[hidden email]> wrote:
> > >> IIUC the logical replication only replicate the tables in publication, I think
> > >> when the tables that aren't in publication should not be replicated.
> > >>
> > >> Attached the patch that fixes it.  Thought?
> > >
> > > With that change, we don't get the behaviour that's stated in the
> > > document - "The ADD TABLE and DROP TABLE clauses will add and remove
> > > one or more tables from the publication. Note that adding tables to a
> > > publication that is already subscribed to will require a ALTER
> > > SUBSCRIPTION ... REFRESH PUBLICATION action on the subscribing side in
> > > order to become effective" -
> > > https://www.postgresql.org/docs/devel/sql-alterpublication.html.
> > >
> >
> > The documentation only emphasize adding tables to a publication, not
> > include dropping tables from a publication.
> >
>
> Right and I think that is ensured by the subscriber by calling
> should_apply_changes_for_rel() which won't return true unless the
> newly added relation is not synced via Refresh Publication. So, this
> means with or without this patch we should be sending the changes of
> the newly published table from the publisher?

Oh, my bad, alter subscription...refresh publication is required only when the tables are added to the publisher. Patch by japin makes the walsender process to stop sending the data to the subscriber/apply worker. The patch is based on the idea of looking at the PUBLICATIONRELMAP in get_rel_sync_entry when the entries have been invalidated in rel_sync_cache_publication_cb because of alter publication...drop table.
,
When the alter subscription...refresh publication is run on the subscriber, the SUBSCRIPTIONRELMAP catalogue gets invalidated but the corresponding cache entries in the LogicalRepRelMap which is used by logicalrep_rel_open are not invalidated. LogicalRepRelMap is used to know the relations that are associated with the subscription. But we seem to have not taken care of invalidating those entries, though we have the invalidation callback invalidate_syncing_table_states registered for SUBSCRIPTIONRELMAP in ApplyWorkerMain. So, we miss to work on updating the entries in LogicalRepRelMap.

IMO, the ideal way to fix this issue is 1) stop the walsender sending the changes to dropped tables, for this japin patch works 2) we must mark all the LogicalRepRelMap entries as invalid in invalidate_syncing_table_states so that in the next logicalrep_rel_open, if the entry is invalidated, then we have to call GetSubscriptionRelState to get the latest state, as shown in [1]. Likewise, we might also have to mark the cache entries invalid in  subscription_change_cb which is invalidation callback for pg_subscription

Thoughts?

[1] -
    if (entry->state != SUBREL_STATE_READY || entry->invalid)
        entry->state = GetSubscriptionRelState(MySubscription->oid,
                                               entry->localreloid,
                                               &entry->statelsn);
   
   if (entry->invalid)
       entry->invalid = false;

    return entry;

> I have another question on your patch which is why in some cases like
> when we have not inserted in step-5 (as mentioned by you) the
> following insertions are not sent. Is somehow we are setting the
> pubactions as false in that case, if so, how?

The reason is that the issue reported in this thread occurs - when we have the walsender process running, RelationSyncCache is initialized, we inserted some data into the table that's sent to the subscriber and the table is dropped, we miss to set the pubactions to false in get_rel_sync_entry, though the cache entries have been invalidated.

In some cases, it works properly because the old walsender process was stopped and when a new walsender is started, then the cache RelationSyncCache gets initialized again and the pubactions will be set to false in get_rel_sync_entry.

if (!found)
    {
        /* immediately make a new entry valid enough to satisfy callbacks */
        entry->schema_sent = false;
        entry->streamed_txns = NIL;
        entry->replicate_valid = false;
        entry->pubactions.pubinsert = entry->pubactions.pubupdate =
            entry->pubactions.pubdelete = entry->pubactions.pubtruncate = false;
        entry->publish_as_relid = InvalidOid;
    }

Hope that clarifies why the issue happens in some cases and not in other cases.

> > > The publisher stops sending the tuples whenever the relation gets
> > > dropped from the publication, not waiting until alter subscription ...
> > > refresh publication on the subscriber.
> > >
> >
> > If we want to wait the subscriber executing alter subscription ... refresh publication,
> > maybe we should send some feedback to walsender.  How can we send this feedback to
> > walsender in non-walreceiver process?
> >
>
> I don't think we need this if what I said above is correct.

My bad. This thought was emanated from my poor reading of the documentation.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com
Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

akapila
On Wed, Jan 13, 2021 at 11:08 AM Bharath Rupireddy
<[hidden email]> wrote:

>
> On Wed, Jan 13, 2021 at 10:33 AM Amit Kapila <[hidden email]> wrote:
> >
> > On Tue, Jan 12, 2021 at 5:23 PM japin <[hidden email]> wrote:
> > >
> > > On Tue, 12 Jan 2021 at 19:32, Bharath Rupireddy wrote:
> > > > On Tue, Jan 12, 2021 at 4:47 PM Li Japin <[hidden email]> wrote:
> > > >> IIUC the logical replication only replicate the tables in publication, I think
> > > >> when the tables that aren't in publication should not be replicated.
> > > >>
> > > >> Attached the patch that fixes it.  Thought?
> > > >
> > > > With that change, we don't get the behaviour that's stated in the
> > > > document - "The ADD TABLE and DROP TABLE clauses will add and remove
> > > > one or more tables from the publication. Note that adding tables to a
> > > > publication that is already subscribed to will require a ALTER
> > > > SUBSCRIPTION ... REFRESH PUBLICATION action on the subscribing side in
> > > > order to become effective" -
> > > > https://www.postgresql.org/docs/devel/sql-alterpublication.html.
> > > >
> > >
> > > The documentation only emphasize adding tables to a publication, not
> > > include dropping tables from a publication.
> > >
> >
> > Right and I think that is ensured by the subscriber by calling
> > should_apply_changes_for_rel() which won't return true unless the
> > newly added relation is not synced via Refresh Publication. So, this
> > means with or without this patch we should be sending the changes of
> > the newly published table from the publisher?
>
> Oh, my bad, alter subscription...refresh publication is required only when the tables are added to the publisher. Patch by japin makes the walsender process to stop sending the data to the subscriber/apply worker. The patch is based on the idea of looking at the PUBLICATIONRELMAP in get_rel_sync_entry when the entries have been invalidated in rel_sync_cache_publication_cb because of alter publication...drop table.
> ,
> When the alter subscription...refresh publication is run on the subscriber, the SUBSCRIPTIONRELMAP catalogue gets invalidated but the corresponding cache entries in the LogicalRepRelMap which is used by logicalrep_rel_open are not invalidated. LogicalRepRelMap is used to know the relations that are associated with the subscription. But we seem to have not taken care of invalidating those entries, though we have the invalidation callback invalidate_syncing_table_states registered for SUBSCRIPTIONRELMAP in ApplyWorkerMain. So, we miss to work on updating the entries in LogicalRepRelMap.
>
> IMO, the ideal way to fix this issue is 1) stop the walsender sending the changes to dropped tables, for this japin patch works 2) we must mark all the LogicalRepRelMap entries as invalid in invalidate_syncing_table_states so that in the next logicalrep_rel_open, if the entry is invalidated, then we have to call GetSubscriptionRelState to get the latest state, as shown in [1]. Likewise, we might also have to mark the cache entries invalid in  subscription_change_cb which is invalidation callback for pg_subscription
>

Is the second point you described here is related to the original
bug-reported or will it cause some other problem?

> Thoughts?
>
> [1] -
>     if (entry->state != SUBREL_STATE_READY || entry->invalid)
>         entry->state = GetSubscriptionRelState(MySubscription->oid,
>                                                entry->localreloid,
>                                                &entry->statelsn);
>
>    if (entry->invalid)
>        entry->invalid = false;
>
>     return entry;
>
> > I have another question on your patch which is why in some cases like
> > when we have not inserted in step-5 (as mentioned by you) the
> > following insertions are not sent. Is somehow we are setting the
> > pubactions as false in that case, if so, how?
>
> The reason is that the issue reported in this thread occurs - when we have the walsender process running, RelationSyncCache is initialized, we inserted some data into the table that's sent to the subscriber and the table is dropped, we miss to set the pubactions to false in get_rel_sync_entry, though the cache entries have been invalidated.
>
> In some cases, it works properly because the old walsender process was stopped and when a new walsender is started, then the cache RelationSyncCache gets initialized again and the pubactions will be set to false in get_rel_sync_entry.
>

Why is walsender process was getting stopped in one case but not in another?

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

japin
In reply to this post by akapila

On Wed, 13 Jan 2021 at 13:26, Amit Kapila wrote:

> On Tue, Jan 12, 2021 at 4:59 PM Bharath Rupireddy
> <[hidden email]> wrote:
>>
>> On Tue, Jan 12, 2021 at 12:06 PM Amit Kapila <[hidden email]> wrote:
>> > > Here's my analysis:
>> > > 1) in the publisher, alter publication drop table successfully
>> > > removes(PublicationDropTables) the table from the catalogue
>> > > pg_publication_rel
>> > > 2) in the subscriber, alter subscription refresh publication
>> > > successfully removes the table from the catalogue pg_subscription_rel
>> > > (AlterSubscription_refresh->RemoveSubscriptionRel)
>> > > so far so good
>> > >
>> >
>> > Here, it should register the worker to stop on commit, and then on
>> > commit it should call AtEOXact_ApplyLauncher to stop the apply worker.
>> > Once the apply worker is stopped, the corresponding WALSender will
>> > also be stopped. Something here is not happening as per expected
>> > behavior.
>>
>> On the subscriber, an entry for worker stop is created in AlterSubscription_refresh --> logicalrep_worker_stop_at_commit. At the end of txn, in AtEOXact_ApplyLauncher, we try to stop that worker, but it cannot be stopped because logicalrep_worker_find returns null (AtEOXact_ApplyLauncher --> logicalrep_worker_stop --> logicalrep_worker_find). The worker entry for that subscriber is having relid as 0 [1], due to which the following if condition will not be hit. The apply worker on the subscriber related to the subscription on which refresh publication was run is not closed. It looks like relid 0 is valid because it will be applicable only during the table sync phase, the comment in the LogicalRepWorker structure says that.
>>
>> And also, I think, expecting the apply worker to be closed this way doesn't make sense because the apply worker is a per-subscription base, and the subscription can have other tables too.
>>
>
> Okay, that makes sense. As responded to Li Japin, let's focus on
> figuring out why we are sending the changes from the publisher node in
> some cases and not in other cases.

After some analysis, I find that the dropped tables always replicate to subscriber.
The difference is that if we drop the table from publication and refresh
publication (on subscriber), the LogicalRepRelMapEntry in should_apply_changes_for_rel()
set state to SUBREL_STATE_UNKNOWN.

(gdb) p *rel
$2 = {remoterel = {remoteid = 16410, nspname = 0x5564fb0177c0 "public",
    relname = 0x5564fb0177a0 "t1", natts = 1, attnames = 0x5564fb0177e0, atttyps = 0x5564fb017780,
    replident = 100 'd', relkind = 0 '\000', attkeys = 0x0}, localrelvalid = true,
  localreloid = 16412, localrel = 0x7f78705da1b8, attrmap = 0x5564fb017800, updatable = false,
  *state = 0 '\000'*, statelsn = 0}

If we insert data between drop table from publication and refresh publication, the
LogicalRepRelMapEntry state is always SUBREL_STATE_READY.

(gdb) p *rel
$2 = {remoterel = {remoteid = 16410, nspname = 0x5564fb0177c0 "public",
    relname = 0x5564fb0177a0 "t1", natts = 1, attnames = 0x5564fb0177e0, atttyps = 0x5564fb017780,
    replident = 100 'd', relkind = 0 '\000', attkeys = 0x0}, localrelvalid = true,
  localreloid = 16412, localrel = 0x7f78705d9d38, attrmap = 0x5564fb017800, updatable = false,
  *state = 114 'r'*, statelsn = 23545672}

I will dig why the state of LogicalRepRelMapEntry doesn't change in second case.

Any suggestion is welcome!

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

Bharath Rupireddy
In reply to this post by akapila
On Wed, Jan 13, 2021 at 11:27 AM Amit Kapila <[hidden email]> wrote:

>
> On Wed, Jan 13, 2021 at 11:08 AM Bharath Rupireddy
> <[hidden email]> wrote:
> >
> > On Wed, Jan 13, 2021 at 10:33 AM Amit Kapila <[hidden email]> wrote:
> > >
> > > On Tue, Jan 12, 2021 at 5:23 PM japin <[hidden email]> wrote:
> > > >
> > > > On Tue, 12 Jan 2021 at 19:32, Bharath Rupireddy wrote:
> > > > > On Tue, Jan 12, 2021 at 4:47 PM Li Japin <[hidden email]> wrote:
> > > > >> IIUC the logical replication only replicate the tables in publication, I think
> > > > >> when the tables that aren't in publication should not be replicated.
> > > > >>
> > > > >> Attached the patch that fixes it.  Thought?
> > > > >
> > > > > With that change, we don't get the behaviour that's stated in the
> > > > > document - "The ADD TABLE and DROP TABLE clauses will add and remove
> > > > > one or more tables from the publication. Note that adding tables to a
> > > > > publication that is already subscribed to will require a ALTER
> > > > > SUBSCRIPTION ... REFRESH PUBLICATION action on the subscribing side in
> > > > > order to become effective" -
> > > > > https://www.postgresql.org/docs/devel/sql-alterpublication.html.
> > > > >
> > > >
> > > > The documentation only emphasize adding tables to a publication, not
> > > > include dropping tables from a publication.
> > > >
> > >
> > > Right and I think that is ensured by the subscriber by calling
> > > should_apply_changes_for_rel() which won't return true unless the
> > > newly added relation is not synced via Refresh Publication. So, this
> > > means with or without this patch we should be sending the changes of
> > > the newly published table from the publisher?
> >
> > Oh, my bad, alter subscription...refresh publication is required only when the tables are added to the publisher. Patch by japin makes the walsender process to stop sending the data to the subscriber/apply worker. The patch is based on the idea of looking at the PUBLICATIONRELMAP in get_rel_sync_entry when the entries have been invalidated in rel_sync_cache_publication_cb because of alter publication...drop table.
> > ,
> > When the alter subscription...refresh publication is run on the subscriber, the SUBSCRIPTIONRELMAP catalogue gets invalidated but the corresponding cache entries in the LogicalRepRelMap which is used by logicalrep_rel_open are not invalidated. LogicalRepRelMap is used to know the relations that are associated with the subscription. But we seem to have not taken care of invalidating those entries, though we have the invalidation callback invalidate_syncing_table_states registered for SUBSCRIPTIONRELMAP in ApplyWorkerMain. So, we miss to work on updating the entries in LogicalRepRelMap.
> >
> > IMO, the ideal way to fix this issue is 1) stop the walsender sending the changes to dropped tables, for this japin patch works 2) we must mark all the LogicalRepRelMap entries as invalid in invalidate_syncing_table_states so that in the next logicalrep_rel_open, if the entry is invalidated, then we have to call GetSubscriptionRelState to get the latest state, as shown in [1]. Likewise, we might also have to mark the cache entries invalid in  subscription_change_cb which is invalidation callback for pg_subscription
> >
>
> Is the second point you described here is related to the original
> bug-reported or will it cause some other problem?

It can cause some other problems. I found this while investigating
from the subscriber perspective why the subscriber is accepting the
inserts even though the relation is removed from the
pg_subscription_rel catalogue after refresh publication. I ended up
finding that the cache entries are not being invalidated in
invalidate_syncing_table_states.

Having said that, the patch proposed by japin is enough to solve the
bug reported here.

While we are fixing the bug, I thought it's better to fix this as
well, maybe as a 0002 patch? If okay, I can work on the patch and post
it in a separate thread?

> > Thoughts?
> >
> > [1] -
> >     if (entry->state != SUBREL_STATE_READY || entry->invalid)
> >         entry->state = GetSubscriptionRelState(MySubscription->oid,
> >                                                entry->localreloid,
> >                                                &entry->statelsn);
> >
> >    if (entry->invalid)
> >        entry->invalid = false;
> >
> >     return entry;
> >
> > > I have another question on your patch which is why in some cases like
> > > when we have not inserted in step-5 (as mentioned by you) the
> > > following insertions are not sent. Is somehow we are setting the
> > > pubactions as false in that case, if so, how?
> >
> > The reason is that the issue reported in this thread occurs - when we have the walsender process running, RelationSyncCache is initialized, we inserted some data into the table that's sent to the subscriber and the table is dropped, we miss to set the pubactions to false in get_rel_sync_entry, though the cache entries have been invalidated.
> >
> > In some cases, it works properly because the old walsender process was stopped and when a new walsender is started, then the cache RelationSyncCache gets initialized again and the pubactions will be set to false in get_rel_sync_entry.
> >
>
> Why is walsender process was getting stopped in one case but not in another?

Both walsender and logical replication apply workers are getting
stopped because of the default values of 60s for wal_sender_timeout
and wal_receiver_timeout respectively. To analyze further, I increased
both timeouts to 600s.

Here's what happening:

Note that for the sake of simplicity, I'm skipping the create
publication, subscription, initial insertion statements, I'm starting
from alter publication ... drop table statement.

case 1 - issue reported in this thread is observed:
1) on publisher - alter publication ... drop table t1; and insert into
t1 values(67);   Though the table is dropped from the publication, the
walsender process publishes the inserts, which will be fixed by the
patch proposed by japin.

2) on subscriber - on the insert into t1(1) on the publisher,
logicalrep_relmap_update gets called in the subscriber because the
publisher informs the subscriber that it has dropped the table from
the publication. In logicalrep_relmap_update, the LogicalRepRelMap
cache entry corresponding to the dropped relation is reset. Since the
alter subscription ... refresh publication is not yet run, the insert
is taken to the target table and the LogicalRepRelMap cache entry for
that table is updated in logicalrep_rel_open, it's state is read from
the catalogues using GetSubscriptionRelState, because of the entry
reset in logicalrep_relmap_update, entry->state is
SUBREL_STATE_UNKNOWN. After GetSubscriptionRelState, entry->state
becomes SUBREL_STATE_READY
    if (entry->state != SUBREL_STATE_READY)
        entry->state = GetSubscriptionRelState(MySubscription->oid,
                                               entry->localreloid,
                                               &entry->statelsn);

3) on subscriber - run alter subscription ... refresh publication, the
dropped relation entry is removed from the pg_subscription_rel
catalogue and invalidate_syncing_table_states gets called. But note
that we have not invalidated the LogicalRepRelMap. The
LogicalRepRelMap cache entry for the table remains what we have set in
(2) i.e. entry->state is SUBREL_STATE_READY.

4) on publisher - insert into t1 values(68); as mentioned in (1),
walsender process keeps sending the inserts.

5) on subscriber - apply_handle_insert is called and the
logicalrep_rel_open will not call GetSubscriptionRelState to get the
new catalogue entry for pg_subscription_rel (3), because the
entry->state is still SUBREL_STATE_READY which was set (2), so it ends
up in applying the incoming inserts, until the GetSubscriptionRelState
is called for that entry which happens either the apply worker stops
and restarts due to timeout or another change to the publication
happens in the publisher so that logicalrep_relmap_update is called on
the subscriber. If we had marked all the LogicalRepRelMap cache
entries as invalid in the invalidate_syncing_table_states callback as
pointed out by me in the earlier mail, then this problem will be
solved.

case 2 - issue is not observed:
1) on publisher - alter publication ... drop table t1; no inserts into
the table t1, subscriber will not receive logicalrep_relmap_update
yet.

2) on subscriber - run alter subscription ... refresh publication, the
dropped relation entry is removed from the pg_subscription_rel
catalogue and invalidate_syncing_table_states gets called. But note
that we have not invalidated the LogicalRepRelMap. The
LogicalRepRelMap cache entry for the table remains the same.

3) on publisher - insert into t1 values(67);   Though the table is
dropped from the publication (1), the walsender process publishes the
inserts, which will be fixed by the patch proposed by japin.

4) on subscriber - on the insert into t1(3) on the publisher,
logicalrep_relmap_update gets called in the subscriber because the
publisher informs the subscriber that it has dropped the table from
the publication. In logicalrep_relmap_update, the LogicalRepRelMap
cache entry corresponding to the dropped relation is reset, because of
which the next logicalrep_rel_open will call GetSubscriptionRelState
(as entry->state is SUBREL_STATE_UNKNOWN due to the reset in
logicalrep_relmap_update). GetSubscriptionRelState will return
SUBREL_STATE_UNKNOWN, because it doesn't find the tuple in the updated
pg_subscription_rel catalogue because of the alter subscription ...
refresh publication would have removed tuple related to the table from
the pg_subscription_rel. So, the inserts are ignored because the
should_apply_changes_for_rel returns false after the
logicalrep_rel_open in apply_handle_insert.
    /* Try finding the mapping. */
    tup = SearchSysCache2(SUBSCRIPTIONRELMAP,
                          ObjectIdGetDatum(relid),
                          ObjectIdGetDatum(subid));

    if (!HeapTupleIsValid(tup))
    {
        *sublsn = InvalidXLogRecPtr;
        return SUBREL_STATE_UNKNOWN;
    }

5) Here on, all further inserts into the publication table are ignored
by the subscriber because of the same reason (4). So, the issue
reported in this thread doesn't occur.

I'm sorry for the huge write up. I hope I clarified the point this time.

In summary, I feel we need to fix the publisher sending the inserts
even though the table is dropped from the publication, that is the
patch patch proposed by japin. This solves the bug reported in this
thread.

And also, it's good to have the LogicalRepRelMap invalidation fix as a
0002 patch in invalidate_syncing_table_states, subscription_change_cb
and logicalrep_rel_open as proposed by me.

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

japin

On Wed, 13 Jan 2021 at 16:49, Bharath Rupireddy wrote:

> On Wed, Jan 13, 2021 at 11:27 AM Amit Kapila <[hidden email]> wrote:
>>
>> On Wed, Jan 13, 2021 at 11:08 AM Bharath Rupireddy
>> <[hidden email]> wrote:
>> >
>> > On Wed, Jan 13, 2021 at 10:33 AM Amit Kapila <[hidden email]> wrote:
>> > >
>> > > On Tue, Jan 12, 2021 at 5:23 PM japin <[hidden email]> wrote:
>> > > >
>> > > > On Tue, 12 Jan 2021 at 19:32, Bharath Rupireddy wrote:
>> > > > > On Tue, Jan 12, 2021 at 4:47 PM Li Japin <[hidden email]> wrote:
>> > > > >> IIUC the logical replication only replicate the tables in publication, I think
>> > > > >> when the tables that aren't in publication should not be replicated.
>> > > > >>
>> > > > >> Attached the patch that fixes it.  Thought?
>> > > > >
>> > > > > With that change, we don't get the behaviour that's stated in the
>> > > > > document - "The ADD TABLE and DROP TABLE clauses will add and remove
>> > > > > one or more tables from the publication. Note that adding tables to a
>> > > > > publication that is already subscribed to will require a ALTER
>> > > > > SUBSCRIPTION ... REFRESH PUBLICATION action on the subscribing side in
>> > > > > order to become effective" -
>> > > > > https://www.postgresql.org/docs/devel/sql-alterpublication.html.
>> > > > >
>> > > >
>> > > > The documentation only emphasize adding tables to a publication, not
>> > > > include dropping tables from a publication.
>> > > >
>> > >
>> > > Right and I think that is ensured by the subscriber by calling
>> > > should_apply_changes_for_rel() which won't return true unless the
>> > > newly added relation is not synced via Refresh Publication. So, this
>> > > means with or without this patch we should be sending the changes of
>> > > the newly published table from the publisher?
>> >
>> > Oh, my bad, alter subscription...refresh publication is required only when the tables are added to the publisher. Patch by japin makes the walsender process to stop sending the data to the subscriber/apply worker. The patch is based on the idea of looking at the PUBLICATIONRELMAP in get_rel_sync_entry when the entries have been invalidated in rel_sync_cache_publication_cb because of alter publication...drop table.
>> > ,
>> > When the alter subscription...refresh publication is run on the subscriber, the SUBSCRIPTIONRELMAP catalogue gets invalidated but the corresponding cache entries in the LogicalRepRelMap which is used by logicalrep_rel_open are not invalidated. LogicalRepRelMap is used to know the relations that are associated with the subscription. But we seem to have not taken care of invalidating those entries, though we have the invalidation callback invalidate_syncing_table_states registered for SUBSCRIPTIONRELMAP in ApplyWorkerMain. So, we miss to work on updating the entries in LogicalRepRelMap.
>> >
>> > IMO, the ideal way to fix this issue is 1) stop the walsender sending the changes to dropped tables, for this japin patch works 2) we must mark all the LogicalRepRelMap entries as invalid in invalidate_syncing_table_states so that in the next logicalrep_rel_open, if the entry is invalidated, then we have to call GetSubscriptionRelState to get the latest state, as shown in [1]. Likewise, we might also have to mark the cache entries invalid in  subscription_change_cb which is invalidation callback for pg_subscription
>> >
>>
>> Is the second point you described here is related to the original
>> bug-reported or will it cause some other problem?
>
> It can cause some other problems. I found this while investigating
> from the subscriber perspective why the subscriber is accepting the
> inserts even though the relation is removed from the
> pg_subscription_rel catalogue after refresh publication. I ended up
> finding that the cache entries are not being invalidated in
> invalidate_syncing_table_states.
>
> Having said that, the patch proposed by japin is enough to solve the
> bug reported here.
>
> While we are fixing the bug, I thought it's better to fix this as
> well, maybe as a 0002 patch? If okay, I can work on the patch and post
> it in a separate thread?
>
>> > Thoughts?
>> >
>> > [1] -
>> >     if (entry->state != SUBREL_STATE_READY || entry->invalid)
>> >         entry->state = GetSubscriptionRelState(MySubscription->oid,
>> >                                                entry->localreloid,
>> >                                                &entry->statelsn);
>> >
>> >    if (entry->invalid)
>> >        entry->invalid = false;
>> >
>> >     return entry;
>> >
>> > > I have another question on your patch which is why in some cases like
>> > > when we have not inserted in step-5 (as mentioned by you) the
>> > > following insertions are not sent. Is somehow we are setting the
>> > > pubactions as false in that case, if so, how?
>> >
>> > The reason is that the issue reported in this thread occurs - when we have the walsender process running, RelationSyncCache is initialized, we inserted some data into the table that's sent to the subscriber and the table is dropped, we miss to set the pubactions to false in get_rel_sync_entry, though the cache entries have been invalidated.
>> >
>> > In some cases, it works properly because the old walsender process was stopped and when a new walsender is started, then the cache RelationSyncCache gets initialized again and the pubactions will be set to false in get_rel_sync_entry.
>> >
>>
>> Why is walsender process was getting stopped in one case but not in another?
>
> Both walsender and logical replication apply workers are getting
> stopped because of the default values of 60s for wal_sender_timeout
> and wal_receiver_timeout respectively. To analyze further, I increased
> both timeouts to 600s.
>
> Here's what happening:
>
> Note that for the sake of simplicity, I'm skipping the create
> publication, subscription, initial insertion statements, I'm starting
> from alter publication ... drop table statement.
>
> case 1 - issue reported in this thread is observed:
> 1) on publisher - alter publication ... drop table t1; and insert into
> t1 values(67);   Though the table is dropped from the publication, the
> walsender process publishes the inserts, which will be fixed by the
> patch proposed by japin.
>
> 2) on subscriber - on the insert into t1(1) on the publisher,
> logicalrep_relmap_update gets called in the subscriber because the
> publisher informs the subscriber that it has dropped the table from
> the publication. In logicalrep_relmap_update, the LogicalRepRelMap
> cache entry corresponding to the dropped relation is reset. Since the
> alter subscription ... refresh publication is not yet run, the insert
> is taken to the target table and the LogicalRepRelMap cache entry for
> that table is updated in logicalrep_rel_open, it's state is read from
> the catalogues using GetSubscriptionRelState, because of the entry
> reset in logicalrep_relmap_update, entry->state is
> SUBREL_STATE_UNKNOWN. After GetSubscriptionRelState, entry->state
> becomes SUBREL_STATE_READY
>     if (entry->state != SUBREL_STATE_READY)
>         entry->state = GetSubscriptionRelState(MySubscription->oid,
>                                                entry->localreloid,
>                                                &entry->statelsn);
>
> 3) on subscriber - run alter subscription ... refresh publication, the
> dropped relation entry is removed from the pg_subscription_rel
> catalogue and invalidate_syncing_table_states gets called. But note
> that we have not invalidated the LogicalRepRelMap. The
> LogicalRepRelMap cache entry for the table remains what we have set in
> (2) i.e. entry->state is SUBREL_STATE_READY.
>
> 4) on publisher - insert into t1 values(68); as mentioned in (1),
> walsender process keeps sending the inserts.
>
> 5) on subscriber - apply_handle_insert is called and the
> logicalrep_rel_open will not call GetSubscriptionRelState to get the
> new catalogue entry for pg_subscription_rel (3), because the
> entry->state is still SUBREL_STATE_READY which was set (2), so it ends
> up in applying the incoming inserts, until the GetSubscriptionRelState
> is called for that entry which happens either the apply worker stops
> and restarts due to timeout or another change to the publication
> happens in the publisher so that logicalrep_relmap_update is called on
> the subscriber. If we had marked all the LogicalRepRelMap cache
> entries as invalid in the invalidate_syncing_table_states callback as
> pointed out by me in the earlier mail, then this problem will be
> solved.
>
> case 2 - issue is not observed:
> 1) on publisher - alter publication ... drop table t1; no inserts into
> the table t1, subscriber will not receive logicalrep_relmap_update
> yet.
>
> 2) on subscriber - run alter subscription ... refresh publication, the
> dropped relation entry is removed from the pg_subscription_rel
> catalogue and invalidate_syncing_table_states gets called. But note
> that we have not invalidated the LogicalRepRelMap. The
> LogicalRepRelMap cache entry for the table remains the same.
>
> 3) on publisher - insert into t1 values(67);   Though the table is
> dropped from the publication (1), the walsender process publishes the
> inserts, which will be fixed by the patch proposed by japin.
>
> 4) on subscriber - on the insert into t1(3) on the publisher,
> logicalrep_relmap_update gets called in the subscriber because the
> publisher informs the subscriber that it has dropped the table from
> the publication. In logicalrep_relmap_update, the LogicalRepRelMap
> cache entry corresponding to the dropped relation is reset, because of
> which the next logicalrep_rel_open will call GetSubscriptionRelState
> (as entry->state is SUBREL_STATE_UNKNOWN due to the reset in
> logicalrep_relmap_update). GetSubscriptionRelState will return
> SUBREL_STATE_UNKNOWN, because it doesn't find the tuple in the updated
> pg_subscription_rel catalogue because of the alter subscription ...
> refresh publication would have removed tuple related to the table from
> the pg_subscription_rel. So, the inserts are ignored because the
> should_apply_changes_for_rel returns false after the
> logicalrep_rel_open in apply_handle_insert.
>     /* Try finding the mapping. */
>     tup = SearchSysCache2(SUBSCRIPTIONRELMAP,
>                           ObjectIdGetDatum(relid),
>                           ObjectIdGetDatum(subid));
>
>     if (!HeapTupleIsValid(tup))
>     {
>         *sublsn = InvalidXLogRecPtr;
>         return SUBREL_STATE_UNKNOWN;
>     }
>
> 5) Here on, all further inserts into the publication table are ignored
> by the subscriber because of the same reason (4). So, the issue
> reported in this thread doesn't occur.
>
> I'm sorry for the huge write up. I hope I clarified the point this time.
>

Yes! Very clearly.

> In summary, I feel we need to fix the publisher sending the inserts
> even though the table is dropped from the publication, that is the
> patch patch proposed by japin. This solves the bug reported in this
> thread.
>
> And also, it's good to have the LogicalRepRelMap invalidation fix as a
> 0002 patch in invalidate_syncing_table_states, subscription_change_cb
> and logicalrep_rel_open as proposed by me.
>
> Thoughts?
>

I think invalidate the LogicalRepRelMap is necessary.  If the table isn't in
subscription, can we remove the LogicalRepRelMapEntry from LogicalRepRelMap?

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.


Reply | Threaded
Open this post in threaded view
|

Re: Logical Replication - behavior of ALTER PUBLICATION .. DROP TABLE and ALTER SUBSCRIPTION .. REFRESH PUBLICATION

Dilip Kumar-2
In reply to this post by japin
On Tue, Jan 12, 2021 at 4:47 PM Li Japin <[hidden email]> wrote:

>
>
> On Jan 12, 2021, at 5:47 PM, japin <[hidden email]> wrote:
>
>
> On Tue, 12 Jan 2021 at 14:38, Amit Kapila wrote:
>
> On Tue, Jan 12, 2021 at 11:39 AM Bharath Rupireddy
> <[hidden email]> wrote:
>
>
> On Tue, Jan 12, 2021 at 9:05 AM Amit Kapila <[hidden email]> wrote:
>
>
> On Mon, Jan 11, 2021 at 6:51 PM Bharath Rupireddy
> <[hidden email]> wrote:
>
>
> Hi,
>
> While providing thoughts on the design in [1], I found a strange
> behaviour with the $subject. The use case is shown below as a sequence
> of steps that need to be run on publisher and subscriber to arrive at
> the strange behaviour.  In step 5, the table is dropped from the
> publication and in step 6, the refresh publication is run on the
> subscriber, from here onwards, the expectation is that no further
> inserts into the publisher table have to be replicated on to the
> subscriber, but the opposite happens i.e. the inserts are still
> replicated to the subscriber. ISTM as a bug. Let me know if I'm
> missing anything.
>
>
> Did you try to investigate what's going on? Can you please check what
> is the behavior if, after step-5, you restart the subscriber and
> separately try creating a new subscription (maybe on a different
> server) for that publication after step-5 and see if that allows the
> relation to be replicated? AFAIU, in AlterSubscription_refresh, we
> remove such dropped rels and stop their corresponding apply workers
> which should stop the further replication of such relations but that
> doesn't seem to be happening in your case.
>
>
> Here's my analysis:
> 1) in the publisher, alter publication drop table successfully
> removes(PublicationDropTables) the table from the catalogue
> pg_publication_rel
> 2) in the subscriber, alter subscription refresh publication
> successfully removes the table from the catalogue pg_subscription_rel
> (AlterSubscription_refresh->RemoveSubscriptionRel)
> so far so good
>
>
> Here, it should register the worker to stop on commit, and then on
> commit it should call AtEOXact_ApplyLauncher to stop the apply worker.
> Once the apply worker is stopped, the corresponding WALSender will
> also be stopped. Something here is not happening as per expected
> behavior.
>
> 3) after the insertion into the table in the publisher(remember that
> it's dropped from the publication in (1)), the walsender process is
> unable detect that the table has been dropped from the publication
> i.e. it doesn't look at the pg_publication_rel catalogue or some
> other, but it only does is_publishable_relation() check which returns
> true in pgoutput_change(). Maybe the walsender should look at the
> catalogue pg_publication_rel in is_publishable_relation()?
>
>
> We must be somewhere checking pg_publication_rel before sending the
> decoded change because otherwise, we would have sent the changes for
> the table which are not even part of this publication. I think you can
> try to create a separate table that is not part of the publication
> under test and see how the changes for that are filtered.
>
>
> I find that pgoutput_change() use a hash table RelationSyncCache to
> cache the publication info for tables.  When we drop tables from the
> publication, the RelationSyncCache doesn't updated, so it replicate
> records.
>
>
> IIUC the logical replication only replicate the tables in publication, I think
> when the tables that aren't in publication should not be replicated.
>
> Attached the patch that fixes it.  Thought?
>

Instead of doing this, I would expect that the RelationSyncCache entry
should be removed when the relation is dropped from the publication.
So if that is done then it will reload the publication and then it
will not find that that relation as published and it will ignore the
changes.  But the patch doesn't seem to be exactly on that line.  Am I
missing something here?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


123