Add a GUC variable that control logical replication

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Add a GUC variable that control logical replication

Quan Zongliang-3


Sybase has a feature to turn off replication at the session level: set
replication = off, which can be temporarily turned off when there is a
maintenance action on the table. Our users also want this feature.
I add a new flag bit in xinfo, control it with a session-level variable,
when set to true, this flag is written when the transaction is
committed, and when the logic is decoded it abandons the transaction
like aborted transactions. Since PostgreSQL has two types of
replication, I call the variable "logical_replication" to avoid
confusion and default value is true.

Sample SQL

insert into a values(100);
set logical_replication to off;
insert into a values(200);
reset logical_replication;
insert into a values(300);

pg_recvlogical output(the second is not output.)
BEGIN 492
table public.a: INSERT: col1[integer]:100
COMMIT 492
BEGIN 494
table public.a: INSERT: col1[integer]:300
COMMIT 494

I'm not sure this is the most appropriate way. What do you think?

Regards,
Quan Zongliang

logi_replication.patch (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add a GUC variable that control logical replication

Peter Eisentraut-6
On 2019-09-18 10:39, Quan Zongliang wrote:
> Sybase has a feature to turn off replication at the session level: set
> replication = off, which can be temporarily turned off when there is a
> maintenance action on the table. Our users also want this feature.

These kinds of feature requests are always dubious because just because
Sybase behaves this way for some implementation or architectural reason
doesn't necessarily mean it makes sense for PostgreSQL too.

Why do you need to turn off replication when there is "maintenance" on a
table?  What does that even mean?

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Add a GUC variable that control logical replication

Quan Zongliang-3
On 2019/9/18 17:11, Peter Eisentraut wrote:
> On 2019-09-18 10:39, Quan Zongliang wrote:
>> Sybase has a feature to turn off replication at the session level: set
>> replication = off, which can be temporarily turned off when there is a
>> maintenance action on the table. Our users also want this feature.
>
> These kinds of feature requests are always dubious because just because
> Sybase behaves this way for some implementation or architectural reason
> doesn't necessarily mean it makes sense for PostgreSQL too.
>
Agree
> Why do you need to turn off replication when there is "maintenance" on a
> table?  What does that even mean?
>
In a table, the user only keep data for a period of time and delete
expired records every day, involving about 10 million to 20 million
records at a time. They want to not pass similar bulk operations in
logical replication.
My English is bad that I use the wrong word “maintenance” in my description.



Reply | Threaded
Open this post in threaded view
|

Re: Add a GUC variable that control logical replication

Peter Eisentraut-6
On 2019-09-18 11:33, Quan Zongliang wrote:
> On 2019/9/18 17:11, Peter Eisentraut wrote:
>> Why do you need to turn off replication when there is "maintenance" on a
>> table?  What does that even mean?
>>
> In a table, the user only keep data for a period of time and delete
> expired records every day, involving about 10 million to 20 million
> records at a time. They want to not pass similar bulk operations in
> logical replication.

You can probably achieve that using ALTER PUBLICATION to disable
publication of deletes or truncates, as the case may be, either
permanently or just for the duration of the operations you want to skip.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Add a GUC variable that control logical replication

Euler Taveira
Em sáb, 19 de out de 2019 às 14:11, Peter Eisentraut
<[hidden email]> escreveu:

>
> On 2019-09-18 11:33, Quan Zongliang wrote:
> > On 2019/9/18 17:11, Peter Eisentraut wrote:
> >> Why do you need to turn off replication when there is "maintenance" on a
> >> table?  What does that even mean?
> >>
> > In a table, the user only keep data for a period of time and delete
> > expired records every day, involving about 10 million to 20 million
> > records at a time. They want to not pass similar bulk operations in
> > logical replication.
>
> You can probably achieve that using ALTER PUBLICATION to disable
> publication of deletes or truncates, as the case may be, either
> permanently or just for the duration of the operations you want to skip.
>
... then you are skipping all tables in the publication. I think this
feature is not essential for unidirectional logical replication.
However, it is important for multi-master replication. Data
synchronization tool will generate transactions with rows that are
already in the other node(s) so those transactions can't be replicated
to avoid (expensive) conflicts.


--
   Euler Taveira                                   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


Reply | Threaded
Open this post in threaded view
|

Re: Add a GUC variable that control logical replication

Peter Eisentraut-6
On 2019-10-20 00:23, Euler Taveira wrote:
>> You can probably achieve that using ALTER PUBLICATION to disable
>> publication of deletes or truncates, as the case may be, either
>> permanently or just for the duration of the operations you want to skip.
>>
> ... then you are skipping all tables in the publication.

You can group tables into different publications and set the
subscription to subscribe to multiple publications if you need this kind
of granularity.

In any case, this kind of thing needs to be handled by the decoding
plugin based on its configuration policies and depending on its needs.
For example, let's say you have two decoding plugins running: one for a
replication system and one for writing an audit log.  It would not be
appropriate to disable logging for both of them because of some
performance optimization for one of them.  And it would also not be
appropriate to do this with a USERSET setting.

If we need different hooks or more DDL commands do this better, then
that can be considered.  But this seems to be the wrong way to do it.

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Add a GUC variable that control logical replication

Quan Zongliang-3
On 2019/11/1 20:49, Peter Eisentraut wrote:

> On 2019-10-20 00:23, Euler Taveira wrote:
>>> You can probably achieve that using ALTER PUBLICATION to disable
>>> publication of deletes or truncates, as the case may be, either
>>> permanently or just for the duration of the operations you want to skip.
>>>
>> ... then you are skipping all tables in the publication.
>
> You can group tables into different publications and set the
> subscription to subscribe to multiple publications if you need this kind
> of granularity.
>
> In any case, this kind of thing needs to be handled by the decoding
> plugin based on its configuration policies and depending on its needs.
> For example, let's say you have two decoding plugins running: one for a
> replication system and one for writing an audit log.  It would not be
> appropriate to disable logging for both of them because of some
> performance optimization for one of them.  And it would also not be
> appropriate to do this with a USERSET setting.
>
> If we need different hooks or more DDL commands do this better, then
> that can be considered.  But this seems to be the wrong way to do it.
>

What the user needs is the same replication link that selectively skips
some transactions. And this choice only affects transactions that are
doing bulk delete sessions. The operations of other sessions are not
affected and can continue to output replication messages.
For example, session 1 wants to bulk delete 1 million old data from the
T1 table, which can be done without replication. At the same time,
session 2 deletes 10 records from T1, which is expected to be passed on
through replication.
Therefore, the two decoders can not meet this requirement. It is also
inappropriate to temporarily disable subscriptions because it skips all
transactions for a certain period of time.

--
权宗亮
神州飞象(北京)数据科技有限公司
我们的力量源自最先进的开源数据库PostgreSQL
[hidden email]



Reply | Threaded
Open this post in threaded view
|

Re: Add a GUC variable that control logical replication

Craig Ringer-3
In reply to this post by Quan Zongliang-3
On Wed, 18 Sep 2019 at 16:39, Quan Zongliang <[hidden email]> wrote:

Sybase has a feature to turn off replication at the session level: set
replication = off, which can be temporarily turned off when there is a
maintenance action on the table. Our users also want this feature.
I add a new flag bit in xinfo, control it with a session-level variable,
when set to true, this flag is written when the transaction is
committed, and when the logic is decoded it abandons the transaction
like aborted transactions. Since PostgreSQL has two types of
replication, I call the variable "logical_replication" to avoid
confusion and default value is true.

There's something related to this already. You can set the replication origin for the transaction to the special value DoNotReplicateId (replication origin id 65535). This will suppress replication of the transaction, at least for output plugins that're aware of replication origins.

This isn't presently exposed to SQL, it's there for the use of logical replication extensions. It's possible to expose it with a pretty trivial C function in an extension.

I think it's a bit of a hack TBH, it's something I perpetrated sometime in the 9.4 series when we needed a way to suppress replication of individual transactions. It originated out of core, so the original design was constrained in how it worked, and maybe it would've actually made more sense to use an xlinfo flag. Probably not worth changing now though.

Be extremely careful though. If you're hiding things from logical replication you can get all sorts of confusing and exciting results. I very strongly suggest you make anything like this superuser-only.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise
Reply | Threaded
Open this post in threaded view
|

Re: Add a GUC variable that control logical replication

Michael Paquier-2
In reply to this post by Quan Zongliang-3
On Wed, Nov 06, 2019 at 10:01:43PM +0800, Quan Zongliang wrote:

> What the user needs is the same replication link that selectively skips some
> transactions. And this choice only affects transactions that are doing bulk
> delete sessions. The operations of other sessions are not affected and can
> continue to output replication messages.
> For example, session 1 wants to bulk delete 1 million old data from the T1
> table, which can be done without replication. At the same time, session 2
> deletes 10 records from T1, which is expected to be passed on through
> replication.
> Therefore, the two decoders can not meet this requirement. It is also
> inappropriate to temporarily disable subscriptions because it skips all
> transactions for a certain period of time.
Hmm.  The patch discussed on this thread does not have much support
from Peter and Craig, so I am marking it as RwF.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add a GUC variable that control logical replication

Craig Ringer-3
On Thu, 28 Nov 2019 at 11:53, Michael Paquier <[hidden email]> wrote:
On Wed, Nov 06, 2019 at 10:01:43PM +0800, Quan Zongliang wrote:
> What the user needs is the same replication link that selectively skips some
> transactions. And this choice only affects transactions that are doing bulk
> delete sessions. The operations of other sessions are not affected and can
> continue to output replication messages.
> For example, session 1 wants to bulk delete 1 million old data from the T1
> table, which can be done without replication. At the same time, session 2
> deletes 10 records from T1, which is expected to be passed on through
> replication.
> Therefore, the two decoders can not meet this requirement. It is also
> inappropriate to temporarily disable subscriptions because it skips all
> transactions for a certain period of time.

Hmm.  The patch discussed on this thread does not have much support
from Peter and Craig, so I am marking it as RwF.


Yeah. I'm not against it as such. But I'd like to either see it work by exposing the ability to use DoNotReplicateId to SQL or if that's not satisfactory, potentially replace that mechanism with the newly added one and emulate DoNotReplicateId for BC.

I don't want two orthogonal ways to say "don't consider this for logical replication".
--
 Craig Ringer                   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise