[PATCH] Note effect of max_replication_slots on subscriber side in documentation.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[PATCH] Note effect of max_replication_slots on subscriber side in documentation.

Paul Martinez-2
Hey, all,

The configuration parameter max_replication_slots is most notably used
to control how many replication slots can be created on a server, but it
also controls how many replication origins can be tracked on the
subscriber side.

This is noted in the Configuration Settings section in the Logical
Replication Chapter [1], but it is not mentioned in the documentation
the parameter itself [2].

The attached patch adds an extra paragraph explaining its effect on
subscribers.


Using max_replication_slots for sizing the number available of
replication origin states is a little odd, and is actually noted twice
in the source code [3] [4]:

> XXX: Should we use a separate variable to size this rather than
> max_replication_slots?

> XXX: max_replication_slots is arguably the wrong thing to use, as here
> we keep the replay state of *remote* transactions. But for now it
> seems sufficient to reuse it, rather than introduce a separate GUC.

This is a different usage of max_replication_slots than originally
intended, managing resource usage on the subscriber side, rather than
the provider side. This manifests itself in the awkwardness of the
documentation, where max_replication_slots is only listed in the Sending
Server section, and not mentioned in the Subscribers section.

Given this, I think introducing a new parameter would make sense
(max_replication_origins? slightly confusing because there's no limit on
the number of records in pg_replication_origins; tracking of replication
origins is displayed in pg_replication_origin_status).

I'd be happy to make a patch for a new GUC parameter, if people think
it's worth it to separate the functionality. Until then, however, the
addition to the documentation should help prevent confusion.


- Paul

[1]: https://www.postgresql.org/docs/13/logical-replication-config.html
[2]: https://www.postgresql.org/docs/13/runtime-config-replication.html#GUC-MAX-REPLICATION-SLOTS
[3]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/origin.c;h=685eaa6134e7cad193b583ff28284d877a6d8055;hb=HEAD#l162
[4]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/origin.c;h=685eaa6134e7cad193b583ff28284d877a6d8055;hb=HEAD#l495

max_replication_slots_subscriber_doc_v00.diff (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Note effect of max_replication_slots on subscriber side in documentation.

Paul Martinez-2
Hey, all,

I went ahead and made a patch for introducing a new GUC variable,
max_replication_origins, to replace the awkward re-use of
max_replication_slots.

I'm mostly indifferent whether a new GUC variable is necessary, or
simply just updating the existing documentation (the first patch I
sent) is sufficient, but one of them should definitely be done to
clear up the confusion.

- Paul

On Tue, Feb 16, 2021 at 1:03 PM Paul Martinez <[hidden email]> wrote:
Hey, all,

The configuration parameter max_replication_slots is most notably used
to control how many replication slots can be created on a server, but it
also controls how many replication origins can be tracked on the
subscriber side.

This is noted in the Configuration Settings section in the Logical
Replication Chapter [1], but it is not mentioned in the documentation
the parameter itself [2].

The attached patch adds an extra paragraph explaining its effect on
subscribers.


Using max_replication_slots for sizing the number available of
replication origin states is a little odd, and is actually noted twice
in the source code [3] [4]:

> XXX: Should we use a separate variable to size this rather than
> max_replication_slots?

> XXX: max_replication_slots is arguably the wrong thing to use, as here
> we keep the replay state of *remote* transactions. But for now it
> seems sufficient to reuse it, rather than introduce a separate GUC.

This is a different usage of max_replication_slots than originally
intended, managing resource usage on the subscriber side, rather than
the provider side. This manifests itself in the awkwardness of the
documentation, where max_replication_slots is only listed in the Sending
Server section, and not mentioned in the Subscribers section.

Given this, I think introducing a new parameter would make sense
(max_replication_origins? slightly confusing because there's no limit on
the number of records in pg_replication_origins; tracking of replication
origins is displayed in pg_replication_origin_status).

I'd be happy to make a patch for a new GUC parameter, if people think
it's worth it to separate the functionality. Until then, however, the
addition to the documentation should help prevent confusion.


- Paul

[1]: https://www.postgresql.org/docs/13/logical-replication-config.html
[2]: https://www.postgresql.org/docs/13/runtime-config-replication.html#GUC-MAX-REPLICATION-SLOTS
[3]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/origin.c;h=685eaa6134e7cad193b583ff28284d877a6d8055;hb=HEAD#l162
[4]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/origin.c;h=685eaa6134e7cad193b583ff28284d877a6d8055;hb=HEAD#l495

max_replication_origins_v00.diff (17K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Note effect of max_replication_slots on subscriber side in documentation.

akapila
On Thu, Feb 25, 2021 at 2:19 AM Paul Martinez <[hidden email]> wrote:

>
> Hey, all,
>
> I went ahead and made a patch for introducing a new GUC variable,
> max_replication_origins, to replace the awkward re-use of
> max_replication_slots.
>
> I'm mostly indifferent whether a new GUC variable is necessary, or
> simply just updating the existing documentation (the first patch I
> sent) is sufficient, but one of them should definitely be done to
> clear up the confusion.
>

+1. I also think one of them is required. I think users who are using
cascaded replication (means subscribers are also publishers), setting
this parameter might be a bit confusing and difficult. Anybody else
has an opinion on this matter?

For docs only patch, I have few suggestions:
1. On page [1], it is not very clear that we are suggesting to set
max_replication_slots for origins whereas your new doc patch has
clarified it, can we update the other page as well.
2.
Setting it a lower value than the current
+         number of tracked replication origins (reflected in
+         <link
linkend="view-pg-replication-origin-status">pg_replication_origin_status</link>,
+         not <link
linkend="catalog-pg-replication-origin">pg_replication_origin</link>)
+         will prevent the server from starting.
+        </para>

Why can't we just mention pg_replication_origin above?

[1] - https://www.postgresql.org/docs/13/logical-replication-config.html

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Note effect of max_replication_slots on subscriber side in documentation.

Paul Martinez-2
On Thu, Feb 25, 2021 at 5:31 AM Amit Kapila <[hidden email]> wrote:
>
> For docs only patch, I have few suggestions:
> 1. On page [1], it is not very clear that we are suggesting to set
> max_replication_slots for origins whereas your new doc patch has
> clarified it, can we update the other page as well.

Sorry, what other page are you referring to?


> 2.
> Setting it a lower value than the current
> +         number of tracked replication origins (reflected in
> +         <link
> linkend="view-pg-replication-origin-status">pg_replication_origin_status</link>,
> +         not <link
> linkend="catalog-pg-replication-origin">pg_replication_origin</link>)
> +         will prevent the server from starting.
> +        </para>
>
> Why can't we just mention pg_replication_origin above?
>

So this is slightly confusing:

pg_replication_origin just contains mappings from origin names to oids.
It is regular catalog table and has no limit on its size. Users can also
manually insert rows into this table.

https://www.postgresql.org/docs/13/catalog-pg-replication-origin.html

The view showing the in-memory information is actually
pg_replication_origin_status. The number of entries here is what is
actually constrained by the GUC parameter.

https://www.postgresql.org/docs/13/view-pg-replication-origin-status.html

I clarified pointing to pg_replication_origin_status because it could in
theory be out of sync with pg_replication_origin. I'm actually not sure
how entries there are managed. Perhaps if you were replicating from one
database and then stopped and started replicating from another database
you'd have two replication origins, but only one replication origin
status?


This also brings up a point regarding the naming of the added GUC.
max_replication_origins is cleanest, but has this confusion regarding
pg_replication_origin vs. pg_replication_origin_status.
max_replication_origin_statuses is weird (and long).
max_tracked_replication_origins is a possibility?

(One last bit of naming confusion; the internal code refers to them as
ReplicationStates, rather than ReplicationOrigins or
ReplicationOriginStatuses, or something like that.)


- Paul


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Note effect of max_replication_slots on subscriber side in documentation.

akapila
On Fri, Feb 26, 2021 at 1:53 AM Paul Martinez <[hidden email]> wrote:

>
> On Thu, Feb 25, 2021 at 5:31 AM Amit Kapila <[hidden email]> wrote:
> >
> > For docs only patch, I have few suggestions:
> > 1. On page [1], it is not very clear that we are suggesting to set
> > max_replication_slots for origins whereas your new doc patch has
> > clarified it, can we update the other page as well.
>
> Sorry, what other page are you referring to?
>

https://www.postgresql.org/docs/devel/logical-replication-config.html

>
> > 2.
> > Setting it a lower value than the current
> > +         number of tracked replication origins (reflected in
> > +         <link
> > linkend="view-pg-replication-origin-status">pg_replication_origin_status</link>,
> > +         not <link
> > linkend="catalog-pg-replication-origin">pg_replication_origin</link>)
> > +         will prevent the server from starting.
> > +        </para>
> >
> > Why can't we just mention pg_replication_origin above?
> >
>
> So this is slightly confusing:
>
> pg_replication_origin just contains mappings from origin names to oids.
> It is regular catalog table and has no limit on its size. Users can also
> manually insert rows into this table.
>
> https://www.postgresql.org/docs/13/catalog-pg-replication-origin.html
>
> The view showing the in-memory information is actually
> pg_replication_origin_status. The number of entries here is what is
> actually constrained by the GUC parameter.
>

Okay, that makes sense. However, I have sent a patch today (see [1])
where I have slightly updated the subscriber-side configuration
paragraph. From PG-14 onwards, table synchronization workers also use
origins on subscribers, so you might want to adjust.

>
>
> This also brings up a point regarding the naming of the added GUC.
> max_replication_origins is cleanest, but has this confusion regarding
> pg_replication_origin vs. pg_replication_origin_status.
> max_replication_origin_statuses is weird (and long).
> max_tracked_replication_origins is a possibility?
>

or maybe max_replication_origin_states. I guess we can leave adding
GUC to some other day as that might require a bit broader acceptance
and we are already near to the start of last CF. I think we can still
consider it if we few more people share the same opinion as yours.

[1] - https://www.postgresql.org/message-id/CAA4eK1KkbppndxxRKbaT2sXrLkdPwy44F4pjEZ0EDrVjD9MPjQ%40mail.gmail.com

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Note effect of max_replication_slots on subscriber side in documentation.

Paul Martinez-2
On Fri, Feb 26, 2021 at 5:22 AM Amit Kapila <[hidden email]> wrote:
>
> https://www.postgresql.org/docs/devel/logical-replication-config.html
>

Ah, yep. I added a clause to the end of the sentence to clarify why we're
using max_replication_slots here:

- The subscriber also requires the max_replication_slots to be set.

+ The subscriber also requires that max_replication_slots be set to
+ configure how many replication origins can be tracked.

>
> Okay, that makes sense. However, I have sent a patch today (see [1])
> where I have slightly updated the subscriber-side configuration
> paragraph. From PG-14 onwards, table synchronization workers also use
> origins on subscribers, so you might want to adjust.
>
> ...
>
> I guess we can leave adding GUC to some other day as that might
> require a bit broader acceptance and we are already near to the start
> of last CF. I think we can still consider it if we few more people
> share the same opinion as yours.
>
Great. I'll wait to update the GUC patch until your patch and/or my
doc-only patch get merged. Should I add it to the March CF?

Separate question: are documentation updates like these ever backported
to older versions that are still supported? And if so, would the changes
be reflected immediately, or would they require a minor point release?
When I was on an older release I found that I'd jump back and forth
between the version I was using and the latest version to see if
anything had changed.


- Paul

max_replication_slots_subscriber_doc_v01.diff (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Note effect of max_replication_slots on subscriber side in documentation.

akapila
On Sat, Feb 27, 2021 at 2:47 AM Paul Martinez <[hidden email]> wrote:

>
> On Fri, Feb 26, 2021 at 5:22 AM Amit Kapila <[hidden email]> wrote:
> >
> > https://www.postgresql.org/docs/devel/logical-replication-config.html
> >
>
> Ah, yep. I added a clause to the end of the sentence to clarify why we're
> using max_replication_slots here:
>
> - The subscriber also requires the max_replication_slots to be set.
>
> + The subscriber also requires that max_replication_slots be set to
> + configure how many replication origins can be tracked.
>

LGTM.

> >
> > Okay, that makes sense. However, I have sent a patch today (see [1])
> > where I have slightly updated the subscriber-side configuration
> > paragraph. From PG-14 onwards, table synchronization workers also use
> > origins on subscribers, so you might want to adjust.
> >
> > ...
> >
> > I guess we can leave adding GUC to some other day as that might
> > require a bit broader acceptance and we are already near to the start
> > of last CF. I think we can still consider it if we few more people
> > share the same opinion as yours.
> >
>
> Great. I'll wait to update the GUC patch until your patch and/or my
> doc-only patch get merged. Should I add it to the March CF?
>

Which patch are you asking about doc-patch or GUC one? If you are
asking for a doc-patch, then I don't think it is required, I'll take
care of this sometime next week. For the GUC patch, my suggestion
would be to propose for v15 with an appropriate use-case. At this
point (just before the last CF of release), people are mostly busy
with patches that are going on for a long time so this might not get
due attention unless few people show-up and say it is important.
However, it is up to you, if you want feel free to register your GUC
patch in the upcoming CF.

> Separate question: are documentation updates like these ever backported
> to older versions that are still supported?
>

Not every doc-change is back-ported but I think it is good to backport
the user-visible ones. It is on a case-by-case basis. For this, I
think we can backport unless you or others feel otherwise?

> And if so, would the changes
> be reflected immediately, or would they require a minor point release?
>

Where you are referring to the docs? If you are checking from code, it
will be reflected immediately.

--
With Regards,
Amit Kapila.


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Note effect of max_replication_slots on subscriber side in documentation.

akapila
On Sat, Feb 27, 2021 at 2:35 PM Amit Kapila <[hidden email]> wrote:

>
> On Sat, Feb 27, 2021 at 2:47 AM Paul Martinez <[hidden email]> wrote:
> >
> > On Fri, Feb 26, 2021 at 5:22 AM Amit Kapila <[hidden email]> wrote:
> > >
> > > https://www.postgresql.org/docs/devel/logical-replication-config.html
> > >
> >
> > Ah, yep. I added a clause to the end of the sentence to clarify why we're
> > using max_replication_slots here:
> >
> > - The subscriber also requires the max_replication_slots to be set.
> >
> > + The subscriber also requires that max_replication_slots be set to
> > + configure how many replication origins can be tracked.
> >
>
> LGTM.
>
The rebased version attached. As mentioned earlier, I think we can
backpatch this patch as this clarifies the already existing behavior.
Do let me know if you or others think otherwise?

--
With Regards,
Amit Kapila.

max_replication_slots_subscriber_doc_v02.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] Note effect of max_replication_slots on subscriber side in documentation.

akapila
On Mon, Mar 1, 2021 at 5:32 PM Amit Kapila <[hidden email]> wrote:

>
> On Sat, Feb 27, 2021 at 2:35 PM Amit Kapila <[hidden email]> wrote:
> >
> > On Sat, Feb 27, 2021 at 2:47 AM Paul Martinez <[hidden email]> wrote:
> > >
> > > On Fri, Feb 26, 2021 at 5:22 AM Amit Kapila <[hidden email]> wrote:
> > > >
> > > > https://www.postgresql.org/docs/devel/logical-replication-config.html
> > > >
> > >
> > > Ah, yep. I added a clause to the end of the sentence to clarify why we're
> > > using max_replication_slots here:
> > >
> > > - The subscriber also requires the max_replication_slots to be set.
> > >
> > > + The subscriber also requires that max_replication_slots be set to
> > > + configure how many replication origins can be tracked.
> > >
> >
> > LGTM.
> >
>
> The rebased version attached.
>

Pushed!

--
With Regards,
Amit Kapila.