Hey, all,
The configuration parameter max_replication_slots is most notably used to control how many replication slots can be created on a server, but it also controls how many replication origins can be tracked on the subscriber side. This is noted in the Configuration Settings section in the Logical Replication Chapter [1], but it is not mentioned in the documentation the parameter itself [2]. The attached patch adds an extra paragraph explaining its effect on subscribers. Using max_replication_slots for sizing the number available of replication origin states is a little odd, and is actually noted twice in the source code [3] [4]: > XXX: Should we use a separate variable to size this rather than > max_replication_slots? > XXX: max_replication_slots is arguably the wrong thing to use, as here > we keep the replay state of *remote* transactions. But for now it > seems sufficient to reuse it, rather than introduce a separate GUC. This is a different usage of max_replication_slots than originally intended, managing resource usage on the subscriber side, rather than the provider side. This manifests itself in the awkwardness of the documentation, where max_replication_slots is only listed in the Sending Server section, and not mentioned in the Subscribers section. Given this, I think introducing a new parameter would make sense (max_replication_origins? slightly confusing because there's no limit on the number of records in pg_replication_origins; tracking of replication origins is displayed in pg_replication_origin_status). I'd be happy to make a patch for a new GUC parameter, if people think it's worth it to separate the functionality. Until then, however, the addition to the documentation should help prevent confusion. - Paul [1]: https://www.postgresql.org/docs/13/logical-replication-config.html [2]: https://www.postgresql.org/docs/13/runtime-config-replication.html#GUC-MAX-REPLICATION-SLOTS [3]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/origin.c;h=685eaa6134e7cad193b583ff28284d877a6d8055;hb=HEAD#l162 [4]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/origin.c;h=685eaa6134e7cad193b583ff28284d877a6d8055;hb=HEAD#l495 |
Hey, all, I went ahead and made a patch for introducing a new GUC variable, max_replication_origins, to replace the awkward re-use of max_replication_slots. I'm mostly indifferent whether a new GUC variable is necessary, or simply just updating the existing documentation (the first patch I sent) is sufficient, but one of them should definitely be done to clear up the confusion. - Paul On Tue, Feb 16, 2021 at 1:03 PM Paul Martinez <[hidden email]> wrote: Hey, all, |
On Thu, Feb 25, 2021 at 2:19 AM Paul Martinez <[hidden email]> wrote:
> > Hey, all, > > I went ahead and made a patch for introducing a new GUC variable, > max_replication_origins, to replace the awkward re-use of > max_replication_slots. > > I'm mostly indifferent whether a new GUC variable is necessary, or > simply just updating the existing documentation (the first patch I > sent) is sufficient, but one of them should definitely be done to > clear up the confusion. > +1. I also think one of them is required. I think users who are using cascaded replication (means subscribers are also publishers), setting this parameter might be a bit confusing and difficult. Anybody else has an opinion on this matter? For docs only patch, I have few suggestions: 1. On page [1], it is not very clear that we are suggesting to set max_replication_slots for origins whereas your new doc patch has clarified it, can we update the other page as well. 2. Setting it a lower value than the current + number of tracked replication origins (reflected in + <link linkend="view-pg-replication-origin-status">pg_replication_origin_status</link>, + not <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>) + will prevent the server from starting. + </para> Why can't we just mention pg_replication_origin above? [1] - https://www.postgresql.org/docs/13/logical-replication-config.html -- With Regards, Amit Kapila. |
On Thu, Feb 25, 2021 at 5:31 AM Amit Kapila <[hidden email]> wrote:
> > For docs only patch, I have few suggestions: > 1. On page [1], it is not very clear that we are suggesting to set > max_replication_slots for origins whereas your new doc patch has > clarified it, can we update the other page as well. Sorry, what other page are you referring to? > 2. > Setting it a lower value than the current > + number of tracked replication origins (reflected in > + <link > linkend="view-pg-replication-origin-status">pg_replication_origin_status</link>, > + not <link > linkend="catalog-pg-replication-origin">pg_replication_origin</link>) > + will prevent the server from starting. > + </para> > > Why can't we just mention pg_replication_origin above? > So this is slightly confusing: pg_replication_origin just contains mappings from origin names to oids. It is regular catalog table and has no limit on its size. Users can also manually insert rows into this table. https://www.postgresql.org/docs/13/catalog-pg-replication-origin.html The view showing the in-memory information is actually pg_replication_origin_status. The number of entries here is what is actually constrained by the GUC parameter. https://www.postgresql.org/docs/13/view-pg-replication-origin-status.html I clarified pointing to pg_replication_origin_status because it could in theory be out of sync with pg_replication_origin. I'm actually not sure how entries there are managed. Perhaps if you were replicating from one database and then stopped and started replicating from another database you'd have two replication origins, but only one replication origin status? This also brings up a point regarding the naming of the added GUC. max_replication_origins is cleanest, but has this confusion regarding pg_replication_origin vs. pg_replication_origin_status. max_replication_origin_statuses is weird (and long). max_tracked_replication_origins is a possibility? (One last bit of naming confusion; the internal code refers to them as ReplicationStates, rather than ReplicationOrigins or ReplicationOriginStatuses, or something like that.) - Paul |
On Fri, Feb 26, 2021 at 1:53 AM Paul Martinez <[hidden email]> wrote:
> > On Thu, Feb 25, 2021 at 5:31 AM Amit Kapila <[hidden email]> wrote: > > > > For docs only patch, I have few suggestions: > > 1. On page [1], it is not very clear that we are suggesting to set > > max_replication_slots for origins whereas your new doc patch has > > clarified it, can we update the other page as well. > > Sorry, what other page are you referring to? > https://www.postgresql.org/docs/devel/logical-replication-config.html > > > 2. > > Setting it a lower value than the current > > + number of tracked replication origins (reflected in > > + <link > > linkend="view-pg-replication-origin-status">pg_replication_origin_status</link>, > > + not <link > > linkend="catalog-pg-replication-origin">pg_replication_origin</link>) > > + will prevent the server from starting. > > + </para> > > > > Why can't we just mention pg_replication_origin above? > > > > So this is slightly confusing: > > pg_replication_origin just contains mappings from origin names to oids. > It is regular catalog table and has no limit on its size. Users can also > manually insert rows into this table. > > https://www.postgresql.org/docs/13/catalog-pg-replication-origin.html > > The view showing the in-memory information is actually > pg_replication_origin_status. The number of entries here is what is > actually constrained by the GUC parameter. > Okay, that makes sense. However, I have sent a patch today (see [1]) where I have slightly updated the subscriber-side configuration paragraph. From PG-14 onwards, table synchronization workers also use origins on subscribers, so you might want to adjust. > > > This also brings up a point regarding the naming of the added GUC. > max_replication_origins is cleanest, but has this confusion regarding > pg_replication_origin vs. pg_replication_origin_status. > max_replication_origin_statuses is weird (and long). > max_tracked_replication_origins is a possibility? > or maybe max_replication_origin_states. I guess we can leave adding GUC to some other day as that might require a bit broader acceptance and we are already near to the start of last CF. I think we can still consider it if we few more people share the same opinion as yours. [1] - https://www.postgresql.org/message-id/CAA4eK1KkbppndxxRKbaT2sXrLkdPwy44F4pjEZ0EDrVjD9MPjQ%40mail.gmail.com -- With Regards, Amit Kapila. |
On Fri, Feb 26, 2021 at 5:22 AM Amit Kapila <[hidden email]> wrote:
> > https://www.postgresql.org/docs/devel/logical-replication-config.html > Ah, yep. I added a clause to the end of the sentence to clarify why we're using max_replication_slots here: - The subscriber also requires the max_replication_slots to be set. + The subscriber also requires that max_replication_slots be set to + configure how many replication origins can be tracked. > > Okay, that makes sense. However, I have sent a patch today (see [1]) > where I have slightly updated the subscriber-side configuration > paragraph. From PG-14 onwards, table synchronization workers also use > origins on subscribers, so you might want to adjust. > > ... > > I guess we can leave adding GUC to some other day as that might > require a bit broader acceptance and we are already near to the start > of last CF. I think we can still consider it if we few more people > share the same opinion as yours. > doc-only patch get merged. Should I add it to the March CF? Separate question: are documentation updates like these ever backported to older versions that are still supported? And if so, would the changes be reflected immediately, or would they require a minor point release? When I was on an older release I found that I'd jump back and forth between the version I was using and the latest version to see if anything had changed. - Paul |
On Sat, Feb 27, 2021 at 2:47 AM Paul Martinez <[hidden email]> wrote:
> > On Fri, Feb 26, 2021 at 5:22 AM Amit Kapila <[hidden email]> wrote: > > > > https://www.postgresql.org/docs/devel/logical-replication-config.html > > > > Ah, yep. I added a clause to the end of the sentence to clarify why we're > using max_replication_slots here: > > - The subscriber also requires the max_replication_slots to be set. > > + The subscriber also requires that max_replication_slots be set to > + configure how many replication origins can be tracked. > LGTM. > > > > Okay, that makes sense. However, I have sent a patch today (see [1]) > > where I have slightly updated the subscriber-side configuration > > paragraph. From PG-14 onwards, table synchronization workers also use > > origins on subscribers, so you might want to adjust. > > > > ... > > > > I guess we can leave adding GUC to some other day as that might > > require a bit broader acceptance and we are already near to the start > > of last CF. I think we can still consider it if we few more people > > share the same opinion as yours. > > > > Great. I'll wait to update the GUC patch until your patch and/or my > doc-only patch get merged. Should I add it to the March CF? > Which patch are you asking about doc-patch or GUC one? If you are asking for a doc-patch, then I don't think it is required, I'll take care of this sometime next week. For the GUC patch, my suggestion would be to propose for v15 with an appropriate use-case. At this point (just before the last CF of release), people are mostly busy with patches that are going on for a long time so this might not get due attention unless few people show-up and say it is important. However, it is up to you, if you want feel free to register your GUC patch in the upcoming CF. > Separate question: are documentation updates like these ever backported > to older versions that are still supported? > Not every doc-change is back-ported but I think it is good to backport the user-visible ones. It is on a case-by-case basis. For this, I think we can backport unless you or others feel otherwise? > And if so, would the changes > be reflected immediately, or would they require a minor point release? > Where you are referring to the docs? If you are checking from code, it will be reflected immediately. -- With Regards, Amit Kapila. |
On Sat, Feb 27, 2021 at 2:35 PM Amit Kapila <[hidden email]> wrote:
> > On Sat, Feb 27, 2021 at 2:47 AM Paul Martinez <[hidden email]> wrote: > > > > On Fri, Feb 26, 2021 at 5:22 AM Amit Kapila <[hidden email]> wrote: > > > > > > https://www.postgresql.org/docs/devel/logical-replication-config.html > > > > > > > Ah, yep. I added a clause to the end of the sentence to clarify why we're > > using max_replication_slots here: > > > > - The subscriber also requires the max_replication_slots to be set. > > > > + The subscriber also requires that max_replication_slots be set to > > + configure how many replication origins can be tracked. > > > > LGTM. > backpatch this patch as this clarifies the already existing behavior. Do let me know if you or others think otherwise? -- With Regards, Amit Kapila. |
On Mon, Mar 1, 2021 at 5:32 PM Amit Kapila <[hidden email]> wrote:
> > On Sat, Feb 27, 2021 at 2:35 PM Amit Kapila <[hidden email]> wrote: > > > > On Sat, Feb 27, 2021 at 2:47 AM Paul Martinez <[hidden email]> wrote: > > > > > > On Fri, Feb 26, 2021 at 5:22 AM Amit Kapila <[hidden email]> wrote: > > > > > > > > https://www.postgresql.org/docs/devel/logical-replication-config.html > > > > > > > > > > Ah, yep. I added a clause to the end of the sentence to clarify why we're > > > using max_replication_slots here: > > > > > > - The subscriber also requires the max_replication_slots to be set. > > > > > > + The subscriber also requires that max_replication_slots be set to > > > + configure how many replication origins can be tracked. > > > > > > > LGTM. > > > > The rebased version attached. > Pushed! -- With Regards, Amit Kapila. |
Free forum by Nabble | Edit this page |