min_safe_lsn column in pg_replication_slots view

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
52 messages Options
123
Reply | Threaded
Open this post in threaded view
|

min_safe_lsn column in pg_replication_slots view

Fujii Masao-4
Hi,

Per the docs, pg_replication_slots.min_safe_lsn inedicates "the minimum
LSN currently available for walsenders". When I executed pg_walfile_name()
with min_safe_lsn, the function returned the name of the last removed
WAL file instead of minimum available WAL file name. This happens because
min_safe_lsn actually indicates the ending position (the boundary byte)
of the last removed WAL file.

I guess that some users would want to calculate the minimum available
WAL file name from min_safe_lsn by using pg_walfile_name(), but the result
would be incorrect. Isn't this confusing? min_safe_lsn should indicate
the bondary byte + 1, instead?

BTW, I just wonder why each row in pg_replication_slots needs to have
min_safe_lsn column? Basically min_safe_lsn should be the same between
every replication slots.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Michael Paquier-2
On Mon, Jun 15, 2020 at 12:40:03PM +0900, Fujii Masao wrote:
> BTW, I just wonder why each row in pg_replication_slots needs to have
> min_safe_lsn column? Basically min_safe_lsn should be the same between
> every replication slots.

Indeed, that's confusing in its current shape.  I would buy putting
this value into pg_replication_slots if there were some consistency of
the data to worry about because of locking issues, but here this data
is controlled within info_lck, which is out of the the repslot LW
lock.  So I think that it is incorrect to put this data in this view
and that we should remove it, and that instead we had better push for
a system view that maps with the contents of XLogCtl.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Kyotaro Horiguchi-4
At Mon, 15 Jun 2020 13:44:31 +0900, Michael Paquier <[hidden email]> wrote in

> On Mon, Jun 15, 2020 at 12:40:03PM +0900, Fujii Masao wrote:
> > BTW, I just wonder why each row in pg_replication_slots needs to have
> > min_safe_lsn column? Basically min_safe_lsn should be the same between
> > every replication slots.
>
> Indeed, that's confusing in its current shape.  I would buy putting
> this value into pg_replication_slots if there were some consistency of
> the data to worry about because of locking issues, but here this data
> is controlled within info_lck, which is out of the the repslot LW
> lock.  So I think that it is incorrect to put this data in this view
> and that we should remove it, and that instead we had better push for
> a system view that maps with the contents of XLogCtl.
It was once the difference from the safe_lsn to restart_lsn which is
distinct among slots. Then it was changed to the safe_lsn. I agree to
the discussion above, but it is needed anywhere since no one can know
the margin until the slot goes to the "lost" state without it.  (Note
that currently even wal_status and min_safe_lsn can be inconsistent in
a line.)

Just for the need for table-consistency and in-line consistency, we
could just remember the result of XLogGetLastRemovedSegno() around
taking info_lock in the function.  That doesn't make a practical
difference but makes the view look consistent.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index d6fe205eb4..f57d67a900 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9498,7 +9498,7 @@ CreateRestartPoint(int flags)
  * * WALAVAIL_INVALID_LSN means the slot hasn't been set to reserve WAL.
  */
 WALAvailability
-GetWALAvailability(XLogRecPtr targetLSN)
+GetWALAvailability(XLogRecPtr targetLSN, XLogSegNo last_removed_seg)
 {
  XLogRecPtr currpos; /* current write LSN */
  XLogSegNo currSeg; /* segid of currpos */
@@ -9524,7 +9524,7 @@ GetWALAvailability(XLogRecPtr targetLSN)
  * the first WAL segment file since startup, which causes the status being
  * wrong under certain abnormal conditions but that doesn't actually harm.
  */
- oldestSeg = XLogGetLastRemovedSegno() + 1;
+ oldestSeg = last_removed_seg + 1;
 
  /* calculate oldest segment by max_wal_size and wal_keep_segments */
  XLByteToSeg(currpos, currSeg, wal_segment_size);
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 1b929a603e..bd2e3e84ed 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -243,6 +243,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  MemoryContext per_query_ctx;
  MemoryContext oldcontext;
  int slotno;
+ XLogSegNo last_removed_seg;
 
  /* check to see if caller supports us returning a tuplestore */
  if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
@@ -272,6 +273,14 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  rsinfo->setResult = tupstore;
  rsinfo->setDesc = tupdesc;
 
+ /*
+ * Remember the last removed segment at this point for the consistency in
+ * this table. Since there's no interlock between slot data and
+ * checkpointer, the segment can be removed in-between, but that doesn't
+ * make any practical difference.
+ */
+ last_removed_seg = XLogGetLastRemovedSegno();
+
  MemoryContextSwitchTo(oldcontext);
 
  LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
@@ -282,7 +291,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  Datum values[PG_GET_REPLICATION_SLOTS_COLS];
  bool nulls[PG_GET_REPLICATION_SLOTS_COLS];
  WALAvailability walstate;
- XLogSegNo last_removed_seg;
  int i;
 
  if (!slot->in_use)
@@ -342,7 +350,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  else
  nulls[i++] = true;
 
- walstate = GetWALAvailability(slot_contents.data.restart_lsn);
+ walstate = GetWALAvailability(slot_contents.data.restart_lsn,
+  last_removed_seg);
 
  switch (walstate)
  {
@@ -368,7 +377,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
  if (max_slot_wal_keep_size_mb >= 0 &&
  (walstate == WALAVAIL_NORMAL || walstate == WALAVAIL_RESERVED) &&
- ((last_removed_seg = XLogGetLastRemovedSegno()) != 0))
+ (last_removed_seg != 0))
  {
  XLogRecPtr min_safe_lsn;
 
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index e917dfe92d..c05a18148d 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -326,7 +326,8 @@ extern void ShutdownXLOG(int code, Datum arg);
 extern void InitXLOGAccess(void);
 extern void CreateCheckPoint(int flags);
 extern bool CreateRestartPoint(int flags);
-extern WALAvailability GetWALAvailability(XLogRecPtr restart_lsn);
+extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN,
+  XLogSegNo last_removed_seg);
 extern XLogRecPtr CalculateMaxmumSafeLSN(void);
 extern void XLogPutNextOid(Oid nextOid);
 extern XLogRecPtr XLogRestorePoint(const char *rpName);
Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Fujii Masao-4


On 2020/06/15 16:35, Kyotaro Horiguchi wrote:

> At Mon, 15 Jun 2020 13:44:31 +0900, Michael Paquier <[hidden email]> wrote in
>> On Mon, Jun 15, 2020 at 12:40:03PM +0900, Fujii Masao wrote:
>>> BTW, I just wonder why each row in pg_replication_slots needs to have
>>> min_safe_lsn column? Basically min_safe_lsn should be the same between
>>> every replication slots.
>>
>> Indeed, that's confusing in its current shape.  I would buy putting
>> this value into pg_replication_slots if there were some consistency of
>> the data to worry about because of locking issues, but here this data
>> is controlled within info_lck, which is out of the the repslot LW
>> lock.  So I think that it is incorrect to put this data in this view
>> and that we should remove it, and that instead we had better push for
>> a system view that maps with the contents of XLogCtl.

Agreed. But as you know it's too late to do that for v13...
So firstly I'd like to fix the issues in pg_replication_slots view,
and then we can improve the things later for v14 if necessary.


> It was once the difference from the safe_lsn to restart_lsn which is
> distinct among slots. Then it was changed to the safe_lsn. I agree to
> the discussion above, but it is needed anywhere since no one can know
> the margin until the slot goes to the "lost" state without it.  (Note
> that currently even wal_status and min_safe_lsn can be inconsistent in
> a line.)
>
> Just for the need for table-consistency and in-line consistency, we
> could just remember the result of XLogGetLastRemovedSegno() around
> taking info_lock in the function.  That doesn't make a practical
> difference but makes the view look consistent.

Agreed. Thanks for the patch. Here are the review comments:


Not only the last removed segment but also current write position
should be obtain at the beginning of pg_get_replication_slots()
and should be given to GetWALAvailability(), for the consistency?


Even after applying your patch, min_safe_lsn is calculated for
each slot even though the calculated result is always the same.
Which is a bit waste of cycle. We should calculate min_safe_lsn
at the beginning and use it for each slot?


                        XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,

Isn't it better to use 1 as the second argument of the above,
in order to address the issue that I reported upthread?
Otherwise, the WAL file name that pg_walfile_name(min_safe_lsn) returns
would be confusing.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Kyotaro Horiguchi-4
Thanks for the comments.

At Wed, 17 Jun 2020 21:37:55 +0900, Fujii Masao <[hidden email]> wrote in

> On 2020/06/15 16:35, Kyotaro Horiguchi wrote:
> > At Mon, 15 Jun 2020 13:44:31 +0900, Michael Paquier
> > <[hidden email]> wrote in
> >> On Mon, Jun 15, 2020 at 12:40:03PM +0900, Fujii Masao wrote:
> >>> BTW, I just wonder why each row in pg_replication_slots needs to have
> >>> min_safe_lsn column? Basically min_safe_lsn should be the same between
> >>> every replication slots.
> >>
> >> Indeed, that's confusing in its current shape.  I would buy putting
> >> this value into pg_replication_slots if there were some consistency of
> >> the data to worry about because of locking issues, but here this data
> >> is controlled within info_lck, which is out of the the repslot LW
> >> lock.  So I think that it is incorrect to put this data in this view
> >> and that we should remove it, and that instead we had better push for
> >> a system view that maps with the contents of XLogCtl.
>
> Agreed. But as you know it's too late to do that for v13...
> So firstly I'd like to fix the issues in pg_replication_slots view,
> and then we can improve the things later for v14 if necessary.
>
>
> > It was once the difference from the safe_lsn to restart_lsn which is
> > distinct among slots. Then it was changed to the safe_lsn. I agree to
> > the discussion above, but it is needed anywhere since no one can know
> > the margin until the slot goes to the "lost" state without it.  (Note
> > that currently even wal_status and min_safe_lsn can be inconsistent in
> > a line.)
> > Just for the need for table-consistency and in-line consistency, we
> > could just remember the result of XLogGetLastRemovedSegno() around
> > taking info_lock in the function.  That doesn't make a practical
> > difference but makes the view look consistent.
>
> Agreed. Thanks for the patch. Here are the review comments:
>
>
> Not only the last removed segment but also current write position
> should be obtain at the beginning of pg_get_replication_slots()
> and should be given to GetWALAvailability(), for the consistency?
You are right.  Though I faintly thought that I didn't need that since
WriteRecPtr doesn't move by so wide steps as removed_segment, actually
it moves.

> Even after applying your patch, min_safe_lsn is calculated for
> each slot even though the calculated result is always the same.
> Which is a bit waste of cycle. We should calculate min_safe_lsn
> at the beginning and use it for each slot?

Agreed. That may results in a wastful calculation but it's better than
repeated wasteful calculations.

> XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,
>
> Isn't it better to use 1 as the second argument of the above,
> in order to address the issue that I reported upthread?
> Otherwise, the WAL file name that pg_walfile_name(min_safe_lsn)
> returns
> would be confusing.

Mmm. pg_walfile_name seems too specialize to
pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
segment boundaries.)  I'm not willing to do that only to follow such
suspicious(?) specification, but surely it would practically be better
doing that. Please find the attached first patch.  I found that
there's no reason to hide min_safe_lsn when wal_status has certain
values. So I changed it to be shown always.

By the way, I noticed that when a replication slot reserves all
existing WAL segments, checkpoint cannot remove a file and
lastRemovedSegment is left being 0. The second attached forces
RemoveOldXlogFiles to initialize the variable even when no WAL
segments are removed.  It puts no additional loads on file system
since the directory is scanned anyway.  My old proposal to
unconditionally initialize it separately from checkpoint was rejected,
but I think this is acceptable.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

From 6742e3b31ac53ef54458d64e34feeac7d4c710d2 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <[hidden email]>
Date: Thu, 18 Jun 2020 13:36:23 +0900
Subject: [PATCH v2 1/2] Make pg_stat_replication view consistent

min_safe_lsn is not consistent within the result of a query. Make it
consistent. On the way fixing that min_safe_lsn is changed to show the
second byte of a segment instead of first byte in order to users can
get the intended segment file name from pg_walfile_name.
---
 src/backend/access/transam/xlog.c   | 15 +++++++------
 src/backend/replication/slotfuncs.c | 35 ++++++++++++++++++++---------
 src/include/access/xlog.h           |  4 +++-
 3 files changed, 35 insertions(+), 19 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 55cac186dc..940f5fcb18 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9484,6 +9484,9 @@ CreateRestartPoint(int flags)
  * Report availability of WAL for the given target LSN
  * (typically a slot's restart_lsn)
  *
+ * currWriteLSN and lastRemovedSeg is the results of XLogGetLastRemovedSegno()
+ * and GetXLogWriteRecPtr() respectively at the referring time.
+ *
  * Returns one of the following enum values:
  * * WALAVAIL_NORMAL means targetLSN is available because it is in the range
  *   of max_wal_size.
@@ -9498,9 +9501,9 @@ CreateRestartPoint(int flags)
  * * WALAVAIL_INVALID_LSN means the slot hasn't been set to reserve WAL.
  */
 WALAvailability
-GetWALAvailability(XLogRecPtr targetLSN)
+GetWALAvailability(XLogRecPtr targetLSN, XLogRecPtr currWriteLSN,
+   XLogSegNo lastRemovedSeg)
 {
- XLogRecPtr currpos; /* current write LSN */
  XLogSegNo currSeg; /* segid of currpos */
  XLogSegNo targetSeg; /* segid of targetLSN */
  XLogSegNo oldestSeg; /* actual oldest segid */
@@ -9513,21 +9516,19 @@ GetWALAvailability(XLogRecPtr targetLSN)
  if (XLogRecPtrIsInvalid(targetLSN))
  return WALAVAIL_INVALID_LSN;
 
- currpos = GetXLogWriteRecPtr();
-
  /* calculate oldest segment currently needed by slots */
  XLByteToSeg(targetLSN, targetSeg, wal_segment_size);
- KeepLogSeg(currpos, &oldestSlotSeg);
+ KeepLogSeg(currWriteLSN, &oldestSlotSeg);
 
  /*
  * Find the oldest extant segment file. We get 1 until checkpoint removes
  * the first WAL segment file since startup, which causes the status being
  * wrong under certain abnormal conditions but that doesn't actually harm.
  */
- oldestSeg = XLogGetLastRemovedSegno() + 1;
+ oldestSeg = lastRemovedSeg + 1;
 
  /* calculate oldest segment by max_wal_size and wal_keep_segments */
- XLByteToSeg(currpos, currSeg, wal_segment_size);
+ XLByteToSeg(currWriteLSN, currSeg, wal_segment_size);
  keepSegs = ConvertToXSegs(Max(max_wal_size_mb, wal_keep_segments),
   wal_segment_size) + 1;
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 1b929a603e..063c6dd435 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -243,6 +243,9 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  MemoryContext per_query_ctx;
  MemoryContext oldcontext;
  int slotno;
+ XLogSegNo last_removed_seg;
+ XLogRecPtr curr_write_lsn;
+ XLogRecPtr min_safe_lsn = 0;
 
  /* check to see if caller supports us returning a tuplestore */
  if (rsinfo == NULL || !IsA(rsinfo, ReturnSetInfo))
@@ -272,6 +275,24 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  rsinfo->setResult = tupstore;
  rsinfo->setDesc = tupdesc;
 
+ /*
+ * Remember the last removed segment and current WAL write LSN at this
+ * point for the consistency in this table. Since there's no interlock
+ * between slot data and checkpointer, the segment can be removed
+ * in-between, but that doesn't make any practical difference.  Also we
+ * calculate safe_min_lsn here as the same value is shown for all slots.
+ */
+ last_removed_seg = XLogGetLastRemovedSegno();
+ curr_write_lsn = GetXLogWriteRecPtr();
+
+ /*
+ * Show the next byte after segment boundary as min_safe_lsn so that
+ * pg_walfile_name() returns the correct file name for the value.
+ */
+ if (max_slot_wal_keep_size_mb >= 0 && last_removed_seg != 0)
+ XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 1,
+ wal_segment_size, min_safe_lsn);
+
  MemoryContextSwitchTo(oldcontext);
 
  LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
@@ -282,7 +303,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  Datum values[PG_GET_REPLICATION_SLOTS_COLS];
  bool nulls[PG_GET_REPLICATION_SLOTS_COLS];
  WALAvailability walstate;
- XLogSegNo last_removed_seg;
  int i;
 
  if (!slot->in_use)
@@ -342,7 +362,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  else
  nulls[i++] = true;
 
- walstate = GetWALAvailability(slot_contents.data.restart_lsn);
+ walstate = GetWALAvailability(slot_contents.data.restart_lsn,
+  curr_write_lsn, last_removed_seg);
 
  switch (walstate)
  {
@@ -366,16 +387,8 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  elog(ERROR, "invalid walstate: %d", (int) walstate);
  }
 
- if (max_slot_wal_keep_size_mb >= 0 &&
- (walstate == WALAVAIL_NORMAL || walstate == WALAVAIL_RESERVED) &&
- ((last_removed_seg = XLogGetLastRemovedSegno()) != 0))
- {
- XLogRecPtr min_safe_lsn;
-
- XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,
- wal_segment_size, min_safe_lsn);
+ if (min_safe_lsn != 0)
  values[i++] = Int64GetDatum(min_safe_lsn);
- }
  else
  nulls[i++] = true;
 
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index e917dfe92d..bbe00e7195 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -326,7 +326,9 @@ extern void ShutdownXLOG(int code, Datum arg);
 extern void InitXLOGAccess(void);
 extern void CreateCheckPoint(int flags);
 extern bool CreateRestartPoint(int flags);
-extern WALAvailability GetWALAvailability(XLogRecPtr restart_lsn);
+extern WALAvailability GetWALAvailability(XLogRecPtr targetLSN,
+  XLogRecPtr currWriteLSN,
+  XLogSegNo last_removed_seg);
 extern XLogRecPtr CalculateMaxmumSafeLSN(void);
 extern void XLogPutNextOid(Oid nextOid);
 extern XLogRecPtr XLogRestorePoint(const char *rpName);
--
2.18.4


From 10425b3c062ed3972c0ad067adb7c14c5dca43c8 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <[hidden email]>
Date: Thu, 18 Jun 2020 14:49:48 +0900
Subject: [PATCH v2 2/2] Forcibly initialize lastRemovedSegNo at the first
 checkpoint

If we have a replication slot retaining all existing WAL segments, a
checkpoint doesn't initialize lastRemovedSegNo. That means
pg_replication_slots can not show min_safe_lsn until any slot loses a
segment. Forcibly initialize it even if no WAL segments are removed at
the first checkpoint.
---
 src/backend/access/transam/xlog.c | 33 +++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 940f5fcb18..282bac32ab 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4025,6 +4025,16 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr, XLogRecPtr endptr)
  DIR   *xldir;
  struct dirent *xlde;
  char lastoff[MAXFNAMELEN];
+ char oldest[MAXFNAMELEN];
+ bool remember_oldest = false;
+
+ /*
+ * If we haven't set the initial last removed segno, force to set it even
+ * if we wouldn't have removed any segments.
+ */
+ oldest[0] = 0;
+ if (XLogGetLastRemovedSegno() == 0)
+ remember_oldest = true;
 
  /*
  * Construct a filename of the last segment to be kept. The timeline ID
@@ -4062,10 +4072,33 @@ RemoveOldXlogFiles(XLogSegNo segno, XLogRecPtr lastredoptr, XLogRecPtr endptr)
  {
  /* Update the last removed location in shared memory first */
  UpdateLastRemovedPtr(xlde->d_name);
+ remember_oldest = false;
 
  RemoveXlogFile(xlde->d_name, lastredoptr, endptr);
  }
  }
+ /* Remember the oldest file left in the directotry */
+ else if(remember_oldest &&
+ (oldest[0] == 0 || strcmp(xlde->d_name + 8, oldest + 8) < 0))
+ strncpy(oldest, xlde->d_name, MAXFNAMELEN);
+ }
+
+ /*
+ * We haven't initialize the pointer, initialize it.
+ * The last removed pointer is the oldest existing segment minus 1.
+ */
+ if (remember_oldest && oldest[0] != 0)
+ {
+ uint32 tli;
+ XLogSegNo segno;
+
+ XLogFromFileName(oldest, &tli, &segno, wal_segment_size);
+
+ if (segno > 1)
+ {
+ XLogFileName(oldest, tli, segno - 1, wal_segment_size);
+ UpdateLastRemovedPtr(oldest);
+ }
  }
 
  FreeDir(xldir);
--
2.18.4

Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

akapila
On Thu, Jun 18, 2020 at 11:52 AM Kyotaro Horiguchi
<[hidden email]> wrote:

>
> At Wed, 17 Jun 2020 21:37:55 +0900, Fujii Masao <[hidden email]> wrote in
> > On 2020/06/15 16:35, Kyotaro Horiguchi wrote:
> > Isn't it better to use 1 as the second argument of the above,
> > in order to address the issue that I reported upthread?
> > Otherwise, the WAL file name that pg_walfile_name(min_safe_lsn)
> > returns
> > would be confusing.
>
> Mmm. pg_walfile_name seems too specialize to
> pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
> segment boundaries.)  I'm not willing to do that only to follow such
> suspicious(?) specification, but surely it would practically be better
> doing that. Please find the attached first patch.
>

It is a little unclear to me how this or any proposed patch will solve
the original problem reported by Fujii-San?  Basically, the problem
arises because we don't have an interlock between when the checkpoint
removes the WAL segment and the view tries to acquire the same.  Am, I
missing something?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Kyotaro Horiguchi-4
At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <[hidden email]> wrote in

> On Thu, Jun 18, 2020 at 11:52 AM Kyotaro Horiguchi
> <[hidden email]> wrote:
> >
> > At Wed, 17 Jun 2020 21:37:55 +0900, Fujii Masao <[hidden email]> wrote in
> > > On 2020/06/15 16:35, Kyotaro Horiguchi wrote:
> > > Isn't it better to use 1 as the second argument of the above,
> > > in order to address the issue that I reported upthread?
> > > Otherwise, the WAL file name that pg_walfile_name(min_safe_lsn)
> > > returns
> > > would be confusing.
> >
> > Mmm. pg_walfile_name seems too specialize to
> > pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
> > segment boundaries.)  I'm not willing to do that only to follow such
> > suspicious(?) specification, but surely it would practically be better
> > doing that. Please find the attached first patch.
> >
>
> It is a little unclear to me how this or any proposed patch will solve
> the original problem reported by Fujii-San?  Basically, the problem
> arises because we don't have an interlock between when the checkpoint
> removes the WAL segment and the view tries to acquire the same.  Am, I
> missing something?

I'm not sure, but I don't get the point of blocking WAL segment
removal until the view is completed. The said columns of the view are
just for monitoring, which needs an information snapshot seemingly
taken at a certain time. And InvalidateObsoleteReplicationSlots kills
walsenders using lastRemovedSegNo of a different time.  The two are
independent each other.

Also the patch changes min_safe_lsn to show an LSN at segment boundary
+ 1.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Michael Paquier-2
On Fri, Jun 19, 2020 at 10:02:54AM +0900, Kyotaro Horiguchi wrote:
> At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <[hidden email]> wrote in
>> It is a little unclear to me how this or any proposed patch will solve
>> the original problem reported by Fujii-San?  Basically, the problem
>> arises because we don't have an interlock between when the checkpoint
>> removes the WAL segment and the view tries to acquire the same.  Am, I
>> missing something?

The proposed patch fetches the computation of the minimum LSN across
all slots before taking ReplicationSlotControlLock so its value gets
more lossy, and potentially older than what the slots actually
include.  So it is an attempt to take the safest spot possible.

Honestly, I find a bit silly the design to compute and use the same
minimum LSN value for all the tuples returned by
pg_get_replication_slots, and you can actually get a pretty good
estimate of that by emulating ReplicationSlotsComputeRequiredLSN()
directly with what pg_replication_slot provides as we have a min()
aggregate for pg_lsn.

For these reasons, I think that we should remove for now this
information from the view, and reconsider this part more carefully for
14~ with a clear definition of how much lossiness we are ready to
accept for the information provided here, if necessary.  We could for
example just have a separate SQL function that just grabs this value
(or a more global SQL view for XLogCtl data that includes this data).

> I'm not sure, but I don't get the point of blocking WAL segment
> removal until the view is completed.

We should really not do that anyway for a monitoring view.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Kyotaro Horiguchi-4
At Fri, 19 Jun 2020 10:39:58 +0900, Michael Paquier <[hidden email]> wrote in

> On Fri, Jun 19, 2020 at 10:02:54AM +0900, Kyotaro Horiguchi wrote:
> > At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <[hidden email]> wrote in
> >> It is a little unclear to me how this or any proposed patch will solve
> >> the original problem reported by Fujii-San?  Basically, the problem
> >> arises because we don't have an interlock between when the checkpoint
> >> removes the WAL segment and the view tries to acquire the same.  Am, I
> >> missing something?
>
> The proposed patch fetches the computation of the minimum LSN across
> all slots before taking ReplicationSlotControlLock so its value gets
> more lossy, and potentially older than what the slots actually
> include.  So it is an attempt to take the safest spot possible.

Minimum LSN (lastRemovedSegNo) is not protected by the lock. That
makes no defference.

> Honestly, I find a bit silly the design to compute and use the same
> minimum LSN value for all the tuples returned by
> pg_get_replication_slots, and you can actually get a pretty good

I see it as silly.  I think I said upthread that it was the distance
to the point where the slot loses a segment, and it was rejected but
just removing it makes us unable to estimate the distance so it is
there.

> estimate of that by emulating ReplicationSlotsComputeRequiredLSN()
> directly with what pg_replication_slot provides as we have a min()
> aggregate for pg_lsn.

min(lastRemovedSegNo) is the earliest value. It is enough to read it
at the first then use it in all slots.

> For these reasons, I think that we should remove for now this
> information from the view, and reconsider this part more carefully for
> 14~ with a clear definition of how much lossiness we are ready to
> accept for the information provided here, if necessary.  We could for
> example just have a separate SQL function that just grabs this value
> (or a more global SQL view for XLogCtl data that includes this data).

I think, we need at least one of the "distance" above or min_safe_lsn
in anywhere reachable from users.

> > I'm not sure, but I don't get the point of blocking WAL segment
> > removal until the view is completed.
>
> We should really not do that anyway for a monitoring view.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

akapila
In reply to this post by Kyotaro Horiguchi-4
On Fri, Jun 19, 2020 at 6:32 AM Kyotaro Horiguchi
<[hidden email]> wrote:

>
> At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <[hidden email]> wrote in
> > On Thu, Jun 18, 2020 at 11:52 AM Kyotaro Horiguchi
> > <[hidden email]> wrote:
> > >
> > > At Wed, 17 Jun 2020 21:37:55 +0900, Fujii Masao <[hidden email]> wrote in
> > > > On 2020/06/15 16:35, Kyotaro Horiguchi wrote:
> > > > Isn't it better to use 1 as the second argument of the above,
> > > > in order to address the issue that I reported upthread?
> > > > Otherwise, the WAL file name that pg_walfile_name(min_safe_lsn)
> > > > returns
> > > > would be confusing.
> > >
> > > Mmm. pg_walfile_name seems too specialize to
> > > pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
> > > segment boundaries.)  I'm not willing to do that only to follow such
> > > suspicious(?) specification, but surely it would practically be better
> > > doing that. Please find the attached first patch.
> > >
> >
> > It is a little unclear to me how this or any proposed patch will solve
> > the original problem reported by Fujii-San?  Basically, the problem
> > arises because we don't have an interlock between when the checkpoint
> > removes the WAL segment and the view tries to acquire the same.  Am, I
> > missing something?
>
> I'm not sure, but I don't get the point of blocking WAL segment
> removal until the view is completed.
>

I am not suggesting to do that.

> The said columns of the view are
> just for monitoring, which needs an information snapshot seemingly
> taken at a certain time. And InvalidateObsoleteReplicationSlots kills
> walsenders using lastRemovedSegNo of a different time.  The two are
> independent each other.
>
> Also the patch changes min_safe_lsn to show an LSN at segment boundary
> + 1.
>

But aren't we doing last_removed_seg+1 even without the patch?  See code below

- {
- XLogRecPtr min_safe_lsn;
-
- XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,
- wal_segment_size, min_safe_lsn);



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

akapila
In reply to this post by Kyotaro Horiguchi-4
On Fri, Jun 19, 2020 at 8:44 AM Kyotaro Horiguchi
<[hidden email]> wrote:

>
> At Fri, 19 Jun 2020 10:39:58 +0900, Michael Paquier <[hidden email]> wrote in
> > Honestly, I find a bit silly the design to compute and use the same
> > minimum LSN value for all the tuples returned by
> > pg_get_replication_slots, and you can actually get a pretty good
>
> I see it as silly.  I think I said upthread that it was the distance
> to the point where the slot loses a segment, and it was rejected but
> just removing it makes us unable to estimate the distance so it is
> there.
>

IIUC, the value of min_safe_lsn will lesser than restart_lsn, so one
can compute the difference of those to see how much ahead the
replication slot's restart_lsn is from min_safe_lsn but still it is
not clear how user will make any use of it.  Can you please explain
how the distance you are talking about is useful to users or anyone?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Kyotaro Horiguchi-4
At Fri, 19 Jun 2020 09:09:03 +0530, Amit Kapila <[hidden email]> wrote in

> On Fri, Jun 19, 2020 at 8:44 AM Kyotaro Horiguchi
> <[hidden email]> wrote:
> >
> > At Fri, 19 Jun 2020 10:39:58 +0900, Michael Paquier <[hidden email]> wrote in
> > > Honestly, I find a bit silly the design to compute and use the same
> > > minimum LSN value for all the tuples returned by
> > > pg_get_replication_slots, and you can actually get a pretty good
> >
> > I see it as silly.  I think I said upthread that it was the distance
> > to the point where the slot loses a segment, and it was rejected but
> > just removing it makes us unable to estimate the distance so it is
> > there.
> >
>
> IIUC, the value of min_safe_lsn will lesser than restart_lsn, so one
> can compute the difference of those to see how much ahead the
> replication slot's restart_lsn is from min_safe_lsn but still it is
> not clear how user will make any use of it.  Can you please explain
> how the distance you are talking about is useful to users or anyone?

When max_slot_wal_keep_size is set, the slot may retain up to as many
as that amount of old WAL segments then suddenly loses the oldest
segments.  *I* thought that I would use it in an HA cluster tool to
inform users about the remaining time (not literally, of course) a
disconnected standy is allowed diconnected.  Of course even if some
segments have been lost, they could be copied from the primary's
archive so that's not critical in theory.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Kyotaro Horiguchi-4
In reply to this post by akapila
At Fri, 19 Jun 2020 08:59:48 +0530, Amit Kapila <[hidden email]> wrote in

> > > > Mmm. pg_walfile_name seems too specialize to
> > > > pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for
> > > > segment boundaries.)  I'm not willing to do that only to follow such
> > > > suspicious(?) specification, but surely it would practically be better
> > > > doing that. Please find the attached first patch.
> > > >
> > >
> > > It is a little unclear to me how this or any proposed patch will solve
> > > the original problem reported by Fujii-San?  Basically, the problem
> > > arises because we don't have an interlock between when the checkpoint
> > > removes the WAL segment and the view tries to acquire the same.  Am, I
> > > missing something?
> >
> > I'm not sure, but I don't get the point of blocking WAL segment
> > removal until the view is completed.
> >
>
> I am not suggesting to do that.
>
> > The said columns of the view are
> > just for monitoring, which needs an information snapshot seemingly
> > taken at a certain time. And InvalidateObsoleteReplicationSlots kills
> > walsenders using lastRemovedSegNo of a different time.  The two are
> > independent each other.
> >
> > Also the patch changes min_safe_lsn to show an LSN at segment boundary
> > + 1.
> >
>
> But aren't we doing last_removed_seg+1 even without the patch?  See code below
>
> - {
> - XLogRecPtr min_safe_lsn;
> -
> - XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0,
> - wal_segment_size, min_safe_lsn);

It is at the beginning byte of the *next* segment. Fujii-san told that
it should be the next byte of it, namely
"XLogSegNoOffsetToRecPtr(last_removed_seg + 1, *1*,", and the patch
calculates as that. It adds the follows instead.

+ if (max_slot_wal_keep_size_mb >= 0 && last_removed_seg != 0)
+ XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 1,
+ wal_segment_size, min_safe_lsn);

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Fujii Masao-4
In reply to this post by Michael Paquier-2


On 2020/06/19 10:39, Michael Paquier wrote:

> On Fri, Jun 19, 2020 at 10:02:54AM +0900, Kyotaro Horiguchi wrote:
>> At Thu, 18 Jun 2020 18:18:37 +0530, Amit Kapila <[hidden email]> wrote in
>>> It is a little unclear to me how this or any proposed patch will solve
>>> the original problem reported by Fujii-San?  Basically, the problem
>>> arises because we don't have an interlock between when the checkpoint
>>> removes the WAL segment and the view tries to acquire the same.  Am, I
>>> missing something?
>
> The proposed patch fetches the computation of the minimum LSN across
> all slots before taking ReplicationSlotControlLock so its value gets
> more lossy, and potentially older than what the slots actually
> include.  So it is an attempt to take the safest spot possible.
>
> Honestly, I find a bit silly the design to compute and use the same
> minimum LSN value for all the tuples returned by
> pg_get_replication_slots, and you can actually get a pretty good
> estimate of that by emulating ReplicationSlotsComputeRequiredLSN()
> directly with what pg_replication_slot provides as we have a min()
> aggregate for pg_lsn.
>
> For these reasons, I think that we should remove for now this
> information from the view, and reconsider this part more carefully for
> 14~ with a clear definition of how much lossiness we are ready to
> accept for the information provided here, if necessary.

Agreed. But isn't it too late to remove the columns (i.e., change
the catalog) for v13? Because v13 beta1 was already released.
IIUC the catalog should not be changed since beta1 release so that
users can upgrade PostgreSQL without initdb.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Michael Paquier-2
On Fri, Jun 19, 2020 at 04:13:27PM +0900, Fujii Masao wrote:
> Agreed. But isn't it too late to remove the columns (i.e., change
> the catalog) for v13? Because v13 beta1 was already released.
> IIUC the catalog should not be changed since beta1 release so that
> users can upgrade PostgreSQL without initdb.

Catalog bumps have happened in the past between beta versions:
git log -p REL_12_BETA1..REL_12_BETA2 src/include/catalog/catversion.h
git log -p REL_11_BETA1..REL_11_BETA2 src/include/catalog/catversion.h
git log -p REL_10_BETA1..REL_10_BETA2 src/include/catalog/catversion.h

So we usually avoid to do that between betas, but my take here is that
a catalog bump is better than regretting a change we may have to live
with after the release is sealed.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Kyotaro Horiguchi-4
At Fri, 19 Jun 2020 16:36:09 +0900, Michael Paquier <[hidden email]> wrote in

> On Fri, Jun 19, 2020 at 04:13:27PM +0900, Fujii Masao wrote:
> > Agreed. But isn't it too late to remove the columns (i.e., change
> > the catalog) for v13? Because v13 beta1 was already released.
> > IIUC the catalog should not be changed since beta1 release so that
> > users can upgrade PostgreSQL without initdb.
>
> Catalog bumps have happened in the past between beta versions:
> git log -p REL_12_BETA1..REL_12_BETA2 src/include/catalog/catversion.h
> git log -p REL_11_BETA1..REL_11_BETA2 src/include/catalog/catversion.h
> git log -p REL_10_BETA1..REL_10_BETA2 src/include/catalog/catversion.h
>
> So we usually avoid to do that between betas, but my take here is that
> a catalog bump is better than regretting a change we may have to live
> with after the release is sealed.

FWIW if we decide that it is really useless, I agree to remove it now.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Fujii Masao-4


On 2020/06/19 16:43, Kyotaro Horiguchi wrote:

> At Fri, 19 Jun 2020 16:36:09 +0900, Michael Paquier <[hidden email]> wrote in
>> On Fri, Jun 19, 2020 at 04:13:27PM +0900, Fujii Masao wrote:
>>> Agreed. But isn't it too late to remove the columns (i.e., change
>>> the catalog) for v13? Because v13 beta1 was already released.
>>> IIUC the catalog should not be changed since beta1 release so that
>>> users can upgrade PostgreSQL without initdb.
>>
>> Catalog bumps have happened in the past between beta versions:
>> git log -p REL_12_BETA1..REL_12_BETA2 src/include/catalog/catversion.h
>> git log -p REL_11_BETA1..REL_11_BETA2 src/include/catalog/catversion.h
>> git log -p REL_10_BETA1..REL_10_BETA2 src/include/catalog/catversion.h
>>
>> So we usually avoid to do that between betas, but my take here is that
>> a catalog bump is better than regretting a change we may have to live
>> with after the release is sealed.

Sounds reasonable.

Regards,


--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Michael Paquier-2
On Fri, Jun 19, 2020 at 05:34:01PM +0900, Fujii Masao wrote:
> On 2020/06/19 16:43, Kyotaro Horiguchi wrote:
>> At Fri, 19 Jun 2020 16:36:09 +0900, Michael Paquier <[hidden email]> wrote in
>>> So we usually avoid to do that between betas, but my take here is that
>>> a catalog bump is better than regretting a change we may have to live
>>> with after the release is sealed.
>
> Sounds reasonable.

If we want to make this happen, I am afraid that the time is short as
beta2 is planned for next week.  As the version will be likely tagged
by Monday US time, it would be good to get this addressed within 48
hours to give some room to the buildfarm to react.  Attached is a
straight-forward proposal of patch.  Any thoughts?

(The change in catversion.h is a self-reminder.)
--
Michael

repslot-min-safe-cleanup.patch (9K) Download Attachment
signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Kyotaro Horiguchi-4
At Fri, 19 Jun 2020 21:15:52 +0900, Michael Paquier <[hidden email]> wrote in

> On Fri, Jun 19, 2020 at 05:34:01PM +0900, Fujii Masao wrote:
> > On 2020/06/19 16:43, Kyotaro Horiguchi wrote:
> >> At Fri, 19 Jun 2020 16:36:09 +0900, Michael Paquier <[hidden email]> wrote in
> >>> So we usually avoid to do that between betas, but my take here is that
> >>> a catalog bump is better than regretting a change we may have to live
> >>> with after the release is sealed.
> >
> > Sounds reasonable.
>
> If we want to make this happen, I am afraid that the time is short as
> beta2 is planned for next week.  As the version will be likely tagged
> by Monday US time, it would be good to get this addressed within 48
> hours to give some room to the buildfarm to react.  Attached is a
> straight-forward proposal of patch.  Any thoughts?
>
> (The change in catversion.h is a self-reminder.)

Thanks for the patch.

As a whole it contains all needed for ripping off the min_safe_lsn.
Some items in the TAP test gets coarse but none of them lose
significance. Compiles almost cleanly and passes all tests including
TAP test.

The variable last_removed_seg in slotfuncs.c:285 is left alone but no
longer used after applying this patch. It should be removed as well.

Other than that the patch looks good to me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center


Reply | Threaded
Open this post in threaded view
|

Re: min_safe_lsn column in pg_replication_slots view

Fujii Masao-4
In reply to this post by Michael Paquier-2


On 2020/06/19 21:15, Michael Paquier wrote:

> On Fri, Jun 19, 2020 at 05:34:01PM +0900, Fujii Masao wrote:
>> On 2020/06/19 16:43, Kyotaro Horiguchi wrote:
>>> At Fri, 19 Jun 2020 16:36:09 +0900, Michael Paquier <[hidden email]> wrote in
>>>> So we usually avoid to do that between betas, but my take here is that
>>>> a catalog bump is better than regretting a change we may have to live
>>>> with after the release is sealed.
>>
>> Sounds reasonable.
>
> If we want to make this happen, I am afraid that the time is short as
> beta2 is planned for next week.  As the version will be likely tagged
> by Monday US time, it would be good to get this addressed within 48
> hours to give some room to the buildfarm to react.  Attached is a
> straight-forward proposal of patch.  Any thoughts?

It's better if we can do that. But I think that we should hear Alvaro's opinion
about this before rushing to push the patch. Even if we miss beta2 as the result
of that, I'm ok. We would be able to push something better into beta3.
So, CC Alvaro.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


123