hot_standby_feedback vs excludeVacuum and snapshots

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

hot_standby_feedback vs excludeVacuum and snapshots

Greg Stark
I'm poking around to see debug a vacuuming problem and wondering if
I've found something more serious.

As far as I can tell the snapshots on HOT standby are built using a
list of running xids that the primary builds and puts in the WAL and
that seems to include all xids from transactions running in all
databases. The HOT standby would then build a snapshot and eventually
send the xmin of that snapshot back to the primary in the hot standby
feedback and that would block vacuuming tuples that might be visible
to the standby.

Many ages ago Alvaro sweated blood to ensure vacuums could run for
long periods of time without holding back the xmin horizon and
blocking other vacuums from cleaning up tuples. That's the purpose of
the excludeVacuum flag in GetCurrentVirtualXIDs(). That's possible
because we know vacuums won't insert any tuples that queries might try
to view and also vacuums won't try to perform any sql queries on other
tables.

I can't find anywhere that the standby snapshot building mechanism
gets this same information about which xids are actually vacuums that
can be ignored when building a snapshot. So I'm concerned that the hot
standby sending back its xmin would be effectively undermining this
mechanism and forcing vacuum xids to be included in the xmin horizon
and prevent vacuuming of tuples.

Am I missing something obvious? Is this a known problem?

--
greg

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

akapila
On Thu, Mar 29, 2018 at 4:47 PM, Greg Stark <[hidden email]> wrote:

> I'm poking around to see debug a vacuuming problem and wondering if
> I've found something more serious.
>
> As far as I can tell the snapshots on HOT standby are built using a
> list of running xids that the primary builds and puts in the WAL and
> that seems to include all xids from transactions running in all
> databases. The HOT standby would then build a snapshot and eventually
> send the xmin of that snapshot back to the primary in the hot standby
> feedback and that would block vacuuming tuples that might be visible
> to the standby.
>
> Many ages ago Alvaro sweated blood to ensure vacuums could run for
> long periods of time without holding back the xmin horizon and
> blocking other vacuums from cleaning up tuples. That's the purpose of
> the excludeVacuum flag in GetCurrentVirtualXIDs(). That's possible
> because we know vacuums won't insert any tuples that queries might try
> to view and also vacuums won't try to perform any sql queries on other
> tables.
>
> I can't find anywhere that the standby snapshot building mechanism
> gets this same information about which xids are actually vacuums that
> can be ignored when building a snapshot.
>

I think the vacuum assigns xids only if it needs to truncate some of
the pages in the relation which happens towards the end of vacuum.
So, it shouldn't hold back the xmin horizon for long.


--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Simon Riggs
On 31 March 2018 at 14:21, Amit Kapila <[hidden email]> wrote:

> On Thu, Mar 29, 2018 at 4:47 PM, Greg Stark <[hidden email]> wrote:
>> I'm poking around to see debug a vacuuming problem and wondering if
>> I've found something more serious.
>>
>> As far as I can tell the snapshots on HOT standby are built using a
>> list of running xids that the primary builds and puts in the WAL and
>> that seems to include all xids from transactions running in all
>> databases. The HOT standby would then build a snapshot and eventually
>> send the xmin of that snapshot back to the primary in the hot standby
>> feedback and that would block vacuuming tuples that might be visible
>> to the standby.
>>
>> Many ages ago Alvaro sweated blood to ensure vacuums could run for
>> long periods of time without holding back the xmin horizon and
>> blocking other vacuums from cleaning up tuples. That's the purpose of
>> the excludeVacuum flag in GetCurrentVirtualXIDs(). That's possible
>> because we know vacuums won't insert any tuples that queries might try
>> to view and also vacuums won't try to perform any sql queries on other
>> tables.
>>
>> I can't find anywhere that the standby snapshot building mechanism
>> gets this same information about which xids are actually vacuums that
>> can be ignored when building a snapshot.
>>
>
> I think the vacuum assigns xids only if it needs to truncate some of
> the pages in the relation which happens towards the end of vacuum.
> So, it shouldn't hold back the xmin horizon for long.
Yes, that's the reason. I recall VACUUMs giving lots of problems
during development  of Hot Standby.

VACUUM FULL was the thing that needed to be excluded in the past
because it needed an xid to move rows.

Greg's concern is a good one and his noticing that we hadn't
specifically excluded VACUUMs is valid, so we should exclude them.
Well spotted, Greg.

So although this doesn't have the dramatic effect it might have had,
there is still the possibility of some effect and I think we should
treat it as a bug.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

ignore_lazy_vacuums_in_RunningTransactionData.v1.patch (710 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

akapila
On Sun, Apr 1, 2018 at 3:30 PM, Simon Riggs <[hidden email]> wrote:

> On 31 March 2018 at 14:21, Amit Kapila <[hidden email]> wrote:
>>
>> I think the vacuum assigns xids only if it needs to truncate some of
>> the pages in the relation which happens towards the end of vacuum.
>> So, it shouldn't hold back the xmin horizon for long.
>
> Yes, that's the reason. I recall VACUUMs giving lots of problems
> during development  of Hot Standby.
>
> VACUUM FULL was the thing that needed to be excluded in the past
> because it needed an xid to move rows.
>
> Greg's concern is a good one and his noticing that we hadn't
> specifically excluded VACUUMs is valid, so we should exclude them.
> Well spotted, Greg.
>
> So although this doesn't have the dramatic effect it might have had,
> there is still the possibility of some effect and I think we should
> treat it as a bug.
>

+1.  I think you missed to update the comments on top of the modified
function ("Similar to GetSnapshotData but returns more information. We
include all PGXACTs with an assigned TransactionId, even VACUUM
processes." ).  It seems last part of the sentence should be omitted
after your patch, otherwise, patch looks good to me.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Andres Freund
In reply to this post by Greg Stark
Hi,

On 2018-03-29 12:17:24 +0100, Greg Stark wrote:

> I'm poking around to see debug a vacuuming problem and wondering if
> I've found something more serious.
>
> As far as I can tell the snapshots on HOT standby are built using a
> list of running xids that the primary builds and puts in the WAL and
> that seems to include all xids from transactions running in all
> databases. The HOT standby would then build a snapshot and eventually
> send the xmin of that snapshot back to the primary in the hot standby
> feedback and that would block vacuuming tuples that might be visible
> to the standby.

> Many ages ago Alvaro sweated blood to ensure vacuums could run for
> long periods of time without holding back the xmin horizon and
> blocking other vacuums from cleaning up tuples. That's the purpose of
> the excludeVacuum flag in GetCurrentVirtualXIDs(). That's possible
> because we know vacuums won't insert any tuples that queries might try
> to view and also vacuums won't try to perform any sql queries on other
> tables.

> I can't find anywhere that the standby snapshot building mechanism
> gets this same information about which xids are actually vacuums that
> can be ignored when building a snapshot. So I'm concerned that the hot
> standby sending back its xmin would be effectively undermining this
> mechanism and forcing vacuum xids to be included in the xmin horizon
> and prevent vacuuming of tuples.

> Am I missing something obvious? Is this a known problem?

Maybe I'm missing something, but the running transaction data reported
to the standby does *NOT* include anything about lazy vacuums - they
don't have an xid. The reason there's PROC_IN_VACUUM etc isn't the xid,
it's *xmin*, no?

We currently do acquire an xid when truncating the relation - but I
think it'd somewhat fair to argue that that's somewhat of a bug. The
reason a log is acquired is that we need to log AEL locks, and that
currently means they have to be assigned to a transaction.

Given that the truncation happens at the end of VACUUM and it *NEEDS* to
be present on the standby - otherwise the locking stuff is useless - I
don't think the fix commited in this thread is correct.

Wonder if the right thing here wouldn't be to instead transiently
acquire an AEL lock during replay when truncating a relation?

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Andres Freund
On 2018-06-07 14:19:18 -0700, Andres Freund wrote:

> Hi,
>
> On 2018-03-29 12:17:24 +0100, Greg Stark wrote:
> > I'm poking around to see debug a vacuuming problem and wondering if
> > I've found something more serious.
> >
> > As far as I can tell the snapshots on HOT standby are built using a
> > list of running xids that the primary builds and puts in the WAL and
> > that seems to include all xids from transactions running in all
> > databases. The HOT standby would then build a snapshot and eventually
> > send the xmin of that snapshot back to the primary in the hot standby
> > feedback and that would block vacuuming tuples that might be visible
> > to the standby.
>
> > Many ages ago Alvaro sweated blood to ensure vacuums could run for
> > long periods of time without holding back the xmin horizon and
> > blocking other vacuums from cleaning up tuples. That's the purpose of
> > the excludeVacuum flag in GetCurrentVirtualXIDs(). That's possible
> > because we know vacuums won't insert any tuples that queries might try
> > to view and also vacuums won't try to perform any sql queries on other
> > tables.
>
> > I can't find anywhere that the standby snapshot building mechanism
> > gets this same information about which xids are actually vacuums that
> > can be ignored when building a snapshot. So I'm concerned that the hot
> > standby sending back its xmin would be effectively undermining this
> > mechanism and forcing vacuum xids to be included in the xmin horizon
> > and prevent vacuuming of tuples.
>
> > Am I missing something obvious? Is this a known problem?
>
> Maybe I'm missing something, but the running transaction data reported
> to the standby does *NOT* include anything about lazy vacuums - they
> don't have an xid. The reason there's PROC_IN_VACUUM etc isn't the xid,
> it's *xmin*, no?
>
> We currently do acquire an xid when truncating the relation - but I
> think it'd somewhat fair to argue that that's somewhat of a bug. The
> reason a log is acquired is that we need to log AEL locks, and that
> currently means they have to be assigned to a transaction.
>
> Given that the truncation happens at the end of VACUUM and it *NEEDS* to
> be present on the standby - otherwise the locking stuff is useless - I
> don't think the fix commited in this thread is correct.
>
> Wonder if the right thing here wouldn't be to instead transiently
> acquire an AEL lock during replay when truncating a relation?

Isn't the fact that vacuum truncation requires an AEL, and that the
change committed today excludes those transactions from running xacts
records, flat out broken?

Look at:

void
ProcArrayApplyRecoveryInfo(RunningTransactions running)
...
        /*
         * Remove stale locks, if any.
         *
         * Locks are always assigned to the toplevel xid so we don't need to care
         * about subxcnt/subxids (and by extension not about ->suboverflowed).
         */
        StandbyReleaseOldLocks(running->xcnt, running->xids);

by excluding running transactions you have, as far as I can tell,
effectively removed the vacuum truncation AEL from the standby. Now that
only happens when a running xact record is logged, but as that happens
in the background...

I also don't understand why this change would be backpatched in the
first place. It's a relatively minor efficiency thing, no?


Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Simon Riggs
In reply to this post by Andres Freund
On 7 June 2018 at 22:19, Andres Freund <[hidden email]> wrote:

> Wonder if the right thing here wouldn't be to instead transiently
> acquire an AEL lock during replay when truncating a relation?

The way AELs are replayed in generic, all AEL requests are handled that way.

So yes, you could invent a special case for this specific situation.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

akapila
In reply to this post by Andres Freund
On Fri, Jun 8, 2018 at 2:55 AM, Andres Freund <[hidden email]> wrote:

>
> On 2018-06-07 14:19:18 -0700, Andres Freund wrote:
> > Hi,
> >
> > On 2018-03-29 12:17:24 +0100, Greg Stark wrote:
> > > I'm poking around to see debug a vacuuming problem and wondering if
> > > I've found something more serious.
> > >
> > > As far as I can tell the snapshots on HOT standby are built using a
> > > list of running xids that the primary builds and puts in the WAL and
> > > that seems to include all xids from transactions running in all
> > > databases. The HOT standby would then build a snapshot and eventually
> > > send the xmin of that snapshot back to the primary in the hot standby
> > > feedback and that would block vacuuming tuples that might be visible
> > > to the standby.
> >
> > > Many ages ago Alvaro sweated blood to ensure vacuums could run for
> > > long periods of time without holding back the xmin horizon and
> > > blocking other vacuums from cleaning up tuples. That's the purpose of
> > > the excludeVacuum flag in GetCurrentVirtualXIDs(). That's possible
> > > because we know vacuums won't insert any tuples that queries might try
> > > to view and also vacuums won't try to perform any sql queries on other
> > > tables.
> >
> > > I can't find anywhere that the standby snapshot building mechanism
> > > gets this same information about which xids are actually vacuums that
> > > can be ignored when building a snapshot. So I'm concerned that the hot
> > > standby sending back its xmin would be effectively undermining this
> > > mechanism and forcing vacuum xids to be included in the xmin horizon
> > > and prevent vacuuming of tuples.
> >
> > > Am I missing something obvious? Is this a known problem?
> >
> > Maybe I'm missing something, but the running transaction data reported
> > to the standby does *NOT* include anything about lazy vacuums - they
> > don't have an xid. The reason there's PROC_IN_VACUUM etc isn't the xid,
> > it's *xmin*, no?
> >
> > We currently do acquire an xid when truncating the relation - but I
> > think it'd somewhat fair to argue that that's somewhat of a bug. The
> > reason a log is acquired is that we need to log AEL locks, and that
> > currently means they have to be assigned to a transaction.
> >
> > Given that the truncation happens at the end of VACUUM and it *NEEDS* to
> > be present on the standby - otherwise the locking stuff is useless - I
> > don't think the fix commited in this thread is correct.
> >
> > Wonder if the right thing here wouldn't be to instead transiently
> > acquire an AEL lock during replay when truncating a relation?
>

If we go that route, then don't we need to bother about when to
release the lock which is right now done at the commit/abort of the
transaction.  In master, for vacuum, we do release the lock
immediately after truncate, but I think that is not true generally.
So, if we release the lock at a time other than commit/abort, we might
end up releasing them at the different times in master and standby.

> Isn't the fact that vacuum truncation requires an AEL, and that the
> change committed today excludes those transactions from running xacts
> records, flat out broken?
>
> Look at:
>
> void
> ProcArrayApplyRecoveryInfo(RunningTransactions running)
> ...
>         /*
>          * Remove stale locks, if any.
>          *
>          * Locks are always assigned to the toplevel xid so we don't need to care
>          * about subxcnt/subxids (and by extension not about ->suboverflowed).
>          */
>         StandbyReleaseOldLocks(running->xcnt, running->xids);
>
> by excluding running transactions you have, as far as I can tell,
> effectively removed the vacuum truncation AEL from the standby. Now that
> only happens when a running xact record is logged, but as that happens
> in the background...
>

Yeah, that seems problematic.  I think we can avoid that if before
releasing the lock in StandbyReleaseOldLocks, we also have an
additional check to see whether the transaction is committed or
aborted as we do before adding it to KnownAssignedXids.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Simon Riggs
In reply to this post by Andres Freund
On 7 June 2018 at 22:25, Andres Freund <[hidden email]> wrote:
> On 2018-06-07 14:19:18 -0700, Andres Freund wrote:

> Look at:
>
> void
> ProcArrayApplyRecoveryInfo(RunningTransactions running)
> ...
>         /*
>          * Remove stale locks, if any.
>          *
>          * Locks are always assigned to the toplevel xid so we don't need to care
>          * about subxcnt/subxids (and by extension not about ->suboverflowed).
>          */
>         StandbyReleaseOldLocks(running->xcnt, running->xids);
>
> by excluding running transactions you have, as far as I can tell,
> effectively removed the vacuum truncation AEL from the standby.
I agree, that is correct, there is a bug in my recent commit that
causes a small race window that could potentially lead to someone
reading the size of a relation just before it is truncated and then
falling off the end of the scan, resulting in a block access ERROR,
potentially much later than the truncate.

I have also found another bug which affects what we do next.

For context, AEL locks are normally removed by COMMIT or ABORT.
StandbyReleaseOldLocks() is just a sweeper to catch anything that
didn't send an abort before it died, so it hardly ever activates. The
coding of StandbyReleaseOldLocks() is backwards... if it ain't in the
running list, then we remove it.

But that wasn't working correctly either, since as of 49bff5300d527 we
assigned AELs to subxids. Subxids weren't in the running list and so
AELs held by them would have been removed at the wrong time, an extant
bug in PG10. It looks to me like they would have been removed either
early or late, up to the next runningxact info record. They would be
removed, so no leakage, but the late timing wouldn't be noticeable for
tests or most usage, since it would look just like lock contention.
Early release might give same issue of block access to non-existent
block/file.

So the attached patch fixes both the bug in the recent commit and the
one I just found by observation of 49bff5300d527, since they are
related.

StandbyReleaseOldLocks() can sweep in the same way as
ExpireOldKnownAssignedTransactionIds().

> I also don't understand why this change would be backpatched in the
> first place. It's a relatively minor efficiency thing, no?

As for everything, that is open to discussion. Yes, it seems minor to
me.... until it affects you, then its not. It seems to have affected
Greg.

The attached patch, or a later revision, needs to be backpatched to
PG10 independently of the recent committed patch.

I have yet to test this manually, but will do so tomorrow morning.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

remove_standby_subxid_locks.v1.patch (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

akapila
On Fri, Jun 8, 2018 at 1:53 PM, Simon Riggs <[hidden email]> wrote:
>
> So the attached patch fixes both the bug in the recent commit and the
> one I just found by observation of 49bff5300d527, since they are
> related.
>
> StandbyReleaseOldLocks() can sweep in the same way as
> ExpireOldKnownAssignedTransactionIds().
>

-StandbyReleaseOldLocks(int nxids, TransactionId *xids)
+StandbyReleaseOldLocks(TransactionId oldxid)
 {
  ListCell   *cell,
     *prev,
@@ -741,26 +741,8 @@ StandbyReleaseOldLocks(int nxids, TransactionId *xids)

  if (StandbyTransactionIdIsPrepared(lock->xid))
  remove = false;
- else
- {
- int i;
- bool found = false;
-
- for (i = 0; i < nxids; i++)
- {
- if (lock->xid == xids[i])
- {
- found = true;
- break;
- }
- }
-
- /*
- * If its not a running transaction, remove it.
- */
- if (!found)
- remove = true;
- }
+ else if (TransactionIdPrecedes(lock->xid, oldxid))
+ remove = true;


How will this avoid the bug introduced by your recent commit
(32ac7a11)?  After that commit, we no longer consider vacuum's xid in
RunningTransactions->oldestRunningXid calculation, so it is possible
that we still remove/release its lock on standby earlier than
required.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Simon Riggs
On 8 June 2018 at 11:33, Amit Kapila <[hidden email]> wrote:

> On Fri, Jun 8, 2018 at 1:53 PM, Simon Riggs <[hidden email]> wrote:
>>
>> So the attached patch fixes both the bug in the recent commit and the
>> one I just found by observation of 49bff5300d527, since they are
>> related.
>>
>> StandbyReleaseOldLocks() can sweep in the same way as
>> ExpireOldKnownAssignedTransactionIds().
>>
>

> How will this avoid the bug introduced by your recent commit
> (32ac7a11)?  After that commit, we no longer consider vacuum's xid in
> RunningTransactions->oldestRunningXid calculation, so it is possible
> that we still remove/release its lock on standby earlier than
> required.

In the past, the presence of an xid was required to prevent removal by
StandbyReleaseOldLocks().

The new patch removes that requirement and removes when the xid is
older than oldestxid

The normal path of removal is at commit or abort,
StandbyReleaseOldLocks is used almost never (as originally intended).

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Andres Freund
In reply to this post by akapila
On 2018-06-08 11:29:17 +0530, Amit Kapila wrote:

> On Fri, Jun 8, 2018 at 2:55 AM, Andres Freund <[hidden email]> wrote:
> >
> > On 2018-06-07 14:19:18 -0700, Andres Freund wrote:
> > > We currently do acquire an xid when truncating the relation - but I
> > > think it'd somewhat fair to argue that that's somewhat of a bug. The
> > > reason a log is acquired is that we need to log AEL locks, and that
> > > currently means they have to be assigned to a transaction.
> > >
> > > Given that the truncation happens at the end of VACUUM and it *NEEDS* to
> > > be present on the standby - otherwise the locking stuff is useless - I
> > > don't think the fix commited in this thread is correct.
> > >
> > > Wonder if the right thing here wouldn't be to instead transiently
> > > acquire an AEL lock during replay when truncating a relation?
> >
>
> If we go that route, then don't we need to bother about when to
> release the lock which is right now done at the commit/abort of the
> transaction.  In master, for vacuum, we do release the lock
> immediately after truncate, but I think that is not true generally.
> So, if we release the lock at a time other than commit/abort, we might
> end up releasing them at the different times in master and standby.

I don't think that'd be an actual problem. For TRUNCATE we'd just
temporarily hold two locks. One for the DDL command itself, and one for
the trunction. Same is true for VACUUM, except that only one of them is
an AEL.


> > Isn't the fact that vacuum truncation requires an AEL, and that the
> > change committed today excludes those transactions from running xacts
> > records, flat out broken?
> >
> > Look at:
> >
> > void
> > ProcArrayApplyRecoveryInfo(RunningTransactions running)
> > ...
> >         /*
> >          * Remove stale locks, if any.
> >          *
> >          * Locks are always assigned to the toplevel xid so we don't need to care
> >          * about subxcnt/subxids (and by extension not about ->suboverflowed).
> >          */
> >         StandbyReleaseOldLocks(running->xcnt, running->xids);
> >
> > by excluding running transactions you have, as far as I can tell,
> > effectively removed the vacuum truncation AEL from the standby. Now that
> > only happens when a running xact record is logged, but as that happens
> > in the background...

> Yeah, that seems problematic.  I think we can avoid that if before
> releasing the lock in StandbyReleaseOldLocks, we also have an
> additional check to see whether the transaction is committed or
> aborted as we do before adding it to KnownAssignedXids.

I think the right fix is to simply revert the change here, rather than
do unplanned open heart surgery.  No convincing explanation has been
made why the change is necessary in the first place - the xid assignment
comes directly before vacuum finishes. That's much less than a lot of
other normal transactions will hold back concurrent vacuums / the
standby's reported horizon.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Andres Freund
In reply to this post by Simon Riggs
Hi,

On 2018-06-08 09:23:02 +0100, Simon Riggs wrote:

> I have also found another bug which affects what we do next.
>
> For context, AEL locks are normally removed by COMMIT or ABORT.
> StandbyReleaseOldLocks() is just a sweeper to catch anything that
> didn't send an abort before it died, so it hardly ever activates. The
> coding of StandbyReleaseOldLocks() is backwards... if it ain't in the
> running list, then we remove it.
>
> But that wasn't working correctly either, since as of 49bff5300d527 we
> assigned AELs to subxids. Subxids weren't in the running list and so
> AELs held by them would have been removed at the wrong time, an extant
> bug in PG10. It looks to me like they would have been removed either
> early or late, up to the next runningxact info record. They would be
> removed, so no leakage, but the late timing wouldn't be noticeable for
> tests or most usage, since it would look just like lock contention.
> Early release might give same issue of block access to non-existent
> block/file.

Yikes, that's kinda bad. It can also just cause plain crashes, no? The
on-disk / catalog state isn't necessarily consistent during DDL, which
is why we hold AE locks. At the very least it can cause corruption of
in-use relcache entries and such.  In practice the fact this probably
hits only a limited number of people because it requires DDL to happen
in subtransactions, which probably isn't *that* common.


> So the attached patch fixes both the bug in the recent commit and the
> one I just found by observation of 49bff5300d527, since they are
> related.

Can we please keep them separate?


> StandbyReleaseOldLocks() can sweep in the same way as
> ExpireOldKnownAssignedTransactionIds().
>
> > I also don't understand why this change would be backpatched in the
> > first place. It's a relatively minor efficiency thing, no?
>
> As for everything, that is open to discussion. Yes, it seems minor to
> me.... until it affects you, then its not.

How is it any worse than any other normal short-lived write transaction?
The truncation is done shortly before commit.


> It seems to have affected Greg.

As far as I can tell Greg was just theorizing?


Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Simon Riggs
On 8 June 2018 at 19:03, Andres Freund <[hidden email]> wrote:

>> It seems to have affected Greg.
>
> As far as I can tell Greg was just theorizing?

I'll wait for Greg to say whether this was an actual case that needs
to be fixed or not. If not, happy to revert.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Simon Riggs
In reply to this post by Andres Freund
On 8 June 2018 at 19:03, Andres Freund <[hidden email]> wrote:

> Hi,
>
> On 2018-06-08 09:23:02 +0100, Simon Riggs wrote:
>> I have also found another bug which affects what we do next.
>>
>> For context, AEL locks are normally removed by COMMIT or ABORT.
>> StandbyReleaseOldLocks() is just a sweeper to catch anything that
>> didn't send an abort before it died, so it hardly ever activates. The
>> coding of StandbyReleaseOldLocks() is backwards... if it ain't in the
>> running list, then we remove it.
>>
>> But that wasn't working correctly either, since as of 49bff5300d527 we
>> assigned AELs to subxids. Subxids weren't in the running list and so
>> AELs held by them would have been removed at the wrong time, an extant
>> bug in PG10. It looks to me like they would have been removed either
>> early or late, up to the next runningxact info record. They would be
>> removed, so no leakage, but the late timing wouldn't be noticeable for
>> tests or most usage, since it would look just like lock contention.
>> Early release might give same issue of block access to non-existent
>> block/file.
>
> Yikes, that's kinda bad. It can also just cause plain crashes, no? The
> on-disk / catalog state isn't necessarily consistent during DDL, which
> is why we hold AE locks. At the very least it can cause corruption of
> in-use relcache entries and such.  In practice the fact this probably
> hits only a limited number of people because it requires DDL to happen
> in subtransactions, which probably isn't *that* common.

Yep, kinda bad, hence fix.

>> So the attached patch fixes both the bug in the recent commit and the
>> one I just found by observation of 49bff5300d527, since they are
>> related.
>
> Can we please keep them separate?

The attached patch is separate. It fixes 49bff5300d527 and also the
committed code, but should work fine even if we revert. Will
doublecheck.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

akapila
In reply to this post by Simon Riggs
On Fri, Jun 8, 2018 at 5:27 PM, Simon Riggs <[hidden email]> wrote:

> On 8 June 2018 at 11:33, Amit Kapila <[hidden email]> wrote:
>> On Fri, Jun 8, 2018 at 1:53 PM, Simon Riggs <[hidden email]> wrote:
>>>
>>> So the attached patch fixes both the bug in the recent commit and the
>>> one I just found by observation of 49bff5300d527, since they are
>>> related.
>>>
>>> StandbyReleaseOldLocks() can sweep in the same way as
>>> ExpireOldKnownAssignedTransactionIds().
>>>
>>
>
>> How will this avoid the bug introduced by your recent commit
>> (32ac7a11)?  After that commit, we no longer consider vacuum's xid in
>> RunningTransactions->oldestRunningXid calculation, so it is possible
>> that we still remove/release its lock on standby earlier than
>> required.
>
> In the past, the presence of an xid was required to prevent removal by
> StandbyReleaseOldLocks().
>
> The new patch removes that requirement and removes when the xid is
> older than oldestxid
>

The case I have in mind is: "Say vacuum got xid 600 while truncating,
and then some other random transactions 601,602  starts modifying the
database.  At this point, I think the computed value of
RunningTransactions->oldestRunningXid will be 601.  Now, on standby
when StandbyReleaseOldLocks sees the lock from transaction 600, it
will remove it which doesn't appear correct to me.".

> The normal path of removal is at commit or abort,
> StandbyReleaseOldLocks is used almost never (as originally intended).
>

Okay, but that doesn't prevent us to ensure that whenever used, it
does the right thing.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Simon Riggs
On 9 June 2018 at 15:41, Amit Kapila <[hidden email]> wrote:

> On Fri, Jun 8, 2018 at 5:27 PM, Simon Riggs <[hidden email]> wrote:
>> On 8 June 2018 at 11:33, Amit Kapila <[hidden email]> wrote:
>>> On Fri, Jun 8, 2018 at 1:53 PM, Simon Riggs <[hidden email]> wrote:
>>>>
>>>> So the attached patch fixes both the bug in the recent commit and the
>>>> one I just found by observation of 49bff5300d527, since they are
>>>> related.
>>>>
>>>> StandbyReleaseOldLocks() can sweep in the same way as
>>>> ExpireOldKnownAssignedTransactionIds().
>>>>
>>>
>>
>>> How will this avoid the bug introduced by your recent commit
>>> (32ac7a11)?  After that commit, we no longer consider vacuum's xid in
>>> RunningTransactions->oldestRunningXid calculation, so it is possible
>>> that we still remove/release its lock on standby earlier than
>>> required.
>>
>> In the past, the presence of an xid was required to prevent removal by
>> StandbyReleaseOldLocks().
>>
>> The new patch removes that requirement and removes when the xid is
>> older than oldestxid
>>
>
> The case I have in mind is: "Say vacuum got xid 600 while truncating,
> and then some other random transactions 601,602  starts modifying the
> database.  At this point, I think the computed value of
> RunningTransactions->oldestRunningXid will be 601.  Now, on standby
> when StandbyReleaseOldLocks sees the lock from transaction 600, it
> will remove it which doesn't appear correct to me.".
OK, thanks. Patch attached.

>> The normal path of removal is at commit or abort,
>> StandbyReleaseOldLocks is used almost never (as originally intended).
>>
> Okay, but that doesn't prevent us to ensure that whenever used, it
> does the right thing.

What do you mean? Has anyone argued in favour of doing the wrong thing?


I'm playing around with finding a test case to prove this area works,
rather than just manual testing. Suggestions welcome.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

execution_ordering_in_running_xact.v1.patch (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Andres Freund
Hi,

On 2018-06-11 10:15:39 +0100, Simon Riggs wrote:

> diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
> index 9db184f8fe..c280744fdd 100644
> --- a/src/backend/storage/ipc/procarray.c
> +++ b/src/backend/storage/ipc/procarray.c
> @@ -1995,10 +1995,6 @@ GetRunningTransactionData(void)
>   volatile PGXACT *pgxact = &allPgXact[pgprocno];
>   TransactionId xid;
>  
> - /* Ignore procs running LAZY VACUUM */
> - if (pgxact->vacuumFlags & PROC_IN_VACUUM)
> - continue;
> -
>   /* Fetch xid just once - see GetNewTransactionId */
>   xid = pgxact->xid;
>  
> @@ -2009,13 +2005,21 @@ GetRunningTransactionData(void)
>   if (!TransactionIdIsValid(xid))
>   continue;
>  
> - xids[count++] = xid;
> -
> + /*
> + * Be careful not to exclude any xids from calculating the values of
> + * oldestRunningXid and suboverflowed.
> + */
>   if (TransactionIdPrecedes(xid, oldestRunningXid))
>   oldestRunningXid = xid;
>  
>   if (pgxact->overflowed)
>   suboverflowed = true;
> +
> + /* Ignore procs running LAZY VACUUM */
> + if (pgxact->vacuumFlags & PROC_IN_VACUUM)
> + continue;
> +
> + xids[count++] = xid;

I don't think this is a good idea. We shouldn't continue down the path
of having running xacts not actually running xacts, but rather go back
to including everything. The case presented in the thread didn't
actually do what it claimed originally, and there's a fair amount of
potential for the excluded xids to cause problems down the line.

Especially not when the fixes should be backpatched.  I think the
earlier patch should be reverted, and then the AEL lock release problem
should be fixed separately.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Simon Riggs
On 11 June 2018 at 17:56, Andres Freund <[hidden email]> wrote:

> I don't think this is a good idea. We shouldn't continue down the path
> of having running xacts not actually running xacts, but rather go back
> to including everything. The case presented in the thread didn't
> actually do what it claimed originally, and there's a fair amount of
> potential for the excluded xids to cause problems down the line.
>
> Especially not when the fixes should be backpatched.  I think the
> earlier patch should be reverted, and then the AEL lock release problem
> should be fixed separately.

Since Greg has not reappeared to speak either way, I agree we should
revert, though I will add comments to document this. I will do this
today.

Looks like we would need a multi-node isolation tester to formally
test the AEL lock release, so I won't add tests for that.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: hot_standby_feedback vs excludeVacuum and snapshots

Noah Misch-2
In reply to this post by Andres Freund
On Fri, Jun 08, 2018 at 11:03:38AM -0700, Andres Freund wrote:

> On 2018-06-08 09:23:02 +0100, Simon Riggs wrote:
> > For context, AEL locks are normally removed by COMMIT or ABORT.
> > StandbyReleaseOldLocks() is just a sweeper to catch anything that
> > didn't send an abort before it died, so it hardly ever activates. The
> > coding of StandbyReleaseOldLocks() is backwards... if it ain't in the
> > running list, then we remove it.
> >
> > But that wasn't working correctly either, since as of 49bff5300d527 we
> > assigned AELs to subxids. Subxids weren't in the running list and so
> > AELs held by them would have been removed at the wrong time, an extant
> > bug in PG10. It looks to me like they would have been removed either
> > early or late, up to the next runningxact info record. They would be
> > removed, so no leakage, but the late timing wouldn't be noticeable for
> > tests or most usage, since it would look just like lock contention.
> > Early release might give same issue of block access to non-existent
> > block/file.
>
> Yikes, that's kinda bad. It can also just cause plain crashes, no? The
> on-disk / catalog state isn't necessarily consistent during DDL, which
> is why we hold AE locks. At the very least it can cause corruption of
> in-use relcache entries and such.

Yes.  If I'm reading the commit message right, the committed code also
releases locks too early:

> commit 15378c1
> Author:     Simon Riggs <[hidden email]>
> AuthorDate: Sat Jun 16 14:03:29 2018 +0100
> Commit:     Simon Riggs <[hidden email]>
> CommitDate: Sat Jun 16 14:03:29 2018 +0100
>
>     Remove AELs from subxids correctly on standby
>    
>     Issues relate only to subtransactions that hold AccessExclusiveLocks
>     when replayed on standby.
>    
>     Prior to PG10, aborting subtransactions that held an
>     AccessExclusiveLock failed to release the lock until top level commit or
>     abort. 49bff5300d527 fixed that.
>    
>     However, 49bff5300d527 also introduced a similar bug where subtransaction
>     commit would fail to release an AccessExclusiveLock, leaving the lock to
>     be removed sometimes early and sometimes late. This commit fixes
>     that bug also. Backpatch to PG10 needed.

Subtransaction commit is too early to release an arbitrary
AccessExclusiveLock.  The primary releases every AccessExclusiveLock at
top-level transaction commit, top-level transaction abort, or subtransaction
abort.  CommitSubTransaction() doesn't do that; it transfers locks to the
parent sub(xact).  Standby nodes can't safely remove an arbitrary lock earlier
than the primary would.

12