Remove HeapTuple and Buffer dependency for predicate locking functions

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Remove HeapTuple and Buffer dependency for predicate locking functions

Ashwin Agrawal

Proposing following changes to make predicate locking and checking
functions generic and remove dependency on HeapTuple and Heap AM. We
made these changes to help with Zedstore. I think the changes should
help Zheap and other AMs in general.

- Change PredicateLockTuple() to PredicateLockTID(). So, instead of
  passing HeapTuple to it, just pass ItemPointer and tuple insert
  transaction id if known. This was also discussed earlier in [1],
  don't think was done in that context but would be helpful in future
  if such requirement comes up as well.

- CheckForSerializableConflictIn() take blocknum instead of
  buffer. Currently, the function anyways does nothing with the buffer
  just needs blocknum. Also, helps to decouple dependency on buffer as
  not all AMs may have one to one notion between blocknum and single
  buffer. Like for zedstore, tuple is stored across individual column
  buffers. So, wish to have way to lock not physical buffer but
  logical blocknum.

- CheckForSerializableConflictOut() no more takes HeapTuple nor
  buffer, instead just takes xid. Push heap specific parts from
  CheckForSerializableConflictOut() into its own function
  HeapCheckForSerializableConflictOut() which calls
  CheckForSerializableConflictOut(). The alternative option could be
  CheckForSerializableConflictOut() take callback function and
  callback arguments, which gets called if required after performing
  prechecks. Though currently I fell AM having its own wrapper to
  perform AM specific task and then calling
  CheckForSerializableConflictOut() is fine.

Attaching patch which makes these changes.

This way PredicateLockTID(), CheckForSerializableConflictIn() and
CheckForSerializableConflictOut() functions become usable by any AM.


1] https://www.postgresql.org/message-id/CAEepm%3D2QbqQ_%2BKQQCnhKukF6NEAeq4SqiO3Qxe%2BfHza5-H-jKA%40mail.gmail.com

v1-0001-Remove-HeapTuple-dependency-for-predicate-locking.patch (34K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Andres Freund
Hi,

On 2019-06-24 10:41:06 -0700, Ashwin Agrawal wrote:
> Proposing following changes to make predicate locking and checking
> functions generic and remove dependency on HeapTuple and Heap AM. We
> made these changes to help with Zedstore. I think the changes should
> help Zheap and other AMs in general.

Indeed.


> - Change PredicateLockTuple() to PredicateLockTID(). So, instead of
>   passing HeapTuple to it, just pass ItemPointer and tuple insert
>   transaction id if known. This was also discussed earlier in [1],
>   don't think was done in that context but would be helpful in future
>   if such requirement comes up as well.

Right.


> - CheckForSerializableConflictIn() take blocknum instead of
>   buffer. Currently, the function anyways does nothing with the buffer
>   just needs blocknum. Also, helps to decouple dependency on buffer as
>   not all AMs may have one to one notion between blocknum and single
>   buffer. Like for zedstore, tuple is stored across individual column
>   buffers. So, wish to have way to lock not physical buffer but
>   logical blocknum.

Hm.  I wonder if we somehow ought to generalize the granularity scheme
for predicate locks to not be tuple/page/relation.  But even if, that's
probably a separate patch.


> - CheckForSerializableConflictOut() no more takes HeapTuple nor
>   buffer, instead just takes xid. Push heap specific parts from
>   CheckForSerializableConflictOut() into its own function
>   HeapCheckForSerializableConflictOut() which calls
>   CheckForSerializableConflictOut(). The alternative option could be
>   CheckForSerializableConflictOut() take callback function and
>   callback arguments, which gets called if required after performing
>   prechecks. Though currently I fell AM having its own wrapper to
>   perform AM specific task and then calling
>   CheckForSerializableConflictOut() is fine.

I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move the
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.


> Attaching patch which makes these changes.

Please make sure that there's a CF entry for this (I'm in a plane with a
super slow connection, otherwise I'd check).

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Thomas Munro-5
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <[hidden email]> wrote:
> On 2019-06-24 10:41:06 -0700, Ashwin Agrawal wrote:
> > Proposing following changes to make predicate locking and checking
> > functions generic and remove dependency on HeapTuple and Heap AM. We
> > made these changes to help with Zedstore. I think the changes should
> > help Zheap and other AMs in general.
>
> Indeed.

+1

> > - Change PredicateLockTuple() to PredicateLockTID(). So, instead of
> >   passing HeapTuple to it, just pass ItemPointer and tuple insert
> >   transaction id if known. This was also discussed earlier in [1],
> >   don't think was done in that context but would be helpful in future
> >   if such requirement comes up as well.
>
> Right.

+1

> > - CheckForSerializableConflictIn() take blocknum instead of
> >   buffer. Currently, the function anyways does nothing with the buffer
> >   just needs blocknum. Also, helps to decouple dependency on buffer as
> >   not all AMs may have one to one notion between blocknum and single
> >   buffer. Like for zedstore, tuple is stored across individual column
> >   buffers. So, wish to have way to lock not physical buffer but
> >   logical blocknum.
>
> Hm.  I wonder if we somehow ought to generalize the granularity scheme
> for predicate locks to not be tuple/page/relation.  But even if, that's
> probably a separate patch.

What do you have in mind?  This is certainly the traditional way to do
lock hierarchies (archeological fun: we used to have
src/backend/storage/lock/multi.c, a "standard multi-level lock manager
as per the Gray paper", before commits 3f7fbf85 and e6e9e18e
decommissioned it;  SSI brought the concept back again in a parallel
lock manager), but admittedly it has assumptions about physical
storage baked into the naming.  Perhaps you just want to give those
things different labels, "TID range" instead of page, for the benefit
of "logical" TID users?  Perhaps you want to permit more levels?  That
seems premature as long as TIDs are defined in terms of blocks and
offsets, so this stuff is reflected all over the source tree...

> > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> >   buffer, instead just takes xid. Push heap specific parts from
> >   CheckForSerializableConflictOut() into its own function
> >   HeapCheckForSerializableConflictOut() which calls
> >   CheckForSerializableConflictOut(). The alternative option could be
> >   CheckForSerializableConflictOut() take callback function and
> >   callback arguments, which gets called if required after performing
> >   prechecks. Though currently I fell AM having its own wrapper to
> >   perform AM specific task and then calling
> >   CheckForSerializableConflictOut() is fine.
>
> I think it's right to move the xid handling out of
> CheckForSerializableConflictOut(). But I think we also ought to move the
> subtransaction handling out of the function - e.g. zheap doesn't
> want/need that.

Thoughts on this Ashwin?

> > Attaching patch which makes these changes.
>
> Please make sure that there's a CF entry for this (I'm in a plane with a
> super slow connection, otherwise I'd check).

This all makes sense, and I'd like to be able to commit this soon.  We
come up with something nearly identical for zheap.  I'm subscribing
Kuntal Ghosh to see if he has comments, as he worked on that.  The
main point of difference seems to be the assumption about how
subtransactions work.

--
Thomas Munro
https://enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Ashwin Agrawal

On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <[hidden email]> wrote:
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <[hidden email]> wrote:
> > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> >   buffer, instead just takes xid. Push heap specific parts from
> >   CheckForSerializableConflictOut() into its own function
> >   HeapCheckForSerializableConflictOut() which calls
> >   CheckForSerializableConflictOut(). The alternative option could be
> >   CheckForSerializableConflictOut() take callback function and
> >   callback arguments, which gets called if required after performing
> >   prechecks. Though currently I fell AM having its own wrapper to
> >   perform AM specific task and then calling
> >   CheckForSerializableConflictOut() is fine.
>
> I think it's right to move the xid handling out of
> CheckForSerializableConflictOut(). But I think we also ought to move the
> subtransaction handling out of the function - e.g. zheap doesn't
> want/need that.

Thoughts on this Ashwin?

I think the only part its doing for sub-transaction is SubTransGetTopmostTransaction(xid). If xid passed to this function is already top most transaction which is case for zheap and zedstore, then there is no downside to keeping that code here in common place.

Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Andres Freund
In reply to this post by Thomas Munro-5
Hi,

On 2019-07-31 09:57:58 +1200, Thomas Munro wrote:
> On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <[hidden email]> wrote:
> > Hm.  I wonder if we somehow ought to generalize the granularity scheme
> > for predicate locks to not be tuple/page/relation.  But even if, that's
> > probably a separate patch.
>
> What do you have in mind?

My concern is that continuing to inferring the granularity levels from
the tid doesn't seem like a great path forward. An AMs use of tids might
not necessarily be very amenable to that, if the mapping isn't actually
block based.


> Perhaps you just want to give those things different labels, "TID
> range" instead of page, for the benefit of "logical" TID users?
> Perhaps you want to permit more levels?  That seems premature as long
> as TIDs are defined in terms of blocks and offsets, so this stuff is
> reflected all over the source tree...

I'm mostly wondering if the different levels shouldn't be computed
outside of predicate.c.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Andres Freund
In reply to this post by Ashwin Agrawal
Hi,

On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:

> On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <[hidden email]> wrote:
>
> > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <[hidden email]> wrote:
> > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> > > >   buffer, instead just takes xid. Push heap specific parts from
> > > >   CheckForSerializableConflictOut() into its own function
> > > >   HeapCheckForSerializableConflictOut() which calls
> > > >   CheckForSerializableConflictOut(). The alternative option could be
> > > >   CheckForSerializableConflictOut() take callback function and
> > > >   callback arguments, which gets called if required after performing
> > > >   prechecks. Though currently I fell AM having its own wrapper to
> > > >   perform AM specific task and then calling
> > > >   CheckForSerializableConflictOut() is fine.
> > >
> > > I think it's right to move the xid handling out of
> > > CheckForSerializableConflictOut(). But I think we also ought to move the
> > > subtransaction handling out of the function - e.g. zheap doesn't
> > > want/need that.
> >
> > Thoughts on this Ashwin?
> >
>
> I think the only part its doing for sub-transaction is
> SubTransGetTopmostTransaction(xid). If xid passed to this function is
> already top most transaction which is case for zheap and zedstore, then
> there is no downside to keeping that code here in common place.

Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Ashwin Agrawal

On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <[hidden email]> wrote:
Hi,

On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
> On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <[hidden email]> wrote:
>
> > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <[hidden email]> wrote:
> > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> > > >   buffer, instead just takes xid. Push heap specific parts from
> > > >   CheckForSerializableConflictOut() into its own function
> > > >   HeapCheckForSerializableConflictOut() which calls
> > > >   CheckForSerializableConflictOut(). The alternative option could be
> > > >   CheckForSerializableConflictOut() take callback function and
> > > >   callback arguments, which gets called if required after performing
> > > >   prechecks. Though currently I fell AM having its own wrapper to
> > > >   perform AM specific task and then calling
> > > >   CheckForSerializableConflictOut() is fine.
> > >
> > > I think it's right to move the xid handling out of
> > > CheckForSerializableConflictOut(). But I think we also ought to move the
> > > subtransaction handling out of the function - e.g. zheap doesn't
> > > want/need that.
> >
> > Thoughts on this Ashwin?
> >
>
> I think the only part its doing for sub-transaction is
> SubTransGetTopmostTransaction(xid). If xid passed to this function is
> already top most transaction which is case for zheap and zedstore, then
> there is no downside to keeping that code here in common place.

Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.

Okay, agree, its costly function and better to avoid the call if possible.

Instead of moving the handling out of the function, how do feel about adding boolean isTopTransactionId argument to function CheckForSerializableConflictOut(). The AMs, which implicitly know, only pass top transaction Id to this function, can pass true and avoid the function call to SubTransGetTopmostTransaction(xid). With this subtransaction code remains in generic place and AMs intending to use it continue to leverage the common code, plus explicitly clarifies the behavior as well.
Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Ashwin Agrawal

On Wed, Jul 31, 2019 at 12:37 PM Ashwin Agrawal <[hidden email]> wrote:

On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <[hidden email]> wrote:
Hi,

On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
> On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <[hidden email]> wrote:
>
> > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <[hidden email]> wrote:
> > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> > > >   buffer, instead just takes xid. Push heap specific parts from
> > > >   CheckForSerializableConflictOut() into its own function
> > > >   HeapCheckForSerializableConflictOut() which calls
> > > >   CheckForSerializableConflictOut(). The alternative option could be
> > > >   CheckForSerializableConflictOut() take callback function and
> > > >   callback arguments, which gets called if required after performing
> > > >   prechecks. Though currently I fell AM having its own wrapper to
> > > >   perform AM specific task and then calling
> > > >   CheckForSerializableConflictOut() is fine.
> > >
> > > I think it's right to move the xid handling out of
> > > CheckForSerializableConflictOut(). But I think we also ought to move the
> > > subtransaction handling out of the function - e.g. zheap doesn't
> > > want/need that.
> >
> > Thoughts on this Ashwin?
> >
>
> I think the only part its doing for sub-transaction is
> SubTransGetTopmostTransaction(xid). If xid passed to this function is
> already top most transaction which is case for zheap and zedstore, then
> there is no downside to keeping that code here in common place.

Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.

Okay, agree, its costly function and better to avoid the call if possible.

Instead of moving the handling out of the function, how do feel about adding boolean isTopTransactionId argument to function CheckForSerializableConflictOut(). The AMs, which implicitly know, only pass top transaction Id to this function, can pass true and avoid the function call to SubTransGetTopmostTransaction(xid). With this subtransaction code remains in generic place and AMs intending to use it continue to leverage the common code, plus explicitly clarifies the behavior as well.

Added argument to function to make the subtransaction handling optional in attached v2 of patch.


v2-0001-Remove-HeapTuple-dependency-for-predicate-locking.patch (35K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Andres Freund
In reply to this post by Ashwin Agrawal
Hi,

On 2019-07-31 12:37:58 -0700, Ashwin Agrawal wrote:

> On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <[hidden email]> wrote:
>
> > Hi,
> >
> > On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
> > > On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <[hidden email]>
> > wrote:
> > >
> > > > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <[hidden email]>
> > wrote:
> > > > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> > > > > >   buffer, instead just takes xid. Push heap specific parts from
> > > > > >   CheckForSerializableConflictOut() into its own function
> > > > > >   HeapCheckForSerializableConflictOut() which calls
> > > > > >   CheckForSerializableConflictOut(). The alternative option could
> > be
> > > > > >   CheckForSerializableConflictOut() take callback function and
> > > > > >   callback arguments, which gets called if required after
> > performing
> > > > > >   prechecks. Though currently I fell AM having its own wrapper to
> > > > > >   perform AM specific task and then calling
> > > > > >   CheckForSerializableConflictOut() is fine.
> > > > >
> > > > > I think it's right to move the xid handling out of
> > > > > CheckForSerializableConflictOut(). But I think we also ought to move
> > the
> > > > > subtransaction handling out of the function - e.g. zheap doesn't
> > > > > want/need that.
> > > >
> > > > Thoughts on this Ashwin?
> > > >
> > >
> > > I think the only part its doing for sub-transaction is
> > > SubTransGetTopmostTransaction(xid). If xid passed to this function is
> > > already top most transaction which is case for zheap and zedstore, then
> > > there is no downside to keeping that code here in common place.
> >
> > Well, it's far from a cheap function. It'll do unnecessary on-disk
> > lookups in many cases. I'd call that quite a downside.
> >
>
> Okay, agree, its costly function and better to avoid the call if possible.
>
> Instead of moving the handling out of the function, how do feel about
> adding boolean isTopTransactionId argument to function
> CheckForSerializableConflictOut(). The AMs, which implicitly know, only
> pass top transaction Id to this function, can pass true and avoid the
> function call to SubTransGetTopmostTransaction(xid). With this
> subtransaction code remains in generic place and AMs intending to use it
> continue to leverage the common code, plus explicitly clarifies the
> behavior as well.

Looking at the code as of master, we currently have:

- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
  out a whether the tuple has been locked by the current
  transaction. That check afaict just should be
  TransactionIdIsCurrentTransactionId(), without all the other
  stuff that's done today.

  TransactionIdIsCurrentTransactionId() imo ought to be optimized to
  always check for the top level transactionid first - that's a good bet
  today, but even moreso for the upcoming AMs that won't have separate
  xids for subtransactions.  Alternatively we shouldn't make that a
  binary search for each subtrans level, but just have a small
  simplehash hashtable for xids.

- CheckForSerializableConflictOut() wants to get the toplevel xid for
  the tuple, because that's the one the predicate hashtable stores.

  In your patch you've already moved the HTSV() call etc out of
  CheckForSerializableConflictOut(). I'm somewhat inclined to think that
  the SubTransGetTopmostTransaction() call ought to go along with that.
  I don't really think that belongs in predicate.c, especially if
  most/all new AMs don't use subtransaction ids.

  The only downside is that currently the
  TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
  avoids the SubTransGetTopmostTransaction() check.

  But again, the better fix for that seems to be to improve the generic
  code. As written the check won't prevent a subtrans lookup for heap
  when subtransactions are in use, and it's IME pretty common for tuples
  to get looked at again in the transaction that has created them. So
  I'm somewhat inclined to think that SubTransGetTopmostTransaction()
  should have a fast-path for the current transaction - probably just
  employing TransactionIdIsCurrentTransactionId().

I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument.  The relevant mapping should be one
line in the caller.

I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.


Minor notes:
- I don't think 'insert_xid' is necessarily great - it could also be the
  updating xid etc. And while you can argue that an update is an insert
  in the current heap, that's not the case for future AMs.
- to me
@@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
  if (valid)
  {
  ItemPointerSetOffsetNumber(tid, offnum);
- PredicateLockTuple(relation, heapTuple, snapshot);
+ PredicateLockTID(relation, &(heapTuple)->t_self, snapshot,
+ HeapTupleHeaderGetXmin(heapTuple->t_data));
  if (all_dead)
  *all_dead = false;
  return true;

 What are those parens - as placed they can't do anything. Did you
 intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
 but it at least clarifies the precedence.

 I'm also a bit confused why we don't need to pass in the offset of the
 current tuple, rather than the HOT root tuple here. That's not related
 to this patch. But aren't we locking the wrong tuple here, in case of
 HOT?

- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
  portion of it's code as a static inline. In particular, it's a shame
  that we currently perform external function calls at quite the
  frequency when serializable isn't even in use.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Kuntal Ghosh
On Thu, Aug 1, 2019 at 2:36 AM Andres Freund <[hidden email]> wrote:

> > > >
> > > > I think the only part its doing for sub-transaction is
> > > > SubTransGetTopmostTransaction(xid). If xid passed to this function is
> > > > already top most transaction which is case for zheap and zedstore, then
> > > > there is no downside to keeping that code here in common place.
> > >
> > > Well, it's far from a cheap function. It'll do unnecessary on-disk
> > > lookups in many cases. I'd call that quite a downside.
> > >
> >
> > Okay, agree, its costly function and better to avoid the call if possible.
> >
> > Instead of moving the handling out of the function, how do feel about
> > adding boolean isTopTransactionId argument to function
> > CheckForSerializableConflictOut(). The AMs, which implicitly know, only
> > pass top transaction Id to this function, can pass true and avoid the
> > function call to SubTransGetTopmostTransaction(xid). With this
> > subtransaction code remains in generic place and AMs intending to use it
> > continue to leverage the common code, plus explicitly clarifies the
> > behavior as well.
>
> Looking at the code as of master, we currently have:
>
> - PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
>   out a whether the tuple has been locked by the current
>   transaction. That check afaict just should be
>   TransactionIdIsCurrentTransactionId(), without all the other
>   stuff that's done today.
>
Yeah. this is the only part where predicate locking uses the subxids.
Since, predicate locking always use the top xid, IMHO, it'll be good
to make this api independent of subxids.

>   TransactionIdIsCurrentTransactionId() imo ought to be optimized to
>   always check for the top level transactionid first - that's a good bet
>   today, but even moreso for the upcoming AMs that won't have separate
>   xids for subtransactions.  Alternatively we shouldn't make that a
>   binary search for each subtrans level, but just have a small
>   simplehash hashtable for xids.
A check for top transaction id first and usage of simple sound like
good optimizations. But, I'm not sure whether these changes should be
part of this patch or a separate one.

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Ashwin Agrawal
In reply to this post by Andres Freund

On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <[hidden email]> wrote:
Looking at the code as of master, we currently have:

Super awesome feedback and insights, thank you!

- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
  out a whether the tuple has been locked by the current
  transaction. That check afaict just should be
  TransactionIdIsCurrentTransactionId(), without all the other
  stuff that's done today.

Agree. v1-0002 patch attached does that now. Please let me know if that's what you meant.

  TransactionIdIsCurrentTransactionId() imo ought to be optimized to
  always check for the top level transactionid first - that's a good bet
  today, but even moreso for the upcoming AMs that won't have separate
  xids for subtransactions.  Alternatively we shouldn't make that a
  binary search for each subtrans level, but just have a small
  simplehash hashtable for xids.

v1-0001 patch checks for GetTopTransactionIdIfAny() first in TransactionIdIsCurrentTransactionId() which seems yes better in general and more for future. That mostly meets the needs for current discussion.

The alternative of not using binary search seems bigger refactoring and should be handled as separate optimization exercise outside of this thread.
 
- CheckForSerializableConflictOut() wants to get the toplevel xid for
  the tuple, because that's the one the predicate hashtable stores.

  In your patch you've already moved the HTSV() call etc out of
  CheckForSerializableConflictOut(). I'm somewhat inclined to think that
  the SubTransGetTopmostTransaction() call ought to go along with that.
  I don't really think that belongs in predicate.c, especially if
  most/all new AMs don't use subtransaction ids.

  The only downside is that currently the
  TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
  avoids the SubTransGetTopmostTransaction() check.

  But again, the better fix for that seems to be to improve the generic
  code. As written the check won't prevent a subtrans lookup for heap
  when subtransactions are in use, and it's IME pretty common for tuples
  to get looked at again in the transaction that has created them. So
  I'm somewhat inclined to think that SubTransGetTopmostTransaction()
  should have a fast-path for the current transaction - probably just
  employing TransactionIdIsCurrentTransactionId().

That optimization, as Kuntal also mentioned, seems something which can be done on-top afterwards on current patch.
 
I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument.  The relevant mapping should be one
line in the caller.

Okay, I moved the sub transaction handling out of CheckForSerializableConflictOut() and have it along side HTSV() now.

The reason I felt leaving subtransaction handling in generic place, was it might be premature to thing no future AM will need it. Plus, all serializable function api's having same expectations is easier. Like PredicateLockTuple() can be passed top or subtransaction id and it can handle it but with the change CheckForSerializableConflictOut() only be feed top transaction ID. But its fine and can see the point of AM needing it can easily get top transaction ID and feed it as heap.
 
I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.

Maybe, will give thought to it separate from the current patch.
 
Minor notes:
- I don't think 'insert_xid' is necessarily great - it could also be the
  updating xid etc. And while you can argue that an update is an insert
  in the current heap, that's not the case for future AMs.
- to me
@@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
                        if (valid)
                        {
                                ItemPointerSetOffsetNumber(tid, offnum);
-                               PredicateLockTuple(relation, heapTuple, snapshot);
+                               PredicateLockTID(relation, &(heapTuple)->t_self, snapshot,
+                                                                HeapTupleHeaderGetXmin(heapTuple->t_data));
                                if (all_dead)
                                        *all_dead = false;
                                return true;

 What are those parens - as placed they can't do anything. Did you
 intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
 but it at least clarifies the precedence.

Fixed. No idea what I was thinking there, mostly feel I intended to have it as like &(heapTuple->t_self).

  I'm also a bit confused why we don't need to pass in the offset of the
 current tuple, rather than the HOT root tuple here. That's not related
 to this patch. But aren't we locking the wrong tuple here, in case of
 HOT?

Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though, can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self).

- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
  portion of it's code as a static inline. In particular, it's a shame
  that we currently perform external function calls at quite the
  frequency when serializable isn't even in use.

I am not sure on portion of the code part? SerializationNeededForRead() is static inline function in C file. Can't inline CheckForSerializableConflictOutNeeded() without moving SerializationNeededForRead() and some other variables to header file. CheckForSerializableConflictOut() wasn't inline either, so a function call was performed earlier as well when serializable isn't even in use.

I understand that with refactor, HeapCheckForSerializableConflictOut() is called which calls CheckForSerializableConflictOutNeeded(). If that's the problem, for addressing the same, I had proposed alternative way to refactor. CheckForSerializableConflictOut() can take callback function and void* callback argument for AM specific check instead. So, the flow would be AM calling CheckForSerializableConflictOut() as today and only if serializable in use will invoke the callback to check with AM if more work should be performed or not. Essentially HeapCheckForSerializableConflictOut() will become callback function instead. Due to void* callback argument aspect I didn't like that solution and felt AM performing checks and calling CheckForSerializableConflictOut() seems more straight forward.


v1-0001-Optimize-TransactionIdIsCurrentTransactionId.patch (1K) Download Attachment
v1-0002-Optimize-PredicateLockTuple.patch (2K) Download Attachment
v1-0003-Remove-HeapTuple-dependency-for-predicate-locking.patch (35K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Thomas Munro-5
On Sat, Aug 3, 2019 at 11:56 AM Ashwin Agrawal <[hidden email]> wrote:
> On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <[hidden email]> wrote:
>>   I'm also a bit confused why we don't need to pass in the offset of the
>>  current tuple, rather than the HOT root tuple here. That's not related
>>  to this patch. But aren't we locking the wrong tuple here, in case of
>>  HOT?
>
> Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though, can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self).

Here are three relevant commits:

1.  Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).

2.  Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
offnum).

3.  Commit b89e151054a "Introduce logical decoding." (2014) also did
ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer),
offnum), for the benefit of historical MVCC snapshots (unnecessarily,
considering the change in the commit #2), but then, intending to
"reset to original, non-redirected, tid", clobbered it, reintroducing
the bug fixed by #2.

My first guess is that commit #3 above was developed before commit #2,
and finished up clobbering it.  In fact, both logical decoding and SSI
want offnum, so we should be able to just remove the "reset" bit
(perhaps like in the attached sketch, not really tested, though it
passes).  This must be in want of an isolation test, but I haven't yet
tried to get my head around how to write a test that would show the
difference.

--
Thomas Munro
https://enterprisedb.com

fix.txt (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Andres Freund
Hi,

On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:

> On Sat, Aug 3, 2019 at 11:56 AM Ashwin Agrawal <[hidden email]> wrote:
> > On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <[hidden email]> wrote:
> >>   I'm also a bit confused why we don't need to pass in the offset of the
> >>  current tuple, rather than the HOT root tuple here. That's not related
> >>  to this patch. But aren't we locking the wrong tuple here, in case of
> >>  HOT?
> >
> > Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though, can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self).
>
> Here are three relevant commits:

Thanks for digging!


> 1.  Commit dafaa3efb75 "Implement genuine serializable isolation
> level." (2011) locked the root tuple, because it set t_self to *tid.
> Possibly due to confusion about the effect of the preceding line
> ItemPointerSetOffsetNumber(tid, offnum).
>
> 2.  Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
> fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
> offnum).

Hm. It's not at all sure that it's ok to report the non-root tuple tid
here. I.e. I'm fairly sure there was a reason to not just set it to the
actual tid. I think I might have written that up on the list at some
point. Let me dig in memory and list. Obviously possible that that was
also obsoleted by parallel changes.


> 3.  Commit b89e151054a "Introduce logical decoding." (2014) also did
> ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer),
> offnum), for the benefit of historical MVCC snapshots (unnecessarily,
> considering the change in the commit #2), but then, intending to
> "reset to original, non-redirected, tid", clobbered it, reintroducing
> the bug fixed by #2.

> My first guess is that commit #3 above was developed before commit #2,
> and finished up clobbering it.

Yea, that sounds likely.


> This must be in want of an isolation test, but I haven't yet tried to
> get my head around how to write a test that would show the difference.

Indeed.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Thomas Munro-5
On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <[hidden email]> wrote:

> On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
> > 1.  Commit dafaa3efb75 "Implement genuine serializable isolation
> > level." (2011) locked the root tuple, because it set t_self to *tid.
> > Possibly due to confusion about the effect of the preceding line
> > ItemPointerSetOffsetNumber(tid, offnum).
> >
> > 2.  Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
> > fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
> > offnum).
>
> Hm. It's not at all sure that it's ok to report the non-root tuple tid
> here. I.e. I'm fairly sure there was a reason to not just set it to the
> actual tid. I think I might have written that up on the list at some
> point. Let me dig in memory and list. Obviously possible that that was
> also obsoleted by parallel changes.
Adding Heikki and Kevin.

I haven't found your earlier discussion about that yet, but would be
keen to read it if you can find it.   I wondered if your argument
might have had something to do with heap pruning, but I can't find a
problem there.  It's not as though the TID of any visible tuples
change, it's just that some dead stuff goes away and the root line
pointer is changed to LP_REDIRECT if it wasn't already.

As for the argument for locking the tuple we emit rather than the HOT
root, I think the questions are: (1) How exactly do we get away with
locking only one version in a regular non-HOT update chain?  (2) Is it
OK to do that with a HOT root?

The answer to the first question is given in README-SSI[1].
Unfortunately it doesn't discuss HOT directly, but I suspect the
answer is no, HOT is not special here.  By my reading, it relies on
the version you lock being the version visible to your snapshot, which
is important because later updates have to touch that tuple to write
the next version.  That doesn't apply to some arbitrarily older tuple
that happens to be a HOT root.  Concretely, heap_update() does
CheckForSerializableConflictIn(relation, &oldtup, buffer), which is
only going to produce a rw conflict if T1 took an SIREAD on precisely
the version T2 locks in that path, not some arbitrarily older version
that happens to be a HOT root.  A HOT root might never be considered
again by concurrent writers, no?

As a minor consequence, the optimisation in
CheckTargetForConflictsIn() assumes that a tuple being updated has the
same tag as we locked when reading the tuple, which isn't the case if
we locked the root while reading but now have the TID for the version
we actually read, so in master we leak a tuple lock unnecessarily
until end-of-transaction when we update a HOT tuple.

> > This must be in want of an isolation test, but I haven't yet tried to
> > get my head around how to write a test that would show the difference.
>
> Indeed.

One practical problem is that the only way to reach
PredicateLockTuple() is from an index scan, and the index scan locks
the index page (or the whole index, depending on
rd_indam->ampredlocks).  So I think if you want to see a serialization
anomaly you'll need multiple indexes (so that index page locks don't
hide the problem), a long enough HOT chain and then probably several
transactions to be able to miss a cycle that should be picked up by
the logic in [1].  I'm out of steam for this problem today though.

The simple test from the report[3] that resulted in commit 81fbbfe3352
doesn't work for me (ie with permutation "r1" "r2" "w1" "w2" "c1" "c2"
twice in a row).  The best I've come up with so far is an assertion
that we predicate-lock the same row version that we emitted to the
user, when reached via an index lookup that visits a HOT row.  The
test outputs 'f' for master, but 't' with the change to heapam.c.

[1] Excerpt from README-SSI:

===
    * PostgreSQL does not use "update in place" with a rollback log
for its MVCC implementation.  Where possible it uses "HOT" updates on
the same page (if there is room and no indexed value is changed).
For non-HOT updates the old tuple is expired in place and a new tuple
is inserted at a new location.  Because of this difference, a tuple
lock in PostgreSQL doesn't automatically lock any other versions of a
row.  We don't try to copy or expand a tuple lock to any other
versions of the row, based on the following proof that any additional
serialization failures we would get from that would be false
positives:

          o If transaction T1 reads a row version (thus acquiring a
predicate lock on it) and a second transaction T2 updates that row
version (thus creating a rw-conflict graph edge from T1 to T2), must a
third transaction T3 which re-updates the new version of the row also
have a rw-conflict in from T1 to prevent anomalies?  In other words,
does it matter whether we recognize the edge T1 -> T3?

          o If T1 has a conflict in, it certainly doesn't. Adding the
edge T1 -> T3 would create a dangerous structure, but we already had
one from the edge T1 -> T2, so we would have aborted something anyway.
(T2 has already committed, else T3 could not have updated its output;
but we would have aborted either T1 or T1's predecessor(s).  Hence
no cycle involving T1 and T3 can survive.)

          o Now let's consider the case where T1 doesn't have a
rw-conflict in. If that's the case, for this edge T1 -> T3 to make a
difference, T3 must have a rw-conflict out that induces a cycle in the
dependency graph, i.e. a conflict out to some transaction preceding T1
in the graph. (A conflict out to T1 itself would be problematic too,
but that would mean T1 has a conflict in, the case we already
eliminated.)

          o So now we're trying to figure out if there can be an
rw-conflict edge T3 -> T0, where T0 is some transaction that precedes
T1. For T0 to precede T1, there has to be some edge, or sequence of
edges, from T0 to T1. At least the last edge has to be a wr-dependency
or ww-dependency rather than a rw-conflict, because T1 doesn't have a
rw-conflict in. And that gives us enough information about the order
of transactions to see that T3 can't have a rw-conflict to T0:
 - T0 committed before T1 started (the wr/ww-dependency implies this)
 - T1 started before T2 committed (the T1->T2 rw-conflict implies this)
 - T2 committed before T3 started (otherwise, T3 would get aborted
                                   because of an update conflict)

          o That means T0 committed before T3 started, and therefore
there can't be a rw-conflict from T3 to T0.

          o So in all cases, we don't need the T1 -> T3 edge to
recognize cycles.  Therefore it's not necessary for T1's SIREAD lock
on the original tuple version to cover later versions as well.
===

[2] https://www.postgresql.org/message-id/52527E4D.4060302%40vmware.com
[3] https://www.postgresql.org/message-id/flat/523C29A8.20904%40vmware.com

--
Thomas Munro
https://enterprisedb.com

0001-Predicate-lock-the-visible-heap-tuple-not-the-HOT-ro.patch (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Kuntal Ghosh
Hello Thomas,

On Tue, Aug 6, 2019 at 9:50 AM Thomas Munro <[hidden email]> wrote:

>
> On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <[hidden email]> wrote:
> > On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
> > > 1.  Commit dafaa3efb75 "Implement genuine serializable isolation
> > > level." (2011) locked the root tuple, because it set t_self to *tid.
> > > Possibly due to confusion about the effect of the preceding line
> > > ItemPointerSetOffsetNumber(tid, offnum).
> > >
> > > 2.  Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
> > > fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
> > > offnum).
> >
> > Hm. It's not at all sure that it's ok to report the non-root tuple tid
> > here. I.e. I'm fairly sure there was a reason to not just set it to the
> > actual tid. I think I might have written that up on the list at some
> > point. Let me dig in memory and list. Obviously possible that that was
> > also obsoleted by parallel changes.
>
> Adding Heikki and Kevin.
>
> I haven't found your earlier discussion about that yet, but would be
> keen to read it if you can find it.   I wondered if your argument
> might have had something to do with heap pruning, but I can't find a
> problem there.  It's not as though the TID of any visible tuples
> change, it's just that some dead stuff goes away and the root line
> pointer is changed to LP_REDIRECT if it wasn't already.
>
> As for the argument for locking the tuple we emit rather than the HOT
> root, I think the questions are: (1) How exactly do we get away with
> locking only one version in a regular non-HOT update chain?  (2) Is it
> OK to do that with a HOT root?
>
> The answer to the first question is given in README-SSI[1].
> Unfortunately it doesn't discuss HOT directly, but I suspect the
> answer is no, HOT is not special here.  By my reading, it relies on
> the version you lock being the version visible to your snapshot, which
> is important because later updates have to touch that tuple to write
> the next version.  That doesn't apply to some arbitrarily older tuple
> that happens to be a HOT root.  Concretely, heap_update() does
> CheckForSerializableConflictIn(relation, &oldtup, buffer), which is
> only going to produce a rw conflict if T1 took an SIREAD on precisely
> the version T2 locks in that path, not some arbitrarily older version
> that happens to be a HOT root.  A HOT root might never be considered
> again by concurrent writers, no?
>
If I understand the problem, this is the same serialization issue as
with in-place updates for zheap. I had a discussion with Kevin
regarding the same in this thread [1]. It seems if we're locking the
hot root id, we may report some false positive serializable errors.


> > > This must be in want of an isolation test, but I haven't yet tried to
> > > get my head around how to write a test that would show the difference.
> >
> > Indeed.
>
> One practical problem is that the only way to reach
> PredicateLockTuple() is from an index scan, and the index scan locks
> the index page (or the whole index, depending on
> rd_indam->ampredlocks).  So I think if you want to see a serialization
> anomaly you'll need multiple indexes (so that index page locks don't
> hide the problem), a long enough HOT chain and then probably several
> transactions to be able to miss a cycle that should be picked up by
> the logic in [1].  I'm out of steam for this problem today though.
>
> The simple test from the report[3] that resulted in commit 81fbbfe3352
> doesn't work for me (ie with permutation "r1" "r2" "w1" "w2" "c1" "c2"
> twice in a row).  The best I've come up with so far is an assertion
> that we predicate-lock the same row version that we emitted to the
> user, when reached via an index lookup that visits a HOT row.  The
> test outputs 'f' for master, but 't' with the change to heapam.c.
>
Here is an example from the multiple-row-versions isolation test which
fails if we perform in-place updates for zheap. I think the same will
be relevant if we lock root tuple id instead of the tuple itself.
Step 1: T1-> BEGIN; Read FROM t where id=1000000;
Step 2: T2-> BEGIN; UPDATE t where id=1000000; COMMIT; (creates T1->T2)
Step 3: T3-> BEGIN; UPDATE t where id=1000000; Read FROM t where id=500000;
Step 4: T4-> BEGIN; UPDATE t where id= 500000; Read FROM t where id=1;
COMMIT;  (creates T3->T4)
Step 5: T3-> COMMIT;
Step 6: T1-> UPDATE t where id=1; COMMIT;  (creates T4->T1,)

At step 6, when the update statement is executed, T1 is rolled back
because of T3->T4->T1.

But for zheap, step 3 also creates a dependency T1->T3 because of
in-place update. When T4 commits in step 4, it marks T3 as doomed
because of T1 --> T3 --> T4. Hence, in step 5, T3 is rolled back.

[1]  Re: In-place updates and serializable transactions:
https://www.postgresql.org/message-id/CAGz5QCJzreUqJqHeXrbEs6xb0zCNKBHhOj6D9Tjd3btJTzydxg%40mail.gmail.com

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Heikki Linnakangas
In reply to this post by Thomas Munro-5
On 06/08/2019 07:20, Thomas Munro wrote:

> On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <[hidden email]> wrote:
>> On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
>>> 1.  Commit dafaa3efb75 "Implement genuine serializable isolation
>>> level." (2011) locked the root tuple, because it set t_self to *tid.
>>> Possibly due to confusion about the effect of the preceding line
>>> ItemPointerSetOffsetNumber(tid, offnum).
>>>
>>> 2.  Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
>>> fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
>>> offnum).
>>
>> Hm. It's not at all sure that it's ok to report the non-root tuple tid
>> here. I.e. I'm fairly sure there was a reason to not just set it to the
>> actual tid. I think I might have written that up on the list at some
>> point. Let me dig in memory and list. Obviously possible that that was
>> also obsoleted by parallel changes.
>
> Adding Heikki and Kevin.
>
> I haven't found your earlier discussion about that yet, but would be
> keen to read it if you can find it.   I wondered if your argument
> might have had something to do with heap pruning, but I can't find a
> problem there.  It's not as though the TID of any visible tuples
> change, it's just that some dead stuff goes away and the root line
> pointer is changed to LP_REDIRECT if it wasn't already.
>
> As for the argument for locking the tuple we emit rather than the HOT
> root, I think the questions are: (1) How exactly do we get away with
> locking only one version in a regular non-HOT update chain?  (2) Is it
> OK to do that with a HOT root?
>
> The answer to the first question is given in README-SSI[1].
> Unfortunately it doesn't discuss HOT directly, but I suspect the
> answer is no, HOT is not special here.  By my reading, it relies on
> the version you lock being the version visible to your snapshot, which
> is important because later updates have to touch that tuple to write
> the next version.  That doesn't apply to some arbitrarily older tuple
> that happens to be a HOT root.  Concretely, heap_update() does
> CheckForSerializableConflictIn(relation, &oldtup, buffer), which is
> only going to produce a rw conflict if T1 took an SIREAD on precisely
> the version T2 locks in that path, not some arbitrarily older version
> that happens to be a HOT root.  A HOT root might never be considered
> again by concurrent writers, no?
Your analysis is spot on. Thanks for the clear write-up!

>>> This must be in want of an isolation test, but I haven't yet tried to
>>> get my head around how to write a test that would show the difference.
>>
>> Indeed.
>
> One practical problem is that the only way to reach
> PredicateLockTuple() is from an index scan, and the index scan locks
> the index page (or the whole index, depending on
> rd_indam->ampredlocks).  So I think if you want to see a serialization
> anomaly you'll need multiple indexes (so that index page locks don't
> hide the problem), a long enough HOT chain and then probably several
> transactions to be able to miss a cycle that should be picked up by
> the logic in [1].  I'm out of steam for this problem today though.
I had some steam, and wrote a spec that reproduces this bug. It wasn't
actually that hard to reproduce, fortunately. Or unfortunately; people
might well be hitting it in production. I used the "freezetest.spec"
from the 2013 thread as the starting point, and added one UPDATE to the
initialization, so that the test starts with an already HOT-updated
tuple. It should throw a serialization error, but on current master, it
does not. After applying your fix.txt, it does.

Your fix.txt seems correct. For clarity, I'd prefer moving things around
a bit, though, so that the t_self is set earlier in the function, at the
same place where the other HeapTuple fields are set. And set blkno and
offnum together, in one ItemPointerSet call. With that, I'm not sure we
need such a verbose comment explaining why t_self needs to be updated
but I kept it for now.

Attached is a patch that contains your fix.txt with the changes for
clarity mentioned above, and an isolationtester test case.

PS. Because heap_hot_search_buffer() now always sets heapTuple->t_self
to the returned tuple version, updating *tid is redundant. And the call
in heapam_index_fetch_tuple() wouldn't need to do
"bslot->base.tupdata.t_self = *tid".

- Heikki

0001-Fix-predicate-locking-of-HOT-updated-rows.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Thomas Munro-5
On Tue, Aug 6, 2019 at 9:26 PM Heikki Linnakangas <[hidden email]> wrote:
> I had some steam, and wrote a spec that reproduces this bug. It wasn't
> actually that hard to reproduce, fortunately. Or unfortunately; people
> might well be hitting it in production. I used the "freezetest.spec"
> from the 2013 thread as the starting point, and added one UPDATE to the
> initialization, so that the test starts with an already HOT-updated
> tuple. It should throw a serialization error, but on current master, it
> does not. After applying your fix.txt, it does.

Thanks!  Ahh, right, I was expecting it to be harder to see an
undetected anomaly, because of the index page lock, but of course we
never actually write to that page so it's just the heap tuple lock
holding everything together.

> Your fix.txt seems correct. For clarity, I'd prefer moving things around
> a bit, though, so that the t_self is set earlier in the function, at the
> same place where the other HeapTuple fields are set. And set blkno and
> offnum together, in one ItemPointerSet call. With that, I'm not sure we
> need such a verbose comment explaining why t_self needs to be updated
> but I kept it for now.

+1

> Attached is a patch that contains your fix.txt with the changes for
> clarity mentioned above, and an isolationtester test case.

LGTM.

> PS. Because heap_hot_search_buffer() now always sets heapTuple->t_self
> to the returned tuple version, updating *tid is redundant. And the call
> in heapam_index_fetch_tuple() wouldn't need to do
> "bslot->base.tupdata.t_self = *tid".

Right, that sounds like a separate improvement for master only.

--
Thomas Munro
https://enterprisedb.com


Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Heikki Linnakangas
On 06/08/2019 13:35, Thomas Munro wrote:
> On Tue, Aug 6, 2019 at 9:26 PM Heikki Linnakangas <[hidden email]> wrote:
>> Attached is a patch that contains your fix.txt with the changes for
>> clarity mentioned above, and an isolationtester test case.
>
> LGTM.

Pushed, thanks!

- Heikki


Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Ashwin Agrawal
In reply to this post by Ashwin Agrawal

On Fri, Aug 2, 2019 at 4:56 PM Ashwin Agrawal <[hidden email]> wrote:

On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <[hidden email]> wrote:
Looking at the code as of master, we currently have:

Super awesome feedback and insights, thank you!

- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
  out a whether the tuple has been locked by the current
  transaction. That check afaict just should be
  TransactionIdIsCurrentTransactionId(), without all the other
  stuff that's done today.

Agree. v1-0002 patch attached does that now. Please let me know if that's what you meant.

  TransactionIdIsCurrentTransactionId() imo ought to be optimized to
  always check for the top level transactionid first - that's a good bet
  today, but even moreso for the upcoming AMs that won't have separate
  xids for subtransactions.  Alternatively we shouldn't make that a
  binary search for each subtrans level, but just have a small
  simplehash hashtable for xids.

v1-0001 patch checks for GetTopTransactionIdIfAny() first in TransactionIdIsCurrentTransactionId() which seems yes better in general and more for future. That mostly meets the needs for current discussion.

The alternative of not using binary search seems bigger refactoring and should be handled as separate optimization exercise outside of this thread.
 
- CheckForSerializableConflictOut() wants to get the toplevel xid for
  the tuple, because that's the one the predicate hashtable stores.

  In your patch you've already moved the HTSV() call etc out of
  CheckForSerializableConflictOut(). I'm somewhat inclined to think that
  the SubTransGetTopmostTransaction() call ought to go along with that.
  I don't really think that belongs in predicate.c, especially if
  most/all new AMs don't use subtransaction ids.

  The only downside is that currently the
  TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
  avoids the SubTransGetTopmostTransaction() check.

  But again, the better fix for that seems to be to improve the generic
  code. As written the check won't prevent a subtrans lookup for heap
  when subtransactions are in use, and it's IME pretty common for tuples
  to get looked at again in the transaction that has created them. So
  I'm somewhat inclined to think that SubTransGetTopmostTransaction()
  should have a fast-path for the current transaction - probably just
  employing TransactionIdIsCurrentTransactionId().

That optimization, as Kuntal also mentioned, seems something which can be done on-top afterwards on current patch.
 
I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument.  The relevant mapping should be one
line in the caller.

Okay, I moved the sub transaction handling out of CheckForSerializableConflictOut() and have it along side HTSV() now.

The reason I felt leaving subtransaction handling in generic place, was it might be premature to thing no future AM will need it. Plus, all serializable function api's having same expectations is easier. Like PredicateLockTuple() can be passed top or subtransaction id and it can handle it but with the change CheckForSerializableConflictOut() only be feed top transaction ID. But its fine and can see the point of AM needing it can easily get top transaction ID and feed it as heap.
 
I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.

Maybe, will give thought to it separate from the current patch.
 
Minor notes:
- I don't think 'insert_xid' is necessarily great - it could also be the
  updating xid etc. And while you can argue that an update is an insert
  in the current heap, that's not the case for future AMs.
- to me
@@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
                        if (valid)
                        {
                                ItemPointerSetOffsetNumber(tid, offnum);
-                               PredicateLockTuple(relation, heapTuple, snapshot);
+                               PredicateLockTID(relation, &(heapTuple)->t_self, snapshot,
+                                                                HeapTupleHeaderGetXmin(heapTuple->t_data));
                                if (all_dead)
                                        *all_dead = false;
                                return true;

 What are those parens - as placed they can't do anything. Did you
 intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
 but it at least clarifies the precedence.

Fixed. No idea what I was thinking there, mostly feel I intended to have it as like &(heapTuple->t_self).

  I'm also a bit confused why we don't need to pass in the offset of the
 current tuple, rather than the HOT root tuple here. That's not related
 to this patch. But aren't we locking the wrong tuple here, in case of
 HOT?

Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though, can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self).

- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
  portion of it's code as a static inline. In particular, it's a shame
  that we currently perform external function calls at quite the
  frequency when serializable isn't even in use.

I am not sure on portion of the code part? SerializationNeededForRead() is static inline function in C file. Can't inline CheckForSerializableConflictOutNeeded() without moving SerializationNeededForRead() and some other variables to header file. CheckForSerializableConflictOut() wasn't inline either, so a function call was performed earlier as well when serializable isn't even in use.

I understand that with refactor, HeapCheckForSerializableConflictOut() is called which calls CheckForSerializableConflictOutNeeded(). If that's the problem, for addressing the same, I had proposed alternative way to refactor. CheckForSerializableConflictOut() can take callback function and void* callback argument for AM specific check instead. So, the flow would be AM calling CheckForSerializableConflictOut() as today and only if serializable in use will invoke the callback to check with AM if more work should be performed or not. Essentially HeapCheckForSerializableConflictOut() will become callback function instead. Due to void* callback argument aspect I didn't like that solution and felt AM performing checks and calling CheckForSerializableConflictOut() seems more straight forward.

Attaching re-based version of the patches on top of current master, which has the fix for HOT serializable predicate locking bug spotted by Andres committed now.


v2-0001-Optimize-TransactionIdIsCurrentTransactionId.patch (1K) Download Attachment
v2-0002-Optimize-PredicateLockTuple.patch (2K) Download Attachment
v2-0003-Remove-HeapTuple-dependency-for-predicate-locking.patch (34K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Thomas Munro-5
On Thu, Aug 8, 2019 at 6:53 AM Ashwin Agrawal <[hidden email]> wrote:
>>> - I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
>>>   portion of it's code as a static inline. In particular, it's a shame
>>>   that we currently perform external function calls at quite the
>>>   frequency when serializable isn't even in use.
>>
>> I am not sure on portion of the code part? SerializationNeededForRead() is static inline function in C file. Can't inline CheckForSerializableConflictOutNeeded() without moving SerializationNeededForRead() and some other variables to header file. CheckForSerializableConflictOut() wasn't inline either, so a function call was performed earlier as well when serializable isn't even in use.

I agree that it's strange that we do these high frequency function
calls just to figure out that we're not even using this stuff, which
ultimately comes down to the static global variable MySerializableXact
being not reachable from (say) an inline function defined in a header.
That's something to look into in another thread.

> Attaching re-based version of the patches on top of current master, which has the fix for HOT serializable predicate locking bug spotted by Andres committed now.

I'm planning to commit these three patches on Monday.  I've attached
versions with whitespace-only changes from pgindent, and commit
messages lightly massaged and updated to point to this discussion and
reviewers.

v3-0001-Optimize-TransactionIdIsCurrentTransactionId.patch (1K) Download Attachment
v3-0002-Optimize-PredicateLockTuple.patch (2K) Download Attachment
v3-0003-Remove-HeapTuple-dependency-for-predicate-locking.patch (35K) Download Attachment
12